Skip to main content

Defer

Defer is a powerful feature that makes it possible to run a subset of models or tests in a sandbox environment without having to first build their upstream parents. This can save time and computational resources when you want to test a small number of models in a large project.

Use 'defer' to modify end-of-pipeline models by pointing to production models, instead of running everything upstream.Use 'defer' to modify end-of-pipeline models by pointing to production models, instead of running everything upstream.

Defer requires that a manifest from a previous dbt invocation be passed to the --state flag or env var. Together with the state: selection method, these features enable "Slim CI". Read more about state.

An alternative command that accomplishes similar functionality for different use cases is dbt clone - see the docs for clone for more information.

It is possible to use separate state for state:modified and --defer, by passing paths to different manifests to each of the --state/DBT_STATE and --defer-state/DBT_DEFER_STATE. This enables more granular control in cases where you want to compare against logical state from one environment or past point in time, and defer to applied state from a different environment or point in time. If --defer-state is not specified, deferral will use the manifest supplied to --state. In most cases, you will want to use the same state for both: compare logical changes against production, and also "fail over" to the production environment for unbuilt upstream resources.

Usage

dbt run --select [...] --defer --state path/to/artifacts
dbt test --select [...] --defer --state path/to/artifacts

By default, dbt uses the target namespace to resolve ref calls.

When --defer is enabled, dbt resolves ref calls using the state manifest instead, but only if:

  1. The node isn’t among the selected nodes, and
  2. It doesn’t exist in the database (or --favor-state is used).

Ephemeral models are never deferred, since they serve as "passthroughs" for other ref calls.

When using defer, you may be selecting from production datasets, development datasets, or a mix of both. Note that this can yield unexpected results

  • if you apply env-specific limits in dev but not prod, as you may end up selecting more data than you expect
  • when executing tests that depend on multiple parents (e.g. relationships), since you're testing "across" environments

Deferral requires both --defer and --state to be set, either by passing flags explicitly or by setting environment variables (DBT_DEFER and DBT_STATE). If you use dbt Cloud, read about how to set up CI jobs.

Favor state

When --favor-state is passed, dbt prioritizes node definitions from the --state directory. However, this doesn’t apply if the node is also part of the selected nodes.

Example

In my local development environment, I create all models in my target schema, dev_alice. In production, the same models are created in a schema named prod.

I access the dbt-generated artifacts (namely manifest.json) from a production run, and copy them into a local directory called prod-run-artifacts.

run

I've been working on model_b:

models/model_b.sql
select

id,
count(*)

from {{ ref('model_a') }}
group by 1

I want to test my changes. Nothing exists in my development schema, dev_alice.

dbt run --select "model_b"
target/run/my_project/model_b.sql
create or replace view dev_me.model_b as (

select

id,
count(*)

from dev_alice.model_a
group by 1

)

Unless I had previously run model_a into this development environment, dev_alice.model_a will not exist, thereby causing a database error.

test

I also have a relationships test that establishes referential integrity between model_a and model_b:

models/resources.yml
version: 2

models:
- name: model_b
columns:
- name: id
tests:
- relationships:
to: ref('model_a')
field: id

(A bit silly, since all the data in model_b had to come from model_a, but suspend your disbelief.)

dbt test --select "model_b"
target/compiled/.../relationships_model_b_id__id__ref_model_a_.sql
select count(*) as validation_errors
from (
select id as id from dev_alice.model_b
) as child
left join (
select id as id from dev_alice.model_a
) as parent on parent.id = child.id
where child.id is not null
and parent.id is null

The relationships test requires both model_a and model_b. Because I did not build model_a in my previous dbt run, dev_alice.model_a does not exist and this test query fails.

0