You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: website/docs/best-practices/how-we-structure/5-the-rest-of-the-project.md
+5-3Lines changed: 5 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -102,12 +102,14 @@ We’ve focused heavily thus far on the primary area of action in our dbt projec
102
102
103
103
### Project splitting
104
104
105
-
One important, growing consideration in the analytics engineering ecosystem is how and when to split a codebase into multiple dbt projects. Our present stance on this for most projects, particularly for teams starting out, is straightforward: you should avoid it unless you have no other option or it saves you from an even more complex workaround. If you do have the need to split up your project, it’s completely possible through the use of private packages, but the added complexity and separation is, for most organizations, a hindrance, not a help, at present. That said, this is very likely subject to change! [We want to create a world where it’s easy to bring lots of dbt projects together into a cohesive lineage](https://github.com/dbt-labs/dbt-core/discussions/5244). In a world where it’s simple to break up monolithic dbt projects into multiple connected projects, perhaps inside of a modern mono repo, the calculus will be different, and the below situations we recommend against may become totally viable. So watch this space!
105
+
One important, growing consideration in the analytics engineering ecosystem is how and when to split a codebase into multiple dbt projects. Currently, our advice for most teams, especially those just starting, is fairly simple: in most cases, we recommend doing so with [dbt Mesh](/best-practices/how-we-mesh/mesh-1-intro)! dbt Mesh allows organizations to handle complexity by connecting several dbt projects rather than relying on one big, monolithic project. This approach is designed to speed up development while maintaining governance.
106
106
107
-
- ❌ **Business groups or departments.** Conceptual separations within the project are not a good reason to split up your project. Splitting up, for instance, marketing and finance modeling into separate projects will not only add unnecessary complexity but destroy the unifying effect of collaborating across your organization on cohesive definitions and business logic.
108
-
- ❌ **ML vs Reporting use cases.** Similarly to the point above, splitting a project up based on different use cases, particularly more standard BI versus ML features, is a common idea. We tend to discourage it for the time being. As with the previous point, a foundational goal of implementing dbt is to create a single source of truth in your organization. The features you’re providing to your data science teams should be coming from the same marts and metrics that serve reports on executive dashboards.
107
+
As breaking up monolithic dbt projects into smaller, connected projects, potentially within a modern mono repo becomes easier, the scenarios we currently advise against may soon become feasible. So watch this space!
108
+
109
+
- ✅ **Business groups or departments.** Conceptual separations within the project are the primary reason to split up your project. This allows your business domains to own their own data products and still collaborate using dbt Mesh. For more information about dbt Mesh, please refer to our [dbt Mesh FAQs](/best-practices/how-we-mesh/mesh-5-faqs).
109
110
- ✅ **Data governance.** Structural, organizational needs — such as data governance and security — are one of the few worthwhile reasons to split up a project. If, for instance, you work at a healthcare company with only a small team cleared to access raw data with PII in it, you may need to split out your staging models into their own projects to preserve those policies. In that case, you would import your staging project into the project that builds on those staging models as a [private package](https://docs.getdbt.com/docs/build/packages/#private-packages).
110
111
- ✅ **Project size.** At a certain point, your project may grow to have simply too many models to present a viable development experience. If you have 1000s of models, it absolutely makes sense to find a way to split up your project.
112
+
- ❌ **ML vs Reporting use cases.** Similarly to the point above, splitting a project up based on different use cases, particularly more standard BI versus ML features, is a common idea. We tend to discourage it for the time being. As with the previous point, a foundational goal of implementing dbt is to create a single source of truth in your organization. The features you’re providing to your data science teams should be coming from the same marts and metrics that serve reports on executive dashboards.
Copy file name to clipboardExpand all lines: website/docs/docs/build/snapshots.md
+20-12Lines changed: 20 additions & 12 deletions
Original file line number
Diff line number
Diff line change
@@ -52,20 +52,25 @@ It is not possible to "preview data" or "compile sql" for snapshots in dbt Cloud
52
52
53
53
<VersionBlockfirstVersion="1.9">
54
54
55
-
In dbt Cloud Versionless and dbt Core v1.9 and later, snapshots are configurations defined in YAML files (typically in your snapshots directory). You'll configure your snapshot to tell dbt how to detect record changes.
55
+
Configure your snapshots in YAML files to tell dbt how to detect record changes. Define snapshots configurations in YAML files, alongside your models, for a cleaner, faster, and more consistent set up.
56
56
57
57
<Filename='snapshots/orders_snapshot.yml'>
58
-
58
+
59
59
```yaml
60
60
snapshots:
61
-
- name: orders_snapshot
62
-
relation: source('jaffle_shop', 'orders')
61
+
- name: string
62
+
relation: relation #source('my_source', 'my_table') or ref('my_model')
@@ -82,6 +87,7 @@ The following table outlines the configurations available for snapshots:
82
87
|[check_cols](/reference/resource-configs/check_cols)| If using the `check` strategy, then the columns to check | Only if using the `check` strategy |["status"]|
83
88
|[updated_at](/reference/resource-configs/updated_at)| If using the `timestamp` strategy, the timestamp column to compare | Only if using the `timestamp` strategy | updated_at |
84
89
|[invalidate_hard_deletes](/reference/resource-configs/invalidate_hard_deletes)| Find hard deleted records in source and set `dbt_valid_to` to current time if the record no longer exists | No | True |
90
+
|[snapshot_meta_column_names](/reference/resource-configs/snapshot_meta_column_names)| Customize the names of the snapshot meta fields | No | dictionary |
85
91
86
92
- In versions prior to v1.9, the `target_schema` (required) and `target_database` (optional) configurations defined a single schema or database to build a snapshot across users and environment. This created problems when testing or developing a snapshot, as there was no clear separation between development and production environments. In v1.9, `target_schema` became optional, allowing snapshots to be environment-aware. By default, without `target_schema` or `target_database` defined, snapshots now use the `generate_schema_name` or `generate_database_name` macros to determine where to build. Developers can still set a custom location with [`schema`](/reference/resource-configs/schema) and [`database`](/reference/resource-configs/database) configs, consistent with other resource types.
87
93
- A number of other configurations are also supported (for example, `tags` and `post-hook`). For the complete list, refer to [Snapshot configurations](/reference/snapshot-configs).
@@ -160,7 +166,7 @@ To add a snapshot to your project follow these steps. For users on versions 1.8
160
166
161
167
### Configuration best practices
162
168
163
-
<Expandable alt_header="Use thetimestamp strategy where possible">
169
+
<Expandable alt_header="Use the timestamp strategy where possible">
164
170
165
171
This strategy handles column additions and deletions better than the `check` strategy.
166
172
@@ -188,9 +194,9 @@ Snapshots can't be rebuilt. Because of this, it's a good idea to put snapshots i
188
194
189
195
</Expandable>
190
196
191
-
<Expandable alt_header="Use ephemeral model to clean or tranform data before snapshotting">
197
+
<Expandable alt_header="Use ephemeral model to clean or transform data before snapshotting">
192
198
193
-
If you need to clean or transform your data before snapshotting, create an ephemeral model (or a staging model) that applies the necessary transformations. Then, reference this model in your snapshot configuration. This approach keeps your snapshot definitions clean and allows you to test and run transformations separately.
199
+
If you need to clean or transform your data before snapshotting, create an ephemeral model or a staging model that applies the necessary transformations. Then, reference this model in your snapshot configuration. This approach keeps your snapshot definitions clean and allows you to test and run transformations separately.
194
200
195
201
</Expandable>
196
202
</VersionBlock>
@@ -203,6 +209,8 @@ When you run the [`dbt snapshot` command](/reference/commands/snapshot):
203
209
- The `dbt_valid_to` column will be updated for any existing records that have changed
204
210
- The updated record and any new records will be inserted into the snapshot table. These records will now have `dbt_valid_to = null`
205
211
212
+
Note, these column names can be customized to your team or organizational conventions using the [snapshot_meta_column_names](#snapshot-meta-fields) config.
213
+
206
214
Snapshots can be referenced in downstream models the same way as referencing models — by using the [ref](/reference/dbt-jinja-functions/ref) function.
Copy file name to clipboardExpand all lines: website/docs/docs/dbt-versions/release-notes.md
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -26,7 +26,7 @@ Release notes are grouped by month for both multi-tenant and virtual private clo
26
26
-**New**: In dbt Cloud Versionless, [Snapshots](/docs/build/snapshots) have been updated to use YAML configuration files instead of SQL snapshot blocks. This new feature simplifies snapshot management and improves performance, and will soon be released in dbt Core 1.9.
27
27
- Who does this affect? New user on Versionless can define snapshots using the new YAML specification. Users upgrading to Versionless who use snapshots can keep their existing configuration or can choose to migrate their snapshot definitions to YAML.
28
28
- Users on dbt 1.8 and earlier: No action is needed; existing snapshots will continue to work as before. However, we recommend upgrading to Versionless to take advantage of the new snapshot features.
29
-
-**Behavior change:** Set [`state_modified_compare_more_unrendered`](/reference/global-configs/behavior-changes#source-definitions-for-state) to true to reduce false positives for `state:modified` when configs differ between `dev` and `prod` environments.
29
+
-**Behavior change:** Set [`state_modified_compare_more_unrendered_values`](/reference/global-configs/behavior-changes#source-definitions-for-state) to true to reduce false positives for `state:modified` when configs differ between `dev` and `prod` environments.
30
30
-**Behavior change:** Set the [`skip_nodes_if_on_run_start_fails`](/reference/global-configs/behavior-changes#failures-in-on-run-start-hooks) flag to `True` to skip all selected resources from running if there is a failure on an `on-run-start` hook.
31
31
-**Enhancement**: In dbt Cloud Versionless, snapshots defined in SQL files can now use `config` defined in `schema.yml` YAML files. This update resolves the previous limitation that required snapshot properties to be defined exclusively in `dbt_project.yml` and/or a `config()` block within the SQL file. This will also be released in dbt Core 1.9.
32
32
-**New**: In dbt Cloud Versionless, the `snapshot_meta_column_names` config allows for customizing the snapshot metadata columns. This feature allows an organization to align these automatically-generated column names with their conventions, and will be included in the upcoming dbt Core 1.9 release.
When the dbt Cloud Maturity is "TBD," it means we have not yet determined the exact date when these flags' default values will change. Affected users will see deprecation warnings in the meantime, and they will receive emails providing advance warning ahead of the maturity date. In the meantime, if you are seeing a deprecation warning, you can either:
75
75
- Migrate your project to support the new behavior, and then set the flag to `True` to stop seeing the warnings.
@@ -85,7 +85,7 @@ Set the `skip_nodes_if_on_run_start_fails` flag to `True` to skip all selected r
85
85
86
86
The flag is `False` by default.
87
87
88
-
Set `state_modified_compare_more_unrendered` to `True` to reduce false positives during `state:modified` checks (especially when configs differ by target environment like `prod` vs. `dev`).
88
+
Set `state_modified_compare_more_unrendered_values` to `True` to reduce false positives during `state:modified` checks (especially when configs differ by target environment like `prod` vs. `dev`).
89
89
90
90
Setting the flag to `True` changes the `state:modified` comparison from using rendered values to unrendered values instead. It accomplishes this by persisting `unrendered_config` during model parsing and `unrendered_database` and `unrendered_schema` configs during source parsing.
Copy file name to clipboardExpand all lines: website/docs/reference/node-selection/state-comparison-caveats.md
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -46,15 +46,15 @@ dbt test -s "state:modified" --exclude "test_name:relationships"
46
46
47
47
<VersionBlockfirstVersion="1.9">
48
48
49
-
To reduce false positives during `state:modified` selection due to env-aware logic, you can set the `state_modified_compare_more_unrendered`[behavior flag](/reference/global-configs/behavior-changes#behavior-change-flags) to `True`.
49
+
To reduce false positives during `state:modified` selection due to env-aware logic, you can set the `state_modified_compare_more_unrendered_values`[behavior flag](/reference/global-configs/behavior-changes#behavior-change-flags) to `True`.
50
50
51
51
</VersionBlock>
52
52
53
53
<VersionBlocklastVersion="1.8">
54
54
State comparison works by identifying discrepancies between two manifests. Those discrepancies could be the result of:
55
55
56
56
1. Changes made to a project in development
57
-
2. Env-aware logic that causes different behavior based on the `target`, env vars, etc., which can be avoided if you upgrade to dbt Core 1.9 and set the `state_modified_compare_more_unrendered`[behavior flag](/reference/global-configs/behavior-changes#behavior-change-flags) to `True`.
57
+
2. Env-aware logic that causes different behavior based on the `target`, env vars, etc., which can be avoided if you upgrade to dbt Core 1.9 and set the `state_modified_compare_more_unrendered_values`[behavior flag](/reference/global-configs/behavior-changes#behavior-change-flags) to `True`.
58
58
59
59
State comparison detects env-aware config in `dbt_project.yml`. This target-based config won't register as a modification:
Copy file name to clipboardExpand all lines: website/docs/reference/resource-configs/snowflake-configs.md
+3Lines changed: 3 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -9,6 +9,8 @@ To-do:
9
9
- use the reference doc structure for this article / split into separate articles
10
10
--->
11
11
12
+
<VersionBlockfirstVersion="1.9">
13
+
12
14
## Iceberg table format <Lifecyclestatus="beta"/>
13
15
14
16
The dbt-snowflake adapter supports the Iceberg table format. It is available for three of the Snowflake materializations:
@@ -95,6 +97,7 @@ There are some limitations to the implementation you need to be aware of:
95
97
- Using Iceberg tables with dbt, the result is that your query is materialized in Iceberg. However, often, dbt creates intermediary objects as temporary and transient tables for certain materializations, such as incremental ones. It is not possible to configure these temporary objects also to be Iceberg-formatted. You may see non-Iceberg tables created in the logs to support specific materializations, but they will be dropped after usage.
96
98
- You cannot incrementally update a preexisting incremental model to be an Iceberg table. To do so, you must fully rebuild the table with the `--full-refresh` flag.
0 commit comments