You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
after discussion with @saraleon1 -- this pr adds a discourse link to
incremental strategies discussion for large datasets and simplifies the
'when should i use an incremental model' paragraph/section. the
discourse link has been share with users a few times and Sara suggested
it'd be helpful to have this on the docs to help users.
### Current content section

### new proposed content + link
- [Discourse
article]((https://discourse.getdbt.com/t/on-the-limits-of-incrementality/303))
- In addition to these considerations for incremental models, it's
important to understand their limits and challenges, particularly with
large datasets. For more insights into efficient strategies, performance
considerations, and the handling of late-arriving data in incremental
models, refer to the [On the Limits of
Incrementality](https://discourse.getdbt.com/t/on-the-limits-of-incrementality/303)
discourse discussion.
Copy file name to clipboardExpand all lines: website/docs/docs/build/incremental-models.md
+9-5Lines changed: 9 additions & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -154,17 +154,21 @@ For detailed usage instructions, check out the [dbt run](/reference/commands/run
154
154
155
155
# Understanding incremental models
156
156
## When should I use an incremental model?
157
-
It's often desirable to build models as tables in your data warehouse since downstream queries are more performant. While the `table` materialization also creates your models as tables, it rebuilds the table on each dbt run. These runs can become problematic in that they use a lot of compute when either:
158
-
* source data tables have millions, or even billions, of rows.
159
-
* the transformations on the source data are computationally expensive (that is, take a long time to execute), for example, complex Regex functions, or UDFs are being used to transform data.
160
157
161
-
Like many things in programming, incremental models are a trade-off between complexity and performance. While they are not as straightforward as the `view` and `table` materializations, they can lead to significantly better performance of your dbt runs.
158
+
Building models as tables in your data warehouse is often preferred for better query performance. However, using `table` materialization can be computationally intensive, especially when:
159
+
160
+
- Source data has millions or billions of rows.
161
+
- Data transformations on the source data are computationally expensive (take a long time to execute) and complex, like using Regex or UDFs.
162
+
163
+
Incremental models offer a balance between complexity and improved performance compared to `view` and `table` materializations and offer better performance of your dbt runs.
164
+
165
+
In addition to these considerations for incremental models, it's important to understand their limitations and challenges, particularly with large datasets. For more insights into efficient strategies, performance considerations, and the handling of late-arriving data in incremental models, refer to the [On the Limits of Incrementality](https://discourse.getdbt.com/t/on-the-limits-of-incrementality/303) discourse discussion.
162
166
163
167
## Understanding the is_incremental() macro
164
168
The `is_incremental()` macro will return `True` if _all_ of the following conditions are met:
165
169
* the destination table already exists in the database
166
170
* dbt is _not_ running in full-refresh mode
167
-
*the running model is configured with `materialized='incremental'`
171
+
*The running model is configured with `materialized='incremental'`
168
172
169
173
Note that the SQL in your model needs to be valid whether `is_incremental()` evaluates to `True` or `False`.
0 commit comments