Skip to content

Commit c5e0f60

Browse files
authored
add discourse link + simplify language (#4675)
after discussion with @saraleon1 -- this pr adds a discourse link to incremental strategies discussion for large datasets and simplifies the 'when should i use an incremental model' paragraph/section. the discourse link has been share with users a few times and Sara suggested it'd be helpful to have this on the docs to help users. ### Current content section ![Screenshot 2023-12-19 at 13 06 23](https://github.com/dbt-labs/docs.getdbt.com/assets/89008547/fc3f9b52-3e19-4ee3-bac9-e16212118dbf) ### new proposed content + link - [Discourse article]((https://discourse.getdbt.com/t/on-the-limits-of-incrementality/303)) - In addition to these considerations for incremental models, it's important to understand their limits and challenges, particularly with large datasets. For more insights into efficient strategies, performance considerations, and the handling of late-arriving data in incremental models, refer to the [On the Limits of Incrementality](https://discourse.getdbt.com/t/on-the-limits-of-incrementality/303) discourse discussion.
2 parents c9e88fa + 03eb38d commit c5e0f60

File tree

1 file changed

+9
-5
lines changed

1 file changed

+9
-5
lines changed

website/docs/docs/build/incremental-models.md

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -154,17 +154,21 @@ For detailed usage instructions, check out the [dbt run](/reference/commands/run
154154

155155
# Understanding incremental models
156156
## When should I use an incremental model?
157-
It's often desirable to build models as tables in your data warehouse since downstream queries are more performant. While the `table` materialization also creates your models as tables, it rebuilds the table on each dbt run. These runs can become problematic in that they use a lot of compute when either:
158-
* source data tables have millions, or even billions, of rows.
159-
* the transformations on the source data are computationally expensive (that is, take a long time to execute), for example, complex Regex functions, or UDFs are being used to transform data.
160157

161-
Like many things in programming, incremental models are a trade-off between complexity and performance. While they are not as straightforward as the `view` and `table` materializations, they can lead to significantly better performance of your dbt runs.
158+
Building models as tables in your data warehouse is often preferred for better query performance. However, using `table` materialization can be computationally intensive, especially when:
159+
160+
- Source data has millions or billions of rows.
161+
- Data transformations on the source data are computationally expensive (take a long time to execute) and complex, like using Regex or UDFs.
162+
163+
Incremental models offer a balance between complexity and improved performance compared to `view` and `table` materializations and offer better performance of your dbt runs.
164+
165+
In addition to these considerations for incremental models, it's important to understand their limitations and challenges, particularly with large datasets. For more insights into efficient strategies, performance considerations, and the handling of late-arriving data in incremental models, refer to the [On the Limits of Incrementality](https://discourse.getdbt.com/t/on-the-limits-of-incrementality/303) discourse discussion.
162166

163167
## Understanding the is_incremental() macro
164168
The `is_incremental()` macro will return `True` if _all_ of the following conditions are met:
165169
* the destination table already exists in the database
166170
* dbt is _not_ running in full-refresh mode
167-
* the running model is configured with `materialized='incremental'`
171+
* The running model is configured with `materialized='incremental'`
168172

169173
Note that the SQL in your model needs to be valid whether `is_incremental()` evaluates to `True` or `False`.
170174

0 commit comments

Comments
 (0)