Skip to content

Commit a46d0da

Browse files
Merge pull request #413 from BradCr/feature/model-too-many-joins
Feature/model too many joins
2 parents 6c50591 + c88170c commit a46d0da

16 files changed

+88
-12
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,3 +22,4 @@ package-lock.yml
2222

2323
# MacOS
2424
.DS_Store
25+
integration_tests/package-lock.yml

dbt_project.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,7 @@ vars:
6363

6464
# -- DAG variables --
6565
models_fanout_threshold: 3
66+
too_many_joins_threshold: 7
6667

6768
# -- Naming conventions variables --
6869
# to add a new "model type", update the variable model_types

docs/customization/overriding-variables.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -38,16 +38,18 @@ vars:
3838
3939
## DAG Variables
4040
41-
| variable | description | default |
42-
| ----------- | ----------- | ----------- |
43-
| `models_fanout_threshold` | threshold for unacceptable model fanout for `fct_model_fanout` | 3 models |
41+
| variable | description | default |
42+
| ----------- | ------------ | ----------- |
43+
| `models_fanout_threshold` | threshold for unacceptable model fanout for `fct_model_fanout` | 3 models |
44+
| `too_many_joins_threshold` | threshold for the number of references to flag in `fct_too_many_joins` | 7 references |
4445

4546
```yaml title="dbt_project.yml"
46-
# set your model fanout threshold to 10 instead of 3
47+
# set your model fanout threshold to 10 instead of 3 and too many joins from 6 instead of 7
4748
4849
vars:
4950
dbt_project_evaluator:
5051
models_fanout_threshold: 10
52+
too_many_joins_threshold: 6
5153
```
5254

5355
## Naming Convention Variables

docs/rules.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,12 +19,13 @@ hide:
1919
|Modeling |[Root Models](../rules/modeling/#root-models) |`fct_root_models`|
2020
|Modeling |[Staging Models Dependent on Downstream Models](../rules/modeling/#staging-models-dependent-on-downstream-models) |`fct_staging_dependent_on_marts_or_intermediate`|
2121
|Modeling |[Unused Sources](../rules/modeling/#unused-sources) |`fct_unused_sources`|
22+
|Modeling |[Models with Too Many Joins](../rules/modeling/#models-with-too-many-joins) |`fct_too_many_joins`|
2223
|Testing |[Missing Primary Key Tests](../rules/testing/#missing-primary-key-tests) |`fct_missing_primary_key_tests`|
2324
|Testing |[Test Coverage](../rules/testing/#test-coverage) |`fct_test_coverage`|
2425
|Documentation |[Undocumented Models](../rules/documentation/#undocumented-models) |`fct_undocumented_models`|
2526
|Documentation |[Documentation Coverage](../rules/documentation/#documentation-coverage) |`fct_documentation_coverage`|
26-
|Documentation |[Undocumented Source Tables](../rules/documentation/#undocumented-source-tables) |`fct_undocumented_source_tables`|
27-
|Documentation |[Undocumented Sources](../rules/documentation/#undocumented-sources) |`fct_undocumented_sources`|
27+
|Documentation |[Undocumented Source Tables](../rules/documentation/#undocumented-source-tables) |`fct_undocumented_source_tables`|
28+
|Documentation |[Undocumented Sources](../rules/documentation/#undocumented-sources) |`fct_undocumented_sources`|
2829
|Structure |[Test Directories](../rules/structure/#test-directories) |`fct_test_directories`|
2930
|Structure |[Model Naming Conventions](../rules/structure/#model-naming-conventions) |`fct_model_naming_conventions`|
3031
|Structure |[Source Directories](../rules/structure/#source-directories) |`fct_source_directories`|

docs/rules/modeling.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -429,3 +429,27 @@ or any other nested information.
429429
```
430430

431431
![A refactored DAG showing three sources which are each being referenced by an accompanying staging model](https://user-images.githubusercontent.com/30663534/159603703-6e94b00b-07d1-4f47-89df-8e5685d9fcf0.png){ width=500 }
432+
433+
---
434+
435+
## Models with Too Many Joins
436+
437+
`fct_too_many_joins` ([source](https://github.com/dbt-labs/dbt-project-evaluator/tree/main/models/marts/dag/fct_too_many_joins.sql)) shows models with a reference to too many other models or sources.
438+
439+
The number of different references to start raising errors is set to 7 by default, but you can set your own threshold by overriding the `too_many_joins_threshold` variable. [See overriding variables section.](../customization/overriding-variables.md)
440+
441+
**Example**
442+
443+
`fct_model_1` directly references seven (7) staging models upstream.
444+
445+
![A DAG showing a model that directly references seven staging models upstream.](https://github.com/BradCr/dbt-project-evaluator/assets/151274228/46ea1f78-1bd7-436b-b15b-f63c726601a1){ width=600 }
446+
447+
**Reason to Flag**
448+
449+
This likely represents a model in which too much is being done. Having a model that too many upstream models introduces a lot of code complexity, which can be challenging to understand and maintain.
450+
451+
**How to Remediate**
452+
453+
Bringing together a reasonable number (typically 4 to 6) of entities or concepts (staging models, or perhaps other intermediate models) that will be joined with another similarly purposed intermediate model to generate a mart. Rather than having too many joins, we can join two intermediate models that each house a piece of the complexity, giving us increased readability, flexibility, testing surface area, and insight into our components.
454+
455+
![A DAG showing a model that directly references only two intermediate models. The intermediate models reference three and four staging models upstream.](https://github.com/BradCr/dbt-project-evaluator/assets/151274228/4b630e3c-f13a-443c-94e5-2d93c713c8f2){ width=700 }

integration_tests/dbt_project.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,4 +66,5 @@ vars:
6666
my_table_reference: 'grace_table'
6767
new_model_type_folder_name: 'my_new_models'
6868
new_model_type_prefixes: 'nwmdl_'
69-
insert_batch_size: 100
69+
insert_batch_size: 100
70+
too_many_joins_threshold: 3
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
{{ ref('stg_model_1') }}
2+
{{ ref('stg_model_2') }}
3+
{{ ref('stg_model_3') }}

integration_tests/seeds/dag/dag_seeds.yml

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -90,4 +90,14 @@ seeds:
9090
compare_model: ref('fct_hard_coded_references')
9191
compare_columns:
9292
- model
93-
- hard_coded_references
93+
- hard_coded_references
94+
95+
- name: test_fct_too_many_joins
96+
tests:
97+
- dbt_utils.equality:
98+
name: equality_fct_too_many_joins
99+
compare_model: ref('fct_too_many_joins')
100+
compare_columns:
101+
- resource_name
102+
- file_path
103+
- join_count
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
resource_name,file_path,join_count
2+
fct_model_10,"models/marts/fct_model_10.sql",3
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,2 @@
11
total_models,documented_models,documentation_coverage_pct,staging_documentation_coverage_pct,intermediate_documentation_coverage_pct,marts_documentation_coverage_pct,other_documentation_coverage_pct
2-
15,4,26.67,20.00,0.00,0.00,75.00
2+
16,4,25,20.00,0.00,0.00,75.00

0 commit comments

Comments
 (0)