Skip to content

Commit 0089336

Browse files
authored
updated according to dbt-teradata 1.8.2 (#6577)
2 parents cfc8ff2 + 13e36ab commit 0089336

File tree

2 files changed

+43
-42
lines changed

2 files changed

+43
-42
lines changed

website/docs/docs/core/connect-data-platform/teradata-setup.md

Lines changed: 27 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -38,17 +38,19 @@ import SetUpPages from '/snippets/_setup-pages-intro.md';
3838
| 1.6.x |||||
3939
| 1.7.x |||||
4040
| 1.8.x |||||
41+
| 1.9.x |||||
4142

4243
## dbt dependent packages version compatibility
4344

44-
| dbt-teradata | dbt-core | dbt-teradata-util | dbt-util |
45-
|--------------|------------|-------------------|----------------|
46-
| 1.2.x | 1.2.x | 0.1.0 | 0.9.x or below |
47-
| 1.6.7 | 1.6.7 | 1.1.1 | 1.1.1 |
48-
| 1.7.x | 1.7.x | 1.1.1 | 1.1.1 |
49-
| 1.8.x | 1.8.x | 1.1.1 | 1.1.1 |
50-
| 1.8.x | 1.8.x | 1.2.0 | 1.2.0 |
51-
| 1.8.x | 1.8.x | 1.3.0 | 1.3.0 |
45+
| dbt-teradata | dbt-core | dbt-teradata-util | dbt-util |
46+
|--------------|----------|-------------------|----------------|
47+
| 1.2.x | 1.2.x | 0.1.0 | 0.9.x or below |
48+
| 1.6.7 | 1.6.7 | 1.1.1 | 1.1.1 |
49+
| 1.7.x | 1.7.x | 1.1.1 | 1.1.1 |
50+
| 1.8.x | 1.8.x | 1.1.1 | 1.1.1 |
51+
| 1.8.x | 1.8.x | 1.2.0 | 1.2.0 |
52+
| 1.8.x | 1.8.x | 1.3.0 | 1.3.0 |
53+
| 1.9.x | 1.9.x | 1.3.0 | 1.3.0 |
5254

5355

5456
### Connecting to Teradata
@@ -95,7 +97,6 @@ Parameter | Default | Type | Description
9597
`browser_tab_timeout` | `"5"` | quoted integer | Specifies the number of seconds to wait before closing the browser tab after Browser Authentication is completed. The default is 5 seconds. The behavior is under the browser's control, and not all browsers support automatic closing of browser tabs.
9698
`browser_timeout` | `"180"` | quoted integer | Specifies the number of seconds that the driver will wait for Browser Authentication to complete. The default is 180 seconds (3 minutes).
9799
`column_name` | `"false"` | quoted boolean | Controls the behavior of cursor `.description` sequence `name` items. Equivalent to the Teradata JDBC Driver `COLUMN_NAME` connection parameter. False specifies that a cursor `.description` sequence `name` item provides the AS-clause name if available, or the column name if available, or the column title. True specifies that a cursor `.description` sequence `name` item provides the column name if available, but has no effect when StatementInfo parcel support is unavailable.
98-
`connect_failure_ttl` | `"0"` | quoted integer | Specifies the time-to-live in seconds to remember the most recent connection failure for each IP address/port combination. The driver subsequently skips connection attempts to that IP address/port for the duration of the time-to-live. The default value of zero disables this feature. The recommended value is half the database restart time. Equivalent to the Teradata JDBC Driver `CONNECT_FAILURE_TTL` connection parameter.
99100
`connect_timeout` | `"10000"` | quoted integer | Specifies the timeout in milliseconds for establishing a TCP socket connection. Specify 0 for no timeout. The default is 10 seconds (10000 milliseconds).
100101
`cop` | `"true"` | quoted boolean | Specifies whether COP Discovery is performed. Equivalent to the Teradata JDBC Driver `COP` connection parameter.
101102
`coplast` | `"false"` | quoted boolean | Specifies how COP Discovery determines the last COP hostname. Equivalent to the Teradata JDBC Driver `COPLAST` connection parameter. When `coplast` is `false` or omitted, or COP Discovery is turned off, then no DNS lookup occurs for the coplast hostname. When `coplast` is `true`, and COP Discovery is turned on, then a DNS lookup occurs for a coplast hostname.
@@ -110,7 +111,7 @@ Parameter | Default | Type | Description
110111
`log` | `"0"` | quoted integer | Controls debug logging. Somewhat equivalent to the Teradata JDBC Driver `LOG` connection parameter. This parameter's behavior is subject to change in the future. This parameter's value is currently defined as an integer in which the 1-bit governs function and method tracing, the 2-bit governs debug logging, the 4-bit governs transmit and receive message hex dumps, and the 8-bit governs timing. Compose the value by adding together 1, 2, 4, and/or 8.
111112
`logdata` | | string | Specifies extra data for the chosen logon authentication method. Equivalent to the Teradata JDBC Driver `LOGDATA` connection parameter.
112113
`logon_timeout` | `"0"` | quoted integer | Specifies the logon timeout in seconds. Zero means no timeout.
113-
`logmech` | `"TD2"` | string | Specifies the logon authentication method. Equivalent to the Teradata JDBC Driver `LOGMECH` connection parameter. Possible values are `TD2` (the default), `JWT`, `LDAP`, `KRB5` for Kerberos, or `TDNEGO`.
114+
`logmech` | `"TD2"` | string | Specifies the logon authentication method. Equivalent to the Teradata JDBC Driver `LOGMECH` connection parameter. Possible values are `TD2` (the default), `JWT`, `LDAP`, `BROWSER`, `KRB5` for Kerberos, or `TDNEGO`.
114115
`max_message_body` | `"2097000"` | quoted integer | Specifies the maximum Response Message size in bytes. Equivalent to the Teradata JDBC Driver `MAX_MESSAGE_BODY` connection parameter.
115116
`partition` | `"DBC/SQL"` | string | Specifies the database partition. Equivalent to the Teradata JDBC Driver `PARTITION` connection parameter.
116117
`request_timeout` | `"0"` | quoted integer | Specifies the timeout for executing each SQL request. Zero means no timeout.
@@ -210,7 +211,8 @@ For using cross-DB macros, teradata-utils as a macro namespace will not be used,
210211

211212
##### <a name="hash"></a>hash
212213

213-
`Hash` macro needs an `md5` function implementation. Teradata doesn't support `md5` natively. You need to install a User Defined Function (UDF):
214+
`Hash` macro needs an `md5` function implementation. Teradata doesn't support `md5` natively. You need to install a User Defined Function (UDF) and optionally specify `md5_udf` [variable](/docs/build/project-variables). <br>
215+
If not specified the code defaults to using `GLOBAL_FUNCTIONS.hash_md5`. See the following instructions on how to install the custom UDF:
214216
1. Download the md5 UDF implementation from Teradata (registration required): https://downloads.teradata.com/download/extensibility/md5-message-digest-udf.
215217
1. Unzip the package and go to `src` directory.
216218
1. Start up `bteq` and connect to your database.
@@ -228,6 +230,12 @@ For using cross-DB macros, teradata-utils as a macro namespace will not be used,
228230
```sql
229231
GRANT EXECUTE FUNCTION ON GLOBAL_FUNCTIONS TO PUBLIC WITH GRANT OPTION;
230232
```
233+
Instruction on how to add md5_udf variable in dbt_project.yml for custom hash function:
234+
```yaml
235+
vars:
236+
md5_udf: Custom_database_name.hash_method_function
237+
```
238+
231239
##### <a name="last_day"></a>last_day
232240

233241
`last_day` in `teradata_utils`, unlike the corresponding macro in `dbt_utils`, doesn't support `quarter` datepart.
@@ -241,6 +249,14 @@ dbt-teradata 1.8.0 and later versions support unit tests, enabling you to valida
241249

242250
## Limitations
243251

252+
### Browser authentication
253+
* When running a dbt job with logmech set to "browser", the initial authentication opens a browser window where you must enter your username and password.<br>
254+
* After authentication, this window remains open, requiring you to manually switch back to the dbt console.<br>
255+
* For every subsequent connection, a new browser tab briefly opens, displaying the message "TERADATA BROWSER AUTHENTICATION COMPLETED," and silently reuses the existing session.<br>
256+
* However, the focus stays on the browser window, so you’ll need to manually switch back to the dbt console each time.<br>
257+
* This behavior is the default functionality of the teradatasql driver and cannot be avoided at this time.<br>
258+
* To prevent session expiration and the need to re-enter credentials, ensure the authentication browser window stays open until the job is complete.
259+
244260
### Transaction mode
245261
Both ANSI and TERA modes are now supported in dbt-teradata. TERA mode's support is introduced with dbt-teradata 1.7.1, it is an initial implementation.
246262

website/docs/reference/resource-configs/teradata-configs.md

Lines changed: 16 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -12,25 +12,6 @@ id: "teradata-configs"
1212
+quote_columns: false #or `true` if you have csv column headers with spaces
1313
```
1414
15-
* *Enable view column types in docs* - Teradata Vantage has a dbscontrol configuration flag called `DisableQVCI`. This flag instructs the database to create `DBC.ColumnsJQV` with view column type definitions. To enable this functionality you need to:
16-
1. Enable QVCI mode in Vantage. Use `dbscontrol` utility and then restart Teradata. Run these commands as a privileged user on a Teradata node:
17-
```bash
18-
# option 551 is DisableQVCI. Setting it to false enables QVCI.
19-
dbscontrol << EOF
20-
M internal 551=false
21-
W
22-
EOF
23-
24-
# restart Teradata
25-
tpareset -y Enable QVCI
26-
```
27-
2. Instruct `dbt` to use `QVCI` mode. Include the following variable in your `dbt_project.yml`:
28-
```yaml
29-
vars:
30-
use_qvci: true
31-
```
32-
For example configuration, see [dbt_project.yml](https://github.com/Teradata/dbt-teradata/blob/main/test/catalog/with_qvci/dbt_project.yml) in `dbt-teradata` QVCI tests.
33-
3415
## Models
3516
3617
### <Term id="table" />
@@ -348,6 +329,11 @@ If a user sets some key-value pair with value as `'{model}'`, internally this `'
348329
- For example, if the model the user is running is `stg_orders`, `{model}` will be replaced with `stg_orders` in runtime.
349330
- If no `query_band` is set by the user, the default query_band used will be: ```org=teradata-internal-telem;appname=dbt;```
350331

332+
## Unit testing
333+
* Unit testing is supported in dbt-teradata, allowing users to write and execute unit tests using the dbt test command.
334+
* For detailed guidance, refer to the [dbt unit tests documentation](/docs/build/documentation).
335+
> In Teradata, reusing the same alias across multiple common table expressions (CTEs) or subqueries within a single model is not permitted, as it results in parsing errors; therefore, it is essential to assign unique aliases to each CTE or subquery to ensure proper query execution.
336+
351337
## valid_history incremental materialization strategy
352338
_This is available in early access_
353339

@@ -361,26 +347,27 @@ In temporal databases, valid time is crucial for applications like historical re
361347
unique_key='id',
362348
on_schema_change='fail',
363349
incremental_strategy='valid_history',
364-
valid_from='valid_from_column',
365-
history_column_in_target='history_period_column'
350+
valid_period='valid_period_col',
351+
use_valid_to_time='no',
366352
)
367353
}}
368354
```
369355

370356
The `valid_history` incremental strategy requires the following parameters:
371-
* `valid_from` &mdash; Column in the source table of **timestamp** datatype indicating when each record became valid.
372-
* `history_column_in_target` &mdash; Column in the target table of **period** datatype that tracks history.
357+
* `unique_key`: The primary key of the model (excluding the valid time components), specified as a column name or list of column names.
358+
* `valid_period`: Name of the model column indicating the period for which the record is considered to be valid. The datatype must be `PERIOD(DATE)` or `PERIOD(TIMESTAMP)`.
359+
* `use_valid_to_time`: Whether the end bound value of the valid period in the input is considered by the strategy when building the valid timeline. Use `no` if you consider your record to be valid until changed (and supply any value greater to the begin bound for the end bound of the period. A typical convention is `9999-12-31` of ``9999-12-31 23:59:59.999999`). Use `yes` if you know until when the record is valid (typically this is a correction in the history timeline).
373360

374361
The valid_history strategy in dbt-teradata involves several critical steps to ensure the integrity and accuracy of historical data management:
375362
* Remove duplicates and conflicting values from the source data:
376363
* This step ensures that the data is clean and ready for further processing by eliminating any redundant or conflicting records.
377-
* The process of removing duplicates and conflicting values from the source data involves using a ranking mechanism to ensure that only the highest-priority records are retained. This is accomplished using the SQL RANK() function.
364+
* The process of removing primary key duplicates (two or more records with the same value for the `unique_key` and BEGIN() bond of the `valid_period` fields) in the dataset produced by the model. If such duplicates exist, the row with the lowest value is retained for all non-primary-key fields (in the order specified in the model). Full-row duplicates are always de-duplicated.
378365
* Identify and adjust overlapping time slices:
379-
* Overlapping time periods in the data are detected and corrected to maintain a consistent and non-overlapping timeline.
380-
* Manage records needing to be overwritten or split based on the source and target data:
366+
* Overlapping or adjacent time periods in the data are corrected to maintain a consistent and non-overlapping timeline. To achieve this, the macro adjusts the valid period end bound of a record to align with the begin bound of the next record (if they overlap or are adjacent) within the same `unique_key` group. If `use_valid_to_time = 'yes'`, the valid period end bound provided in the source data is used. Otherwise, a default end date is applied for missing bounds, and adjustments are made accordingly.
367+
* Manage records needing to be adjusted, deleted, or split based on the source and target data:
381368
* This involves handling scenarios where records in the source data overlap with or need to replace records in the target data, ensuring that the historical timeline remains accurate.
382-
* Utilize the TD_NORMALIZE_MEET function to compact history:
383-
* This function helps to normalize and compact the history by merging adjacent time periods, improving the efficiency and performance of the database.
369+
* Compact history:
370+
* Normalize and compact the history by merging records of adjacent time periods with the same value, optimizing database storage and performance. We use the function TD_NORMALIZE_MEET for this purpose.
384371
* Delete existing overlapping records from the target table:
385372
* Before inserting new or updated records, any existing records in the target table that overlap with the new data are removed to prevent conflicts.
386373
* Insert the processed data into the target table:
@@ -416,9 +403,7 @@ These steps collectively ensure that the valid_history strategy effectively mana
416403
```
417404

418405

419-
:::info
420-
The target table must already exist before running the model. Ensure the target table is created and properly structured with the necessary columns, including a column that tracks the history with period datatype, before running a dbt model.
421-
:::
406+
422407

423408
## Common Teradata-specific tasks
424409
* *collect statistics* - when a table is created or modified significantly, there might be a need to tell Teradata to collect statistics for the optimizer. It can be done using `COLLECT STATISTICS` command. You can perform this step using dbt's `post-hooks`, e.g.:

0 commit comments

Comments
 (0)