You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on May 17, 2024. It is now read-only.
Copy file name to clipboardExpand all lines: README.md
+13-13
Original file line number
Diff line number
Diff line change
@@ -16,20 +16,27 @@ A data diff is the value-level comparison between two tables—used to identify
16
16
17
17
There is a lot you can do with data-diff: you can test SQL code by comparing development or staging environment data to production, or compare source and target data to identify discrepancies when moving data between databases.
18
18
19
-
# Use Cases
19
+
# data-diff OSS & Datafold Cloud
20
+
data-diff is an open source utility for running stateless diffs as a great single player experience.
20
21
21
-
### Data Migration & Replication Testing
22
-
data-diff is a powerful tool for comparing data when you're moving it between systems. Use it to ensure data accuracy and identify discrepancies during tasks like:
23
-
-**Migrating**to a new data warehouse (e.g., Oracle -> Snowflake)
24
-
-**Converting SQL** to a new transformation framework (e.g., stored procedures -> dbt)
25
-
- Continuously **replicating data** from an OLTP database to OLAP data warehouse (e.g., MySQL -> Redshift)
22
+
23
+
24
+
Scale up with [Datafold Cloud](https://www.datafold.com/)to make data diffing a company-wide experience to both supercharge your data diffing CLI experience (ex: data-diff --dbt --cloud) and run diffs manually in your CI process and within the Datafold UI. This includes [column-level lineage](https://www.datafold.com/column-level-lineage) with BI tool integrations, [CI testing](https://docs.datafold.com/deployment_testing/how_it_works/), faster cross-database diffing, and diff history.
25
+
26
+
# Use Cases
26
27
27
28
### Data Development Testing
28
29
When developing SQL code, data-diff helps you validate and preview changes by comparing data between development/staging environments and production. Here's how it works:
29
30
1. Make a change to your SQL code
30
31
2. Run the SQL code to create a new dataset
31
32
3. Compare this dataset with its production version or other iterations
32
33
34
+
### Data Migration & Replication Testing
35
+
data-diff is a powerful tool for comparing data when you're moving it between systems. Use it to ensure data accuracy and identify discrepancies during tasks like:
36
+
-**Migrating** to a new data warehouse (e.g., Oracle -> Snowflake)
37
+
-**Validating SQL transformations** from legacy solutions (e.g., stored procedures) to new transformation frameworks (e.g., dbt)
38
+
- Continuously **replicating data** from an OLTP database to OLAP data warehouse (e.g., MySQL -> Redshift)
@@ -213,13 +220,6 @@ Your database not listed here?
213
220
214
221
For detailed algorithm and performance insights, explore [here](https://github.com/datafold/data-diff/blob/master/docs/technical-explanation.md), or head to our docs to [learn more about how Datafold diffs data](https://docs.datafold.com/data_diff/how-datafold-diffs-data).
215
222
216
-
217
-
# data-diff OSS & Datafold Cloud
218
-
219
-
data-diff is an open source utility for running stateless diffs as a great single player experience.
220
-
221
-
Scale up with [Datafold Cloud](https://www.datafold.com/) to make data diffing a company-wide experience to both supercharge your data diffing CLI experience (ex: data-diff --dbt --cloud) and run diffs manually in the UI. This includes [column-level lineage](https://www.datafold.com/column-level-lineage), [CI testing](https://docs.datafold.com/deployment_testing/how_it_works/), and diff history.
0 commit comments