Skip to content
This repository was archived by the owner on May 17, 2024. It is now read-only.

Commit e619e42

Browse files
sar009sungchun12
andauthored
Ability to install all oss supported database adapters. (#842)
* Ability to install all database adapters. Signed-off-by: Sarad Mohanan <[email protected]> * Update pyproject.toml Co-authored-by: Sung Won Chung <[email protected]> * Update README.md Co-authored-by: Sung Won Chung <[email protected]> * Update pyproject.toml Co-authored-by: Sung Won Chung <[email protected]> * pyodbc comment and breaking dependency Signed-off-by: Sarad Mohanan <[email protected]> * update readme Signed-off-by: Sarad Mohanan <[email protected]> * Update pyproject.toml Co-authored-by: Sung Won Chung <[email protected]> * Update README.md Co-authored-by: Sung Won Chung <[email protected]> * remove cloud dbs Signed-off-by: Sarad Mohanan <[email protected]> * Update README.md Co-authored-by: Sung Won Chung <[email protected]> --------- Signed-off-by: Sarad Mohanan <[email protected]> Co-authored-by: Sung Won Chung <[email protected]>
1 parent 42e6a9f commit e619e42

File tree

2 files changed

+19
-10
lines changed

2 files changed

+19
-10
lines changed

README.md

+14-9
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ data-diff is a powerful tool for comparing data when you're moving it between sy
2424
- **Converting SQL** to a new transformation framework (e.g., stored procedures -> dbt)
2525
- Continuously **replicating data** from an OLTP database to OLAP data warehouse (e.g., MySQL -> Redshift)
2626

27-
### Data Development Testing
27+
### Data Development Testing
2828
When developing SQL code, data-diff helps you validate and preview changes by comparing data between development/staging environments and production. Here's how it works:
2929
1. Make a change to your SQL code
3030
2. Run the SQL code to create a new dataset
@@ -33,7 +33,7 @@ When developing SQL code, data-diff helps you validate and preview changes by co
3333
# dbt Integration
3434
<p align="left">
3535
<img alt="dbt" src="https://seeklogo.com/images/D/dbt-logo-E4B0ED72A2-seeklogo.com.png" width="10%" />
36-
</p>
36+
</p>
3737

3838
data-diff integrates with [dbt Core](https://github.com/dbt-labs/dbt-core) to seamlessly compare local development to production datasets.
3939

@@ -46,9 +46,9 @@ Learn more about how data-diff works with dbt:
4646
# Getting Started
4747

4848
### ⚡ Validating dbt model changes between dev and prod
49-
Looking to use data-diff in dbt development?
49+
Looking to use data-diff in dbt development?
5050

51-
Development testing with Datafold enables you to see the impact of dbt code changes on data as you write the code, whether in your IDE or CLI.
51+
Development testing with Datafold enables you to see the impact of dbt code changes on data as you write the code, whether in your IDE or CLI.
5252

5353
Head over to [our `data-diff` + `dbt` documentation](https://docs.datafold.com/development_testing/cli) to get started with a development testing workflow!
5454

@@ -61,6 +61,11 @@ To compare data between databases, install `data-diff` with specific database ad
6161
pip install data-diff 'data-diff[postgresql,snowflake]' -U
6262
```
6363

64+
Additionally, you can install all open source supported database adapters as follows.
65+
```
66+
pip install data-diff 'data-diff[all-oss-supported-dbs]' -U
67+
```
68+
6469
2. Run `data-diff` with connection URIs
6570

6671
Then, we compare tables between PostgreSQL and Snowflake using the hashdiff algorithm:
@@ -75,13 +80,13 @@ data-diff \
7580
-c <columns to compare> \
7681
-w <filter condition>
7782
```
78-
3. Set up your configuration
83+
3. Set up your configuration
7984

8085
You can use a `toml` configuration file to run your `data-diff` job. In this example, we compare tables between MotherDuck (hosted DuckDB) and Snowflake using the hashdiff algorithm:
8186

8287
```toml
8388
## DATABASE CONNECTION ##
84-
[database.duckdb_connection]
89+
[database.duckdb_connection]
8590
driver = "duckdb"
8691
# filepath = "datafold_demo.duckdb" # local duckdb file example
8792
# filepath = "md:" # default motherduck connection example
@@ -202,10 +207,10 @@ Your database not listed here?
202207
* Time complexity approximates COUNT(*) operation when there are few differences
203208
* Performance degrades when datasets have a large number of differences
204209

205-
</details>
210+
</details>
206211
<br>
207212

208-
For detailed algorithm and performance insights, explore [here](https://github.com/datafold/data-diff/blob/master/docs/technical-explanation.md), or head to our docs to [learn more about how Datafold diffs data](https://docs.datafold.com/data_diff/how-datafold-diffs-data).
213+
For detailed algorithm and performance insights, explore [here](https://github.com/datafold/data-diff/blob/master/docs/technical-explanation.md), or head to our docs to [learn more about how Datafold diffs data](https://docs.datafold.com/data_diff/how-datafold-diffs-data).
209214

210215

211216
# data-diff OSS & Datafold Cloud
@@ -216,7 +221,7 @@ Scale up with [Datafold Cloud](https://www.datafold.com/) to make data diffing a
216221

217222
## Contributors
218223

219-
We thank everyone who contributed so far!
224+
We thank everyone who contributed so far!
220225

221226
We'd love to see your face here: [Contributing Instructions](CONTRIBUTING.md)
222227

pyproject.toml

+5-1
Original file line numberDiff line numberDiff line change
@@ -74,12 +74,16 @@ redshift = ["psycopg2"]
7474
snowflake = ["snowflake-connector-python", "cryptography"]
7575
presto = ["presto-python-client"]
7676
oracle = ["oracledb"]
77-
mssql = ["pyodbc"]
77+
mssql = ["pyodbc"] # natively supported in Datafold Cloud only
7878
# databricks = ["databricks-sql-connector"]
7979
trino = ["trino"]
8080
clickhouse = ["clickhouse-driver"]
8181
vertica = ["vertica-python"]
8282
duckdb = ["duckdb"]
83+
all-oss-supported-dbs = [
84+
"preql", "mysql-connector-python", "psycopg2", "snowflake-connector-python", "cryptography", "presto-python-client",
85+
"oracledb", "trino", "clickhouse-driver", "vertica-python", "duckdb"
86+
]
8387

8488
[tool.poetry.group.dev.dependencies]
8589
pre-commit = "^3.5.0"

0 commit comments

Comments
 (0)