-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CT-983] Support dbt Python models in OSS Apache Spark #510
Comments
let us know if you'd like to help on this issue! |
Hi Cody,
thanks for reaching out. Yes, I would like to help, but my current time
available is really strechted.
Regards
Sebastian
Am Di., 1. Nov. 2022 um 18:24 Uhr schrieb Cody Peterson <
***@***.***>:
… let us know if you'd like to help on this issue!
—
Reply to this email directly, view it on GitHub
<#510>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGWEUZAANPKZONHDPDPMJMTWGFG6PANCNFSM55OXUO6Q>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Hi, is there any update or current plan on this? |
Would like to vouch that this will be an important features, and open dbt up to a different data engineers who mostly work with Spark but at the same time wanted the rigor and data quality framework of dbt. |
I have some non-production grady sample that uses the same approach as duckdb to run python models.. https://github.com/timvw/dbt-spark/tree/support-sparksession-python-local |
I think there's a very meaningful approach we should think about in terms of how we expand support for Spark in dbt. For scalability and maintenance, it would make more sense to split apart the adapter to support specific Spark services but we haven't have capacity to prioritize this work yet. Due to the age of this issue and this broader plan, I'm going to go ahead and close this issue. |
Context: dbt-labs/dbt-spark#407
The current implementation depends on Databricks APIs that are not available in OSS Apache Spark. We would like help from knowledgeable and interested community members, who could spec out an implementation using Spark-only functionality.
The entry point is
submit_python_job
:https://github.com/dbt-labs/dbt-spark/blob/7f6cffecf38b7c41aa441eb020d464ba1e20bf9e/dbt/adapters/spark/impl.py#L392
potentially useful Spark doc: Submitting Applications
The text was updated successfully, but these errors were encountered: