Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

schema_update empty in load_packages #2267

Closed
gsarviareply opened this issue Feb 4, 2025 · 3 comments
Closed

schema_update empty in load_packages #2267

gsarviareply opened this issue Feb 4, 2025 · 3 comments
Assignees

Comments

@gsarviareply
Copy link

dlt version

1.6.0

Describe the problem

In case of the run of a pipeline, with schema contract set to evolve for columns and datatype, the load_package in the output of the pipeline.run() contains the variable schema_update that is always empty, even if a new column is added at the source we extract data from.

Image

Expected behavior

The expected behavior is that, when the schema_contract is evolve, this variable schema_update is populated with the columns that changed.

Steps to reproduce

To replicate the issue:

  • Run the pipeline a first time to obtain the original schema and information from a SQL table
  • Then add a new column in the source table and populate it
  • Rerun the pipeline with the schema_contract set to evolve for columns and datatype
  • Check the value of the schema_update variable within the load_packages in the output of the pipeline run

Operating system

Windows

Runtime environment

Local

Python version

3.10

dlt data source

MySQL

dlt destination

No response

Other deployment details

No response

Additional information

No response

@sh-rp
Copy link
Collaborator

sh-rp commented Feb 17, 2025

Please provide a minimal full code exampe that reproduces this so we can take a look. Thanks :)

@sh-rp sh-rp self-assigned this Feb 17, 2025
@sh-rp
Copy link
Collaborator

sh-rp commented Feb 24, 2025

closing for inactivity

@sh-rp sh-rp closed this as completed Feb 24, 2025
@github-project-automation github-project-automation bot moved this from Todo to Done in dlt core library Feb 24, 2025
@gsarviareply
Copy link
Author

I followed the example reported here:

https://github.com/dlt-hub/dlt_demos/blob/main/schema_evolution.ipynb

The difference with the example is that I am reading data from a mysql table. In a first run, the schema is inferred from the mysql table. Afterwards, a new column is added to the table in mysql and then a second pipeline run is performed, specifying the import_schema_path in the pipeline.run() (it is the path where the schema was produced in the first step).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

2 participants