Skip to content

[Feature] Improve how Cosmos renders & runs seeds #1576

@tatiana

Description

@tatiana

Description

Currently, if the dbt project contains seeds, Cosmos will render them and attempt to run dbt seed unless the end-user customises how Seeds should be rendered:
https://astronomer.github.io/astronomer-cosmos/configuration/render-config.html#customizing-how-nodes-are-rendered-experimental

However, in most cases, seeds do not need to be continuously run.

It would be great if Cosmos could allow users to opt for different ways of rendering/running seeds, similar to source nodes:

  • SeedRenderingBehavior.ALWAYS: Current behaviour of Cosmos, always add the seed tasks in the DAG/TaskGroup and run seeds
  • SeedRenderingBehavior.NONE: New behavior, don't render any seeds in DAG/TaskGroup
  • SeedRenderingBehavior.WHEN_SEED_CHANGES: Only render and run a seed if the csv file has changed since the last execution

There are a few open questions:

  • Should we still render seeds so users understand the topology, even if we don't run them in SeedRenderingBehavior.NONE and SeedRenderingBehavior.WHEN_SEED_CHANGES, and use an EmptyOperator in those cases?
  • Where could we store the state if a .csv was run and if it changed? As an Airflow variable? In a remote object store? Somewherelse..?
  • It would be probably neat to render the source nodes, but skip them, unless they should run

I'd love your thoughts on these ideas.

Use case/motivation

Allow users to opt out of running seeds and only run seeds when needed.

Related issues

No response

Are you willing to submit a PR?

  • Yes, I am willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:executionRelated to the execution environment/mode, like Docker, Kubernetes, Local, VirtualEnv, etcarea:renderingRelated to rendering, like Jinja, Airflow tasks, etccustomer requestAn Astronomer customer made requested thisdbt:seedPrimarily related to dbt seed command or functionalityenhancementNew feature or requeststaleIssue has not had recent activity or appears to be solved. Stale issues will be automatically closedtriage-neededItems need to be reviewed / assigned to milestone

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions