Skip to content

Commit

Permalink
[components][docs] Add components doc on moving definitions into comp…
Browse files Browse the repository at this point in the history
…onentw
  • Loading branch information
benpankow committed Feb 10, 2025
1 parent 1f2eb79 commit eabb2ea
Show file tree
Hide file tree
Showing 32 changed files with 371 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -73,5 +73,6 @@ Now, your code location is ready to use components! `dg` can be used to scaffold

## Next steps

- [Migrate existing definitions to components](./migrating-definitions)
- [Add a new component to your code location](./using-a-component)
- [Create a new component type](./creating-a-component)
58 changes: 58 additions & 0 deletions docs/docs/guides/preview/components/migrating-definitions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
---
title: 'Migrating existing Definitions to components'
sidebar_position: 350
---

:::note
This guide covers migrating existing Python `Definitions` to components. This guide presupposes a components-enabled project. See the [getting started guide](./) or [Making an existing code location components-compatible](./existing-code-location) guide for more information.
:::

When adding components to an existing Dagster code location, it is often useful to restructure your definitions into component folders, making it easier to eventually migrate them entirely to using components.

## Example project

Let's walk through an example of how to migrate existing definitions to components, with a project that has the following structure:

<CliInvocationExample path="docs_beta_snippets/docs_beta_snippets/guides/components/migrating-definitions/1-tree.txt" />

The root `Definitions` object combines definitions from various nested modules:

<CodeExample path="docs_beta_snippets/docs_beta_snippets/guides/components/migrating-definitions/2-definitions-before.py" title="my_existing_project/definitions.py" />

Each of these modules consolidates its own `Definitions` object:

<CodeExample path="docs_beta_snippets/docs_beta_snippets/guides/components/migrating-definitions/3-inner-definitions-before.py" title="my_existing_project/elt/definitions.py" />

We'll migrate the `elt` module to a component.

## Create a Definitions component

We'll start by creating a `Definitions` component for the `elt` module:

<CliInvocationExample path="docs_beta_snippets/docs_beta_snippets/guides/components/migrating-definitions/4-scaffold.txt" />


This creates a new folder in `my_existing_project/components/elt-definitions`, with a `component.yaml` file. This component is rather simple, it just points to a file which contains a `Definitions` object.

Check failure on line 35 in docs/docs/guides/preview/components/migrating-definitions.md

View workflow job for this annotation

GitHub Actions / runner / vale

[vale] reported by reviewdog 🐶 [Vale.Avoid] Avoid using 'simple'. Raw Output: {"message": "[Vale.Avoid] Avoid using 'simple'.", "location": {"path": "docs/docs/guides/preview/components/migrating-definitions.md", "range": {"start": {"line": 35, "column": 135}}}, "severity": "ERROR"}

Let's move the `elt` module's `definitions.py` file to the new component folder:

<CliInvocationExample path="docs_beta_snippets/docs_beta_snippets/guides/components/migrating-definitions/6-mv.txt" />

Now, we can update the `component.yaml` file to point to the new `definitions.py` file:

<CodeExample path="docs_beta_snippets/docs_beta_snippets/guides/components/migrating-definitions/5-component-yaml.txt" title="my_existing_project/components/elt-definitions/component.yaml" />

Finally, we can update the root `definitions.py` file to no longer explicitly load the `elt` module's `Definitions`:

<CodeExample path="docs_beta_snippets/docs_beta_snippets/guides/components/migrating-definitions/7-definitions-after.py" title="my_existing_project/definitions.py" />

Now, our project structure looks like this:

<CliInvocationExample path="docs_beta_snippets/docs_beta_snippets/guides/components/migrating-definitions/8-tree-after.txt" />

We can repeat the same process for our other modules.

## Next steps

- [Add a new component to your code location](./using-a-component)
- [Create a new component type](./creating-a-component)
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
tree

.
├── README.md
├── my_existing_project
│   ├── __init__.py
│   ├── analytics
│   │   ├── __init__.py
│   │   ├── assets.py
│   │   └── definitions.py
│   ├── components
│   ├── definitions.py
│   └── elt
│   ├── __init__.py
│   └── definitions.py
├── my_existing_project_tests
│   ├── __init__.py
│   └── test_assets.py
└── pyproject.toml

6 directories, 11 files
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
from pathlib import Path

import dagster_components as dg_components

import dagster as dg
from my_existing_project.analytics import definitions as analytics_definitions
from my_existing_project.elt import definitions as elt_definitions

defs = dg.Definitions.merge(
dg.load_definitions_from_module(elt_definitions),
dg.load_definitions_from_module(analytics_definitions),
dg_components.build_component_defs(Path(__file__).parent / "components"),
)
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
from dagster import asset
from dagster._core.definitions.definitions_class import Definitions


@asset
def my_elt_asset(): ...


defs = Definitions(assets=[my_elt_asset])
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
dg component scaffold 'definitions@dagster_components' elt-definitions

Creating a Dagster component instance folder at /.../my-existing-project/my_existing_project/components/elt-definitions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
type: definitions@dagster_components

params:
definitions_path: definitions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
mv my_existing_project/elt/definitions.py my_existing_project/components/elt-definitions && rm -rf my_existing_project/elt
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
from pathlib import Path

import dagster_components as dg_components

import dagster as dg
from my_existing_project.analytics import definitions as analytics_definitions

defs = dg.Definitions.merge(
dg.load_definitions_from_module(analytics_definitions),
dg_components.build_component_defs(Path(__file__).parent / "components"),
)
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
tree

.
├── README.md
├── my_existing_project
│   ├── __init__.py
│   ├── analytics
│   │   ├── __init__.py
│   │   ├── assets.py
│   │   └── definitions.py
│   ├── components
│   │   └── elt-definitions
│   │   ├── component.yaml
│   │   └── definitions.py
│   └── definitions.py
├── my_existing_project_tests
│   ├── __init__.py
│   └── test_assets.py
├── pyproject.toml
└── uv.lock

6 directories, 12 files
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Sample existing project for testing docs for the "Making an existing code location components-compatible" guide.
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@

Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
from dagster import asset


@asset
def my_asset():
pass
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
from dagster import asset
from dagster._core.definitions.definitions_class import Definitions


@asset
def my_analytics_asset(): ...


defs = Definitions(assets=[my_analytics_asset])
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
from pathlib import Path

import dagster_components as dg_components

import dagster as dg
from my_existing_project.analytics import definitions as analytics_definitions
from my_existing_project.elt import definitions as elt_definitions

defs = dg.Definitions.merge(
dg.load_definitions_from_module(elt_definitions),
dg.load_definitions_from_module(analytics_definitions),
dg_components.build_component_defs(Path(__file__).parent / "components"),
)
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
from dagster import asset
from dagster._core.definitions.definitions_class import Definitions


@asset
def my_elt_asset(): ...


defs = Definitions(assets=[my_elt_asset])
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@

Original file line number Diff line number Diff line change
@@ -0,0 +1 @@

Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
[project]
name = "my_existing_project"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.9,<3.13"
dependencies = [
"dagster",
"dagster-components",
]

[project.optional-dependencies]
dev = [
"dagster-webserver",
"pytest>8",
]

[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"

[tool.dg]
is_code_location = true

[tool.dagster]
module_name = "my_existing_project.definitions"
code_location_name = "my_existing_project"

[tool.setuptools.packages.find]
exclude=["my_existing_project_tests"]
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
import os
import re
import subprocess
from pathlib import Path
from tempfile import TemporaryDirectory

import pytest

from dagster._utils.env import environ
from docs_beta_snippets_tests.snippet_checks.guides.components.utils import (
DAGSTER_ROOT,
EDITABLE_DIR,
MASK_EDITABLE_DAGSTER,
MASK_JAFFLE_PLATFORM,
MASK_SLING_DOWNLOAD_DUCKDB,
MASK_SLING_PROMO,
MASK_SLING_WARNING,
MASK_TIME,
)
from docs_beta_snippets_tests.snippet_checks.utils import (
_run_command,
check_file,
compare_tree_output,
create_file,
re_ignore_after,
re_ignore_before,
run_command_and_snippet_output,
)

MASK_MY_EXISTING_PROJECT = (r" \/.*?\/my-existing-project", " /.../my-existing-project")


COMPONENTS_SNIPPETS_DIR = (
DAGSTER_ROOT
/ "examples"
/ "docs_beta_snippets"
/ "docs_beta_snippets"
/ "guides"
/ "components"
/ "migrating-definitions"
)


MY_EXISTING_PROJECT = Path(__file__).parent / "my-existing-project"


def test_components_docs_migrating_definitions(update_snippets: bool) -> None:
snip_no = 0

def next_snip_no():
nonlocal snip_no
snip_no += 1
return snip_no

with (
TemporaryDirectory() as tempdir,
environ(
{
"COLUMNS": "90",
"NO_COLOR": "1",
"HOME": "/tmp",
"DAGSTER_GIT_REPO_DIR": str(DAGSTER_ROOT),
}
),
):
os.chdir(tempdir)

_run_command(f"cp -r {MY_EXISTING_PROJECT} . && cd my-existing-project")
_run_command(r"find . -type d -name __pycache__ -exec rm -r {} \+")
_run_command(
r"find . -type d -name my_existing_project.egg-info -exec rm -r {} \+"
)

run_command_and_snippet_output(
cmd="tree",
snippet_path=COMPONENTS_SNIPPETS_DIR / f"{next_snip_no()}-tree.txt",
update_snippets=update_snippets,
custom_comparison_fn=compare_tree_output,
)

check_file(
Path("my_existing_project") / "definitions.py",
COMPONENTS_SNIPPETS_DIR / f"{next_snip_no()}-definitions-before.py",
update_snippets=update_snippets,
)

check_file(
Path("my_existing_project") / "elt" / "definitions.py",
COMPONENTS_SNIPPETS_DIR / f"{next_snip_no()}-inner-definitions-before.py",
update_snippets=update_snippets,
)

_run_command(cmd="uv venv")
_run_command(cmd="uv sync")
_run_command(
f"uv add --editable '{EDITABLE_DIR / 'dagster-components'!s}' '{DAGSTER_ROOT / 'python_modules' / 'dagster'!s}' '{DAGSTER_ROOT / 'python_modules' / 'dagster-webserver'!s}'"
)

run_command_and_snippet_output(
cmd="dg component scaffold 'definitions@dagster_components' elt-definitions",
snippet_path=COMPONENTS_SNIPPETS_DIR / f"{next_snip_no()}-scaffold.txt",
update_snippets=update_snippets,
snippet_replace_regex=[MASK_MY_EXISTING_PROJECT],
)

create_file(
Path("my_existing_project")
/ "components"
/ "elt-definitions"
/ "component.yaml",
"""type: definitions@dagster_components
params:
definitions_path: definitions.py
""",
COMPONENTS_SNIPPETS_DIR / f"{next_snip_no()}-component-yaml.txt",
)

run_command_and_snippet_output(
cmd="mv my_existing_project/elt/definitions.py my_existing_project/components/elt-definitions && rm -rf my_existing_project/elt",
snippet_path=COMPONENTS_SNIPPETS_DIR / f"{next_snip_no()}-mv.txt",
update_snippets=update_snippets,
)

create_file(
Path("my_existing_project") / "definitions.py",
"""from pathlib import Path
import dagster_components as dg_components
import dagster as dg
from my_existing_project.analytics import definitions as analytics_definitions
defs = dg.Definitions.merge(
dg.load_definitions_from_module(analytics_definitions),
dg_components.build_component_defs(Path(__file__).parent / "components"),
)
""",
COMPONENTS_SNIPPETS_DIR / f"{next_snip_no()}-definitions-after.py",
)

_run_command(r"find . -type d -name __pycache__ -exec rm -r {} \+")
_run_command(
r"find . -type d -name my_existing_project.egg-info -exec rm -r {} \+"
)

run_command_and_snippet_output(
cmd="tree",
snippet_path=COMPONENTS_SNIPPETS_DIR / f"{next_snip_no()}-tree-after.txt",
update_snippets=update_snippets,
custom_comparison_fn=compare_tree_output,
)

# validate loads
_run_command(
"uv run dagster asset materialize --select '*' -m 'my_existing_project.definitions'"
)

0 comments on commit eabb2ea

Please sign in to comment.