Skip to content

Commit

Permalink
Merge pull request #48 from rich-iannone/fix-col-schema-step-report-r…
Browse files Browse the repository at this point in the history
…efinements

fix: add snapshot tests for `col_schema_match()` step reports
  • Loading branch information
rich-iannone authored Jan 30, 2025
2 parents ad1cc68 + d5626af commit 0440931
Show file tree
Hide file tree
Showing 11 changed files with 763 additions and 2 deletions.
285 changes: 283 additions & 2 deletions tests/manual_tests/schema_step_reports.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,95 @@ df_01 = validation.get_step_report(i=-99)
validation.get_step_report(i=1)
```

1-1. Use `complete=False` / `in_order=True`

```{python}
import pointblank as pb
schema = pb.Schema(
columns=[
("a", "String"),
("b", "Int64"),
("c", "Float64"),
]
)
validation = (
pb.Validate(data=tbl)
.col_schema_match(
schema=schema,
complete=False, # non-default
in_order=True, # default
case_sensitive_colnames=True, # default
case_sensitive_dtypes=True, # default
full_match_dtypes=True, # default
)
.interrogate()
)
df_01_1 = validation.get_step_report(i=-99)
validation.get_step_report(i=1)
```

1-2. Use `complete=True` / `in_order=False`

```{python}
import pointblank as pb
schema = pb.Schema(
columns=[
("a", "String"),
("b", "Int64"),
("c", "Float64"),
]
)
validation = (
pb.Validate(data=tbl)
.col_schema_match(
schema=schema,
complete=True, # default
in_order=False, # non-default
case_sensitive_colnames=True, # default
case_sensitive_dtypes=True, # default
full_match_dtypes=True, # default
)
.interrogate()
)
df_01_2 = validation.get_step_report(i=-99)
validation.get_step_report(i=1)
```

1-3. Use `complete=False` / `in_order=False`

```{python}
import pointblank as pb
schema = pb.Schema(
columns=[
("a", "String"),
("b", "Int64"),
("c", "Float64"),
]
)
validation = (
pb.Validate(data=tbl)
.col_schema_match(
schema=schema,
complete=False, # non-default
in_order=False, # non-default
case_sensitive_colnames=True, # default
case_sensitive_dtypes=True, # default
full_match_dtypes=True, # default
)
.interrogate()
)
df_01_3 = validation.get_step_report(i=-99)
validation.get_step_report(i=1)
```

2. Schema matches completely; option taken to match any of two different dtypes for column "a", but
all dtypes correct.
Expand Down Expand Up @@ -79,6 +168,97 @@ df_02 = validation.get_step_report(i=-99)
validation.get_step_report(i=1)
```

2-1. Use `complete=False` / `in_order=True`

```{python}
import pointblank as pb
schema = pb.Schema(
columns=[
("a", ["String", "Int64"]),
("b", "Int64"),
("c", "Float64"),
]
)
validation = (
pb.Validate(data=tbl)
.col_schema_match(
schema=schema,
complete=False, # non-default
in_order=True, # default
case_sensitive_colnames=True, # default
case_sensitive_dtypes=True, # default
full_match_dtypes=True, # default
)
.interrogate()
)
df_02_1 = validation.get_step_report(i=-99)
validation.get_step_report(i=1)
```


2-2. Use `complete=True` / `in_order=False`

```{python}
import pointblank as pb
schema = pb.Schema(
columns=[
("a", ["String", "Int64"]),
("b", "Int64"),
("c", "Float64"),
]
)
validation = (
pb.Validate(data=tbl)
.col_schema_match(
schema=schema,
complete=True, # default
in_order=False, # non-default
case_sensitive_colnames=True, # default
case_sensitive_dtypes=True, # default
full_match_dtypes=True, # default
)
.interrogate()
)
df_02_2 = validation.get_step_report(i=-99)
validation.get_step_report(i=1)
```

2-3. Use `complete=False` / `in_order=False`

```{python}
import pointblank as pb
schema = pb.Schema(
columns=[
("a", ["String", "Int64"]),
("b", "Int64"),
("c", "Float64"),
]
)
validation = (
pb.Validate(data=tbl)
.col_schema_match(
schema=schema,
complete=False, # non-default
in_order=False, # non-default
case_sensitive_colnames=True, # default
case_sensitive_dtypes=True, # default
full_match_dtypes=True, # default
)
.interrogate()
)
df_02_3 = validation.get_step_report(i=-99)
validation.get_step_report(i=1)
```

3. Schema has all three columns accounted for but in an incorrect order; dtypes correct.

```{python}
Expand Down Expand Up @@ -109,6 +289,97 @@ df_03 = validation.get_step_report(i=-99)
validation.get_step_report(i=1)
```

3-1. Use `complete=False` / `in_order=True`

```{python}
import pointblank as pb
schema = pb.Schema(
columns=[
("b", "Int64"),
("a", "String"),
("c", "Float64"),
]
)
validation = (
pb.Validate(data=tbl)
.col_schema_match(
schema=schema,
complete=False, # non-default
in_order=True, # default
case_sensitive_colnames=True, # default
case_sensitive_dtypes=True, # default
full_match_dtypes=True, # default
)
.interrogate()
)
df_03_1 = validation.get_step_report(i=-99)
validation.get_step_report(i=1)
```

3-2. Use `complete=True` / `in_order=False`

```{python}
import pointblank as pb
schema = pb.Schema(
columns=[
("b", "Int64"),
("a", "String"),
("c", "Float64"),
]
)
validation = (
pb.Validate(data=tbl)
.col_schema_match(
schema=schema,
complete=True, # default
in_order=False, # non-default
case_sensitive_colnames=True, # default
case_sensitive_dtypes=True, # default
full_match_dtypes=True, # default
)
.interrogate()
)
df_03_2 = validation.get_step_report(i=-99)
validation.get_step_report(i=1)
```

3-3. Use `complete=False` / `in_order=False`

```{python}
import pointblank as pb
schema = pb.Schema(
columns=[
("b", "Int64"),
("a", "String"),
("c", "Float64"),
]
)
validation = (
pb.Validate(data=tbl)
.col_schema_match(
schema=schema,
complete=False, # default
in_order=False, # non-default
case_sensitive_colnames=True, # default
case_sensitive_dtypes=True, # default
full_match_dtypes=True, # default
)
.interrogate()
)
df_03_3 = validation.get_step_report(i=-99)
validation.get_step_report(i=1)
```


4. Schema has all three columns accounted for but in an incorrect order; option taken to match any
of two different dtypes for column "a", but all dtypes correct.

Expand Down Expand Up @@ -316,7 +587,12 @@ validation.get_step_report(i=1)
import pointblank as pb
schema = pb.Schema(
columns=[("a", ["String", "Int64"]), ("b", "Int64"), ("c", "Float64"), ("d", "String")]
columns=[
("a", ["String", "Int64"]),
("b", "Int64"),
("c", "Float64"),
("d", "String"),
]
)
validation = (
Expand All @@ -341,7 +617,12 @@ validation.get_step_report(i=1)
```{python}
import pointblank as pb
schema = pb.Schema(columns=[("a", ["String", "Int64"]), ("c", "Float64"), ("d", "String")])
schema = pb.Schema(columns=[
("a", ["String", "Int64"]),
("c", "Float64"),
("d", "String"),
]
)
validation = (
pb.Validate(data=tbl)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
shape: (3, 8)
┌────────────┬────────────┬────────────┬───────────┬───────────┬───────────┬───────────┬───────────┐
│ index_targ ┆ col_name_t ┆ dtype_targ ┆ index_exp ┆ col_name_ ┆ col_name_ ┆ dtype_exp ┆ dtype_exp │
│ et ┆ arget ┆ et ┆ --- ┆ exp ┆ exp_corre ┆ --- ┆ _correct │
│ --- ┆ --- ┆ --- ┆ i64 ┆ --- ┆ ct ┆ str ┆ --- │
│ i64 ┆ str ┆ str ┆ ┆ str ┆ --- ┆ ┆ str │
│ ┆ ┆ ┆ ┆ ┆ str ┆ ┆ │
╞════════════╪════════════╪════════════╪═══════════╪═══════════╪═══════════╪═══════════╪═══════════╡
│ 1 ┆ a ┆ String ┆ 1 ┆ a ┆ <span sty ┆ String ┆ <span sty │
│ ┆ ┆ ┆ ┆ ┆ le='color ┆ ┆ le='color │
│ ┆ ┆ ┆ ┆ ┆ : #4CA64C ┆ ┆ : #4CA64C │
│ ┆ ┆ ┆ ┆ ┆ ;'>… ┆ ┆ ;'>… │
│ 2 ┆ b ┆ Int64 ┆ 2 ┆ b ┆ <span sty ┆ Int64 ┆ <span sty │
│ ┆ ┆ ┆ ┆ ┆ le='color ┆ ┆ le='color │
│ ┆ ┆ ┆ ┆ ┆ : #4CA64C ┆ ┆ : #4CA64C │
│ ┆ ┆ ┆ ┆ ┆ ;'>… ┆ ┆ ;'>… │
│ 3 ┆ c ┆ Float64 ┆ 3 ┆ c ┆ <span sty ┆ Float64 ┆ <span sty │
│ ┆ ┆ ┆ ┆ ┆ le='color ┆ ┆ le='color │
│ ┆ ┆ ┆ ┆ ┆ : #4CA64C ┆ ┆ : #4CA64C │
│ ┆ ┆ ┆ ┆ ┆ ;'>… ┆ ┆ ;'>… │
└────────────┴────────────┴────────────┴───────────┴───────────┴───────────┴───────────┴───────────┘
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
shape: (3, 8)
┌────────────┬────────────┬────────────┬───────────┬───────────┬───────────┬───────────┬───────────┐
│ index_targ ┆ col_name_t ┆ dtype_targ ┆ index_exp ┆ col_name_ ┆ col_name_ ┆ dtype_exp ┆ dtype_exp │
│ et ┆ arget ┆ et ┆ --- ┆ exp ┆ exp_corre ┆ --- ┆ _correct │
│ --- ┆ --- ┆ --- ┆ str ┆ --- ┆ ct ┆ str ┆ --- │
│ i64 ┆ str ┆ str ┆ ┆ str ┆ --- ┆ ┆ str │
│ ┆ ┆ ┆ ┆ ┆ str ┆ ┆ │
╞════════════╪════════════╪════════════╪═══════════╪═══════════╪═══════════╪═══════════╪═══════════╡
│ 1 ┆ a ┆ String ┆ 1 ┆ a ┆ <span sty ┆ String ┆ <span sty │
│ ┆ ┆ ┆ ┆ ┆ le='color ┆ ┆ le='color │
│ ┆ ┆ ┆ ┆ ┆ : #4CA64C ┆ ┆ : #4CA64C │
│ ┆ ┆ ┆ ┆ ┆ ;'>… ┆ ┆ ;'>… │
│ 2 ┆ b ┆ Int64 ┆ 2 ┆ b ┆ <span sty ┆ Int64 ┆ <span sty │
│ ┆ ┆ ┆ ┆ ┆ le='color ┆ ┆ le='color │
│ ┆ ┆ ┆ ┆ ┆ : #4CA64C ┆ ┆ : #4CA64C │
│ ┆ ┆ ┆ ┆ ┆ ;'>… ┆ ┆ ;'>… │
│ 3 ┆ c ┆ Float64 ┆ 3 ┆ c ┆ <span sty ┆ Float64 ┆ <span sty │
│ ┆ ┆ ┆ ┆ ┆ le='color ┆ ┆ le='color │
│ ┆ ┆ ┆ ┆ ┆ : #4CA64C ┆ ┆ : #4CA64C │
│ ┆ ┆ ┆ ┆ ┆ ;'>… ┆ ┆ ;'>… │
└────────────┴────────────┴────────────┴───────────┴───────────┴───────────┴───────────┴───────────┘
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
shape: (3, 8)
┌────────────┬────────────┬────────────┬───────────┬───────────┬───────────┬───────────┬───────────┐
│ index_targ ┆ col_name_t ┆ dtype_targ ┆ index_exp ┆ col_name_ ┆ col_name_ ┆ dtype_exp ┆ dtype_exp │
│ et ┆ arget ┆ et ┆ --- ┆ exp ┆ exp_corre ┆ --- ┆ _correct │
│ --- ┆ --- ┆ --- ┆ str ┆ --- ┆ ct ┆ str ┆ --- │
│ i64 ┆ str ┆ str ┆ ┆ str ┆ --- ┆ ┆ str │
│ ┆ ┆ ┆ ┆ ┆ str ┆ ┆ │
╞════════════╪════════════╪════════════╪═══════════╪═══════════╪═══════════╪═══════════╪═══════════╡
│ 1 ┆ a ┆ String ┆ 1 ┆ a ┆ <span sty ┆ String ┆ <span sty │
│ ┆ ┆ ┆ ┆ ┆ le='color ┆ ┆ le='color │
│ ┆ ┆ ┆ ┆ ┆ : #4CA64C ┆ ┆ : #4CA64C │
│ ┆ ┆ ┆ ┆ ┆ ;'>… ┆ ┆ ;'>… │
│ 2 ┆ b ┆ Int64 ┆ 2 ┆ b ┆ <span sty ┆ Int64 ┆ <span sty │
│ ┆ ┆ ┆ ┆ ┆ le='color ┆ ┆ le='color │
│ ┆ ┆ ┆ ┆ ┆ : #4CA64C ┆ ┆ : #4CA64C │
│ ┆ ┆ ┆ ┆ ┆ ;'>… ┆ ┆ ;'>… │
│ 3 ┆ c ┆ Float64 ┆ 3 ┆ c ┆ <span sty ┆ Float64 ┆ <span sty │
│ ┆ ┆ ┆ ┆ ┆ le='color ┆ ┆ le='color │
│ ┆ ┆ ┆ ┆ ┆ : #4CA64C ┆ ┆ : #4CA64C │
│ ┆ ┆ ┆ ┆ ┆ ;'>… ┆ ┆ ;'>… │
└────────────┴────────────┴────────────┴───────────┴───────────┴───────────┴───────────┴───────────┘
Loading

0 comments on commit 0440931

Please sign in to comment.