Skip to content

Question: How are non-narwhals-supported datatypes handled? #30

@dangotbanned

Description

@dangotbanned

Hey there, just dropping by as I seem to be causing some failures 🫣 for Validoopsie in a narwhals PR I'm working on:

I think the cause is raising an error for DuckDB's Map Type

The change I made doesn't have to make it in, but I haven't been able to track down what the intended current behavior was in Validoopsie for a datatype that narwhals doesn't support was?

Pytest run details

Show logs

warning: The `tool.uv.dev-dependencies` field (used in `pyproject.toml`) is deprecated and will be removed in a future release; use `dependency-groups.dev` instead
============================= test session starts ==============================
platform linux -- Python 3.12.3, pytest-8.4.2, pluggy-1.6.0
rootdir: /home/runner/work/narwhals/narwhals/validoopsie
configfile: pytest.ini
plugins: anyio-4.11.0, env-1.2.0
collected 1086 items

tests/test_adding_custom_validation.py ............                      [  1%]
tests/test_dataset.py ..............................                     [  3%]
tests/test_raise_results.py ..................                           [  5%]
tests/test_validation_catalogue/test_DateValidation/test_column_match_date_format.py . [  5%]
.......................                                                  [  7%]
tests/test_validation_catalogue/test_DateValidation/test_date_to_be_between.py . [  7%]
........................................................................ [ 14%]
........................................................................ [ 21%]
.....                                                                    [ 21%]
tests/test_validation_catalogue/test_EqualityValidation/test_pair_column_equality.py . [ 21%]
.............................                                            [ 24%]
tests/test_validation_catalogue/test_NullValidation/test_column_be_null.py . [ 24%]
.............................                                            [ 27%]
tests/test_validation_catalogue/test_NullValidation/test_column_not_be_null.py . [ 27%]
.................                                                        [ 28%]
tests/test_validation_catalogue/test_StringValidation/test_length_to_be_between.py . [ 28%]
........................................................................ [ 35%]
........................................................................ [ 42%]
.....                                                                    [ 42%]
tests/test_validation_catalogue/test_StringValidation/test_length_to_be_equal_to.py . [ 42%]
...........................................................              [ 48%]
tests/test_validation_catalogue/test_StringValidation/test_not_pattern_match.py . [ 48%]
...............................................                          [ 52%]
tests/test_validation_catalogue/test_StringValidation/test_pattern_match.py . [ 52%]
...............................................                          [ 56%]
tests/test_validation_catalogue/test_TypesValidation/test_type_check.py . [ 56%]
...F.....F.......s...F.....F.....F.....F...........F.................F.. [ 63%]
.....                                                                    [ 64%]
tests/test_validation_catalogue/test_UniqueValidation/test_column_unique_pair.py . [ 64%]
....................................................s............        [ 70%]
tests/test_validation_catalogue/test_UniqueValidation/test_column_unique_value_count_to_be_between.py . [ 70%]
........................................................................ [ 76%]
.......................                                                  [ 79%]
tests/test_validation_catalogue/test_UniqueValidation/test_column_unique_values_to_be_in_list.py . [ 79%]
.......................                                                  [ 81%]
tests/test_validation_catalogue/test_ValuesValidation/test_column_values_to_be_between.py . [ 81%]
........................................................................ [ 87%]
.................                                                        [ 89%]
tests/test_validation_catalogue/test_ValuesValidation/test_columns_sum_to_be_between.py . [ 89%]
........................................................................ [ 96%]
...........                                                              [ 97%]
tests/test_validation_catalogue/test_ValuesValidation/test_columns_sum_to_be_equal_to.py . [ 97%]
.............................                                            [100%]

=================================== FAILURES ===================================
_______________ test_type_check_success_single_column[duckdb_df] _______________

sample_data = ┌───────────────────────────────────────┐
|          Narwhals LazyFrame           |
| Use `.to_native` to see native output |
└───────────────────────────────────────┘

    def test_type_check_success_single_column(sample_data: Frame) -> None:
        ds = TypeCheck("IntegerType", IntegerType)
        result = ds.__execute_check__(frame=sample_data)
>       assert result["result"]["status"] == "Success"
E       AssertionError: assert 'Fail' == 'Success'
E         
E         - Success
E         + Fail

tests/test_validation_catalogue/test_TypesValidation/test_type_check.py:94: AssertionError
----------------------------- Captured stderr call -----------------------------
2025-10-10 08:50:46.284 | ERROR    | validoopsie.util.base_util_functions:log_exception_summary:61 - An error occurred while validating TypeCheck:
UnsupportedDTypeError - MAP(VARCHAR, VARCHAR)
________ test_type_check_success_single_column_specific_type[duckdb_df] ________

sample_data = ┌───────────────────────────────────────┐
|          Narwhals LazyFrame           |
| Use `.to_native` to see native output |
└───────────────────────────────────────┘

    def test_type_check_success_single_column_specific_type(sample_data: Frame) -> None:
        ds = TypeCheck("IntegerType", Int64)
        result = ds.__execute_check__(frame=sample_data)
>       assert result["result"]["status"] == "Success"
E       AssertionError: assert 'Fail' == 'Success'
E         
E         - Success
E         + Fail

tests/test_validation_catalogue/test_TypesValidation/test_type_check.py:100: AssertionError
----------------------------- Captured stderr call -----------------------------
2025-10-10 08:50:46.432 | ERROR    | validoopsie.util.base_util_functions:log_exception_summary:61 - An error occurred while validating TypeCheck:
UnsupportedDTypeError - MAP(VARCHAR, VARCHAR)
_______________ test_date_check_success_single_column[duckdb_df] _______________

sample_data = ┌───────────────────────────────────────┐
|          Narwhals LazyFrame           |
| Use `.to_native` to see native output |
└───────────────────────────────────────┘
request = <FixtureRequest for <Function test_date_check_success_single_column[duckdb_df]>>

    def test_date_check_success_single_column(
        sample_data: Frame,
        request: pytest.FixtureRequest,
    ) -> None:
        if request.node.callspec.id == "pandas":
            pytest.skip("Pandas does not support Date type")
        ds = TypeCheck("Date", Date)
        result = ds.__execute_check__(frame=sample_data)
>       assert result["result"]["status"] == "Success"
E       AssertionError: assert 'Fail' == 'Success'
E         
E         - Success
E         + Fail

tests/test_validation_catalogue/test_TypesValidation/test_type_check.py:117: AssertionError
----------------------------- Captured stderr call -----------------------------
2025-10-10 08:50:46.512 | ERROR    | validoopsie.util.base_util_functions:log_exception_summary:61 - An error occurred while validating TypeCheck:
UnsupportedDTypeError - MAP(VARCHAR, VARCHAR)
_____________ test_type_check_success_multiple_columns[duckdb_df] ______________

sample_data = ┌───────────────────────────────────────┐
|          Narwhals LazyFrame           |
| Use `.to_native` to see native output |
└───────────────────────────────────────┘

    def test_type_check_success_multiple_columns(sample_data: Frame) -> None:
        frame_schema_definition = {
            "IntegerType": IntegerType,
            "FloatType": FloatType,
            "String": String,
            "Boolean": Boolean,
        }
        ds = TypeCheck(frame_schema_definition=frame_schema_definition)
        result = ds.__execute_check__(frame=sample_data)
>       assert result["result"]["status"] == "Success"
E       AssertionError: assert 'Fail' == 'Success'
E         
E         - Success
E         + Fail

tests/test_validation_catalogue/test_TypesValidation/test_type_check.py:129: AssertionError
----------------------------- Captured stderr call -----------------------------
2025-10-10 08:50:46.553 | ERROR    | validoopsie.util.base_util_functions:log_exception_summary:61 - An error occurred while validating TypeCheck:
UnsupportedDTypeError - MAP(VARCHAR, VARCHAR)
_____________ test_type_check_failure_multiple_columns[duckdb_df] ______________

sample_data = ┌───────────────────────────────────────┐
|          Narwhals LazyFrame           |
| Use `.to_native` to see native output |
└───────────────────────────────────────┘

    def test_type_check_failure_multiple_columns(sample_data: Frame) -> None:
        frame_schema_definition = {
            "IntegerType": IntegerType,
            "FloatType": FloatType,
            "String": FloatType,
        }
        ds = TypeCheck(None, None, frame_schema_definition=frame_schema_definition)
        result = ds.__execute_check__(frame=sample_data)
        assert result["result"]["status"] == "Fail"
>       assert result["result"]["failing_items"][0] == "String"
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E       KeyError: 'failing_items'

tests/test_validation_catalogue/test_TypesValidation/test_type_check.py:141: KeyError
----------------------------- Captured stderr call -----------------------------
2025-10-10 08:50:46.596 | ERROR    | validoopsie.util.base_util_functions:log_exception_summary:61 - An error occurred while validating TypeCheck:
UnsupportedDTypeError - MAP(VARCHAR, VARCHAR)
_________________ test_type_check_threshold_success[duckdb_df] _________________

sample_data = ┌───────────────────────────────────────┐
|          Narwhals LazyFrame           |
| Use `.to_native` to see native output |
└───────────────────────────────────────┘

    def test_type_check_threshold_success(sample_data: Frame) -> None:
        frame_schema_definition = {
            "IntegerType": IntegerType,
            "FloatType": FloatType,
            "String": String,
            "Boolean": Boolean,
            "Date": Date,
            "Duration": Duration,
            "List": List,
        }
        ds = TypeCheck(
            None,
            None,
            frame_schema_definition=frame_schema_definition,
            threshold=0.75,
        )
        result = ds.__execute_check__(frame=sample_data)
>       assert result["result"]["status"] == "Success"
E       AssertionError: assert 'Fail' == 'Success'
E         
E         - Success
E         + Fail

tests/test_validation_catalogue/test_TypesValidation/test_type_check.py:161: AssertionError
----------------------------- Captured stderr call -----------------------------
2025-10-10 08:50:46.639 | ERROR    | validoopsie.util.base_util_functions:log_exception_summary:61 - An error occurred while validating TypeCheck:
UnsupportedDTypeError - MAP(VARCHAR, VARCHAR)
________________ test_type_check_integration_success[duckdb_df] ________________

sample_data = ┌───────────────────────────────────────┐
|          Narwhals LazyFrame           |
| Use `.to_native` to see native output |
└───────────────────────────────────────┘

    def test_type_check_integration_success(sample_data: Frame) -> None:
        vd = Validate(sample_data)
        vd.TypeValidation.TypeCheck("IntegerType", IntegerType)
        result = vd.results
        key = list(vd.results.keys())[-1]
>       assert result[key]["result"]["status"] == "Success"
E       AssertionError: assert 'Fail' == 'Success'
E         
E         - Success
E         + Fail

tests/test_validation_catalogue/test_TypesValidation/test_type_check.py:189: AssertionError
----------------------------- Captured stderr call -----------------------------
2025-10-10 08:50:46.726 | ERROR    | validoopsie.util.base_util_functions:log_exception_summary:61 - An error occurred while validating TypeCheck:
UnsupportedDTypeError - MAP(VARCHAR, VARCHAR)
___________ test_type_check_integration_threshold_success[duckdb_df] ___________

sample_data = ┌───────────────────────────────────────┐
|          Narwhals LazyFrame           |
| Use `.to_native` to see native output |
└───────────────────────────────────────┘

    def test_type_check_integration_threshold_success(sample_data: Frame) -> None:
        vd = Validate(sample_data)
        frame_schema_definition = {
            "IntegerType": IntegerType,
            "FloatType": FloatType,
            "Boolean": Boolean,
            "String": String,
            "Date": Date,
            "Duration": Duration,
        }
        vd.TypeValidation.TypeCheck(
            None,
            None,
            frame_schema_definition=frame_schema_definition,
            threshold=0.8,
        )
        result = vd.results
        key = list(vd.results.keys())[-1]
>       assert result[key]["result"]["status"] == "Success"
E       AssertionError: assert 'Fail' == 'Success'
E         
E         - Success
E         + Fail

tests/test_validation_catalogue/test_TypesValidation/test_type_check.py:225: AssertionError
----------------------------- Captured stderr call -----------------------------
2025-10-10 08:50:46.867 | ERROR    | validoopsie.util.base_util_functions:log_exception_summary:61 - An error occurred while validating TypeCheck:
UnsupportedDTypeError - MAP(VARCHAR, VARCHAR)
=============================== warnings summary ===============================
tests/test_adding_custom_validation.py::test_adding_custom_validation[pyspark]
  DeprecationWarning: Jupyter is migrating its paths to use standard platformdirs
  given by the platformdirs library.  To remove this warning and
  see the appropriate new directories, set the environment variable
  `JUPYTER_PLATFORM_DIRS=1` and then run `jupyter --paths`.
  The use of platformdirs will be the default in `jupyter_core` v6

tests/test_validation_catalogue/test_TypesValidation/test_type_check.py: 26 warnings
  PerformanceWarning: Resolving the schema of a LazyFrame is a potentially expensive operation. Use `LazyFrame.collect_schema()` to get the schema without this warning.

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED tests/test_validation_catalogue/test_TypesValidation/test_type_check.py::test_type_check_success_single_column[duckdb_df] - AssertionError: assert 'Fail' == 'Success'
  
  - Success
  + Fail
FAILED tests/test_validation_catalogue/test_TypesValidation/test_type_check.py::test_type_check_success_single_column_specific_type[duckdb_df] - AssertionError: assert 'Fail' == 'Success'
  
  - Success
  + Fail
FAILED tests/test_validation_catalogue/test_TypesValidation/test_type_check.py::test_date_check_success_single_column[duckdb_df] - AssertionError: assert 'Fail' == 'Success'
  
  - Success
  + Fail
FAILED tests/test_validation_catalogue/test_TypesValidation/test_type_check.py::test_type_check_success_multiple_columns[duckdb_df] - AssertionError: assert 'Fail' == 'Success'
  
  - Success
  + Fail
FAILED tests/test_validation_catalogue/test_TypesValidation/test_type_check.py::test_type_check_failure_multiple_columns[duckdb_df] - KeyError: 'failing_items'
FAILED tests/test_validation_catalogue/test_TypesValidation/test_type_check.py::test_type_check_threshold_success[duckdb_df] - AssertionError: assert 'Fail' == 'Success'
  
  - Success
  + Fail
FAILED tests/test_validation_catalogue/test_TypesValidation/test_type_check.py::test_type_check_integration_success[duckdb_df] - AssertionError: assert 'Fail' == 'Success'
  
  - Success
  + Fail
FAILED tests/test_validation_catalogue/test_TypesValidation/test_type_check.py::test_type_check_integration_threshold_success[duckdb_df] - AssertionError: assert 'Fail' == 'Success'
  
  - Success
  + Fail
====== 8 failed, 1076 passed, 2 skipped, 27 warnings in 73.86s (0:01:13) =======
Error: Process completed with exit code 1.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions