Skip to content

Commit

Permalink
Merge pull request #38 from cparmet/new-asserts
Browse files Browse the repository at this point in the history
Add new assert methods
  • Loading branch information
cparmet authored Nov 17, 2024
2 parents 0828abd + d069aba commit d9cc400
Show file tree
Hide file tree
Showing 10 changed files with 1,244 additions and 855 deletions.
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,12 +158,14 @@ Types:
- `.check.assert_type()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_type) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_type)

Values:
- `.check.assert_all_nulls()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_all_nulls) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_all_nulls)
- `.check.assert_less_than()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_less_than) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_less_than)
- `.check.assert_greater_than()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_greater_than) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_greater_than)
- `.check.assert_negative()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_negative) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_negative)
- `.check.assert_no_nulls()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_no_nulls) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_no_nulls)
- `.check.assert_all_nulls()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_all_nulls) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_all_nulls)
- `.check.assert_nrows()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_nrows) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_nrows)
- `.check.assert_positive()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_positive) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_positive)
- `.check.assert_same_nrows()`: Check that DataFrame/Series has same number of rows as another DF/Series, for example to validate 1:1 joins - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_same_nrows) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_same_nrows)
- `.check.assert_unique()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_unique) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_unique)

### Visualize data
Expand Down
4 changes: 3 additions & 1 deletion docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,12 +131,14 @@ Types:
- `.check.assert_type()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_type) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_type)

Values:
- `.check.assert_all_nulls()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_all_nulls) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_all_nulls)
- `.check.assert_less_than()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_less_than) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_less_than)
- `.check.assert_greater_than()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_greater_than) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_greater_than)
- `.check.assert_negative()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_negative) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_negative)
- `.check.assert_no_nulls()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_no_nulls) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_no_nulls)
- `.check.assert_all_nulls()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_all_nulls) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_all_nulls)
- `.check.assert_nrows()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_nrows) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_nrows)
- `.check.assert_positive()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_positive) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_positive)
- `.check.assert_same_nrows()`: Check that this DataFrame/Series has same number of rows as another DataFrame/Series, for example to validate 1:1 joins - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_same_nrows) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_same_nrows)
- `.check.assert_unique()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_unique) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_unique)

### Visualize data
Expand Down
89 changes: 89 additions & 0 deletions pandas_checks/DataFrameChecks.py
Original file line number Diff line number Diff line change
Expand Up @@ -544,6 +544,48 @@ def assert_no_nulls(
)
return self._obj

def assert_nrows(
self,
nrows: int,
fail_message: str = " ㄨ Assert nrows failed ",
pass_message: str = " ✔️ Assert nrows passed ",
raise_exception: bool = True,
exception_to_raise: Type[BaseException] = DataError,
verbose: bool = False,
) -> pd.DataFrame:
"""Tests whether Dataframe has a given number of rows. Optionally raises an exception. Does not modify the DataFrame itself.
Example:
(
iris
.check.assert_nrows(20)
)
# See docs for .check.assert_data() for examples of how to customize assertions
Args:
nrows: The expected number of rows
fail_message: Message to display if the condition fails.
pass_message: Message to display if the condition passes.
raise_exception: Whether to raise an exception if the condition fails.
exception_to_raise: The exception to raise if the condition fails and raise_exception is True.
verbose: Whether to display the pass message if the condition passes.
Returns:
The original DataFrame, unchanged.
"""

self._obj.check.assert_data(
condition=lambda df: df.shape[0] == nrows,
fail_message=fail_message,
pass_message=pass_message,
raise_exception=raise_exception,
exception_to_raise=exception_to_raise,
message_shows_condition=False,
verbose=verbose,
)
return self._obj

def assert_positive(
self,
fail_message: str = " ㄨ Assert positive failed ",
Expand Down Expand Up @@ -598,6 +640,53 @@ def assert_positive(
)
return self._obj

def assert_same_nrows(
self,
other: Union[pd.DataFrame, pd.Series],
fail_message: str = " ㄨ Assert same_nrows failed ",
pass_message: str = " ✔️ Assert same_nrows passed ",
raise_exception: bool = True,
exception_to_raise: Type[BaseException] = DataError,
verbose: bool = False,
) -> pd.DataFrame:
"""Tests whether Dataframe has the same number of rows as another DataFrame/Series has.
Optionally raises an exception. Does not modify the DataFrame itself.
Example:
# Validate that an expected one-to-one join didn't add rows due to duplicate keys in the right table.
(
transactions_df
.merge(how="left", right=products_df, on="product_id")
.check.assert_same_nrows(transactions_df, "Left join changed row count! Check for duplicate `product_id` keys in product_df.")
)
# See docs for .check.assert_data() for examples of how to customize assertions
Args:
other: The DataFrame or Series that we expect to have the same # of rows as
fail_message: Message to display if the condition fails.
pass_message: Message to display if the condition passes.
subset: Optional, which column or columns to check the condition against.
raise_exception: Whether to raise an exception if the condition fails.
exception_to_raise: The exception to raise if the condition fails and raise_exception is True.
verbose: Whether to display the pass message if the condition passes.
Returns:
The original DataFrame, unchanged.
"""

self._obj.check.assert_data(
condition=lambda df: df.shape[0] == other.shape[0],
fail_message=fail_message,
pass_message=pass_message,
raise_exception=raise_exception,
exception_to_raise=exception_to_raise,
message_shows_condition=False,
verbose=verbose,
)
return self._obj

def assert_str(
self,
fail_message: Union[str, None] = None,
Expand Down
89 changes: 89 additions & 0 deletions pandas_checks/SeriesChecks.py
Original file line number Diff line number Diff line change
Expand Up @@ -514,6 +514,49 @@ def assert_no_nulls(
)
return self._obj

def assert_nrows(
self,
nrows: int,
fail_message: str = " ㄨ Assert nrows failed ",
pass_message: str = " ✔️ Assert nrows passed ",
raise_exception: bool = True,
exception_to_raise: Type[BaseException] = DataError,
verbose: bool = False,
) -> pd.DataFrame:
"""Tests whether Series has a given number of rows. Optionally raises an exception. Does not modify the Series itself.
Example:
(
iris
["species"]
.check.assert_nrows(20)
)
# See docs for .check.assert_data() for examples of how to customize assertions
Args:
nrows: The expected number of rows
fail_message: Message to display if the condition fails.
pass_message: Message to display if the condition passes.
raise_exception: Whether to raise an exception if the condition fails.
exception_to_raise: The exception to raise if the condition fails and raise_exception is True.
verbose: Whether to display the pass message if the condition passes.
Returns:
The original Series, unchanged.
"""

self._obj.check.assert_data(
condition=lambda s: s.shape[0] == nrows,
fail_message=fail_message,
pass_message=pass_message,
raise_exception=raise_exception,
exception_to_raise=exception_to_raise,
message_shows_condition=False,
verbose=verbose,
)
return self._obj

def assert_positive(
self,
fail_message: str = " ㄨ Assert positive failed ",
Expand Down Expand Up @@ -566,6 +609,52 @@ def assert_positive(
)
return self._obj

def assert_same_nrows(
self,
other: Union[pd.DataFrame, pd.Series],
fail_message: str = " ㄨ Assert same_nrows failed ",
pass_message: str = " ✔️ Assert same_nrows passed ",
raise_exception: bool = True,
exception_to_raise: Type[BaseException] = DataError,
verbose: bool = False,
) -> pd.DataFrame:
"""Tests whether Series has the same number of rows as another DataFrame/Series has.
Optionally raises an exception. Does not modify the Series itself.
Example:
(
df1
["column"]
.check.assert_same_nrows(df2)
)
# See docs for .check.assert_data() for examples of how to customize assertions
Args:
other: The DataFrame or Series that we expect to have the same # of rows as
fail_message: Message to display if the condition fails.
pass_message: Message to display if the condition passes.
subset: Optional, which column or columns to check the condition against.
raise_exception: Whether to raise an exception if the condition fails.
exception_to_raise: The exception to raise if the condition fails and raise_exception is True.
verbose: Whether to display the pass message if the condition passes.
Returns:
The original DataFrame, unchanged.
"""

self._obj.check.assert_data(
condition=lambda df: df.shape[0] == other.shape[0],
fail_message=fail_message,
pass_message=pass_message,
raise_exception=raise_exception,
exception_to_raise=exception_to_raise,
message_shows_condition=False,
verbose=verbose,
)
return self._obj

def assert_str(
self,
fail_message: Union[str, None] = None,
Expand Down
Loading

0 comments on commit d9cc400

Please sign in to comment.