Skip to content

Commit d9cc400

Browse files
authored
Merge pull request #38 from cparmet/new-asserts
Add new assert methods
2 parents 0828abd + d069aba commit d9cc400

10 files changed

+1244
-855
lines changed

README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -158,12 +158,14 @@ Types:
158158
- `.check.assert_type()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_type) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_type)
159159

160160
Values:
161+
- `.check.assert_all_nulls()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_all_nulls) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_all_nulls)
161162
- `.check.assert_less_than()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_less_than) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_less_than)
162163
- `.check.assert_greater_than()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_greater_than) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_greater_than)
163164
- `.check.assert_negative()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_negative) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_negative)
164165
- `.check.assert_no_nulls()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_no_nulls) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_no_nulls)
165-
- `.check.assert_all_nulls()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_all_nulls) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_all_nulls)
166+
- `.check.assert_nrows()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_nrows) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_nrows)
166167
- `.check.assert_positive()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_positive) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_positive)
168+
- `.check.assert_same_nrows()`: Check that DataFrame/Series has same number of rows as another DF/Series, for example to validate 1:1 joins - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_same_nrows) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_same_nrows)
167169
- `.check.assert_unique()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_unique) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_unique)
168170

169171
### Visualize data

docs/usage.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -131,12 +131,14 @@ Types:
131131
- `.check.assert_type()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_type) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_type)
132132

133133
Values:
134+
- `.check.assert_all_nulls()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_all_nulls) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_all_nulls)
134135
- `.check.assert_less_than()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_less_than) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_less_than)
135136
- `.check.assert_greater_than()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_greater_than) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_greater_than)
136137
- `.check.assert_negative()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_negative) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_negative)
137138
- `.check.assert_no_nulls()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_no_nulls) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_no_nulls)
138-
- `.check.assert_all_nulls()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_all_nulls) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_all_nulls)
139+
- `.check.assert_nrows()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_nrows) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_nrows)
139140
- `.check.assert_positive()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_positive) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_positive)
141+
- `.check.assert_same_nrows()`: Check that this DataFrame/Series has same number of rows as another DataFrame/Series, for example to validate 1:1 joins - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_same_nrows) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_same_nrows)
140142
- `.check.assert_unique()` - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_unique) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_unique)
141143

142144
### Visualize data

pandas_checks/DataFrameChecks.py

Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -544,6 +544,48 @@ def assert_no_nulls(
544544
)
545545
return self._obj
546546

547+
def assert_nrows(
548+
self,
549+
nrows: int,
550+
fail_message: str = " ㄨ Assert nrows failed ",
551+
pass_message: str = " ✔️ Assert nrows passed ",
552+
raise_exception: bool = True,
553+
exception_to_raise: Type[BaseException] = DataError,
554+
verbose: bool = False,
555+
) -> pd.DataFrame:
556+
"""Tests whether Dataframe has a given number of rows. Optionally raises an exception. Does not modify the DataFrame itself.
557+
558+
Example:
559+
(
560+
iris
561+
.check.assert_nrows(20)
562+
)
563+
564+
# See docs for .check.assert_data() for examples of how to customize assertions
565+
566+
Args:
567+
nrows: The expected number of rows
568+
fail_message: Message to display if the condition fails.
569+
pass_message: Message to display if the condition passes.
570+
raise_exception: Whether to raise an exception if the condition fails.
571+
exception_to_raise: The exception to raise if the condition fails and raise_exception is True.
572+
verbose: Whether to display the pass message if the condition passes.
573+
574+
Returns:
575+
The original DataFrame, unchanged.
576+
"""
577+
578+
self._obj.check.assert_data(
579+
condition=lambda df: df.shape[0] == nrows,
580+
fail_message=fail_message,
581+
pass_message=pass_message,
582+
raise_exception=raise_exception,
583+
exception_to_raise=exception_to_raise,
584+
message_shows_condition=False,
585+
verbose=verbose,
586+
)
587+
return self._obj
588+
547589
def assert_positive(
548590
self,
549591
fail_message: str = " ㄨ Assert positive failed ",
@@ -598,6 +640,53 @@ def assert_positive(
598640
)
599641
return self._obj
600642

643+
def assert_same_nrows(
644+
self,
645+
other: Union[pd.DataFrame, pd.Series],
646+
fail_message: str = " ㄨ Assert same_nrows failed ",
647+
pass_message: str = " ✔️ Assert same_nrows passed ",
648+
raise_exception: bool = True,
649+
exception_to_raise: Type[BaseException] = DataError,
650+
verbose: bool = False,
651+
) -> pd.DataFrame:
652+
"""Tests whether Dataframe has the same number of rows as another DataFrame/Series has.
653+
654+
Optionally raises an exception. Does not modify the DataFrame itself.
655+
656+
Example:
657+
# Validate that an expected one-to-one join didn't add rows due to duplicate keys in the right table.
658+
(
659+
transactions_df
660+
.merge(how="left", right=products_df, on="product_id")
661+
.check.assert_same_nrows(transactions_df, "Left join changed row count! Check for duplicate `product_id` keys in product_df.")
662+
)
663+
664+
# See docs for .check.assert_data() for examples of how to customize assertions
665+
666+
Args:
667+
other: The DataFrame or Series that we expect to have the same # of rows as
668+
fail_message: Message to display if the condition fails.
669+
pass_message: Message to display if the condition passes.
670+
subset: Optional, which column or columns to check the condition against.
671+
raise_exception: Whether to raise an exception if the condition fails.
672+
exception_to_raise: The exception to raise if the condition fails and raise_exception is True.
673+
verbose: Whether to display the pass message if the condition passes.
674+
675+
Returns:
676+
The original DataFrame, unchanged.
677+
"""
678+
679+
self._obj.check.assert_data(
680+
condition=lambda df: df.shape[0] == other.shape[0],
681+
fail_message=fail_message,
682+
pass_message=pass_message,
683+
raise_exception=raise_exception,
684+
exception_to_raise=exception_to_raise,
685+
message_shows_condition=False,
686+
verbose=verbose,
687+
)
688+
return self._obj
689+
601690
def assert_str(
602691
self,
603692
fail_message: Union[str, None] = None,

pandas_checks/SeriesChecks.py

Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -514,6 +514,49 @@ def assert_no_nulls(
514514
)
515515
return self._obj
516516

517+
def assert_nrows(
518+
self,
519+
nrows: int,
520+
fail_message: str = " ㄨ Assert nrows failed ",
521+
pass_message: str = " ✔️ Assert nrows passed ",
522+
raise_exception: bool = True,
523+
exception_to_raise: Type[BaseException] = DataError,
524+
verbose: bool = False,
525+
) -> pd.DataFrame:
526+
"""Tests whether Series has a given number of rows. Optionally raises an exception. Does not modify the Series itself.
527+
528+
Example:
529+
(
530+
iris
531+
["species"]
532+
.check.assert_nrows(20)
533+
)
534+
535+
# See docs for .check.assert_data() for examples of how to customize assertions
536+
537+
Args:
538+
nrows: The expected number of rows
539+
fail_message: Message to display if the condition fails.
540+
pass_message: Message to display if the condition passes.
541+
raise_exception: Whether to raise an exception if the condition fails.
542+
exception_to_raise: The exception to raise if the condition fails and raise_exception is True.
543+
verbose: Whether to display the pass message if the condition passes.
544+
545+
Returns:
546+
The original Series, unchanged.
547+
"""
548+
549+
self._obj.check.assert_data(
550+
condition=lambda s: s.shape[0] == nrows,
551+
fail_message=fail_message,
552+
pass_message=pass_message,
553+
raise_exception=raise_exception,
554+
exception_to_raise=exception_to_raise,
555+
message_shows_condition=False,
556+
verbose=verbose,
557+
)
558+
return self._obj
559+
517560
def assert_positive(
518561
self,
519562
fail_message: str = " ㄨ Assert positive failed ",
@@ -566,6 +609,52 @@ def assert_positive(
566609
)
567610
return self._obj
568611

612+
def assert_same_nrows(
613+
self,
614+
other: Union[pd.DataFrame, pd.Series],
615+
fail_message: str = " ㄨ Assert same_nrows failed ",
616+
pass_message: str = " ✔️ Assert same_nrows passed ",
617+
raise_exception: bool = True,
618+
exception_to_raise: Type[BaseException] = DataError,
619+
verbose: bool = False,
620+
) -> pd.DataFrame:
621+
"""Tests whether Series has the same number of rows as another DataFrame/Series has.
622+
623+
Optionally raises an exception. Does not modify the Series itself.
624+
625+
Example:
626+
(
627+
df1
628+
["column"]
629+
.check.assert_same_nrows(df2)
630+
)
631+
632+
# See docs for .check.assert_data() for examples of how to customize assertions
633+
634+
Args:
635+
other: The DataFrame or Series that we expect to have the same # of rows as
636+
fail_message: Message to display if the condition fails.
637+
pass_message: Message to display if the condition passes.
638+
subset: Optional, which column or columns to check the condition against.
639+
raise_exception: Whether to raise an exception if the condition fails.
640+
exception_to_raise: The exception to raise if the condition fails and raise_exception is True.
641+
verbose: Whether to display the pass message if the condition passes.
642+
643+
Returns:
644+
The original DataFrame, unchanged.
645+
"""
646+
647+
self._obj.check.assert_data(
648+
condition=lambda df: df.shape[0] == other.shape[0],
649+
fail_message=fail_message,
650+
pass_message=pass_message,
651+
raise_exception=raise_exception,
652+
exception_to_raise=exception_to_raise,
653+
message_shows_condition=False,
654+
verbose=verbose,
655+
)
656+
return self._obj
657+
569658
def assert_str(
570659
self,
571660
fail_message: Union[str, None] = None,

0 commit comments

Comments
 (0)