From 531561709e551cca6d4bc8e8ce03980ba438470b Mon Sep 17 00:00:00 2001 From: Date: Thu, 27 Jun 2024 12:39:14 -0400 Subject: [PATCH] Deployed 221c5eb with MkDocs version: 1.6.0 --- API reference/DataFrameChecks/index.html | 3 ++- API reference/SeriesChecks/index.html | 3 ++- search/search_index.json | 2 +- 3 files changed, 5 insertions(+), 3 deletions(-) diff --git a/API reference/DataFrameChecks/index.html b/API reference/DataFrameChecks/index.html index c1681a8..bc3f81c 100644 --- a/API reference/DataFrameChecks/index.html +++ b/API reference/DataFrameChecks/index.html @@ -3181,8 +3181,9 @@

- type + dtype + Type[Any]
diff --git a/API reference/SeriesChecks/index.html b/API reference/SeriesChecks/index.html index c73068a..4b57961 100644 --- a/API reference/SeriesChecks/index.html +++ b/API reference/SeriesChecks/index.html @@ -2496,8 +2496,9 @@

- type + dtype + Type[Any]
diff --git a/search/search_index.json b/search/search_index.json index 841b445..a21104d 100644 --- a/search/search_index.json +++ b/search/search_index.json @@ -1 +1 @@ -{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"About","text":""},{"location":"#introduction","title":"Introduction","text":"

Pandas Checks is a Python library for data science and data engineering. It adds non-invasive health checks for Pandas method chains.

"},{"location":"#what-are-method-chains","title":"What are method chains?","text":"

Method chains are one of the coolest features of the Pandas library! They allow you to write more functional code with fewer intermediate variables and fewer side effects. If you're familiar with R, method chains are Python's version of dplyr pipes.

"},{"location":"#why-use-pandas-checks","title":"Why use Pandas Checks?","text":"

Pandas Checks adds the ability to inspect and validate your Pandas data at any point in the method chain, without modifying the underlying data. Think of Pandas Checks as a drone you can send up to check on your pipeline, whether it's in exploratory data analysis, prototyping, or production.

That way you don't need to chop up a method chain, or create intermediate variables, every time you need to diagnose, treat, or prevent problems with your data processing pipeline.

As Fleetwood Mac says, you would never break the chain.

"},{"location":"#giving-feedback-and-contributing","title":"Giving feedback and contributing","text":"

If you run into trouble or have questions, I'd love to know. Please open an issue.

Contributions are appreciated! Please open an issue or submit a pull request. Pandas Checks uses the wonderful libraries poetry for package and dependency management, nox for test automation, and mkdocs for docs.

"},{"location":"#license","title":"License","text":"

Pandas Checks is licensed under the BSD-3 License.

\ud83d\udc3c\ud83e\ude7a

"},{"location":"usage/","title":"Usage","text":""},{"location":"usage/#installation","title":"Installation","text":"

First make Pandas Check available in your environment.

pip install pandas-checks\n

Then import it in your code. It works in Jupyter, IPython, and Python scripts run from the command line.

import pandas_checks\n

After importing, you don't need to access the pandas_checks module directly.

\ud83d\udca1 Tip: You can import Pandas Checks either before or after your code imports Pandas. Just somewhere. \ud83d\ude01

"},{"location":"usage/#basic-usage","title":"Basic usage","text":"

Pandas Checks adds .check methods to Pandas DataFrames and Series.

Say you have a nice function.

\ndef clean_iris_data(iris: pd.DataFrame) -> pd.DataFrame:\n    \"\"\"Preprocess data about pretty flowers.\n\n    Args:\n        iris: The raw iris dataset.\n\n    Returns:\n        The cleaned iris dataset.\n    \"\"\"\n\n    return (\n        iris\n        .dropna() # Drop rows with any null values\n        .rename(columns={\"FLOWER_SPECIES\": \"species\"}) # Rename a column\n        .query(\"species=='setosa'\") # Filter to rows with a certain value\n    )\n

But what if you want to make the chain more robust? Or see what's happening to the data as it flows down the pipeline? Or understand why your new iris CSV suddenly makes the cleaned data look weird?

You can add some .check steps.

\n(\n    iris\n    .dropna()\n    .rename(columns={\"FLOWER_SPECIES\": \"species\"})\n\n    # Validate assumptions\n    .check.assert_positive(subset=[\"petal_length\", \"sepal_length\"])\n\n    # Plot the distribution of a column after cleaning\n    .check.hist(column='petal_length') \n\n    .query(\"species=='setosa'\")\n\n    # Display the first few rows after cleaning\n    .check.head(3)  \n)\n

The .check methods will display the following results:

The .check methods didn't modify how the iris data is processed by your code. They just let you check the data as it flows down the pipeline. That's the difference between Pandas .head() and Pandas Checks .check.head().

"},{"location":"usage/#features","title":"Features","text":""},{"location":"usage/#check-methods","title":"Check methods","text":"

Here's what's in the doctor's bag.

Describe - Standard Pandas methods: - .check.columns() - DataFrame | Series - .check.dtypes() for DataFrame | .check.dtype() for Series - .check.describe() - DataFrame | Series - .check.head() - DataFrame | Series - .check.info() - DataFrame | Series - .check.memory_usage() - DataFrame | Series - .check.nunique() - DataFrame | Series - .check.shape() - DataFrame | Series - .check.tail() - DataFrame | Series - .check.unique() - DataFrame | Series - .check.value_counts() - DataFrame | Series - New functions in Pandas Checks: - .check.function(): Apply an arbitrary lambda function to your data and see the result - DataFrame | Series - .check.ncols(): Count columns - DataFrame | Series - .check.ndups(): Count rows with duplicate values - DataFrame | Series - .check.nnulls(): Count rows with null values - DataFrame | Series - .check.print(): Print a string, a variable, or the current dataframe - DataFrame | Series

  • Export interim files

    • .check.write(): Export the current data, inferring file format from the name - DataFrame | Series
  • Time your code

    • .check.print_time_elapsed(start_time): Print the execution time since you called start_time = pdc.start_timer() - DataFrame | Series
    • \ud83d\udca1 Tip: You can also use this stopwatch outside a method chain, anywhere in your Python code:

      ```python from pandas_checks import print_elapsed_time, start_timer

      start_time = start_timer() ... print_elapsed_time(start_time) ```

  • Turn off Pandas Checks

    • .check.disable_checks(): Don't run checks, for production mode etc. By default, still runs assertions. - DataFrame | Series
    • .check.enable_checks(): Run checks - DataFrame | Series
  • Validate

    • General
      • .check.assert_data(): Check that data passes an arbitrary condition - DataFrame | Series
    • Types
      • .check.assert_datetime() - DataFrame | Series
      • .check.assert_float() - DataFrame | Series
      • .check.assert_int() - DataFrame | Series
      • .check.assert_str() - DataFrame | Series
      • .check.assert_timedelta() - DataFrame | Series
      • .check.assert_type() - DataFrame | Series
    • Values
      • .check.assert_less_than() - DataFrame | Series
      • .check.assert_greater_than() - DataFrame | Series
      • .check.assert_negative() - DataFrame | Series
      • .check.assert_not_null() - DataFrame | Series
      • .check.assert_null() - DataFrame | Series
      • .check.assert_positive() - DataFrame | Series
      • .check.assert_unique() - DataFrame | Series
  • Visualize

    • .check.hist(): A histogram - DataFrame | Series
    • .check.plot(): An arbitrary plot you can customize - DataFrame | Series
"},{"location":"usage/#customizing-a-check","title":"Customizing a check","text":"

You can use Pandas Checks methods like the regular Pandas methods. They accept the same arguments. For example, you can pass: * .check.head(7) * .check.value_counts(column=\"species\", dropna=False, normalize=True) * .check.plot(kind=\"scatter\", x=\"sepal_width\", y=\"sepal_length\")

Also, most Pandas Checks methods accept 3 additional arguments: 1. check_name: text to display before the result of the check 2. fn: a lambda function that modifies the data displayed by the check 3. subset: limit a check to certain columns

(\n    iris\n    .check.value_counts(column='species', check_name=\"Varieties after data cleaning\")\n    .assign(species=lambda df: df[\"species\"].str.upper()) # Do your regular Pandas data processing, like upper-casing the values in one column\n    .check.head(n=2, fn=lambda df: df[\"petal_width\"]*2) # Modify the data that gets displayed in the check only\n    .check.describe(subset=['sepal_width', 'sepal_length'])  # Only apply the check to certain columns\n)\n

"},{"location":"usage/#configuring-pandas-check","title":"Configuring Pandas Check","text":""},{"location":"usage/#global-configuration","title":"Global configuration","text":"

You can change how Pandas Checks works everywhere. For example:

import pandas_checks as pdc\n\n# Set output precision and turn off the cute emojis\npdc.set_format(precision=3, use_emojis=False)\n\n# Don't run any of the calls to Pandas Checks, globally. \npdc.disable_checks()\n

Run pdc.describe_options() to see the arguments you can pass to .set_format().

\ud83d\udca1 Tip: By default, disable_checks() and enable_checks() do not change whether Pandas Checks will run assertion methods (.check.assert_*).

To turn off assertions too, add the argument enable_asserts=False, such as: disable_checks(enable_asserts=False).

"},{"location":"usage/#local-configuration","title":"Local configuration","text":"

You can also adjust settings within a method chain by bookending the chain, like this:

# Customize format during one method chain\n(\n    iris\n    .check.set_format(precision=7, use_emojis=False)\n    ... # Any .check methods in here will use the new format\n    .check.reset_format() # Restore default format\n)\n\n# Turn off Pandas Checks during one method chain\n(\n    iris\n    .check.disable_checks()\n    ... # Any .check methods in here will not be run\n    .check.enable_checks() # Turn it back on for the next code\n)\n
"},{"location":"usage/#hybrid-eda-production-data-processing","title":"Hybrid EDA-Production data processing","text":"

Exploratory Data Analysis is often taught as a one-time step we do to plan our production data processing. But sometimes EDA is a cyclical process we go back to for deeper inspection during debugging, code edits, or changes in the input data. If explorations were useful in EDA, they may be useful again.

Unfortunately, it's hard to go back to EDA. It's too out of sync. The prod data processing pipeline has usually evolved too much, making the EDA code a historical artifact full of cobwebs that we can't easily fire up again.

But if you use Pandas Checks during EDA, you could roll your .check methods into your first production code. Then in prod mode, disable Pandas Checks when you don't need it, to save compute and streamline output. When you ever need to pull out those EDA tools, enable Pandas Checks globally or locally.

This can make your prod pipline more transparent and easier to inspect.

"},{"location":"API%20reference/DataFrameChecks/","title":"DataFrame methods","text":""},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks._obj","title":"_obj = pandas_obj instance-attribute","text":""},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.__init__","title":"__init__(pandas_obj)","text":""},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_data","title":"assert_data(condition, subset=None, pass_message=' \u2714\ufe0f Assertion passed ', fail_message=' \u3128 Assertion failed ', raise_exception=True, exception_to_raise=DataError, message_shows_condition=True, verbose=False)","text":"

Tests whether Dataframe meets condition. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default condition Callable

Assertion criteria in the form of a lambda function, such as lambda df: df.shape[0]>10.

required subset Union[str, List, None]

Optional, which column or columns to check the condition against. Applied after fn. Subsetting can also be done within the condition, such as lambda df: df['column_name'].sum()>10

None pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assertion passed ' fail_message str

Message to display if the condition fails.

' \u3128 Assertion failed ' raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

DataError message_shows_condition bool

Whether the fail/pass message should also print the assertion criteria

True verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_datetime","title":"assert_datetime(subset=None, pass_message=' \u2714\ufe0f Assert datetime passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)","text":"

Tests whether Dataframe or subset of columns is datetime or timestamp. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default subset Union[str, List, None]

Optional, which column or columns to check the condition against. `

None pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert datetime passed ' fail_message Union[str, None]

Message to display if the condition fails.

None raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

TypeError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_float","title":"assert_float(subset=None, pass_message=' \u2714\ufe0f Assert float passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)","text":"

Tests whether Dataframe or subset of columns is floats. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default subset Union[str, List, None]

Optional, which column or columns to check the condition against. `

None pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert float passed ' fail_message Union[str, None]

Message to display if the condition fails.

None raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

TypeError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_greater_than","title":"assert_greater_than(min, or_equal_to=True, subset=None, pass_message=' \u2714\ufe0f Assert minimum passed ', fail_message=' \u3128 Assert minimum failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Dataframe or subset of columns is > or >= a value. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default min Any

the minimum value to compare DataFrame to. Accepts any type that can be used in >, such as int, float, str, datetime

required or_equal_to bool

whether to test for >= min (True) or > min (False)

True subset Union[str, List, None]

Optional, which column or columns to check the condition against. `

None pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert minimum passed ' fail_message str

Message to display if the condition fails.

' \u3128 Assert minimum failed ' raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

DataError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_int","title":"assert_int(subset=None, pass_message=' \u2714\ufe0f Assert integeer passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)","text":"

Tests whether Dataframe or subset of columns is integers. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default subset Union[str, List, None]

Optional, which column or columns to check the condition against. `

None pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert integeer passed ' fail_message Union[str, None]

Message to display if the condition fails.

None raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

TypeError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_less_than","title":"assert_less_than(max, or_equal_to=True, subset=None, pass_message=' \u2714\ufe0f Assert maximum passed ', fail_message=' \u3128 Assert maximum failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Dataframe or subset of columns is < or <= a value. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default max Any

the max value to compare DataFrame to. Accepts any type that can be used in <, such as int, float, str, datetime

required or_equal_to bool

whether to test for <= min (True) or < max (False)

True subset Union[str, List, None]

Optional, which column or columns to check the condition against. `

None pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert maximum passed ' fail_message str

Message to display if the condition fails.

' \u3128 Assert maximum failed ' raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

DataError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_negative","title":"assert_negative(subset=None, assert_not_null=True, pass_message=' \u2714\ufe0f Assert negative passed ', fail_message=' \u3128 Assert negative failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Dataframe or subset of columns has all negative values. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default subset Union[str, List, None]

Optional, which column or columns to check the condition against.`

None assert_not_null bool

Whether to also enforce that data has no nulls.

True pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert negative passed ' fail_message str

Message to display if the condition fails.

' \u3128 Assert negative failed ' raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

DataError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_not_null","title":"assert_not_null(subset=None, pass_message=' \u2714\ufe0f Assert no nulls passed ', fail_message=' \u3128 Assert no nulls failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Dataframe or subset of columns has no nulls. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default subset Union[str, List, None]

Optional, which column or columns to check the condition against. `

None pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert no nulls passed ' fail_message str

Message to display if the condition fails.

' \u3128 Assert no nulls failed ' raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

DataError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_null","title":"assert_null(subset=None, pass_message=' \u2714\ufe0f Assert all nulls passed ', fail_message=' \u3128 Assert all nulls failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Dataframe or subset of columns has all nulls. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default subset Union[str, List, None]

Optional, which column or columns to check the condition against. `

None pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert all nulls passed ' fail_message str

Message to display if the condition fails.

' \u3128 Assert all nulls failed ' raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

DataError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_positive","title":"assert_positive(subset=None, assert_not_null=True, pass_message=' \u2714\ufe0f Assert positive passed ', fail_message=' \u3128 Assert positive failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Dataframe or subset of columns has all positive values. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default subset Union[str, List, None]

Optional, which column or columns to check the condition against. `

None assert_not_null bool

Whether to also enforce that data has no nulls.

True pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert positive passed ' fail_message str

Message to display if the condition fails.

' \u3128 Assert positive failed ' raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

DataError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_str","title":"assert_str(subset=None, pass_message=' \u2714\ufe0f Assert string passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)","text":"

Tests whether Dataframe or subset of columns is strings. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default subset Union[str, List, None]

Optional, which column or columns to check the condition against. `

None pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert string passed ' fail_message Union[str, None]

Message to display if the condition fails.

None raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

TypeError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_timedelta","title":"assert_timedelta(subset=None, pass_message=' \u2714\ufe0f Assert timedelta passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)","text":"

Tests whether Dataframe or subset of columns is of type timedelta. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default subset Union[str, List, None]

Optional, which column or columns to check the condition against. `

None pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert timedelta passed ' fail_message Union[str, None]

Message to display if the condition fails.

None raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

TypeError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_type","title":"assert_type(dtype, subset=None, pass_message=' \u2714\ufe0f Assert type passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)","text":"

Tests whether Dataframe or subset of columns meets type assumption. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default type

The required variable type

required subset Union[str, List, None]

Optional, which column or columns to check the condition against. `

None pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert type passed ' fail_message Union[str, None]

Message to display if the condition fails.

None raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

TypeError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_unique","title":"assert_unique(subset=None, pass_message=' \u2714\ufe0f Assert unique passed ', fail_message=' \u3128 Assert unique failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Dataframe or subset of columns has no duplicate rows. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default subset Union[str, List, None]

Optional, which column or columns to check the condition against. `

None pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert unique passed ' fail_message str

Message to display if the condition fails.

' \u3128 Assert unique failed ' raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

DataError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.columns","title":"columns(fn=lambda df: df, subset=None, check_name='\ud83c\udfdb\ufe0f Columns')","text":"

Prints the column names of a DataFrame, without modifying the DataFrame itself.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the DataFrame before printing columns. Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string to select a subset of columns before printing their names. Applied after fn.

None check_name Union[str, None]

An optional name for the check to preface the result with.

'\ud83c\udfdb\ufe0f Columns'

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.describe","title":"describe(fn=lambda df: df, subset=None, check_name='\ud83d\udccf Distributions', **kwargs)","text":"

Displays descriptive statistics about a DataFrame without modifying the DataFrame itself.

See Pandas docs for describe() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the DataFrame before running Pandas describe(). Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string to select a subset of columns before running Pandas describe(). Applied after fn.

None check_name Union[str, None]

An optional name for the check to preface the result with.

'\ud83d\udccf Distributions' **kwargs Any

Optional, additional arguments that are accepted by Pandas describe() method.

{}

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.disable_checks","title":"disable_checks(enable_asserts=True)","text":"

Turns off Pandas Checks globally, such as in production mode. Calls to .check functions will not be run. Does not modify the DataFrame itself.

Args enable_assert: Optionally, whether to also enable or disable assert statements

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.dtypes","title":"dtypes(fn=lambda df: df, subset=None, check_name='\ud83d\uddc2\ufe0f Data types')","text":"

Displays the data types of a DataFrame's columns without modifying the DataFrame itself.

See Pandas docs for dtypes for additional usage information.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the DataFrame before running Pandas dtypes. Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string to select a subset of columns before running Pandas .dtypes. Applied after fn.

None check_name Union[str, None]

An optional name for the check to preface the result with.

'\ud83d\uddc2\ufe0f Data types'

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.enable_checks","title":"enable_checks(enable_asserts=True)","text":"

Globally enables Pandas Checks. Subequent calls to .check methods will be run. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default enable_asserts bool

Optionally, whether to globally enable or disable calls to .check.assert_data().

True

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.function","title":"function(fn=lambda df: df, subset=None, check_name=None)","text":"

Applies an arbitrary function on a DataFrame and shows the result, without modifying the DataFrame itself.

Example

.check.function(fn=lambda df: df.shape[0]>10, check_name='Has at least 10 rows?') which will result in 'True' or 'False'

Parameters:

Name Type Description Default fn Callable

A lambda function to apply to the DataFrame. Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string to select a subset of columns before running Pandas describe(). Applied after fn.

None check_name Union[str, None]

An optional name for the check to preface the result with.

None

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.get_mode","title":"get_mode(check_name='\ud83d\udc3c\ud83e\ude7a Pandas Checks mode')","text":"

Displays the current values of Pandas Checks global options enable_checks and enable_asserts. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default check_name Union[str, None]

An optional name for the check. Will be used as a preface the printed result.

'\ud83d\udc3c\ud83e\ude7a Pandas Checks mode'

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.head","title":"head(n=5, fn=lambda df: df, subset=None, check_name=None)","text":"

Displays the first n rows of a DataFrame, without modifying the DataFrame itself.

See Pandas docs for head() for additional usage information.

Parameters:

Name Type Description Default n int

The number of rows to display.

5 fn Callable

An optional lambda function to apply to the DataFrame before running Pandas head(). Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string to select a subset of columns before running Pandas head(). Applied after fn.

None check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.hist","title":"hist(fn=lambda df: df, subset=[], check_name=None, **kwargs)","text":"

Displays a histogram for the DataFrame, without modifying the DataFrame itself.

See Pandas docs for hist() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the DataFrame before running Pandas hist(). Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string to select a subset of columns before running Pandas hist(). Applied after fn.

[] check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None **kwargs Any

Optional, additional arguments that are accepted by Pandas hist() method.

{}

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

Note

If more than one column is passed, displays a grid of histograms

Only renders in interactive mode (IPython/Jupyter), not in terminal

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.info","title":"info(fn=lambda df: df, subset=None, check_name='\u2139\ufe0f Info', **kwargs)","text":"

Displays summary information about a DataFrame, without modifying the DataFrame itself.

See Pandas docs for info() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the DataFrame before running Pandas info(). Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string to select a subset of columns before running Pandas info(). Applied after fn.

None check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

'\u2139\ufe0f Info' **kwargs Any

Optional, additional arguments that are accepted by Pandas info() method.

{}

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.memory_usage","title":"memory_usage(fn=lambda df: df, subset=None, check_name='\ud83d\udcbe Memory usage', **kwargs)","text":"

Displays the memory footprint of a DataFrame, without modifying the DataFrame itself.

See Pandas docs for memory_usage() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the DataFrame before running Pandas memory_usage(). Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string to select a subset of columns before running Pandas memory_usage(). Applied after fn.

None check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

'\ud83d\udcbe Memory usage' **kwargs Any

Optional, additional arguments that are accepted by Pandas info() method.

{}

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

Note

Include argument deep=True to get further memory usage of object dtypes in the DataFrame. See Pandas docs for memory_usage() for more info.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.ncols","title":"ncols(fn=lambda df: df, subset=None, check_name='\ud83c\udfdb\ufe0f Columns')","text":"

Displays the number of columns in a DataFrame, without modifying the DataFrame itself.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the DataFrame before counting the number of columns. Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string to select a subset of columns before counting the number of columns. Applied after fn.

None check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

'\ud83c\udfdb\ufe0f Columns'

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.ndups","title":"ndups(fn=lambda df: df, subset=None, check_name=None, **kwargs)","text":"

Displays the number of duplicated rows in a DataFrame, without modifying the DataFrame itself.

See Pandas docs for duplicated() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the DataFrame before counting the number of duplicates. Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string to select a subset of columns before counting duplicate rows. Applied after fn.

None check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None **kwargs Any

Optional, additional arguments that are accepted by Pandas duplicated() method.

{}

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.nnulls","title":"nnulls(fn=lambda df: df, subset=None, by_column=True, check_name='\ud83d\udc7b Rows with NaNs')","text":"

Displays the number of rows with null values in a DataFrame, without modifying the DataFrame itself.

See Pandas docs for isna() for additional usage information.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the DataFrame before counting the number of rows with a null. Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string to select a subset of columns before counting nulls.

None by_column bool

If True, count null values with each column separately. If False, count rows with a null value in any column. Applied after fn.

True check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

'\ud83d\udc7b Rows with NaNs'

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.nrows","title":"nrows(fn=lambda df: df, subset=None, check_name='\u2630 Rows')","text":"

Displays the number of rows in a DataFrame, without modifying the DataFrame itself.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the DataFrame before counting the number of rows. Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string name of one column to limit which columns are considered when counting rows. Applied after fn.

None check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

'\u2630 Rows'

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.nunique","title":"nunique(column, fn=lambda df: df, check_name=None, **kwargs)","text":"

Displays the number of unique rows in a single column, without modifying the DataFrame itself.

See Pandas docs for nunique() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default column str

The name of a column to count uniques in. Applied after fn.

required fn Callable

An optional lambda function to apply to the DataFrame before running Pandas nunique(). Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None **kwargs Any

Optional, additional arguments that are accepted by Pandas nunique() method.

{}

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.plot","title":"plot(fn=lambda df: df, subset=None, check_name='', **kwargs)","text":"

Displays a plot of the DataFrame, without modifying the DataFrame itself.

See Pandas docs for plot() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the DataFrame before running Pandas plot(). Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string name of one column to limit which columns are plotted. Applied after fn.

None check_name Union[str, None]

An optional title for the plot.

'' **kwargs Any

Optional, additional arguments that are accepted by Pandas plot() method.

{}

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

Note

Plots are only displayed when code is run in IPython/Jupyter, not in terminal.

If you pass a 'title' kwarg, it becomes the plot title, overriding check_name

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.print","title":"print(object=None, fn=lambda df: df, subset=None, check_name=None, max_rows=10)","text":"

Displays text, another object, or (by default) the current DataFrame's head. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default object Any

Object to print. Can be anything printable: str, int, list, another DataFrame, etc. If None, print the DataFrame's head (with max_rows rows).

None fn Callable

An optional lambda function to apply to the DataFrame before printing object. Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string name of one column to limit which columns are printed. Applied after fn.

None check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None max_rows int

Maximum number of rows to print if object=None.

10

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.print_time_elapsed","title":"print_time_elapsed(start_time, lead_in='Time elapsed', units='auto')","text":"

Displays the time elapsed since start_time.

Parameters:

Name Type Description Default start_time float

The index time when the stopwatch started, which comes from the Pandas Checks start_timer()

required lead_in Union[str, None]

Optional text to print before the elapsed time.

'Time elapsed' units str

The units in which to display the elapsed time. Can be \"auto\", \"seconds\", \"minutes\", or \"hours\".

'auto'

Raises:

Type Description ValueError

If units is not one of \"auto\", \"seconds\", \"minutes\", or \"hours\".

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.reset_format","title":"reset_format()","text":"

Globally restores all Pandas Checks formatting options to their default \"factory\" settings. Does not modify the DataFrame itself.

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.set_format","title":"set_format(**kwargs)","text":"

Configures selected formatting options for Pandas Checks. Does not modify the DataFrame itself.

Run pandas_checks.describe_options() to see a list of available options.

For example, .check.set_format(check_text_tag= \"h1\", use_emojis=False`) will globally change Pandas Checks to display text results as H1 headings and remove all emojis.

Parameters:

Name Type Description Default **kwargs Any

Pairs of setting name and its new value.

{}

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.set_mode","title":"set_mode(enable_checks, enable_asserts)","text":"

Configures the operation mode for Pandas Checks globally. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default enable_checks bool

Whether to run any Pandas Checks methods globally. Does not affect .check.assert_data().

required enable_asserts bool

Whether to run calls to Pandas Checks .check.assert_data() statements globally.

required

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.shape","title":"shape(fn=lambda df: df, subset=None, check_name='\ud83d\udcd0 Shape')","text":"

Displays the Dataframe's dimensions, without modifying the DataFrame itself.

See Pandas docs for shape for additional usage information.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the DataFrame before running Pandas shape. Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string name of one column to limit which columns are considered when printing the shape. Applied after fn.

None check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

'\ud83d\udcd0 Shape'

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

Note

See also .check.nrows() and .check.ncols()

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.tail","title":"tail(n=5, fn=lambda df: df, subset=None, check_name=None)","text":"

Displays the last n rows of the DataFrame, without modifying the DataFrame itself.

See Pandas docs for tail() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default n int

Number of rows to show.

5 fn Callable

An optional lambda function to apply to the DataFrame before running Pandas tail(). Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string name of one column to limit which columns are displayed. Applied after fn.

None check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.unique","title":"unique(column, fn=lambda df: df, check_name=None)","text":"

Displays the unique values in a column, without modifying the DataFrame itself.

See Pandas docs for unique() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default column str

Column to check for unique values.

required fn Callable

An optional lambda function to apply to the DataFrame before calling Pandas unique(). Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

Note

fn is applied to the dataframe before selecting column. If you want to select the column before modifying it, set column=None and start fn with a column selection, i.e. fn=lambda df: df[\"my_column\"].stuff()

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.value_counts","title":"value_counts(column, fn=lambda df: df, max_rows=10, check_name=None, **kwargs)","text":"

Displays the value counts for a column, without modifying the DataFrame itself.

See Pandas docs for value_counts() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default column str

Column to check for value counts.

required max_rows int

Maximum number of rows to show in the value counts.

10 fn Callable

An optional lambda function to apply to the DataFrame before running Pandas value_counts(). Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None **kwargs Any

Optional, additional arguments that are accepted by Pandas value_counts() method.

{}

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

Note

fn is applied to the dataframe before selecting column. If you want to select the column before modifying it, set column=None and start fn with a column selection, i.e. fn=lambda df: df[\"my_column\"].stuff()

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.write","title":"write(path, format=None, fn=lambda df: df, subset=None, verbose=False, **kwargs)","text":"

Exports DataFrame to file, without modifying the DataFrame itself.

Format is inferred from path extension like .csv.

This functions uses the corresponding Pandas export function such as to_csv(). See Pandas docs for those functions for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default path str

Path to write the file to.

required format Union[str, None]

Optional file format to force for the export. If None, format is inferred from the file's extension in path.

None fn Callable

An optional lambda function to apply to the DataFrame before exporting. Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string name of one column to limit which columns are exported. Applied after fn.

None verbose bool

Whether to print a message when the file is written.

False **kwargs Any

Optional, additional keyword arguments to pass to the Pandas export function (.to_csv).

{}

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

Note

Exporting to some formats such as Excel, Feather, and Parquet may require you to install additional packages.

"},{"location":"API%20reference/SeriesChecks/","title":"Series methods","text":""},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks._obj","title":"_obj = pandas_obj instance-attribute","text":""},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.__init__","title":"__init__(pandas_obj)","text":""},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_data","title":"assert_data(condition, pass_message=' \u2714\ufe0f Assertion passed ', fail_message=' \u3128 Assertion failed ', raise_exception=True, exception_to_raise=DataError, message_shows_condition=True, verbose=False)","text":"

Tests whether Series meets condition. Optionally raises an exception. Does not modify the Series itself.

Parameters:

Name Type Description Default condition Callable

Assertion criteria in the form of a lambda function, such as lambda s: s.shape[0]>10.

required pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assertion passed ' fail_message str

Message to display if the condition fails.

' \u3128 Assertion failed ' raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

DataError message_shows_condition bool

Whether the fail/pass message should also print the assertion criteria

True verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_datetime","title":"assert_datetime(pass_message=' \u2714\ufe0f Assert datetime passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)","text":"

Tests whether Series is datetime or timestamp. Optionally raises an exception. Does not modify the Series itself.

Parameters:

Name Type Description Default pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert datetime passed ' fail_message Union[str, None]

Message to display if the condition fails.

None raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

TypeError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_float","title":"assert_float(pass_message=' \u2714\ufe0f Assert float passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)","text":"

Tests whether Series is floats. Optionally raises an exception. Does not modify the Series itself.

Parameters:

Name Type Description Default pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert float passed ' fail_message Union[str, None]

Message to display if the condition fails.

None raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

TypeError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_greater_than","title":"assert_greater_than(min, or_equal_to=True, pass_message=' \u2714\ufe0f Assert minimum passed ', fail_message=' \u3128 Assert minimum failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Series is > or >= a value. Optionally raises an exception. Does not modify the Series itself.

Parameters:

Name Type Description Default min Any

the minimum value to compare Series to. Accepts any type that can be used in >, such as int, float, str, datetime

required or_equal_to bool

whether to test for >= min (True) or > min (False)

True pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert minimum passed ' fail_message str

Message to display if the condition fails.

' \u3128 Assert minimum failed ' raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

DataError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_int","title":"assert_int(pass_message=' \u2714\ufe0f Assert integeer passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)","text":"

Tests whether Series is integers. Optionally raises an exception. Does not modify the Series itself.

Args:

pass_message: Message to display if the condition passes.\nfail_message: Message to display if the condition fails.\nraise_exception: Whether to raise an exception if the condition fails.\nexception_to_raise: The exception to raise if the condition fails and raise_exception is True.\nverbose: Whether to display the pass message if the condition passes.\n

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_less_than","title":"assert_less_than(max, or_equal_to=True, pass_message=' \u2714\ufe0f Assert maximum passed ', fail_message=' \u3128 Assert maximum failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Series is < or <= a value. Optionally raises an exception. Does not modify the Series itself.

Parameters:

Name Type Description Default max Any

the max value to compare Series to. Accepts any type that can be used in <, such as int, float, str, datetime

required or_equal_to bool

whether to test for <= min (True) or < max (False)

True pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert maximum passed ' fail_message str

Message to display if the condition fails.

' \u3128 Assert maximum failed ' raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

DataError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_negative","title":"assert_negative(assert_not_null=True, pass_message=' \u2714\ufe0f Assert negative passed ', fail_message=' \u3128 Assert negative failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Series has all negative values. Optionally raises an exception. Does not modify the Series itself.

Parameters:

Name Type Description Default assert_not_null bool

Whether to also enforce that data has no nulls.

True pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert negative passed ' fail_message str

Message to display if the condition fails.

' \u3128 Assert negative failed ' raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

DataError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_not_null","title":"assert_not_null(pass_message=' \u2714\ufe0f Assert no nulls passed ', fail_message=' \u3128 Assert no nulls failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Series has no nulls. Optionally raises an exception. Does not modify the Series itself.

Args:

pass_message: Message to display if the condition passes.\nfail_message: Message to display if the condition fails.\nraise_exception: Whether to raise an exception if the condition fails.\nexception_to_raise: The exception to raise if the condition fails and raise_exception is True.\nverbose: Whether to display the pass message if the condition passes.\n

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_null","title":"assert_null(pass_message=' \u2714\ufe0f Assert all nulls passed ', fail_message=' \u3128 Assert all nulls failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Series has all nulls. Optionally raises an exception. Does not modify the Series itself.

Args:

pass_message: Message to display if the condition passes.\nfail_message: Message to display if the condition fails.\nraise_exception: Whether to raise an exception if the condition fails.\nexception_to_raise: The exception to raise if the condition fails and raise_exception is True.\nverbose: Whether to display the pass message if the condition passes.\n

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_positive","title":"assert_positive(assert_not_null=True, pass_message=' \u2714\ufe0f Assert positive passed ', fail_message=' \u3128 Assert positive failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Series has all positive values. Optionally raises an exception. Does not modify the Series itself.

Args:

assert_not_null: Whether to also enforce that data has no nulls.\npass_message: Message to display if the condition passes.\nfail_message: Message to display if the condition fails.\nraise_exception: Whether to raise an exception if the condition fails.\nexception_to_raise: The exception to raise if the condition fails and raise_exception is True.\nverbose: Whether to display the pass message if the condition passes.\n

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_str","title":"assert_str(pass_message=' \u2714\ufe0f Assert string passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)","text":"

Tests whether Series is strings. Optionally raises an exception. Does not modify the Series itself.

Args:

pass_message: Message to display if the condition passes.\nfail_message: Message to display if the condition fails.\nraise_exception: Whether to raise an exception if the condition fails.\nexception_to_raise: The exception to raise if the condition fails and raise_exception is True.\nverbose: Whether to display the pass message if the condition passes.\n

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_timedelta","title":"assert_timedelta(pass_message=' \u2714\ufe0f Assert timedelta passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)","text":"

Tests whether Series is of type timedelta. Optionally raises an exception. Does not modify the Series itself.

Args:

pass_message: Message to display if the condition passes.\nfail_message: Message to display if the condition fails.\nraise_exception: Whether to raise an exception if the condition fails.\nexception_to_raise: The exception to raise if the condition fails and raise_exception is True.\nverbose: Whether to display the pass message if the condition passes.\n

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_type","title":"assert_type(dtype, pass_message=' \u2714\ufe0f Assert type passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)","text":"

Tests whether Series meets type assumption. Optionally raises an exception. Does not modify the Series itself.

Parameters:

Name Type Description Default type

The required variable type

required pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert type passed ' fail_message Union[str, None]

Message to display if the condition fails.

None raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

TypeError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_unique","title":"assert_unique(pass_message=' \u2714\ufe0f Assert unique passed ', fail_message=' \u3128 Assert unique failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Series has no duplicate rows. Optionally raises an exception. Does not modify the Series itself.

Args:

pass_message: Message to display if the condition passes.\nfail_message: Message to display if the condition fails.\nraise_exception: Whether to raise an exception if the condition fails.\nexception_to_raise: The exception to raise if the condition fails and raise_exception is True.\nverbose: Whether to display the pass message if the condition passes.\n

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.describe","title":"describe(fn=lambda s: s, check_name='\ud83d\udccf Distribution', **kwargs)","text":"

Displays descriptive statistics about a Series, without modifying the Series itself.

See Pandas docs for describe() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the Series before running Pandas describe(). Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check to preface the result with.

'\ud83d\udccf Distribution' **kwargs Any

Optional, additional arguments that are accepted by Pandas describe() method.

{}

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.disable_checks","title":"disable_checks(enable_asserts=True)","text":"

Turns off Pandas Checks globally, such as in production mode. Calls to .check functions will not be run. Does not modify the Series itself.

Args enable_assert: Optionally, whether to also enable or disable assert statements

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.dtype","title":"dtype(fn=lambda s: s, check_name='\ud83d\uddc2\ufe0f Data type')","text":"

Displays the data type of a Series, without modifying the Series itself.

See Pandas docs for .dtype for additional usage information.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the Series before running Pandas dtype. Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check to preface the result with.

'\ud83d\uddc2\ufe0f Data type'

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.enable_checks","title":"enable_checks(enable_asserts=True)","text":"

Globally enables Pandas Checks. Subequent calls to .check methods will be run. Does not modify the Series itself.

Parameters:

Name Type Description Default enable_asserts bool

Optionally, whether to globally enable or disable calls to .check.assert_data().

True

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.function","title":"function(fn=lambda s: s, check_name=None)","text":"

Applies an arbitrary function on a Series and shows the result, without modifying the Series itself.

Example

.check.function(fn=lambda s: s.shape[0]>10, check_name='Has at least 10 rows?') which will result in 'True' or 'False'

Parameters:

Name Type Description Default fn Callable

The lambda function to apply to the Series. Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check to preface the result with.

None

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.get_mode","title":"get_mode(check_name='\u2699\ufe0f Pandas Checks mode')","text":"

Displays the current values of Pandas Checks global options enable_checks and enable_asserts. Does not modify the Series itself.

Parameters:

Name Type Description Default check_name Union[str, None]

An optional name for the check. Will be used as a preface the printed result.

'\u2699\ufe0f Pandas Checks mode'

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.head","title":"head(n=5, fn=lambda s: s, check_name=None)","text":"

Displays the first n rows of a Series, without modifying the Series itself.

See Pandas docs for head() for additional usage information.

Parameters:

Name Type Description Default n int

The number of rows to display.

5 fn Callable

An optional lambda function to apply to the Series before running Pandas head(). Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.hist","title":"hist(fn=lambda s: s, check_name=None, **kwargs)","text":"

Displays a histogram for the Series's distribution, without modifying the Series itself.

See Pandas docs for hist() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the Series before running Pandas head(). Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None **kwargs Any

Optional, additional arguments that are accepted by Pandas hist() method.

{}

Returns:

Type Description Series

The original Series, unchanged.

Note

Plots are only displayed when code is run in IPython/Jupyter, not in terminal.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.info","title":"info(fn=lambda s: s, check_name='\u2139\ufe0f Series info', **kwargs)","text":"

Displays summary information about a Series, without modifying the Series itself.

See Pandas docs for info() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the Series before running Pandas info(). Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

'\u2139\ufe0f Series info' **kwargs Any

Optional, additional arguments that are accepted by Pandas info() method.

{}

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.memory_usage","title":"memory_usage(fn=lambda s: s, check_name='\ud83d\udcbe Memory usage', **kwargs)","text":"

Displays the memory footprint of a Series, without modifying the Series itself.

See Pandas docs for memory_usage() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the Series before running Pandas memory_usage(). Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

'\ud83d\udcbe Memory usage' **kwargs Any

Optional, additional arguments that are accepted by Pandas memory_usage() method.

{}

Returns:

Type Description Series

The original Series, unchanged.

Note

Include argument deep=True to get further memory usage of object dtypes. See Pandas docs for memory_usage() for more info.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.ndups","title":"ndups(fn=lambda s: s, check_name=None, **kwargs)","text":"

Displays the number of duplicated rows in the Series, without modifying the Series itself.

See Pandas docs for duplicated() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the Series before counting the number of duplicates. Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None **kwargs Any

Optional, additional arguments that are accepted by Pandas duplicated() method.

{}

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.nnulls","title":"nnulls(fn=lambda s: s, check_name='\ud83d\udc7b Rows with NaNs')","text":"

Displays the number of rows with null values in the Series, without modifying the Series itself.

See Pandas docs for isna() for additional usage information.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the Series before counting rows with nulls. Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

'\ud83d\udc7b Rows with NaNs'

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.nrows","title":"nrows(fn=lambda s: s, check_name='\u2630 Rows')","text":"

Displays the number of rows in a Series, without modifying the Series itself.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the Series before counting the number of rows. Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

'\u2630 Rows'

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.nunique","title":"nunique(fn=lambda s: s, check_name=None, **kwargs)","text":"

Displays the number of unique rows in a Series, without modifying the Series itself.

See Pandas docs for nunique() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the Series before running Pandas nunique(). Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None **kwargs Any

Optional, additional arguments that are accepted by Pandas nunique() method.

{}

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.plot","title":"plot(fn=lambda s: s, check_name='', **kwargs)","text":"

Displays a plot of the Series, without modifying the Series itself.

See Pandas docs for plot() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the Series before running Pandas plot(). Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional title for the plot.

'' **kwargs Any

Optional, additional arguments that are accepted by Pandas plot() method.

{}

Returns:

Type Description Series

The original Series, unchanged.

Note

Plots are only displayed when code is run in IPython/Jupyter, not in terminal.

If you pass a 'title' kwarg, it becomes the plot title, overriding check_name

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.print","title":"print(object=None, fn=lambda s: s, check_name=None, max_rows=10)","text":"

Displays text, another object, or (by default) the current DataFrame's head. Does not modify the Series itself.

Parameters:

Name Type Description Default object Any

Object to print. Can be anything printable: str, int, list, another DataFrame, etc. If None, print the Series's head (with max_rows rows).

None fn Callable

An optional lambda function to apply to the Series before printing object. Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None max_rows int

Maximum number of rows to print if object=None.

10

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.print_time_elapsed","title":"print_time_elapsed(start_time, lead_in='Time elapsed', units='auto')","text":"

Displays the time elapsed since start_time.

Args: start_time: The index time when the stopwatch started, which comes from the Pandas Checks start_timer() lead_in: Optional text to print before the elapsed time. units: The units in which to display the elapsed time. Can be \"auto\", \"seconds\", \"minutes\", or \"hours\".

Raises:

Type Description ValueError

If units is not one of \"auto\", \"seconds\", \"minutes\", or \"hours\".

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.reset_format","title":"reset_format()","text":"

Globally restores all Pandas Checks formatting options to their default \"factory\" settings. Does not modify the Series itself.

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.set_format","title":"set_format(**kwargs)","text":"

Configures selected formatting options for Pandas Checks. Run pandas_checks.describe_options() to see a list of available options. Does not modify the Series itself

For example, .check.set_format(check_text_tag= \"h1\", use_emojis=False`) will globally change Pandas Checks to display text results as H1 headings and remove all emojis.

Parameters:

Name Type Description Default **kwargs Any

Pairs of setting name and its new value.

{}

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.set_mode","title":"set_mode(enable_checks, enable_asserts)","text":"

Configures the operation mode for Pandas Checks globally. Does not modify the Series itself.

Parameters:

Name Type Description Default enable_checks bool

Whether to run any Pandas Checks methods globally. Does not affect .check.assert_data().

required enable_asserts bool

Whether to run calls to Pandas Checks .check.assert_data() globally.

required

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.shape","title":"shape(fn=lambda s: s, check_name='\ud83d\udcd0 Shape')","text":"

Displays the Series's dimensions, without modifying the Series itself.

See Pandas docs for shape for additional usage information.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the Series before running Pandas shape. Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

'\ud83d\udcd0 Shape'

Returns:

Type Description Series

The original Series, unchanged.

Note

See also .check.nrows()

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.tail","title":"tail(n=5, fn=lambda s: s, check_name=None)","text":"

Displays the last n rows of the Series, without modifying the Series itself.

See Pandas docs for tail() for additional usage information.

Parameters:

Name Type Description Default n int

Number of rows to show.

5 fn Callable

An optional lambda function to apply to the Series before running Pandas tail(). Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.unique","title":"unique(fn=lambda s: s, check_name=None)","text":"

Displays the unique values in a Series, without modifying the Series itself.

See Pandas docs for unique() for additional usage information.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the Series before running Pandas unique(). Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.value_counts","title":"value_counts(fn=lambda s: s, max_rows=10, check_name=None, **kwargs)","text":"

Displays the value counts for a Series, without modifying the Series itself.

See Pandas docs for value_counts() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default max_rows int

Maximum number of rows to show in the value counts.

10 fn Callable

An optional lambda function to apply to the Series before running Pandas value_counts(). Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None **kwargs Any

Optional, additional arguments that are accepted by Pandas value_counts() method.

{}

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.write","title":"write(path, format=None, fn=lambda s: s, verbose=False, **kwargs)","text":"

Exports Series to file, without modifying the Series itself.

Format is inferred from path extension like .csv.

This functions uses the corresponding Pandas export function such as to_csv(). See Pandas docs for those functions for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default path str

Path to write the file to.

required format Union[str, None]

Optional file format to force for the export. If None, format is inferred from the file's extension in path.

None fn Callable

An optional lambda function to apply to the Series before exporting. Example: lambda s: s.dropna().

lambda s: s verbose bool

Whether to print a message when the file is written.

False **kwargs Any

Optional, additional keyword arguments to pass to the Pandas export function (.to_csv).

{}

Returns:

Type Description Series

The original Series, unchanged.

Note

Exporting to some formats such as Excel, Feather, and Parquet may require you to install additional packages.

"},{"location":"API%20reference/display/","title":"Display","text":"

Utilities for displaying text, tables, and plots in Pandas Checks in both terminal and IPython/Jupyter environments.

"},{"location":"API%20reference/display/#pandas_checks.display._display_check","title":"_display_check(data, name=None)","text":"

Renders the result of a Pandas Checks method.

Parameters:

Name Type Description Default data Any

The data to display.

required name Union[str, None]

The optional name of the check.

None

Returns:

Type Description None

None

"},{"location":"API%20reference/display/#pandas_checks.display._display_line","title":"_display_line(line, lead_in=None, colors={})","text":"

Displays a line of text with optional formatting.

Parameters:

Name Type Description Default line str

The text to display.

required lead_in Union[str, None]

The optional text to display before the main text.

None colors Dict

An optional dictionary containing color options for the text and lead-in text. See syntax in docstring for _render_text().

{}

Returns:

Type Description None

None

"},{"location":"API%20reference/display/#pandas_checks.display._display_plot","title":"_display_plot()","text":"

Renders the active Pandas Checks matplotlib plot object in an IPython/Jupyter environment with an optional indent.

Returns:

Type Description None

None

Note

It assumes the plot has already been drawn by another function, such as with .plot() or .hist().

"},{"location":"API%20reference/display/#pandas_checks.display._display_plot_title","title":"_display_plot_title(line, lead_in=None, colors={})","text":"

Displays a plot title with optional formatting.

Parameters:

Name Type Description Default line str

The title text to display.

required lead_in Union[str, None]

Optional text to display before the title.

None colors Dict

An optional dictionary containing color settings for the text and lead-in text. See details in docstring for _render_text().

{}

Returns:

Type Description None

None

"},{"location":"API%20reference/display/#pandas_checks.display._display_table","title":"_display_table(table)","text":"

Renders a Pandas DataFrame or Series in an IPython/Jupyter environment with an optional indent.

Parameters:

Name Type Description Default table Union[DataFrame, Series]

The DataFrame or Series to display.

required

Returns:

Type Description None

None

"},{"location":"API%20reference/display/#pandas_checks.display._display_table_title","title":"_display_table_title(line, lead_in=None, colors={})","text":"

Displays a table title with optional formatting.

Parameters:

Name Type Description Default line str

The title text to display.

required lead_in Union[str, None]

Optional text to display before the title.

None colors Dict

An optiona dictionary containing color options for the text and lead-in text. See details in docstring for _render_text()

{}

Returns:

Type Description None

None

"},{"location":"API%20reference/display/#pandas_checks.display._filter_emojis","title":"_filter_emojis(text)","text":"

Removes emojis from text if user has globally forbidden them.

Parameters:

Name Type Description Default text str

The text to filter emojis from.

required

Returns:

Type Description str

The text with emojis removed if the user's global settings do not allow emojis. Else, the original text.

"},{"location":"API%20reference/display/#pandas_checks.display._format_background_color","title":"_format_background_color(color)","text":"

Applies a background color to text used being displayed in the terminal.

Parameters:

Name Type Description Default color str

The background color to format. See syntax in docstring for _render_text().

required

Returns:

Type Description str

The formatted background color.

"},{"location":"API%20reference/display/#pandas_checks.display._lead_in","title":"_lead_in(lead_in, foreground, background)","text":"

Formats a lead-in text with colors.

Parameters:

Name Type Description Default lead_in Union[str, None]

The lead-in text to format.

required foreground str

The foreground color for the lead-in text. See syntax in docstring for _render_text().

required background str

The background color for the lead-in text. See syntax in docstring for _render_text().

required

Returns:

Type Description str

The formatted lead-in text.

"},{"location":"API%20reference/display/#pandas_checks.display._print_table_terminal","title":"_print_table_terminal(table)","text":"

Prints a Pandas table in a terminal with an optional indent.

Parameters:

Name Type Description Default table Union[DataFrame, Series]

A DataFrame or Series.

required

Returns:

Type Description None

None

"},{"location":"API%20reference/display/#pandas_checks.display._render_html_with_indent","title":"_render_html_with_indent(object_as_html)","text":"

Renders HTML with an optional indent.

Parameters:

Name Type Description Default object_as_html str

The HTML to render.

required

Returns:

Type Description None

None

"},{"location":"API%20reference/display/#pandas_checks.display._render_text","title":"_render_text(text, tag, lead_in=None, colors={})","text":"

Renders text with optional formatting.

Parameters:

Name Type Description Default text str

The text to render.

required tag str

The HTML tag to use for rendering.

required lead_in Union[str, None]

Optional text to display before the main text.

None colors Dict

Optional colors for the text and lead-in text. Keys include: - text_color: The foreground color of the main text. - text_background_color: The background or highlight color of the main text. - lead_in_text_color: The foreground color of lead-in text. - lead_in_background_color: The background color of lead-in text. Color values are phrased such as \"blue\" or \"white\". They are passed to either HTML for Jupyter/IPython outputs and to termcolor when code is run in terminal. For color options when code is run in terminal, see https://github.com/termcolor/termcolor.

{}

Returns:

Type Description None

None

"},{"location":"API%20reference/display/#pandas_checks.display._warning","title":"_warning(message, lead_in='\ud83d\udc3c\ud83e\ude7a Pandas Checks warning', clean_type=False)","text":"

Displays a warning message.

Parameters:

Name Type Description Default message str

The warning message to display.

required lead_in str

Optional lead-in text to display before the warning message.

'\ud83d\udc3c\ud83e\ude7a Pandas Checks warning' clean_type bool

Optional flag to remove the class type from the message, when running .check.dtype().

False

Returns:

Type Description None

None

"},{"location":"API%20reference/options/","title":"Options","text":"

Utilities for configuring Pandas Checks options.

This module provides functions for setting and managing global options for Pandas Checks, including formatting and disabling checks and assertions.

"},{"location":"API%20reference/options/#pandas_checks.options._initialize_format_options","title":"_initialize_format_options(options=None)","text":"

Initializes or resets Pandas Checks formatting options.

Parameters:

Name Type Description Default options Union[List[str], None]

A list of option names to initialize or reset. If None, all formatting options will be initialized or reset.

None

Returns: None

Note

We separate this function from _initialize_options() so user can reset just formatting without changing mode

"},{"location":"API%20reference/options/#pandas_checks.options._initialize_options","title":"_initialize_options()","text":"

Initializes (or resets) all Pandas Checks options to their default values.

Returns:

Type Description None

None

Note

We separate this function from _initialize_format_options() so user can reset just formatting if desired without changing mode

"},{"location":"API%20reference/options/#pandas_checks.options._register_option","title":"_register_option(name, default_value, description, validator)","text":"

Registers a Pandas Checks option in the global Pandas context manager.

If the option has already been registered, reset its value.

This method enables setting global formatting for Pandas Checks results and storing variables that will persist across Pandas method chains, which return newly initialized DataFrames at each method (and so reset the DataFrame's attributes).

Parameters:

Name Type Description Default name str

The name of the option to register.

required default_value Any

The default value for the option.

required description str

A description of the option.

required validator Callable

A function to validate the option value.

required

Returns:

Type Description None

None

Note

For more details on the arguments, see the documentation for pandas._config.config.register_option()

"},{"location":"API%20reference/options/#pandas_checks.options._set_option","title":"_set_option(option, value)","text":"

Updates the value of a Pandas Checks option in the global Pandas context manager.

Parameters:

Name Type Description Default option str

The name of the option to set.

required value Any

The value to set for the option.

required

Returns:

Type Description None

None

Raises:

Type Description AttributeError

If the option is not a valid Pandas Checks option.

"},{"location":"API%20reference/options/#pandas_checks.options.describe_options","title":"describe_options()","text":"

Prints all global options for Pandas Checks, their default values, and current values.

Returns:

Type Description None

None

"},{"location":"API%20reference/options/#pandas_checks.options.disable_checks","title":"disable_checks(enable_asserts=True)","text":"

Turns off all calls to Pandas Checks methods and optionally enables or disables check.assert_data(). Does not modify the DataFrame itself.

If this function is called, subequent calls to .check functions will not be run.

Typically used to 1) Globally switch off Pandas Checks, such as during production. or 2) Temporarily switch off Pandas Checks, such as for a stable part of a notebook.

Parameters:

Name Type Description Default enable_asserts bool

Whether to also run calls to Pandas Checks .check.assert_data()

True

Returns:

Type Description None

None

"},{"location":"API%20reference/options/#pandas_checks.options.enable_checks","title":"enable_checks(enable_asserts=True)","text":"

Turns on Pandas Checks globally. Subsequent calls to .check methods will be run.

Parameters:

Name Type Description Default enable_asserts bool

Whether to also enable or disable check.assert_data().

True

Returns:

Type Description None

None

"},{"location":"API%20reference/options/#pandas_checks.options.get_mode","title":"get_mode()","text":"

Returns whether Pandas Checks is currently running checks and assertions.

Returns:

Type Description Dict[str, bool]

A dictionary containing the current settings.

"},{"location":"API%20reference/options/#pandas_checks.options.reset_format","title":"reset_format()","text":"

Globally restores all Pandas Checks formatting options to their default \"factory\" settings.

Returns:

Type Description None

None

"},{"location":"API%20reference/options/#pandas_checks.options.set_format","title":"set_format(**kwargs)","text":"

Configures selected formatting options for Pandas Checks. Run pandas_checks.describe_options() to see a list of available options.

For example, set_format(check_text_tag= \"h1\", use_emojis=False`) will globally change Pandas Checks to display text results as H1 headings and remove all emojis.

Returns:

Type Description None

None

Parameters:

Name Type Description Default **kwargs Any

Pairs of setting name and its new value.

{}"},{"location":"API%20reference/options/#pandas_checks.options.set_mode","title":"set_mode(enable_checks, enable_asserts)","text":"

Configures the operation mode for Pandas Checks globally.

Parameters:

Name Type Description Default enable_checks bool

Whether to run any Pandas Checks methods globally. Does not affect .check.assert_data().

required enable_asserts bool

Whether to run calls to .check.assert_data() globally.

required

Returns:

Type Description None

None

"},{"location":"API%20reference/run_checks/","title":"Run checks","text":"

Utilities for running Pandas Checks data checks.

"},{"location":"API%20reference/run_checks/#pandas_checks.run_checks._apply_modifications","title":"_apply_modifications(data, fn=lambda df: df, subset=None)","text":"

Applies user's modifications to a data object.

Parameters:

Name Type Description Default data Any

May be any Pandas DataFrame, Series, string, or other variable

required fn Callable

An optional lambda function to modify data

lambda df: df subset Union[str, List, None]

Columns to subset after applying modifications

None

Returns:

Type Description Any

Modified and optionally subsetted data object. If all arguments are defaults, data is returned unchanged.

"},{"location":"API%20reference/run_checks/#pandas_checks.run_checks._check_data","title":"_check_data(data, check_fn=lambda df: df, modify_fn=lambda df: df, subset=None, check_name=None)","text":"

Runs a selected check on a data object

Parameters:

Name Type Description Default data Any

A Pandas DataFrame, Series, string, or other variable

required check_fn Callable

Function to apply to data for checking. For example if we're running .check.value_counts(), this function would appply the Pandas value_counts() method

lambda df: df modify_fn Callable

Optional function to modify data before checking

lambda df: df subset Union[str, List, None]

Optional list of columns or name of column to subset data before running check_fn

None check_name Union[str, None]

Name to use when displaying check result

None

Returns:

Type Description None

None

"},{"location":"API%20reference/run_checks/#pandas_checks.run_checks._display_check","title":"_display_check(data, name=None)","text":"

Renders the result of a Pandas Checks method.

Parameters:

Name Type Description Default data Any

The data to display.

required name Union[str, None]

The optional name of the check.

None

Returns:

Type Description None

None

"},{"location":"API%20reference/run_checks/#pandas_checks.run_checks.get_mode","title":"get_mode()","text":"

Returns whether Pandas Checks is currently running checks and assertions.

Returns:

Type Description Dict[str, bool]

A dictionary containing the current settings.

"},{"location":"API%20reference/timer/","title":"Timer","text":"

Provides a timer utility for tracking the elapsed time of steps within a Pandas method chain.

Note that these functions rely on the pdchecks.enable_checks option being enabled in the Pandas configuration, as it is by default.

"},{"location":"API%20reference/timer/#pandas_checks.timer._display_line","title":"_display_line(line, lead_in=None, colors={})","text":"

Displays a line of text with optional formatting.

Parameters:

Name Type Description Default line str

The text to display.

required lead_in Union[str, None]

The optional text to display before the main text.

None colors Dict

An optional dictionary containing color options for the text and lead-in text. See syntax in docstring for _render_text().

{}

Returns:

Type Description None

None

"},{"location":"API%20reference/timer/#pandas_checks.timer.get_mode","title":"get_mode()","text":"

Returns whether Pandas Checks is currently running checks and assertions.

Returns:

Type Description Dict[str, bool]

A dictionary containing the current settings.

"},{"location":"API%20reference/timer/#pandas_checks.timer.print_time_elapsed","title":"print_time_elapsed(start_time, lead_in='\u23f1\ufe0f Time elapsed', units='auto')","text":"

Displays the time elapsed since start_time.

Parameters:

Name Type Description Default start_time float

The index time when the stopwatch started, which comes from the Pandas Checks start_timer()

required lead_in Union[str, None]

Optional text to print before the elapsed time.

'\u23f1\ufe0f Time elapsed' units str

The units in which to display the elapsed time. Accepted values: - \"auto\" - \"milliseconds\", \"seconds\", \"minutes\", \"hours\" - \"ms\", \"s\", \"m\", \"h\"

'auto'

Returns:

Type Description None

None

Raises:

Type Description ValueError

If units is not one of expected time units

Note

If you change the default values for this function's argument, change them in .check.print_time_elapsed too in DataFrameChecks and SeriesChecks so they're exposed to the user.

"},{"location":"API%20reference/timer/#pandas_checks.timer.start_timer","title":"start_timer(verbose=False)","text":"

Starts a Pandas Checks stopwatch to measure run time between operations, such as steps in a Pandas method chain. Use print_elapsed_time() to get timings.

Parameters:

Name Type Description Default verbose bool

Whether to print a message that the timer has started.

False

Returns:

Type Description float

Timestamp as a float

"},{"location":"API%20reference/utils/","title":"Utils","text":"

Utility functions for the pandas_checks package.

"},{"location":"API%20reference/utils/#pandas_checks.utils._display_line","title":"_display_line(line, lead_in=None, colors={})","text":"

Displays a line of text with optional formatting.

Parameters:

Name Type Description Default line str

The text to display.

required lead_in Union[str, None]

The optional text to display before the main text.

None colors Dict

An optional dictionary containing color options for the text and lead-in text. See syntax in docstring for _render_text().

{}

Returns:

Type Description None

None

"},{"location":"API%20reference/utils/#pandas_checks.utils._has_nulls","title":"_has_nulls(data, fail_message, raise_exception=True, exception_to_raise=DataError)","text":"

Utility function to check for nulls as part of a larger check

"},{"location":"API%20reference/utils/#pandas_checks.utils._is_type","title":"_is_type(data, dtype)","text":"

Utility function to check if a dataframe's columns or one series has an expected type. Includes special handling for strings, since 'object' type in Pandas may not mean a string

"},{"location":"API%20reference/utils/#pandas_checks.utils._lambda_to_string","title":"_lambda_to_string(lambda_func)","text":"

Create a string representation of a lambda function.

Parameters:

Name Type Description Default lambda_func Callable

An arbitrary function in lambda form

required

Returns:

Type Description str

A string version of lambda_func

Todo

This still returns all arguments to the calling function. They get entangled with the argument when it's a lambda function. Try other ways to get just the argument we want.

"},{"location":"API%20reference/utils/#pandas_checks.utils._series_is_type","title":"_series_is_type(s, dtype)","text":"

Utility function to check if a series has an expected type. Includes special handling for strings, since 'object' type in Pandas may not mean a string

"}]} \ No newline at end of file +{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"About","text":""},{"location":"#introduction","title":"Introduction","text":"

Pandas Checks is a Python library for data science and data engineering. It adds non-invasive health checks for Pandas method chains.

"},{"location":"#what-are-method-chains","title":"What are method chains?","text":"

Method chains are one of the coolest features of the Pandas library! They allow you to write more functional code with fewer intermediate variables and fewer side effects. If you're familiar with R, method chains are Python's version of dplyr pipes.

"},{"location":"#why-use-pandas-checks","title":"Why use Pandas Checks?","text":"

Pandas Checks adds the ability to inspect and validate your Pandas data at any point in the method chain, without modifying the underlying data. Think of Pandas Checks as a drone you can send up to check on your pipeline, whether it's in exploratory data analysis, prototyping, or production.

That way you don't need to chop up a method chain, or create intermediate variables, every time you need to diagnose, treat, or prevent problems with your data processing pipeline.

As Fleetwood Mac says, you would never break the chain.

"},{"location":"#giving-feedback-and-contributing","title":"Giving feedback and contributing","text":"

If you run into trouble or have questions, I'd love to know. Please open an issue.

Contributions are appreciated! Please open an issue or submit a pull request. Pandas Checks uses the wonderful libraries poetry for package and dependency management, nox for test automation, and mkdocs for docs.

"},{"location":"#license","title":"License","text":"

Pandas Checks is licensed under the BSD-3 License.

\ud83d\udc3c\ud83e\ude7a

"},{"location":"usage/","title":"Usage","text":""},{"location":"usage/#installation","title":"Installation","text":"

First make Pandas Check available in your environment.

pip install pandas-checks\n

Then import it in your code. It works in Jupyter, IPython, and Python scripts run from the command line.

import pandas_checks\n

After importing, you don't need to access the pandas_checks module directly.

\ud83d\udca1 Tip: You can import Pandas Checks either before or after your code imports Pandas. Just somewhere. \ud83d\ude01

"},{"location":"usage/#basic-usage","title":"Basic usage","text":"

Pandas Checks adds .check methods to Pandas DataFrames and Series.

Say you have a nice function.

\ndef clean_iris_data(iris: pd.DataFrame) -> pd.DataFrame:\n    \"\"\"Preprocess data about pretty flowers.\n\n    Args:\n        iris: The raw iris dataset.\n\n    Returns:\n        The cleaned iris dataset.\n    \"\"\"\n\n    return (\n        iris\n        .dropna() # Drop rows with any null values\n        .rename(columns={\"FLOWER_SPECIES\": \"species\"}) # Rename a column\n        .query(\"species=='setosa'\") # Filter to rows with a certain value\n    )\n

But what if you want to make the chain more robust? Or see what's happening to the data as it flows down the pipeline? Or understand why your new iris CSV suddenly makes the cleaned data look weird?

You can add some .check steps.

\n(\n    iris\n    .dropna()\n    .rename(columns={\"FLOWER_SPECIES\": \"species\"})\n\n    # Validate assumptions\n    .check.assert_positive(subset=[\"petal_length\", \"sepal_length\"])\n\n    # Plot the distribution of a column after cleaning\n    .check.hist(column='petal_length') \n\n    .query(\"species=='setosa'\")\n\n    # Display the first few rows after cleaning\n    .check.head(3)  \n)\n

The .check methods will display the following results:

The .check methods didn't modify how the iris data is processed by your code. They just let you check the data as it flows down the pipeline. That's the difference between Pandas .head() and Pandas Checks .check.head().

"},{"location":"usage/#features","title":"Features","text":""},{"location":"usage/#check-methods","title":"Check methods","text":"

Here's what's in the doctor's bag.

Describe - Standard Pandas methods: - .check.columns() - DataFrame | Series - .check.dtypes() for DataFrame | .check.dtype() for Series - .check.describe() - DataFrame | Series - .check.head() - DataFrame | Series - .check.info() - DataFrame | Series - .check.memory_usage() - DataFrame | Series - .check.nunique() - DataFrame | Series - .check.shape() - DataFrame | Series - .check.tail() - DataFrame | Series - .check.unique() - DataFrame | Series - .check.value_counts() - DataFrame | Series - New functions in Pandas Checks: - .check.function(): Apply an arbitrary lambda function to your data and see the result - DataFrame | Series - .check.ncols(): Count columns - DataFrame | Series - .check.ndups(): Count rows with duplicate values - DataFrame | Series - .check.nnulls(): Count rows with null values - DataFrame | Series - .check.print(): Print a string, a variable, or the current dataframe - DataFrame | Series

  • Export interim files

    • .check.write(): Export the current data, inferring file format from the name - DataFrame | Series
  • Time your code

    • .check.print_time_elapsed(start_time): Print the execution time since you called start_time = pdc.start_timer() - DataFrame | Series
    • \ud83d\udca1 Tip: You can also use this stopwatch outside a method chain, anywhere in your Python code:

      ```python from pandas_checks import print_elapsed_time, start_timer

      start_time = start_timer() ... print_elapsed_time(start_time) ```

  • Turn off Pandas Checks

    • .check.disable_checks(): Don't run checks, for production mode etc. By default, still runs assertions. - DataFrame | Series
    • .check.enable_checks(): Run checks - DataFrame | Series
  • Validate

    • General
      • .check.assert_data(): Check that data passes an arbitrary condition - DataFrame | Series
    • Types
      • .check.assert_datetime() - DataFrame | Series
      • .check.assert_float() - DataFrame | Series
      • .check.assert_int() - DataFrame | Series
      • .check.assert_str() - DataFrame | Series
      • .check.assert_timedelta() - DataFrame | Series
      • .check.assert_type() - DataFrame | Series
    • Values
      • .check.assert_less_than() - DataFrame | Series
      • .check.assert_greater_than() - DataFrame | Series
      • .check.assert_negative() - DataFrame | Series
      • .check.assert_not_null() - DataFrame | Series
      • .check.assert_null() - DataFrame | Series
      • .check.assert_positive() - DataFrame | Series
      • .check.assert_unique() - DataFrame | Series
  • Visualize

    • .check.hist(): A histogram - DataFrame | Series
    • .check.plot(): An arbitrary plot you can customize - DataFrame | Series
"},{"location":"usage/#customizing-a-check","title":"Customizing a check","text":"

You can use Pandas Checks methods like the regular Pandas methods. They accept the same arguments. For example, you can pass: * .check.head(7) * .check.value_counts(column=\"species\", dropna=False, normalize=True) * .check.plot(kind=\"scatter\", x=\"sepal_width\", y=\"sepal_length\")

Also, most Pandas Checks methods accept 3 additional arguments: 1. check_name: text to display before the result of the check 2. fn: a lambda function that modifies the data displayed by the check 3. subset: limit a check to certain columns

(\n    iris\n    .check.value_counts(column='species', check_name=\"Varieties after data cleaning\")\n    .assign(species=lambda df: df[\"species\"].str.upper()) # Do your regular Pandas data processing, like upper-casing the values in one column\n    .check.head(n=2, fn=lambda df: df[\"petal_width\"]*2) # Modify the data that gets displayed in the check only\n    .check.describe(subset=['sepal_width', 'sepal_length'])  # Only apply the check to certain columns\n)\n

"},{"location":"usage/#configuring-pandas-check","title":"Configuring Pandas Check","text":""},{"location":"usage/#global-configuration","title":"Global configuration","text":"

You can change how Pandas Checks works everywhere. For example:

import pandas_checks as pdc\n\n# Set output precision and turn off the cute emojis\npdc.set_format(precision=3, use_emojis=False)\n\n# Don't run any of the calls to Pandas Checks, globally. \npdc.disable_checks()\n

Run pdc.describe_options() to see the arguments you can pass to .set_format().

\ud83d\udca1 Tip: By default, disable_checks() and enable_checks() do not change whether Pandas Checks will run assertion methods (.check.assert_*).

To turn off assertions too, add the argument enable_asserts=False, such as: disable_checks(enable_asserts=False).

"},{"location":"usage/#local-configuration","title":"Local configuration","text":"

You can also adjust settings within a method chain by bookending the chain, like this:

# Customize format during one method chain\n(\n    iris\n    .check.set_format(precision=7, use_emojis=False)\n    ... # Any .check methods in here will use the new format\n    .check.reset_format() # Restore default format\n)\n\n# Turn off Pandas Checks during one method chain\n(\n    iris\n    .check.disable_checks()\n    ... # Any .check methods in here will not be run\n    .check.enable_checks() # Turn it back on for the next code\n)\n
"},{"location":"usage/#hybrid-eda-production-data-processing","title":"Hybrid EDA-Production data processing","text":"

Exploratory Data Analysis is often taught as a one-time step we do to plan our production data processing. But sometimes EDA is a cyclical process we go back to for deeper inspection during debugging, code edits, or changes in the input data. If explorations were useful in EDA, they may be useful again.

Unfortunately, it's hard to go back to EDA. It's too out of sync. The prod data processing pipeline has usually evolved too much, making the EDA code a historical artifact full of cobwebs that we can't easily fire up again.

But if you use Pandas Checks during EDA, you could roll your .check methods into your first production code. Then in prod mode, disable Pandas Checks when you don't need it, to save compute and streamline output. When you ever need to pull out those EDA tools, enable Pandas Checks globally or locally.

This can make your prod pipline more transparent and easier to inspect.

"},{"location":"API%20reference/DataFrameChecks/","title":"DataFrame methods","text":""},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks._obj","title":"_obj = pandas_obj instance-attribute","text":""},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.__init__","title":"__init__(pandas_obj)","text":""},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_data","title":"assert_data(condition, subset=None, pass_message=' \u2714\ufe0f Assertion passed ', fail_message=' \u3128 Assertion failed ', raise_exception=True, exception_to_raise=DataError, message_shows_condition=True, verbose=False)","text":"

Tests whether Dataframe meets condition. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default condition Callable

Assertion criteria in the form of a lambda function, such as lambda df: df.shape[0]>10.

required subset Union[str, List, None]

Optional, which column or columns to check the condition against. Applied after fn. Subsetting can also be done within the condition, such as lambda df: df['column_name'].sum()>10

None pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assertion passed ' fail_message str

Message to display if the condition fails.

' \u3128 Assertion failed ' raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

DataError message_shows_condition bool

Whether the fail/pass message should also print the assertion criteria

True verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_datetime","title":"assert_datetime(subset=None, pass_message=' \u2714\ufe0f Assert datetime passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)","text":"

Tests whether Dataframe or subset of columns is datetime or timestamp. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default subset Union[str, List, None]

Optional, which column or columns to check the condition against. `

None pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert datetime passed ' fail_message Union[str, None]

Message to display if the condition fails.

None raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

TypeError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_float","title":"assert_float(subset=None, pass_message=' \u2714\ufe0f Assert float passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)","text":"

Tests whether Dataframe or subset of columns is floats. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default subset Union[str, List, None]

Optional, which column or columns to check the condition against. `

None pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert float passed ' fail_message Union[str, None]

Message to display if the condition fails.

None raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

TypeError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_greater_than","title":"assert_greater_than(min, or_equal_to=True, subset=None, pass_message=' \u2714\ufe0f Assert minimum passed ', fail_message=' \u3128 Assert minimum failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Dataframe or subset of columns is > or >= a value. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default min Any

the minimum value to compare DataFrame to. Accepts any type that can be used in >, such as int, float, str, datetime

required or_equal_to bool

whether to test for >= min (True) or > min (False)

True subset Union[str, List, None]

Optional, which column or columns to check the condition against. `

None pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert minimum passed ' fail_message str

Message to display if the condition fails.

' \u3128 Assert minimum failed ' raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

DataError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_int","title":"assert_int(subset=None, pass_message=' \u2714\ufe0f Assert integeer passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)","text":"

Tests whether Dataframe or subset of columns is integers. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default subset Union[str, List, None]

Optional, which column or columns to check the condition against. `

None pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert integeer passed ' fail_message Union[str, None]

Message to display if the condition fails.

None raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

TypeError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_less_than","title":"assert_less_than(max, or_equal_to=True, subset=None, pass_message=' \u2714\ufe0f Assert maximum passed ', fail_message=' \u3128 Assert maximum failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Dataframe or subset of columns is < or <= a value. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default max Any

the max value to compare DataFrame to. Accepts any type that can be used in <, such as int, float, str, datetime

required or_equal_to bool

whether to test for <= min (True) or < max (False)

True subset Union[str, List, None]

Optional, which column or columns to check the condition against. `

None pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert maximum passed ' fail_message str

Message to display if the condition fails.

' \u3128 Assert maximum failed ' raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

DataError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_negative","title":"assert_negative(subset=None, assert_not_null=True, pass_message=' \u2714\ufe0f Assert negative passed ', fail_message=' \u3128 Assert negative failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Dataframe or subset of columns has all negative values. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default subset Union[str, List, None]

Optional, which column or columns to check the condition against.`

None assert_not_null bool

Whether to also enforce that data has no nulls.

True pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert negative passed ' fail_message str

Message to display if the condition fails.

' \u3128 Assert negative failed ' raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

DataError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_not_null","title":"assert_not_null(subset=None, pass_message=' \u2714\ufe0f Assert no nulls passed ', fail_message=' \u3128 Assert no nulls failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Dataframe or subset of columns has no nulls. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default subset Union[str, List, None]

Optional, which column or columns to check the condition against. `

None pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert no nulls passed ' fail_message str

Message to display if the condition fails.

' \u3128 Assert no nulls failed ' raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

DataError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_null","title":"assert_null(subset=None, pass_message=' \u2714\ufe0f Assert all nulls passed ', fail_message=' \u3128 Assert all nulls failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Dataframe or subset of columns has all nulls. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default subset Union[str, List, None]

Optional, which column or columns to check the condition against. `

None pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert all nulls passed ' fail_message str

Message to display if the condition fails.

' \u3128 Assert all nulls failed ' raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

DataError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_positive","title":"assert_positive(subset=None, assert_not_null=True, pass_message=' \u2714\ufe0f Assert positive passed ', fail_message=' \u3128 Assert positive failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Dataframe or subset of columns has all positive values. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default subset Union[str, List, None]

Optional, which column or columns to check the condition against. `

None assert_not_null bool

Whether to also enforce that data has no nulls.

True pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert positive passed ' fail_message str

Message to display if the condition fails.

' \u3128 Assert positive failed ' raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

DataError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_str","title":"assert_str(subset=None, pass_message=' \u2714\ufe0f Assert string passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)","text":"

Tests whether Dataframe or subset of columns is strings. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default subset Union[str, List, None]

Optional, which column or columns to check the condition against. `

None pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert string passed ' fail_message Union[str, None]

Message to display if the condition fails.

None raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

TypeError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_timedelta","title":"assert_timedelta(subset=None, pass_message=' \u2714\ufe0f Assert timedelta passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)","text":"

Tests whether Dataframe or subset of columns is of type timedelta. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default subset Union[str, List, None]

Optional, which column or columns to check the condition against. `

None pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert timedelta passed ' fail_message Union[str, None]

Message to display if the condition fails.

None raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

TypeError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_type","title":"assert_type(dtype, subset=None, pass_message=' \u2714\ufe0f Assert type passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)","text":"

Tests whether Dataframe or subset of columns meets type assumption. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default dtype Type[Any]

The required variable type

required subset Union[str, List, None]

Optional, which column or columns to check the condition against. `

None pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert type passed ' fail_message Union[str, None]

Message to display if the condition fails.

None raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

TypeError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_unique","title":"assert_unique(subset=None, pass_message=' \u2714\ufe0f Assert unique passed ', fail_message=' \u3128 Assert unique failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Dataframe or subset of columns has no duplicate rows. Optionally raises an exception. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default subset Union[str, List, None]

Optional, which column or columns to check the condition against. `

None pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert unique passed ' fail_message str

Message to display if the condition fails.

' \u3128 Assert unique failed ' raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

DataError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.columns","title":"columns(fn=lambda df: df, subset=None, check_name='\ud83c\udfdb\ufe0f Columns')","text":"

Prints the column names of a DataFrame, without modifying the DataFrame itself.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the DataFrame before printing columns. Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string to select a subset of columns before printing their names. Applied after fn.

None check_name Union[str, None]

An optional name for the check to preface the result with.

'\ud83c\udfdb\ufe0f Columns'

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.describe","title":"describe(fn=lambda df: df, subset=None, check_name='\ud83d\udccf Distributions', **kwargs)","text":"

Displays descriptive statistics about a DataFrame without modifying the DataFrame itself.

See Pandas docs for describe() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the DataFrame before running Pandas describe(). Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string to select a subset of columns before running Pandas describe(). Applied after fn.

None check_name Union[str, None]

An optional name for the check to preface the result with.

'\ud83d\udccf Distributions' **kwargs Any

Optional, additional arguments that are accepted by Pandas describe() method.

{}

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.disable_checks","title":"disable_checks(enable_asserts=True)","text":"

Turns off Pandas Checks globally, such as in production mode. Calls to .check functions will not be run. Does not modify the DataFrame itself.

Args enable_assert: Optionally, whether to also enable or disable assert statements

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.dtypes","title":"dtypes(fn=lambda df: df, subset=None, check_name='\ud83d\uddc2\ufe0f Data types')","text":"

Displays the data types of a DataFrame's columns without modifying the DataFrame itself.

See Pandas docs for dtypes for additional usage information.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the DataFrame before running Pandas dtypes. Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string to select a subset of columns before running Pandas .dtypes. Applied after fn.

None check_name Union[str, None]

An optional name for the check to preface the result with.

'\ud83d\uddc2\ufe0f Data types'

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.enable_checks","title":"enable_checks(enable_asserts=True)","text":"

Globally enables Pandas Checks. Subequent calls to .check methods will be run. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default enable_asserts bool

Optionally, whether to globally enable or disable calls to .check.assert_data().

True

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.function","title":"function(fn=lambda df: df, subset=None, check_name=None)","text":"

Applies an arbitrary function on a DataFrame and shows the result, without modifying the DataFrame itself.

Example

.check.function(fn=lambda df: df.shape[0]>10, check_name='Has at least 10 rows?') which will result in 'True' or 'False'

Parameters:

Name Type Description Default fn Callable

A lambda function to apply to the DataFrame. Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string to select a subset of columns before running Pandas describe(). Applied after fn.

None check_name Union[str, None]

An optional name for the check to preface the result with.

None

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.get_mode","title":"get_mode(check_name='\ud83d\udc3c\ud83e\ude7a Pandas Checks mode')","text":"

Displays the current values of Pandas Checks global options enable_checks and enable_asserts. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default check_name Union[str, None]

An optional name for the check. Will be used as a preface the printed result.

'\ud83d\udc3c\ud83e\ude7a Pandas Checks mode'

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.head","title":"head(n=5, fn=lambda df: df, subset=None, check_name=None)","text":"

Displays the first n rows of a DataFrame, without modifying the DataFrame itself.

See Pandas docs for head() for additional usage information.

Parameters:

Name Type Description Default n int

The number of rows to display.

5 fn Callable

An optional lambda function to apply to the DataFrame before running Pandas head(). Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string to select a subset of columns before running Pandas head(). Applied after fn.

None check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.hist","title":"hist(fn=lambda df: df, subset=[], check_name=None, **kwargs)","text":"

Displays a histogram for the DataFrame, without modifying the DataFrame itself.

See Pandas docs for hist() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the DataFrame before running Pandas hist(). Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string to select a subset of columns before running Pandas hist(). Applied after fn.

[] check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None **kwargs Any

Optional, additional arguments that are accepted by Pandas hist() method.

{}

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

Note

If more than one column is passed, displays a grid of histograms

Only renders in interactive mode (IPython/Jupyter), not in terminal

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.info","title":"info(fn=lambda df: df, subset=None, check_name='\u2139\ufe0f Info', **kwargs)","text":"

Displays summary information about a DataFrame, without modifying the DataFrame itself.

See Pandas docs for info() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the DataFrame before running Pandas info(). Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string to select a subset of columns before running Pandas info(). Applied after fn.

None check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

'\u2139\ufe0f Info' **kwargs Any

Optional, additional arguments that are accepted by Pandas info() method.

{}

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.memory_usage","title":"memory_usage(fn=lambda df: df, subset=None, check_name='\ud83d\udcbe Memory usage', **kwargs)","text":"

Displays the memory footprint of a DataFrame, without modifying the DataFrame itself.

See Pandas docs for memory_usage() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the DataFrame before running Pandas memory_usage(). Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string to select a subset of columns before running Pandas memory_usage(). Applied after fn.

None check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

'\ud83d\udcbe Memory usage' **kwargs Any

Optional, additional arguments that are accepted by Pandas info() method.

{}

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

Note

Include argument deep=True to get further memory usage of object dtypes in the DataFrame. See Pandas docs for memory_usage() for more info.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.ncols","title":"ncols(fn=lambda df: df, subset=None, check_name='\ud83c\udfdb\ufe0f Columns')","text":"

Displays the number of columns in a DataFrame, without modifying the DataFrame itself.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the DataFrame before counting the number of columns. Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string to select a subset of columns before counting the number of columns. Applied after fn.

None check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

'\ud83c\udfdb\ufe0f Columns'

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.ndups","title":"ndups(fn=lambda df: df, subset=None, check_name=None, **kwargs)","text":"

Displays the number of duplicated rows in a DataFrame, without modifying the DataFrame itself.

See Pandas docs for duplicated() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the DataFrame before counting the number of duplicates. Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string to select a subset of columns before counting duplicate rows. Applied after fn.

None check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None **kwargs Any

Optional, additional arguments that are accepted by Pandas duplicated() method.

{}

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.nnulls","title":"nnulls(fn=lambda df: df, subset=None, by_column=True, check_name='\ud83d\udc7b Rows with NaNs')","text":"

Displays the number of rows with null values in a DataFrame, without modifying the DataFrame itself.

See Pandas docs for isna() for additional usage information.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the DataFrame before counting the number of rows with a null. Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string to select a subset of columns before counting nulls.

None by_column bool

If True, count null values with each column separately. If False, count rows with a null value in any column. Applied after fn.

True check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

'\ud83d\udc7b Rows with NaNs'

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.nrows","title":"nrows(fn=lambda df: df, subset=None, check_name='\u2630 Rows')","text":"

Displays the number of rows in a DataFrame, without modifying the DataFrame itself.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the DataFrame before counting the number of rows. Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string name of one column to limit which columns are considered when counting rows. Applied after fn.

None check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

'\u2630 Rows'

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.nunique","title":"nunique(column, fn=lambda df: df, check_name=None, **kwargs)","text":"

Displays the number of unique rows in a single column, without modifying the DataFrame itself.

See Pandas docs for nunique() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default column str

The name of a column to count uniques in. Applied after fn.

required fn Callable

An optional lambda function to apply to the DataFrame before running Pandas nunique(). Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None **kwargs Any

Optional, additional arguments that are accepted by Pandas nunique() method.

{}

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.plot","title":"plot(fn=lambda df: df, subset=None, check_name='', **kwargs)","text":"

Displays a plot of the DataFrame, without modifying the DataFrame itself.

See Pandas docs for plot() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the DataFrame before running Pandas plot(). Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string name of one column to limit which columns are plotted. Applied after fn.

None check_name Union[str, None]

An optional title for the plot.

'' **kwargs Any

Optional, additional arguments that are accepted by Pandas plot() method.

{}

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

Note

Plots are only displayed when code is run in IPython/Jupyter, not in terminal.

If you pass a 'title' kwarg, it becomes the plot title, overriding check_name

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.print","title":"print(object=None, fn=lambda df: df, subset=None, check_name=None, max_rows=10)","text":"

Displays text, another object, or (by default) the current DataFrame's head. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default object Any

Object to print. Can be anything printable: str, int, list, another DataFrame, etc. If None, print the DataFrame's head (with max_rows rows).

None fn Callable

An optional lambda function to apply to the DataFrame before printing object. Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string name of one column to limit which columns are printed. Applied after fn.

None check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None max_rows int

Maximum number of rows to print if object=None.

10

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.print_time_elapsed","title":"print_time_elapsed(start_time, lead_in='Time elapsed', units='auto')","text":"

Displays the time elapsed since start_time.

Parameters:

Name Type Description Default start_time float

The index time when the stopwatch started, which comes from the Pandas Checks start_timer()

required lead_in Union[str, None]

Optional text to print before the elapsed time.

'Time elapsed' units str

The units in which to display the elapsed time. Can be \"auto\", \"seconds\", \"minutes\", or \"hours\".

'auto'

Raises:

Type Description ValueError

If units is not one of \"auto\", \"seconds\", \"minutes\", or \"hours\".

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.reset_format","title":"reset_format()","text":"

Globally restores all Pandas Checks formatting options to their default \"factory\" settings. Does not modify the DataFrame itself.

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.set_format","title":"set_format(**kwargs)","text":"

Configures selected formatting options for Pandas Checks. Does not modify the DataFrame itself.

Run pandas_checks.describe_options() to see a list of available options.

For example, .check.set_format(check_text_tag= \"h1\", use_emojis=False`) will globally change Pandas Checks to display text results as H1 headings and remove all emojis.

Parameters:

Name Type Description Default **kwargs Any

Pairs of setting name and its new value.

{}

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.set_mode","title":"set_mode(enable_checks, enable_asserts)","text":"

Configures the operation mode for Pandas Checks globally. Does not modify the DataFrame itself.

Parameters:

Name Type Description Default enable_checks bool

Whether to run any Pandas Checks methods globally. Does not affect .check.assert_data().

required enable_asserts bool

Whether to run calls to Pandas Checks .check.assert_data() statements globally.

required

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.shape","title":"shape(fn=lambda df: df, subset=None, check_name='\ud83d\udcd0 Shape')","text":"

Displays the Dataframe's dimensions, without modifying the DataFrame itself.

See Pandas docs for shape for additional usage information.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the DataFrame before running Pandas shape. Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string name of one column to limit which columns are considered when printing the shape. Applied after fn.

None check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

'\ud83d\udcd0 Shape'

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

Note

See also .check.nrows() and .check.ncols()

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.tail","title":"tail(n=5, fn=lambda df: df, subset=None, check_name=None)","text":"

Displays the last n rows of the DataFrame, without modifying the DataFrame itself.

See Pandas docs for tail() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default n int

Number of rows to show.

5 fn Callable

An optional lambda function to apply to the DataFrame before running Pandas tail(). Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string name of one column to limit which columns are displayed. Applied after fn.

None check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.unique","title":"unique(column, fn=lambda df: df, check_name=None)","text":"

Displays the unique values in a column, without modifying the DataFrame itself.

See Pandas docs for unique() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default column str

Column to check for unique values.

required fn Callable

An optional lambda function to apply to the DataFrame before calling Pandas unique(). Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

Note

fn is applied to the dataframe before selecting column. If you want to select the column before modifying it, set column=None and start fn with a column selection, i.e. fn=lambda df: df[\"my_column\"].stuff()

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.value_counts","title":"value_counts(column, fn=lambda df: df, max_rows=10, check_name=None, **kwargs)","text":"

Displays the value counts for a column, without modifying the DataFrame itself.

See Pandas docs for value_counts() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default column str

Column to check for value counts.

required max_rows int

Maximum number of rows to show in the value counts.

10 fn Callable

An optional lambda function to apply to the DataFrame before running Pandas value_counts(). Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None **kwargs Any

Optional, additional arguments that are accepted by Pandas value_counts() method.

{}

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

Note

fn is applied to the dataframe before selecting column. If you want to select the column before modifying it, set column=None and start fn with a column selection, i.e. fn=lambda df: df[\"my_column\"].stuff()

"},{"location":"API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.write","title":"write(path, format=None, fn=lambda df: df, subset=None, verbose=False, **kwargs)","text":"

Exports DataFrame to file, without modifying the DataFrame itself.

Format is inferred from path extension like .csv.

This functions uses the corresponding Pandas export function such as to_csv(). See Pandas docs for those functions for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default path str

Path to write the file to.

required format Union[str, None]

Optional file format to force for the export. If None, format is inferred from the file's extension in path.

None fn Callable

An optional lambda function to apply to the DataFrame before exporting. Example: lambda df: df.shape[0]>10. Applied before subset.

lambda df: df subset Union[str, List, None]

An optional list of column names or a string name of one column to limit which columns are exported. Applied after fn.

None verbose bool

Whether to print a message when the file is written.

False **kwargs Any

Optional, additional keyword arguments to pass to the Pandas export function (.to_csv).

{}

Returns:

Type Description DataFrame

The original DataFrame, unchanged.

Note

Exporting to some formats such as Excel, Feather, and Parquet may require you to install additional packages.

"},{"location":"API%20reference/SeriesChecks/","title":"Series methods","text":""},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks._obj","title":"_obj = pandas_obj instance-attribute","text":""},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.__init__","title":"__init__(pandas_obj)","text":""},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_data","title":"assert_data(condition, pass_message=' \u2714\ufe0f Assertion passed ', fail_message=' \u3128 Assertion failed ', raise_exception=True, exception_to_raise=DataError, message_shows_condition=True, verbose=False)","text":"

Tests whether Series meets condition. Optionally raises an exception. Does not modify the Series itself.

Parameters:

Name Type Description Default condition Callable

Assertion criteria in the form of a lambda function, such as lambda s: s.shape[0]>10.

required pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assertion passed ' fail_message str

Message to display if the condition fails.

' \u3128 Assertion failed ' raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

DataError message_shows_condition bool

Whether the fail/pass message should also print the assertion criteria

True verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_datetime","title":"assert_datetime(pass_message=' \u2714\ufe0f Assert datetime passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)","text":"

Tests whether Series is datetime or timestamp. Optionally raises an exception. Does not modify the Series itself.

Parameters:

Name Type Description Default pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert datetime passed ' fail_message Union[str, None]

Message to display if the condition fails.

None raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

TypeError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_float","title":"assert_float(pass_message=' \u2714\ufe0f Assert float passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)","text":"

Tests whether Series is floats. Optionally raises an exception. Does not modify the Series itself.

Parameters:

Name Type Description Default pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert float passed ' fail_message Union[str, None]

Message to display if the condition fails.

None raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

TypeError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_greater_than","title":"assert_greater_than(min, or_equal_to=True, pass_message=' \u2714\ufe0f Assert minimum passed ', fail_message=' \u3128 Assert minimum failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Series is > or >= a value. Optionally raises an exception. Does not modify the Series itself.

Parameters:

Name Type Description Default min Any

the minimum value to compare Series to. Accepts any type that can be used in >, such as int, float, str, datetime

required or_equal_to bool

whether to test for >= min (True) or > min (False)

True pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert minimum passed ' fail_message str

Message to display if the condition fails.

' \u3128 Assert minimum failed ' raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

DataError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_int","title":"assert_int(pass_message=' \u2714\ufe0f Assert integeer passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)","text":"

Tests whether Series is integers. Optionally raises an exception. Does not modify the Series itself.

Args:

pass_message: Message to display if the condition passes.\nfail_message: Message to display if the condition fails.\nraise_exception: Whether to raise an exception if the condition fails.\nexception_to_raise: The exception to raise if the condition fails and raise_exception is True.\nverbose: Whether to display the pass message if the condition passes.\n

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_less_than","title":"assert_less_than(max, or_equal_to=True, pass_message=' \u2714\ufe0f Assert maximum passed ', fail_message=' \u3128 Assert maximum failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Series is < or <= a value. Optionally raises an exception. Does not modify the Series itself.

Parameters:

Name Type Description Default max Any

the max value to compare Series to. Accepts any type that can be used in <, such as int, float, str, datetime

required or_equal_to bool

whether to test for <= min (True) or < max (False)

True pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert maximum passed ' fail_message str

Message to display if the condition fails.

' \u3128 Assert maximum failed ' raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

DataError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_negative","title":"assert_negative(assert_not_null=True, pass_message=' \u2714\ufe0f Assert negative passed ', fail_message=' \u3128 Assert negative failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Series has all negative values. Optionally raises an exception. Does not modify the Series itself.

Parameters:

Name Type Description Default assert_not_null bool

Whether to also enforce that data has no nulls.

True pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert negative passed ' fail_message str

Message to display if the condition fails.

' \u3128 Assert negative failed ' raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

DataError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_not_null","title":"assert_not_null(pass_message=' \u2714\ufe0f Assert no nulls passed ', fail_message=' \u3128 Assert no nulls failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Series has no nulls. Optionally raises an exception. Does not modify the Series itself.

Args:

pass_message: Message to display if the condition passes.\nfail_message: Message to display if the condition fails.\nraise_exception: Whether to raise an exception if the condition fails.\nexception_to_raise: The exception to raise if the condition fails and raise_exception is True.\nverbose: Whether to display the pass message if the condition passes.\n

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_null","title":"assert_null(pass_message=' \u2714\ufe0f Assert all nulls passed ', fail_message=' \u3128 Assert all nulls failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Series has all nulls. Optionally raises an exception. Does not modify the Series itself.

Args:

pass_message: Message to display if the condition passes.\nfail_message: Message to display if the condition fails.\nraise_exception: Whether to raise an exception if the condition fails.\nexception_to_raise: The exception to raise if the condition fails and raise_exception is True.\nverbose: Whether to display the pass message if the condition passes.\n

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_positive","title":"assert_positive(assert_not_null=True, pass_message=' \u2714\ufe0f Assert positive passed ', fail_message=' \u3128 Assert positive failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Series has all positive values. Optionally raises an exception. Does not modify the Series itself.

Args:

assert_not_null: Whether to also enforce that data has no nulls.\npass_message: Message to display if the condition passes.\nfail_message: Message to display if the condition fails.\nraise_exception: Whether to raise an exception if the condition fails.\nexception_to_raise: The exception to raise if the condition fails and raise_exception is True.\nverbose: Whether to display the pass message if the condition passes.\n

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_str","title":"assert_str(pass_message=' \u2714\ufe0f Assert string passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)","text":"

Tests whether Series is strings. Optionally raises an exception. Does not modify the Series itself.

Args:

pass_message: Message to display if the condition passes.\nfail_message: Message to display if the condition fails.\nraise_exception: Whether to raise an exception if the condition fails.\nexception_to_raise: The exception to raise if the condition fails and raise_exception is True.\nverbose: Whether to display the pass message if the condition passes.\n

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_timedelta","title":"assert_timedelta(pass_message=' \u2714\ufe0f Assert timedelta passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)","text":"

Tests whether Series is of type timedelta. Optionally raises an exception. Does not modify the Series itself.

Args:

pass_message: Message to display if the condition passes.\nfail_message: Message to display if the condition fails.\nraise_exception: Whether to raise an exception if the condition fails.\nexception_to_raise: The exception to raise if the condition fails and raise_exception is True.\nverbose: Whether to display the pass message if the condition passes.\n

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_type","title":"assert_type(dtype, pass_message=' \u2714\ufe0f Assert type passed ', fail_message=None, raise_exception=True, exception_to_raise=TypeError, verbose=False)","text":"

Tests whether Series meets type assumption. Optionally raises an exception. Does not modify the Series itself.

Parameters:

Name Type Description Default dtype Type[Any]

The required variable type

required pass_message str

Message to display if the condition passes.

' \u2714\ufe0f Assert type passed ' fail_message Union[str, None]

Message to display if the condition fails.

None raise_exception bool

Whether to raise an exception if the condition fails.

True exception_to_raise Type[BaseException]

The exception to raise if the condition fails and raise_exception is True.

TypeError verbose bool

Whether to display the pass message if the condition passes.

False

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_unique","title":"assert_unique(pass_message=' \u2714\ufe0f Assert unique passed ', fail_message=' \u3128 Assert unique failed ', raise_exception=True, exception_to_raise=DataError, verbose=False)","text":"

Tests whether Series has no duplicate rows. Optionally raises an exception. Does not modify the Series itself.

Args:

pass_message: Message to display if the condition passes.\nfail_message: Message to display if the condition fails.\nraise_exception: Whether to raise an exception if the condition fails.\nexception_to_raise: The exception to raise if the condition fails and raise_exception is True.\nverbose: Whether to display the pass message if the condition passes.\n

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.describe","title":"describe(fn=lambda s: s, check_name='\ud83d\udccf Distribution', **kwargs)","text":"

Displays descriptive statistics about a Series, without modifying the Series itself.

See Pandas docs for describe() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the Series before running Pandas describe(). Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check to preface the result with.

'\ud83d\udccf Distribution' **kwargs Any

Optional, additional arguments that are accepted by Pandas describe() method.

{}

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.disable_checks","title":"disable_checks(enable_asserts=True)","text":"

Turns off Pandas Checks globally, such as in production mode. Calls to .check functions will not be run. Does not modify the Series itself.

Args enable_assert: Optionally, whether to also enable or disable assert statements

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.dtype","title":"dtype(fn=lambda s: s, check_name='\ud83d\uddc2\ufe0f Data type')","text":"

Displays the data type of a Series, without modifying the Series itself.

See Pandas docs for .dtype for additional usage information.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the Series before running Pandas dtype. Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check to preface the result with.

'\ud83d\uddc2\ufe0f Data type'

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.enable_checks","title":"enable_checks(enable_asserts=True)","text":"

Globally enables Pandas Checks. Subequent calls to .check methods will be run. Does not modify the Series itself.

Parameters:

Name Type Description Default enable_asserts bool

Optionally, whether to globally enable or disable calls to .check.assert_data().

True

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.function","title":"function(fn=lambda s: s, check_name=None)","text":"

Applies an arbitrary function on a Series and shows the result, without modifying the Series itself.

Example

.check.function(fn=lambda s: s.shape[0]>10, check_name='Has at least 10 rows?') which will result in 'True' or 'False'

Parameters:

Name Type Description Default fn Callable

The lambda function to apply to the Series. Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check to preface the result with.

None

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.get_mode","title":"get_mode(check_name='\u2699\ufe0f Pandas Checks mode')","text":"

Displays the current values of Pandas Checks global options enable_checks and enable_asserts. Does not modify the Series itself.

Parameters:

Name Type Description Default check_name Union[str, None]

An optional name for the check. Will be used as a preface the printed result.

'\u2699\ufe0f Pandas Checks mode'

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.head","title":"head(n=5, fn=lambda s: s, check_name=None)","text":"

Displays the first n rows of a Series, without modifying the Series itself.

See Pandas docs for head() for additional usage information.

Parameters:

Name Type Description Default n int

The number of rows to display.

5 fn Callable

An optional lambda function to apply to the Series before running Pandas head(). Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.hist","title":"hist(fn=lambda s: s, check_name=None, **kwargs)","text":"

Displays a histogram for the Series's distribution, without modifying the Series itself.

See Pandas docs for hist() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the Series before running Pandas head(). Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None **kwargs Any

Optional, additional arguments that are accepted by Pandas hist() method.

{}

Returns:

Type Description Series

The original Series, unchanged.

Note

Plots are only displayed when code is run in IPython/Jupyter, not in terminal.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.info","title":"info(fn=lambda s: s, check_name='\u2139\ufe0f Series info', **kwargs)","text":"

Displays summary information about a Series, without modifying the Series itself.

See Pandas docs for info() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the Series before running Pandas info(). Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

'\u2139\ufe0f Series info' **kwargs Any

Optional, additional arguments that are accepted by Pandas info() method.

{}

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.memory_usage","title":"memory_usage(fn=lambda s: s, check_name='\ud83d\udcbe Memory usage', **kwargs)","text":"

Displays the memory footprint of a Series, without modifying the Series itself.

See Pandas docs for memory_usage() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the Series before running Pandas memory_usage(). Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

'\ud83d\udcbe Memory usage' **kwargs Any

Optional, additional arguments that are accepted by Pandas memory_usage() method.

{}

Returns:

Type Description Series

The original Series, unchanged.

Note

Include argument deep=True to get further memory usage of object dtypes. See Pandas docs for memory_usage() for more info.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.ndups","title":"ndups(fn=lambda s: s, check_name=None, **kwargs)","text":"

Displays the number of duplicated rows in the Series, without modifying the Series itself.

See Pandas docs for duplicated() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the Series before counting the number of duplicates. Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None **kwargs Any

Optional, additional arguments that are accepted by Pandas duplicated() method.

{}

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.nnulls","title":"nnulls(fn=lambda s: s, check_name='\ud83d\udc7b Rows with NaNs')","text":"

Displays the number of rows with null values in the Series, without modifying the Series itself.

See Pandas docs for isna() for additional usage information.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the Series before counting rows with nulls. Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

'\ud83d\udc7b Rows with NaNs'

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.nrows","title":"nrows(fn=lambda s: s, check_name='\u2630 Rows')","text":"

Displays the number of rows in a Series, without modifying the Series itself.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the Series before counting the number of rows. Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

'\u2630 Rows'

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.nunique","title":"nunique(fn=lambda s: s, check_name=None, **kwargs)","text":"

Displays the number of unique rows in a Series, without modifying the Series itself.

See Pandas docs for nunique() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the Series before running Pandas nunique(). Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None **kwargs Any

Optional, additional arguments that are accepted by Pandas nunique() method.

{}

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.plot","title":"plot(fn=lambda s: s, check_name='', **kwargs)","text":"

Displays a plot of the Series, without modifying the Series itself.

See Pandas docs for plot() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the Series before running Pandas plot(). Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional title for the plot.

'' **kwargs Any

Optional, additional arguments that are accepted by Pandas plot() method.

{}

Returns:

Type Description Series

The original Series, unchanged.

Note

Plots are only displayed when code is run in IPython/Jupyter, not in terminal.

If you pass a 'title' kwarg, it becomes the plot title, overriding check_name

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.print","title":"print(object=None, fn=lambda s: s, check_name=None, max_rows=10)","text":"

Displays text, another object, or (by default) the current DataFrame's head. Does not modify the Series itself.

Parameters:

Name Type Description Default object Any

Object to print. Can be anything printable: str, int, list, another DataFrame, etc. If None, print the Series's head (with max_rows rows).

None fn Callable

An optional lambda function to apply to the Series before printing object. Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None max_rows int

Maximum number of rows to print if object=None.

10

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.print_time_elapsed","title":"print_time_elapsed(start_time, lead_in='Time elapsed', units='auto')","text":"

Displays the time elapsed since start_time.

Args: start_time: The index time when the stopwatch started, which comes from the Pandas Checks start_timer() lead_in: Optional text to print before the elapsed time. units: The units in which to display the elapsed time. Can be \"auto\", \"seconds\", \"minutes\", or \"hours\".

Raises:

Type Description ValueError

If units is not one of \"auto\", \"seconds\", \"minutes\", or \"hours\".

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.reset_format","title":"reset_format()","text":"

Globally restores all Pandas Checks formatting options to their default \"factory\" settings. Does not modify the Series itself.

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.set_format","title":"set_format(**kwargs)","text":"

Configures selected formatting options for Pandas Checks. Run pandas_checks.describe_options() to see a list of available options. Does not modify the Series itself

For example, .check.set_format(check_text_tag= \"h1\", use_emojis=False`) will globally change Pandas Checks to display text results as H1 headings and remove all emojis.

Parameters:

Name Type Description Default **kwargs Any

Pairs of setting name and its new value.

{}

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.set_mode","title":"set_mode(enable_checks, enable_asserts)","text":"

Configures the operation mode for Pandas Checks globally. Does not modify the Series itself.

Parameters:

Name Type Description Default enable_checks bool

Whether to run any Pandas Checks methods globally. Does not affect .check.assert_data().

required enable_asserts bool

Whether to run calls to Pandas Checks .check.assert_data() globally.

required

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.shape","title":"shape(fn=lambda s: s, check_name='\ud83d\udcd0 Shape')","text":"

Displays the Series's dimensions, without modifying the Series itself.

See Pandas docs for shape for additional usage information.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the Series before running Pandas shape. Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

'\ud83d\udcd0 Shape'

Returns:

Type Description Series

The original Series, unchanged.

Note

See also .check.nrows()

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.tail","title":"tail(n=5, fn=lambda s: s, check_name=None)","text":"

Displays the last n rows of the Series, without modifying the Series itself.

See Pandas docs for tail() for additional usage information.

Parameters:

Name Type Description Default n int

Number of rows to show.

5 fn Callable

An optional lambda function to apply to the Series before running Pandas tail(). Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.unique","title":"unique(fn=lambda s: s, check_name=None)","text":"

Displays the unique values in a Series, without modifying the Series itself.

See Pandas docs for unique() for additional usage information.

Parameters:

Name Type Description Default fn Callable

An optional lambda function to apply to the Series before running Pandas unique(). Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.value_counts","title":"value_counts(fn=lambda s: s, max_rows=10, check_name=None, **kwargs)","text":"

Displays the value counts for a Series, without modifying the Series itself.

See Pandas docs for value_counts() for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default max_rows int

Maximum number of rows to show in the value counts.

10 fn Callable

An optional lambda function to apply to the Series before running Pandas value_counts(). Example: lambda s: s.dropna().

lambda s: s check_name Union[str, None]

An optional name for the check, to be printed as preface to the result.

None **kwargs Any

Optional, additional arguments that are accepted by Pandas value_counts() method.

{}

Returns:

Type Description Series

The original Series, unchanged.

"},{"location":"API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.write","title":"write(path, format=None, fn=lambda s: s, verbose=False, **kwargs)","text":"

Exports Series to file, without modifying the Series itself.

Format is inferred from path extension like .csv.

This functions uses the corresponding Pandas export function such as to_csv(). See Pandas docs for those functions for additional usage information, including more configuration options you can pass to this Pandas Checks method.

Parameters:

Name Type Description Default path str

Path to write the file to.

required format Union[str, None]

Optional file format to force for the export. If None, format is inferred from the file's extension in path.

None fn Callable

An optional lambda function to apply to the Series before exporting. Example: lambda s: s.dropna().

lambda s: s verbose bool

Whether to print a message when the file is written.

False **kwargs Any

Optional, additional keyword arguments to pass to the Pandas export function (.to_csv).

{}

Returns:

Type Description Series

The original Series, unchanged.

Note

Exporting to some formats such as Excel, Feather, and Parquet may require you to install additional packages.

"},{"location":"API%20reference/display/","title":"Display","text":"

Utilities for displaying text, tables, and plots in Pandas Checks in both terminal and IPython/Jupyter environments.

"},{"location":"API%20reference/display/#pandas_checks.display._display_check","title":"_display_check(data, name=None)","text":"

Renders the result of a Pandas Checks method.

Parameters:

Name Type Description Default data Any

The data to display.

required name Union[str, None]

The optional name of the check.

None

Returns:

Type Description None

None

"},{"location":"API%20reference/display/#pandas_checks.display._display_line","title":"_display_line(line, lead_in=None, colors={})","text":"

Displays a line of text with optional formatting.

Parameters:

Name Type Description Default line str

The text to display.

required lead_in Union[str, None]

The optional text to display before the main text.

None colors Dict

An optional dictionary containing color options for the text and lead-in text. See syntax in docstring for _render_text().

{}

Returns:

Type Description None

None

"},{"location":"API%20reference/display/#pandas_checks.display._display_plot","title":"_display_plot()","text":"

Renders the active Pandas Checks matplotlib plot object in an IPython/Jupyter environment with an optional indent.

Returns:

Type Description None

None

Note

It assumes the plot has already been drawn by another function, such as with .plot() or .hist().

"},{"location":"API%20reference/display/#pandas_checks.display._display_plot_title","title":"_display_plot_title(line, lead_in=None, colors={})","text":"

Displays a plot title with optional formatting.

Parameters:

Name Type Description Default line str

The title text to display.

required lead_in Union[str, None]

Optional text to display before the title.

None colors Dict

An optional dictionary containing color settings for the text and lead-in text. See details in docstring for _render_text().

{}

Returns:

Type Description None

None

"},{"location":"API%20reference/display/#pandas_checks.display._display_table","title":"_display_table(table)","text":"

Renders a Pandas DataFrame or Series in an IPython/Jupyter environment with an optional indent.

Parameters:

Name Type Description Default table Union[DataFrame, Series]

The DataFrame or Series to display.

required

Returns:

Type Description None

None

"},{"location":"API%20reference/display/#pandas_checks.display._display_table_title","title":"_display_table_title(line, lead_in=None, colors={})","text":"

Displays a table title with optional formatting.

Parameters:

Name Type Description Default line str

The title text to display.

required lead_in Union[str, None]

Optional text to display before the title.

None colors Dict

An optiona dictionary containing color options for the text and lead-in text. See details in docstring for _render_text()

{}

Returns:

Type Description None

None

"},{"location":"API%20reference/display/#pandas_checks.display._filter_emojis","title":"_filter_emojis(text)","text":"

Removes emojis from text if user has globally forbidden them.

Parameters:

Name Type Description Default text str

The text to filter emojis from.

required

Returns:

Type Description str

The text with emojis removed if the user's global settings do not allow emojis. Else, the original text.

"},{"location":"API%20reference/display/#pandas_checks.display._format_background_color","title":"_format_background_color(color)","text":"

Applies a background color to text used being displayed in the terminal.

Parameters:

Name Type Description Default color str

The background color to format. See syntax in docstring for _render_text().

required

Returns:

Type Description str

The formatted background color.

"},{"location":"API%20reference/display/#pandas_checks.display._lead_in","title":"_lead_in(lead_in, foreground, background)","text":"

Formats a lead-in text with colors.

Parameters:

Name Type Description Default lead_in Union[str, None]

The lead-in text to format.

required foreground str

The foreground color for the lead-in text. See syntax in docstring for _render_text().

required background str

The background color for the lead-in text. See syntax in docstring for _render_text().

required

Returns:

Type Description str

The formatted lead-in text.

"},{"location":"API%20reference/display/#pandas_checks.display._print_table_terminal","title":"_print_table_terminal(table)","text":"

Prints a Pandas table in a terminal with an optional indent.

Parameters:

Name Type Description Default table Union[DataFrame, Series]

A DataFrame or Series.

required

Returns:

Type Description None

None

"},{"location":"API%20reference/display/#pandas_checks.display._render_html_with_indent","title":"_render_html_with_indent(object_as_html)","text":"

Renders HTML with an optional indent.

Parameters:

Name Type Description Default object_as_html str

The HTML to render.

required

Returns:

Type Description None

None

"},{"location":"API%20reference/display/#pandas_checks.display._render_text","title":"_render_text(text, tag, lead_in=None, colors={})","text":"

Renders text with optional formatting.

Parameters:

Name Type Description Default text str

The text to render.

required tag str

The HTML tag to use for rendering.

required lead_in Union[str, None]

Optional text to display before the main text.

None colors Dict

Optional colors for the text and lead-in text. Keys include: - text_color: The foreground color of the main text. - text_background_color: The background or highlight color of the main text. - lead_in_text_color: The foreground color of lead-in text. - lead_in_background_color: The background color of lead-in text. Color values are phrased such as \"blue\" or \"white\". They are passed to either HTML for Jupyter/IPython outputs and to termcolor when code is run in terminal. For color options when code is run in terminal, see https://github.com/termcolor/termcolor.

{}

Returns:

Type Description None

None

"},{"location":"API%20reference/display/#pandas_checks.display._warning","title":"_warning(message, lead_in='\ud83d\udc3c\ud83e\ude7a Pandas Checks warning', clean_type=False)","text":"

Displays a warning message.

Parameters:

Name Type Description Default message str

The warning message to display.

required lead_in str

Optional lead-in text to display before the warning message.

'\ud83d\udc3c\ud83e\ude7a Pandas Checks warning' clean_type bool

Optional flag to remove the class type from the message, when running .check.dtype().

False

Returns:

Type Description None

None

"},{"location":"API%20reference/options/","title":"Options","text":"

Utilities for configuring Pandas Checks options.

This module provides functions for setting and managing global options for Pandas Checks, including formatting and disabling checks and assertions.

"},{"location":"API%20reference/options/#pandas_checks.options._initialize_format_options","title":"_initialize_format_options(options=None)","text":"

Initializes or resets Pandas Checks formatting options.

Parameters:

Name Type Description Default options Union[List[str], None]

A list of option names to initialize or reset. If None, all formatting options will be initialized or reset.

None

Returns: None

Note

We separate this function from _initialize_options() so user can reset just formatting without changing mode

"},{"location":"API%20reference/options/#pandas_checks.options._initialize_options","title":"_initialize_options()","text":"

Initializes (or resets) all Pandas Checks options to their default values.

Returns:

Type Description None

None

Note

We separate this function from _initialize_format_options() so user can reset just formatting if desired without changing mode

"},{"location":"API%20reference/options/#pandas_checks.options._register_option","title":"_register_option(name, default_value, description, validator)","text":"

Registers a Pandas Checks option in the global Pandas context manager.

If the option has already been registered, reset its value.

This method enables setting global formatting for Pandas Checks results and storing variables that will persist across Pandas method chains, which return newly initialized DataFrames at each method (and so reset the DataFrame's attributes).

Parameters:

Name Type Description Default name str

The name of the option to register.

required default_value Any

The default value for the option.

required description str

A description of the option.

required validator Callable

A function to validate the option value.

required

Returns:

Type Description None

None

Note

For more details on the arguments, see the documentation for pandas._config.config.register_option()

"},{"location":"API%20reference/options/#pandas_checks.options._set_option","title":"_set_option(option, value)","text":"

Updates the value of a Pandas Checks option in the global Pandas context manager.

Parameters:

Name Type Description Default option str

The name of the option to set.

required value Any

The value to set for the option.

required

Returns:

Type Description None

None

Raises:

Type Description AttributeError

If the option is not a valid Pandas Checks option.

"},{"location":"API%20reference/options/#pandas_checks.options.describe_options","title":"describe_options()","text":"

Prints all global options for Pandas Checks, their default values, and current values.

Returns:

Type Description None

None

"},{"location":"API%20reference/options/#pandas_checks.options.disable_checks","title":"disable_checks(enable_asserts=True)","text":"

Turns off all calls to Pandas Checks methods and optionally enables or disables check.assert_data(). Does not modify the DataFrame itself.

If this function is called, subequent calls to .check functions will not be run.

Typically used to 1) Globally switch off Pandas Checks, such as during production. or 2) Temporarily switch off Pandas Checks, such as for a stable part of a notebook.

Parameters:

Name Type Description Default enable_asserts bool

Whether to also run calls to Pandas Checks .check.assert_data()

True

Returns:

Type Description None

None

"},{"location":"API%20reference/options/#pandas_checks.options.enable_checks","title":"enable_checks(enable_asserts=True)","text":"

Turns on Pandas Checks globally. Subsequent calls to .check methods will be run.

Parameters:

Name Type Description Default enable_asserts bool

Whether to also enable or disable check.assert_data().

True

Returns:

Type Description None

None

"},{"location":"API%20reference/options/#pandas_checks.options.get_mode","title":"get_mode()","text":"

Returns whether Pandas Checks is currently running checks and assertions.

Returns:

Type Description Dict[str, bool]

A dictionary containing the current settings.

"},{"location":"API%20reference/options/#pandas_checks.options.reset_format","title":"reset_format()","text":"

Globally restores all Pandas Checks formatting options to their default \"factory\" settings.

Returns:

Type Description None

None

"},{"location":"API%20reference/options/#pandas_checks.options.set_format","title":"set_format(**kwargs)","text":"

Configures selected formatting options for Pandas Checks. Run pandas_checks.describe_options() to see a list of available options.

For example, set_format(check_text_tag= \"h1\", use_emojis=False`) will globally change Pandas Checks to display text results as H1 headings and remove all emojis.

Returns:

Type Description None

None

Parameters:

Name Type Description Default **kwargs Any

Pairs of setting name and its new value.

{}"},{"location":"API%20reference/options/#pandas_checks.options.set_mode","title":"set_mode(enable_checks, enable_asserts)","text":"

Configures the operation mode for Pandas Checks globally.

Parameters:

Name Type Description Default enable_checks bool

Whether to run any Pandas Checks methods globally. Does not affect .check.assert_data().

required enable_asserts bool

Whether to run calls to .check.assert_data() globally.

required

Returns:

Type Description None

None

"},{"location":"API%20reference/run_checks/","title":"Run checks","text":"

Utilities for running Pandas Checks data checks.

"},{"location":"API%20reference/run_checks/#pandas_checks.run_checks._apply_modifications","title":"_apply_modifications(data, fn=lambda df: df, subset=None)","text":"

Applies user's modifications to a data object.

Parameters:

Name Type Description Default data Any

May be any Pandas DataFrame, Series, string, or other variable

required fn Callable

An optional lambda function to modify data

lambda df: df subset Union[str, List, None]

Columns to subset after applying modifications

None

Returns:

Type Description Any

Modified and optionally subsetted data object. If all arguments are defaults, data is returned unchanged.

"},{"location":"API%20reference/run_checks/#pandas_checks.run_checks._check_data","title":"_check_data(data, check_fn=lambda df: df, modify_fn=lambda df: df, subset=None, check_name=None)","text":"

Runs a selected check on a data object

Parameters:

Name Type Description Default data Any

A Pandas DataFrame, Series, string, or other variable

required check_fn Callable

Function to apply to data for checking. For example if we're running .check.value_counts(), this function would appply the Pandas value_counts() method

lambda df: df modify_fn Callable

Optional function to modify data before checking

lambda df: df subset Union[str, List, None]

Optional list of columns or name of column to subset data before running check_fn

None check_name Union[str, None]

Name to use when displaying check result

None

Returns:

Type Description None

None

"},{"location":"API%20reference/run_checks/#pandas_checks.run_checks._display_check","title":"_display_check(data, name=None)","text":"

Renders the result of a Pandas Checks method.

Parameters:

Name Type Description Default data Any

The data to display.

required name Union[str, None]

The optional name of the check.

None

Returns:

Type Description None

None

"},{"location":"API%20reference/run_checks/#pandas_checks.run_checks.get_mode","title":"get_mode()","text":"

Returns whether Pandas Checks is currently running checks and assertions.

Returns:

Type Description Dict[str, bool]

A dictionary containing the current settings.

"},{"location":"API%20reference/timer/","title":"Timer","text":"

Provides a timer utility for tracking the elapsed time of steps within a Pandas method chain.

Note that these functions rely on the pdchecks.enable_checks option being enabled in the Pandas configuration, as it is by default.

"},{"location":"API%20reference/timer/#pandas_checks.timer._display_line","title":"_display_line(line, lead_in=None, colors={})","text":"

Displays a line of text with optional formatting.

Parameters:

Name Type Description Default line str

The text to display.

required lead_in Union[str, None]

The optional text to display before the main text.

None colors Dict

An optional dictionary containing color options for the text and lead-in text. See syntax in docstring for _render_text().

{}

Returns:

Type Description None

None

"},{"location":"API%20reference/timer/#pandas_checks.timer.get_mode","title":"get_mode()","text":"

Returns whether Pandas Checks is currently running checks and assertions.

Returns:

Type Description Dict[str, bool]

A dictionary containing the current settings.

"},{"location":"API%20reference/timer/#pandas_checks.timer.print_time_elapsed","title":"print_time_elapsed(start_time, lead_in='\u23f1\ufe0f Time elapsed', units='auto')","text":"

Displays the time elapsed since start_time.

Parameters:

Name Type Description Default start_time float

The index time when the stopwatch started, which comes from the Pandas Checks start_timer()

required lead_in Union[str, None]

Optional text to print before the elapsed time.

'\u23f1\ufe0f Time elapsed' units str

The units in which to display the elapsed time. Accepted values: - \"auto\" - \"milliseconds\", \"seconds\", \"minutes\", \"hours\" - \"ms\", \"s\", \"m\", \"h\"

'auto'

Returns:

Type Description None

None

Raises:

Type Description ValueError

If units is not one of expected time units

Note

If you change the default values for this function's argument, change them in .check.print_time_elapsed too in DataFrameChecks and SeriesChecks so they're exposed to the user.

"},{"location":"API%20reference/timer/#pandas_checks.timer.start_timer","title":"start_timer(verbose=False)","text":"

Starts a Pandas Checks stopwatch to measure run time between operations, such as steps in a Pandas method chain. Use print_elapsed_time() to get timings.

Parameters:

Name Type Description Default verbose bool

Whether to print a message that the timer has started.

False

Returns:

Type Description float

Timestamp as a float

"},{"location":"API%20reference/utils/","title":"Utils","text":"

Utility functions for the pandas_checks package.

"},{"location":"API%20reference/utils/#pandas_checks.utils._display_line","title":"_display_line(line, lead_in=None, colors={})","text":"

Displays a line of text with optional formatting.

Parameters:

Name Type Description Default line str

The text to display.

required lead_in Union[str, None]

The optional text to display before the main text.

None colors Dict

An optional dictionary containing color options for the text and lead-in text. See syntax in docstring for _render_text().

{}

Returns:

Type Description None

None

"},{"location":"API%20reference/utils/#pandas_checks.utils._has_nulls","title":"_has_nulls(data, fail_message, raise_exception=True, exception_to_raise=DataError)","text":"

Utility function to check for nulls as part of a larger check

"},{"location":"API%20reference/utils/#pandas_checks.utils._is_type","title":"_is_type(data, dtype)","text":"

Utility function to check if a dataframe's columns or one series has an expected type. Includes special handling for strings, since 'object' type in Pandas may not mean a string

"},{"location":"API%20reference/utils/#pandas_checks.utils._lambda_to_string","title":"_lambda_to_string(lambda_func)","text":"

Create a string representation of a lambda function.

Parameters:

Name Type Description Default lambda_func Callable

An arbitrary function in lambda form

required

Returns:

Type Description str

A string version of lambda_func

Todo

This still returns all arguments to the calling function. They get entangled with the argument when it's a lambda function. Try other ways to get just the argument we want.

"},{"location":"API%20reference/utils/#pandas_checks.utils._series_is_type","title":"_series_is_type(s, dtype)","text":"

Utility function to check if a series has an expected type. Includes special handling for strings, since 'object' type in Pandas may not mean a string

"}]} \ No newline at end of file