You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-`.check.print()`: Print a string, a variable, or the current dataframe - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.print) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.print)
Copy file name to clipboardExpand all lines: docs/index.md
+18-5Lines changed: 18 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,11 @@
1
-
# About
2
-
3
-
<imgsrc="https://raw.githubusercontent.com/cparmet/pandas-checks/main/static/pandas-check-gh-social.jpg"alt="Banner image for Pandas Checks">
1
+
---
2
+
title: About
3
+
---
4
+
5
+
<imgsrc="https://raw.githubusercontent.com/cparmet/pandas-checks/main/static/pandas-check-gh-social.jpg"alt="Banner image for Pandas Checks"style="max-height: 200px; width: auto;">
4
6
7
+
[TOC]
8
+
5
9
## What is it?
6
10
7
11
**Pandas Checks** is a Python package for data science and data engineering. It adds non-invasive health checks for [Pandas](https://github.com/pandas-dev/pandas/) method chains.
@@ -24,8 +28,17 @@ As Fleetwood Mac says, you would never break the chain.
24
28
25
29
If you run into trouble or have questions, I'd love to know. Please [open an issue](https://github.com/cparmet/pandas-checks/issues).
26
30
27
-
Contributions are appreciated! Please open an [issue](https://github.com/cparmet/pandas-checks/issues) or submit a [pull request](https://github.com/cparmet/pandas-checks/pulls). Pandas Checks uses the wonderful libraries [poetry](https://python-poetry.org) for package and dependency management, [nox](https://nox.thea.codes/en/stable/) for test automation, and [mkdocs](https://www.mkdocs.org/) for docs.
28
-
31
+
Contributions are appreciated! Please open an [issue](https://github.com/cparmet/pandas-checks/issues) or submit a [pull request](https://github.com/cparmet/pandas-checks/pulls). To run the tests, run `uv run --group dev nox`
32
+
33
+
## Acknowledgments
34
+
35
+
Pandas Checks uses the following wonderful libraries:
36
+
37
+
-[uv](https://github.com/astral-sh/uv) for package and dependency management
38
+
-[nox](https://nox.thea.codes/en/stable/) for test automation
-`.check.print()`: Print a string, a variable, or the current dataframe - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.print) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.print)
98
102
99
103
### Export interim files
@@ -120,17 +124,20 @@ These methods can be used to disable subsequent Pandas Checks methods, either te
120
124
121
125
### Validate data
122
126
Custom:
127
+
123
128
-`.check.assert_data()`: Check that data passes an arbitrary condition - [DataFrame](https://cparmet.github.io/pandas-checks/API%20reference/DataFrameChecks/#pandas_checks.DataFrameChecks.DataFrameChecks.assert_data) | [Series](https://cparmet.github.io/pandas-checks/API%20reference/SeriesChecks/#pandas_checks.SeriesChecks.SeriesChecks.assert_data)
@@ -210,14 +217,3 @@ You can also adjust settings within a method chain by bookending the chain, like
210
217
.check.enable_checks() # Turn it back on for the next code
211
218
)
212
219
```
213
-
214
-
215
-
### Hybrid EDA-Production data processing
216
-
217
-
Exploratory Data Analysis is often taught as a one-time step we do to plan our production data processing. But sometimes EDA is a cyclical process we go back to for deeper inspection during debugging, code edits, or changes in the input data. If explorations were useful in EDA, they may be useful again.
218
-
219
-
Unfortunately, it's hard to go back to the original EDA code. It's too out of sync. The prod data processing pipeline has usually evolved too much, making the EDA code a historical artifact full of cobwebs that we can't easily fire up again.
220
-
221
-
But if you use Pandas Checks during EDA, you could roll your `.check` methods into your first production code. Then in prod mode, disable Pandas Checks when you don't need it, to save compute and streamline output. When you ever need to pull out those EDA tools, enable Pandas Checks globally or locally.
222
-
223
-
This can make your prod pipline more transparent and easier to inspect.
Copy file name to clipboardExpand all lines: pandas_checks/DataFrameChecks.py
+4-4Lines changed: 4 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -120,7 +120,7 @@ def assert_data(
120
120
iris
121
121
.check.assert_data(lambda df: df.shape[0]>0)
122
122
123
-
# Or customize the message displayed when alert fails
123
+
# Or customize the message displayed when assert fails
124
124
.check.assert_data(lambda df: df.shape[0]>0, "Assertion failed, DataFrame has no rows!")
125
125
126
126
# Or show a warning instead of raising an exception
@@ -1788,7 +1788,7 @@ def unique(
1788
1788
) ->pd.DataFrame:
1789
1789
"""Displays the unique values in a column, without modifying the DataFrame itself.
1790
1790
1791
-
See Pandas docs for [unique()]((https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.unique.html)) for additional usage information, including more configuration options you can pass to this Pandas Checks method.
1791
+
See Pandas docs for [unique()](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.unique.html) for additional usage information, including more configuration options you can pass to this Pandas Checks method.
1792
1792
1793
1793
Example:
1794
1794
```python
@@ -1832,7 +1832,7 @@ def value_counts(
1832
1832
) ->pd.DataFrame:
1833
1833
"""Displays the value counts for a column, without modifying the DataFrame itself.
1834
1834
1835
-
See Pandas docs for [value_counts()]((https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.value_counts.html)) for additional usage information, including more configuration options you can pass to this Pandas Checks method.
1835
+
See Pandas docs for [value_counts()](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.value_counts.html) for additional usage information, including more configuration options you can pass to this Pandas Checks method.
1836
1836
1837
1837
Example:
1838
1838
```python
@@ -1887,7 +1887,7 @@ def write(
1887
1887
- .tsv # Tab-separated data file
1888
1888
- .xlsx
1889
1889
1890
-
This functions uses the corresponding Pandas export function, such as `to_csv()` and `to_feather()`. See [Pandas docs for those corresponding export functions][Pandas docs for those export functions](https://pandas.pydata.org/docs/reference/io.html) for additional usage information, including more configuration options you can pass to this Pandas Checks method.
1890
+
This functions uses the corresponding Pandas export function, such as `to_csv()` and `to_feather()`. See [Pandas docs for those corresponding export functions](https://pandas.pydata.org/docs/reference/io.html) for additional usage information, including more configuration options you can pass to this Pandas Checks method.
0 commit comments