-
Notifications
You must be signed in to change notification settings - Fork 77
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'main' into docs-blog-pointblank-intro
- Loading branch information
Showing
7 changed files
with
140 additions
and
11 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
* @rich-iannone @machow |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,107 @@ | ||
--- | ||
title: "Style Table Body with `mask=` in `loc.body()`" | ||
html-table-processing: none | ||
author: Jerry Wu | ||
date: 2025-01-24 | ||
freeze: true | ||
jupyter: python3 | ||
format: | ||
html: | ||
code-summary: "Show the Code" | ||
--- | ||
|
||
In Great Tables `0.16.0`, we introduced the `mask=` parameter in `loc.body()`, enabling users to apply conditional styling to rows on a per-column basis more efficiently when working with a Polars DataFrame. This post will demonstrate how it works and compare it with the "old-fashioned" approach: | ||
|
||
* **Leveraging the `mask=` parameter in `loc.body()`:** Use Polars expressions for streamlined styling. | ||
* **Utilizing the `locations=` parameter in `GT.tab_style()`:** Pass a list of `loc.body()` objects. | ||
|
||
Let’s dive in. | ||
|
||
### Preparations | ||
We'll use the built-in dataset `gtcars` to create a Polars DataFrame. Next, we'll select the columns `mfr`, `drivetrain`, `year`, and `hp` to create a small pivoted table named `df_mini`. Finally, we'll pass `df_mini` to the `GT` object to create a table named `gt`, using `drivetrain` as the `rowname_col=` and `mfr` as the `groupname_col=`, as shown below: | ||
```{python} | ||
# | code-fold: true | ||
import polars as pl | ||
from great_tables import GT, loc, style | ||
from great_tables.data import gtcars | ||
from polars import selectors as cs | ||
year_cols = ["2014", "2015", "2016", "2017"] | ||
df_mini = ( | ||
pl.from_pandas(gtcars) | ||
.filter(pl.col("mfr").is_in(["Ferrari", "Lamborghini", "BMW"])) | ||
.sort("drivetrain") | ||
.pivot(on="year", index=["mfr", "drivetrain"], values="hp", aggregate_function="mean") | ||
.select(["mfr", "drivetrain", *year_cols]) | ||
) | ||
gt = GT(df_mini).tab_stub(rowname_col="drivetrain", groupname_col="mfr").opt_stylize(color="cyan") | ||
gt | ||
``` | ||
|
||
The numbers in the cells represent the average horsepower for each combination of `mfr` and `drivetrain` for a specific year. | ||
|
||
### Leveraging the `mask=` parameter in `loc.body()` | ||
The `mask=` parameter in `loc.body()` accepts a Polars expression that evaluates to a boolean result for each cell. | ||
|
||
Here’s how we can use it to achieve the two goals: | ||
|
||
* Highlight the cell text in red if the column datatype is numerical and the cell value exceeds 650. | ||
* Fill the background color as lightgrey if the cell value is missing in the last two columns (`2016` and `2017`). | ||
|
||
```{python} | ||
( | ||
gt.tab_style( | ||
style=style.text(color="red"), | ||
locations=loc.body(mask=cs.numeric().gt(650)) | ||
).tab_style( | ||
style=style.fill(color="lightgrey"), | ||
locations=loc.body(mask=pl.nth(-2, -1).is_null()), | ||
) | ||
) | ||
``` | ||
|
||
In this example: | ||
|
||
* `cs.numeric()` targets numerical columns, and `.gt(650)` checks if the cell value is greater than 650. | ||
* `pl.nth(-2, -1)` targets the last two columns, and `.is_null()` identifies missing values. | ||
|
||
Did you notice that we can use Polars selectors and expressions to dynamically identify columns at runtime? This is definitely a killer feature when working with pivoted operations. | ||
|
||
The `mask=` parameter acts as a syntactic sugar, streamlining the process and removing the need to loop through columns manually. | ||
|
||
::: {.callout-warning collapse="false"} | ||
## Using `mask=` Independently | ||
`mask=` should not be used in combination with the `columns` or `rows` arguments. Attempting to do so will raise a `ValueError`. | ||
::: | ||
|
||
### Utilizing the `locations=` parameter in `GT.tab_style()` | ||
A more "old-fashioned" approach involves passing a list of `loc.body()` objects to the `locations=` parameter in `GT.tab_style()`: | ||
```{python} | ||
# | eval: false | ||
( | ||
gt.tab_style( | ||
style=style.text(color="red"), | ||
locations=[loc.body(columns=col, rows=pl.col(col).gt(650)) | ||
for col in year_cols], | ||
).tab_style( | ||
style=style.fill(color="lightgrey"), | ||
locations=[loc.body(columns=col, rows=pl.col(col).is_null()) | ||
for col in year_cols[-2:]], | ||
) | ||
) | ||
``` | ||
|
||
This approach, though functional, demands additional effort: | ||
|
||
* Explicitly preparing the column names in advance. | ||
* Specifying the `columns=` and `rows=` arguments for each `loc.body()` in the loop. | ||
|
||
While effective, it is less efficient and more verbose compared to the first approach. | ||
|
||
### Wrapping up | ||
With the introduction of the `mask=` parameter in `loc.body()`, users can now style the table body in a more vectorized-like manner, akin to using `df.apply()` in Pandas, enhancing the overall user experience. | ||
|
||
We extend our gratitude to [@igorcalabria](https://github.com/igorcalabria) for suggesting this feature in [#389](https://github.com/posit-dev/great-tables/issues/389) and providing an insightful explanation of its utility. A special thanks to [@henryharbeck](https://github.com/henryharbeck) for providing the second approach. | ||
|
||
We hope you enjoy this new functionality as much as we do! Have ideas to make Great Tables even better? Share them with us via [GitHub Issues](https://github.com/posit-dev/great-tables/issues). We're always amazed by the creativity of our users! See you, until the next great table. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters