Skip to content

Commit

Permalink
feat(analysis): add built-in plots and view (#17)
Browse files Browse the repository at this point in the history
* chore(deps): add analysis dependencies

* feat(analysis): add pull request list view

* feat(analysis): implemented state distribution

* feat(analysis): implemented overall timeline distribution

* feat(analysis): implemented author association distribution

* feat(analysis): implemented conventional commit breakdown

* chore(deps): consolidated

* feat(analysis): implemented reaction distribution

* docs(sample): add plot and view samples

* chore(poetry): update lock

* chore(precommit): removed eof fixer

* chore(notebooks): moved view pull requests

* chore(verify): add papermill and jupyter verification scripts

* chore(notebooks): refresh

* chore(ci): verify notebooks

* chore(ci): build project

* chore(pyproject): comment out readme for now

* chore(ci): run poetry shell

* fix(ci): ensure poetry run is used for validate notebooks

* fix(ci): update validate notebooks

* chore(verbose): disable validation

* chore(ci): add amend

* fix(poetry): pin urllib below version 2

python-poetry/poetry#7936
https://github.com/kiran94/prfiesta/actions/runs/5031995017/jobs/9025385196#step:11:26

* chore(ci): restrict verify notebooks to linux

* chore(notebooks): refresh

* chore(notebooks): remove echo

* style(analysis): trailing whitespace
  • Loading branch information
kiran94 authored May 20, 2023
1 parent 1485b9b commit 834cf42
Show file tree
Hide file tree
Showing 22 changed files with 2,791 additions and 60 deletions.
9 changes: 9 additions & 0 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,15 @@ jobs:
lcov-file: coverage.lcov
github-token: ${{ secrets.GITHUB_TOKEN }}

- name: Verify Notebooks
if: matrix.os == 'ubuntu-latest'
env:
PYDEVD_DISABLE_FILE_VALIDATION: 1
run: |
poetry build
poetry install
make validate_notebooks
terraform-lint:
runs-on: ubuntu-latest
timeout-minutes: 10
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -222,5 +222,6 @@ requirements-dev.txt

*.csv
*.parquet
!notebooks/samples_data/*.parquet

coverage.lcov
1 change: 0 additions & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@ repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v2.3.0
hooks:
- id: end-of-file-fixer
- id: trailing-whitespace
- id: check-merge-conflict
- id: check-yaml
Expand Down
61 changes: 61 additions & 0 deletions docs/analysis.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# Analysis

prfiesta ships with built in plots to help analyze your pull request data. These serve as a starting point in your analysis.

**It's recommended to run these within the context of a [Jupyter Notebook](https://docs.jupyter.org/en/latest/)**

## Built In Views

| View | Description | Sample |
| --------------- | --------------- | ------------- |
| `prfiesta.analysis.view.view_pull_request` | Produces a table of pull requests which summarizes contribution. Each pull request is linked to enable further investigation | [Link](../notebooks/views/view_pull_requests.ipynb) |

## Built In Plots

Plot | Description | Sample |
| --------------- | --------------- | ------ |
| `prfiesta.analysis.plot.plot_overall_timeline` | Produces a plot showing contributions over time catagorized by month and year | [Link](../notebooks/plots/plot_overall_contribution_timeline.ipynb) |
| `prfiesta.analysis.plot.plot_state_distribution` | Produces a plot showing the distribution of state (open or closed PR) catagorized by repository | [Link](../notebooks/plots/plot_state_distribution.ipynb) |
| `prfiesta.analysis.plot.plot_author_associations` | Produces a plot showing authors association to the repository contribued to | [Link](../notebooks/plots/plot_author_association.ipynb) |
| `prfiesta.analysis.plot.plot_conventional_commit_breakdown` | Produces a plot showing [git conventional commit](https://www.conventionalcommits.org/en/v1.0.0/) breakdown for contributions catagorized by repository name. Note that this requires the user to use conventional commit messages in their pull request titles. | [Link](../notebooks/plots/plot_conventional_commit_breakdown.ipynb) |
| `prfiesta.analysis.plot.plot_reactions` | Produces a plot showing distribution of [GitHub Reactions](https://docs.github.com/en/rest/reactions?apiVersion=2022-11-28) | [Link](../notebooks/plots/plot_reactions.ipynb) |


All prfiesta plots have the same signature:

```python
def plot_NAME(data: pd.DataFrame, **kwargs) -> Union[plt.Figure, plt.Axes]:
...
```

*Where `NAME` is the placeholder for the actual plot name.*

- The plotting functions always take a `pd.DataFrame` as the first argument. This Dataframe should originate from the prfiesta collection process.
- The plotting functions always return something which can be displayed in a Jupyter Notebook
- The plotting functions always take `**kwargs` which can be used to further customize the output
- The exact specifics of this is up to the plotting function however in general the following common *optional* options should exist:

| Option | Type | Description |
| ---------------- | --------------- | --------------- |
| `ax` | `Optional[matplotlib.axes.Axes]` | The `matplotlib` axis to plot into. This allows users to add the plot into a `plt.subplots`. If omitted, the plot will just plot into the default location |
| `palette` | `Optional[str]` | A color palette to apply to the plot (e.g a [seaborn color palette](https://seaborn.pydata.org/tutorial/color_palettes.html)) |
| `title` | `Optional[str]` | The title of the plot |
| `hue` | `Optional[str]` | Used to distinguise different catagories. This is relevant to [seaborn backed plots](https://seaborn.pydata.org/tutorial/color_palettes.html?highlight=hue#vary-hue-to-distinguish-categories) |


## Building a Custom Plot

You can build your own plot by creating a function that follows this signature:

```python
from typing import Union
import pandas as pd
import matplotlib.pyplot as plt

def plot_my_cool_plot(data: pd.DataFrame, **kwargs) -> Union[plt.Figure, plt.Axes, pd.DataFrame]:
pass
```

All prfiesta should accept the the same `kwargs` mentioned in the above table however implementation of these `kwargs` per plot is on a best effort basis.

**If you build a plot which you think will be useful for others then feel free to contribute it to this project 🚀**
4 changes: 4 additions & 0 deletions makefile
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,10 @@ precommit_install:
precommit_run:
pre-commit run --all-files

validate_notebooks:
poetry run bash ./notebooks/scripts/run_all.sh './notebooks/plots/*.ipynb' 'notebooks/plots'
poetry run bash ./notebooks/scripts/run_all.sh './notebooks/views/*.ipynb' 'notebooks/views'

clean:
rm ./coverage.xml
rm -rf ./htmlcov
Expand Down
129 changes: 129 additions & 0 deletions notebooks/plots/plot_author_association.ipynb

Large diffs are not rendered by default.

132 changes: 132 additions & 0 deletions notebooks/plots/plot_conventional_commit_breakdown.ipynb

Large diffs are not rendered by default.

132 changes: 132 additions & 0 deletions notebooks/plots/plot_overall_contribution_timeline.ipynb

Large diffs are not rendered by default.

132 changes: 132 additions & 0 deletions notebooks/plots/plot_reactions.ipynb

Large diffs are not rendered by default.

132 changes: 132 additions & 0 deletions notebooks/plots/plot_state_distribution.ipynb

Large diffs are not rendered by default.

Binary file added notebooks/samples_data/charliermarsh.parquet
Binary file not shown.
Binary file added notebooks/samples_data/kiran94.parquet
Binary file not shown.
6 changes: 6 additions & 0 deletions notebooks/scripts/run_all.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/bin/bash

LIST_EXPRESSION=$1
CWD=$2

ls $LIST_EXPRESSION | xargs -I _ papermill --cwd $CWD _ _
Loading

0 comments on commit 834cf42

Please sign in to comment.