Skip to content

Commit 834cf42

Browse files
authored
feat(analysis): add built-in plots and view (#17)
* chore(deps): add analysis dependencies * feat(analysis): add pull request list view * feat(analysis): implemented state distribution * feat(analysis): implemented overall timeline distribution * feat(analysis): implemented author association distribution * feat(analysis): implemented conventional commit breakdown * chore(deps): consolidated * feat(analysis): implemented reaction distribution * docs(sample): add plot and view samples * chore(poetry): update lock * chore(precommit): removed eof fixer * chore(notebooks): moved view pull requests * chore(verify): add papermill and jupyter verification scripts * chore(notebooks): refresh * chore(ci): verify notebooks * chore(ci): build project * chore(pyproject): comment out readme for now * chore(ci): run poetry shell * fix(ci): ensure poetry run is used for validate notebooks * fix(ci): update validate notebooks * chore(verbose): disable validation * chore(ci): add amend * fix(poetry): pin urllib below version 2 python-poetry/poetry#7936 https://github.com/kiran94/prfiesta/actions/runs/5031995017/jobs/9025385196#step:11:26 * chore(ci): restrict verify notebooks to linux * chore(notebooks): refresh * chore(notebooks): remove echo * style(analysis): trailing whitespace
1 parent 1485b9b commit 834cf42

22 files changed

+2791
-60
lines changed

.github/workflows/main.yml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,15 @@ jobs:
5656
lcov-file: coverage.lcov
5757
github-token: ${{ secrets.GITHUB_TOKEN }}
5858

59+
- name: Verify Notebooks
60+
if: matrix.os == 'ubuntu-latest'
61+
env:
62+
PYDEVD_DISABLE_FILE_VALIDATION: 1
63+
run: |
64+
poetry build
65+
poetry install
66+
make validate_notebooks
67+
5968
terraform-lint:
6069
runs-on: ubuntu-latest
6170
timeout-minutes: 10

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -222,5 +222,6 @@ requirements-dev.txt
222222

223223
*.csv
224224
*.parquet
225+
!notebooks/samples_data/*.parquet
225226

226227
coverage.lcov

.pre-commit-config.yaml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@ repos:
22
- repo: https://github.com/pre-commit/pre-commit-hooks
33
rev: v2.3.0
44
hooks:
5-
- id: end-of-file-fixer
65
- id: trailing-whitespace
76
- id: check-merge-conflict
87
- id: check-yaml

docs/analysis.md

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
# Analysis
2+
3+
prfiesta ships with built in plots to help analyze your pull request data. These serve as a starting point in your analysis.
4+
5+
**It's recommended to run these within the context of a [Jupyter Notebook](https://docs.jupyter.org/en/latest/)**
6+
7+
## Built In Views
8+
9+
| View | Description | Sample |
10+
| --------------- | --------------- | ------------- |
11+
| `prfiesta.analysis.view.view_pull_request` | Produces a table of pull requests which summarizes contribution. Each pull request is linked to enable further investigation | [Link](../notebooks/views/view_pull_requests.ipynb) |
12+
13+
## Built In Plots
14+
15+
Plot | Description | Sample |
16+
| --------------- | --------------- | ------ |
17+
| `prfiesta.analysis.plot.plot_overall_timeline` | Produces a plot showing contributions over time catagorized by month and year | [Link](../notebooks/plots/plot_overall_contribution_timeline.ipynb) |
18+
| `prfiesta.analysis.plot.plot_state_distribution` | Produces a plot showing the distribution of state (open or closed PR) catagorized by repository | [Link](../notebooks/plots/plot_state_distribution.ipynb) |
19+
| `prfiesta.analysis.plot.plot_author_associations` | Produces a plot showing authors association to the repository contribued to | [Link](../notebooks/plots/plot_author_association.ipynb) |
20+
| `prfiesta.analysis.plot.plot_conventional_commit_breakdown` | Produces a plot showing [git conventional commit](https://www.conventionalcommits.org/en/v1.0.0/) breakdown for contributions catagorized by repository name. Note that this requires the user to use conventional commit messages in their pull request titles. | [Link](../notebooks/plots/plot_conventional_commit_breakdown.ipynb) |
21+
| `prfiesta.analysis.plot.plot_reactions` | Produces a plot showing distribution of [GitHub Reactions](https://docs.github.com/en/rest/reactions?apiVersion=2022-11-28) | [Link](../notebooks/plots/plot_reactions.ipynb) |
22+
23+
24+
All prfiesta plots have the same signature:
25+
26+
```python
27+
def plot_NAME(data: pd.DataFrame, **kwargs) -> Union[plt.Figure, plt.Axes]:
28+
...
29+
```
30+
31+
*Where `NAME` is the placeholder for the actual plot name.*
32+
33+
- The plotting functions always take a `pd.DataFrame` as the first argument. This Dataframe should originate from the prfiesta collection process.
34+
- The plotting functions always return something which can be displayed in a Jupyter Notebook
35+
- The plotting functions always take `**kwargs` which can be used to further customize the output
36+
- The exact specifics of this is up to the plotting function however in general the following common *optional* options should exist:
37+
38+
| Option | Type | Description |
39+
| ---------------- | --------------- | --------------- |
40+
| `ax` | `Optional[matplotlib.axes.Axes]` | The `matplotlib` axis to plot into. This allows users to add the plot into a `plt.subplots`. If omitted, the plot will just plot into the default location |
41+
| `palette` | `Optional[str]` | A color palette to apply to the plot (e.g a [seaborn color palette](https://seaborn.pydata.org/tutorial/color_palettes.html)) |
42+
| `title` | `Optional[str]` | The title of the plot |
43+
| `hue` | `Optional[str]` | Used to distinguise different catagories. This is relevant to [seaborn backed plots](https://seaborn.pydata.org/tutorial/color_palettes.html?highlight=hue#vary-hue-to-distinguish-categories) |
44+
45+
46+
## Building a Custom Plot
47+
48+
You can build your own plot by creating a function that follows this signature:
49+
50+
```python
51+
from typing import Union
52+
import pandas as pd
53+
import matplotlib.pyplot as plt
54+
55+
def plot_my_cool_plot(data: pd.DataFrame, **kwargs) -> Union[plt.Figure, plt.Axes, pd.DataFrame]:
56+
pass
57+
```
58+
59+
All prfiesta should accept the the same `kwargs` mentioned in the above table however implementation of these `kwargs` per plot is on a best effort basis.
60+
61+
**If you build a plot which you think will be useful for others then feel free to contribute it to this project 🚀**

makefile

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,10 @@ precommit_install:
2828
precommit_run:
2929
pre-commit run --all-files
3030

31+
validate_notebooks:
32+
poetry run bash ./notebooks/scripts/run_all.sh './notebooks/plots/*.ipynb' 'notebooks/plots'
33+
poetry run bash ./notebooks/scripts/run_all.sh './notebooks/views/*.ipynb' 'notebooks/views'
34+
3135
clean:
3236
rm ./coverage.xml
3337
rm -rf ./htmlcov

notebooks/plots/plot_author_association.ipynb

Lines changed: 129 additions & 0 deletions
Large diffs are not rendered by default.

notebooks/plots/plot_conventional_commit_breakdown.ipynb

Lines changed: 132 additions & 0 deletions
Large diffs are not rendered by default.

notebooks/plots/plot_overall_contribution_timeline.ipynb

Lines changed: 132 additions & 0 deletions
Large diffs are not rendered by default.

notebooks/plots/plot_reactions.ipynb

Lines changed: 132 additions & 0 deletions
Large diffs are not rendered by default.

notebooks/plots/plot_state_distribution.ipynb

Lines changed: 132 additions & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)