Skip to content

Commit fa7d9c2

Browse files
committed
Repository generated with a template
0 parents  commit fa7d9c2

25 files changed

+536
-0
lines changed

.ci/flake8.sh

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
#!/usr/bin/env bash
2+
set -ex
3+
pip install poetry
4+
poetry install --only lint
5+
poetry run flake8 .
6+
poetry run nbqa flake8 notebooks --nbqa-shell

.ci/isort.sh

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
#!/usr/bin/env bash
2+
set -ex
3+
pip install poetry
4+
poetry install --only lint
5+
poetry run isort . --check-only
6+
poetry run nbqa isort notebooks --check-only

.ci/stripped_notebooks_check.sh

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
#!/usr/bin/env bash
2+
set -ex
3+
TEMP=$(mktemp --directory)
4+
.hooks/strip_notebooks.py --all --no-index --output "$TEMP"
5+
diff -rq ./stripped "$TEMP" --exclude .gitkeep
6+
rm -rf "$TEMP"

.ci/test.sh

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
#!/usr/bin/env bash
2+
set -ex
3+
pip install poetry
4+
poetry install
5+
poetry run pytest

.flake8

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
[flake8]
2+
max-line-length = 120
3+
# W503: we prefer line breaks _before_ operators (as changed in PEP8 in 2016).
4+
# E203: whitespace before : , black is right here: https://github.com/psf/black/issues/315
5+
ignore = W503,E203
6+
show-source = True
7+
statistics = True
8+
exclude =
9+
.git
10+
.vscode
11+
.idea
12+
__pycache__
13+
.ipynb_checkpoints
14+
stripped
15+
venv
16+
.venv
17+
dist
18+
.hooks

.gitattributes

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
*.ipynb -diff

.github/ISSUE_TEMPLATE/bug_report.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
---
2+
name: Bug report
3+
about: Create a report to help us improve
4+
title: ''
5+
labels: bug
6+
assignees: ''
7+
8+
---
9+
10+
**Describe the bug**
11+
A clear and concise description of what the bug is.
12+
13+
**To Reproduce**
14+
Steps to reproduce the behavior:
15+
1. Go to '...'
16+
2. Click on '....'
17+
3. Scroll down to '....'
18+
4. See error
19+
20+
**Expected behavior**
21+
A clear and concise description of what you expected to happen.
22+
23+
**Screenshots**
24+
If applicable, add screenshots to help explain your problem.
25+
26+
**Configuration (please complete the following information):**
27+
- OS: [e.g. Windows]
28+
- Version [e.g. 1.0.1 or commit SHA]
29+
- [any other (e.g. CUDA version) if applicable]
30+
31+
**Additional context**
32+
Add any other context about the problem here.
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
---
2+
name: Feature request
3+
about: Suggest an idea for this project
4+
title: ''
5+
labels: enhancement
6+
assignees: ''
7+
8+
---
9+
10+
**Is your feature request related to a problem? Please describe.**
11+
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
12+
13+
**Describe the solution you'd like**
14+
A clear and concise description of what you want to happen.
15+
16+
**Describe alternatives you've considered**
17+
A clear and concise description of any alternative solutions or features you've considered.
18+
19+
**Are you willing to create a pull request?**
20+
State if you can create a pull request that solves the problem.
21+
22+
**Additional context**
23+
Add any other context or screenshots about the feature request here.

.github/ISSUE_TEMPLATE/question.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
---
2+
name: Question
3+
about: Ask a question about this project.
4+
title: ''
5+
labels: question
6+
assignees: ''
7+
8+
---
9+
10+
A clear and concise description of the problem and/or the expected behavior.
11+
12+
**Screenshots**
13+
If applicable, add screenshots to help explain your question.

.github/workflows/codeql.yml

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
name: "CodeQL"
2+
3+
on:
4+
push:
5+
branches: [ "main" ]
6+
pull_request:
7+
branches: [ "main" ]
8+
schedule:
9+
- cron: '0 6 1 * *'
10+
11+
# --- AUTOMATICALLY GENERATED ---
12+
13+
jobs:
14+
analyze:
15+
name: Analyze
16+
# Runner size impacts CodeQL analysis time. To learn more, please see:
17+
# - https://gh.io/recommended-hardware-resources-for-running-codeql
18+
# - https://gh.io/supported-runners-and-hardware-resources
19+
# - https://gh.io/using-larger-runners
20+
# Consider using larger runners for possible analysis time improvements.
21+
runs-on: ${{ (matrix.language == 'swift' && 'macos-latest') || 'ubuntu-latest' }}
22+
timeout-minutes: ${{ (matrix.language == 'swift' && 120) || 360 }}
23+
permissions:
24+
# required for all workflows
25+
security-events: write
26+
27+
# only required for workflows in private repositories
28+
actions: read
29+
contents: read
30+
31+
strategy:
32+
fail-fast: false
33+
matrix:
34+
language: [ 'python' ]
35+
# CodeQL supports [ 'c-cpp', 'csharp', 'go', 'java-kotlin', 'javascript-typescript', 'python', 'ruby', 'swift' ]
36+
# Use only 'java-kotlin' to analyze code written in Java, Kotlin or both
37+
# Use only 'javascript-typescript' to analyze code written in JavaScript, TypeScript or both
38+
# Learn more about CodeQL language support at https://aka.ms/codeql-docs/language-support
39+
40+
steps:
41+
- name: Checkout repository
42+
uses: actions/checkout@v4
43+
44+
# Initializes the CodeQL tools for scanning.
45+
- name: Initialize CodeQL
46+
uses: github/codeql-action/init@v3
47+
with:
48+
languages: ${{ matrix.language }}
49+
# If you wish to specify custom queries, you can do so here or in a config file.
50+
# By default, queries listed here will override any specified in a config file.
51+
# Prefix the list here with "+" to use these queries and those in the config file.
52+
53+
# For more details on CodeQL's query packs, refer to: https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/configuring-code-scanning#using-queries-in-ql-packs
54+
# queries: security-extended,security-and-quality
55+
56+
57+
# Autobuild attempts to build any compiled languages (C/C++, C#, Go, Java, or Swift).
58+
# If this step fails, then you should remove it and run the build manually (see below)
59+
- name: Autobuild
60+
uses: github/codeql-action/autobuild@v3
61+
62+
# ℹ️ Command-line programs to run using the OS shell.
63+
# 📚 See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsrun
64+
65+
# If the Autobuild fails above, remove it and uncomment the following three lines.
66+
# modify them (or add more) to build your code if your project, please refer to the EXAMPLE below for guidance.
67+
68+
# - run: |
69+
# echo "Run, Build Application using script"
70+
# ./location_of_script_within_repo/buildscript.sh
71+
72+
- name: Perform CodeQL Analysis
73+
uses: github/codeql-action/analyze@v3
74+
with:
75+
category: "/language:${{matrix.language}}"

.github/workflows/tests.yml

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
name: Tests
2+
3+
on:
4+
push:
5+
branches: [ "main" ]
6+
pull_request:
7+
branches: [ "main"]
8+
9+
permissions:
10+
contents: read
11+
12+
jobs:
13+
stripped_notebooks_check:
14+
name: Stripped notebooks check
15+
runs-on: ubuntu-latest
16+
steps:
17+
- uses: actions/checkout@v4
18+
- uses: actions/setup-python@v5
19+
with:
20+
python-version: "3.9"
21+
- run: .ci/stripped_notebooks_check.sh
22+
23+
isort:
24+
name: isort
25+
runs-on: ubuntu-latest
26+
steps:
27+
- uses: actions/checkout@v4
28+
- uses: actions/setup-python@v5
29+
with:
30+
python-version: "3.9"
31+
- run: .ci/isort.sh
32+
33+
flake8:
34+
name: Flake8
35+
runs-on: ubuntu-latest
36+
steps:
37+
- uses: actions/checkout@v4
38+
- uses: actions/setup-python@v5
39+
with:
40+
python-version: "3.9"
41+
- run: .ci/flake8.sh
42+
43+
test:
44+
name: Tests
45+
runs-on: ubuntu-latest
46+
steps:
47+
- uses: actions/checkout@v4
48+
- uses: actions/setup-python@v5
49+
with:
50+
python-version: "3.9"
51+
- run: .ci/test.sh

.gitignore

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
**/*.pyc
2+
**/.ipynb_checkpoints
3+
**/.pytest_cache
4+
**/__pycache__
5+
.idea
6+
.python-version
7+
.vscode
8+
venv
9+
.venv

.hooks/isort.sh

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
#!/usr/bin/env sh
2+
set -ex
3+
# isort "fails" if there are any changes made and thus the whole hook fails.
4+
# This is a deliberate decision made by pre-commit creators, see below:
5+
# https://github.com/pre-commit/pre-commit/issues/2240#issuecomment-1034028917
6+
poetry run isort .
7+
poetry run nbqa isort notebooks

.hooks/strip_notebooks.py

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
#!/usr/bin/env python3
2+
import argparse
3+
import json
4+
import logging
5+
import shutil
6+
import subprocess
7+
from pathlib import Path
8+
from typing import Optional
9+
10+
LOG = logging.getLogger('strip_notebooks')
11+
12+
13+
def get_paths_from_command(*popenargs, **kwargs) -> list[Path]:
14+
output = subprocess.check_output(*popenargs, **kwargs)
15+
return list(map(Path, output.decode().splitlines(keepends=False)))
16+
17+
18+
def process(git: Path, ipynb_path: Path, stripped_path: Path, no_index: bool):
19+
20+
if not ipynb_path.exists():
21+
if stripped_path.exists() and no_index is False:
22+
LOG.info('Removing %s', stripped_path)
23+
subprocess.check_call([git, 'rm', stripped_path])
24+
return
25+
26+
LOG.info('Stripping %s to %s', ipynb_path, stripped_path)
27+
with ipynb_path.open(encoding='utf-8') as f_in:
28+
data = json.load(f_in)
29+
stripped_path.parent.mkdir(parents=True, exist_ok=True)
30+
with stripped_path.open('w') as f_out:
31+
for cell in data['cells']:
32+
cell_type = cell['cell_type']
33+
source = cell['source']
34+
if cell_type in ('markdown', 'raw'):
35+
print('"""', file=f_out)
36+
for line in source:
37+
print(line.rstrip('\n'), file=f_out)
38+
print('"""', file=f_out)
39+
elif cell_type == 'code':
40+
for line in source:
41+
print(line.rstrip('\n'), file=f_out)
42+
print('# ---', file=f_out)
43+
else:
44+
raise ValueError('Cell type', cell_type)
45+
if no_index is False:
46+
LOG.info('Adding %s', stripped_path)
47+
subprocess.check_call([git, 'add', stripped_path])
48+
49+
50+
def main(
51+
all_: bool,
52+
no_index: bool,
53+
output: Optional[Path],
54+
paths: list[Path],
55+
):
56+
logging.basicConfig(level=logging.INFO)
57+
58+
git = Path(shutil.which('git'))
59+
repo_path = get_paths_from_command([git, 'rev-parse', '--show-toplevel'])[0]
60+
output = output or (repo_path / 'stripped')
61+
62+
if all_:
63+
paths = get_paths_from_command([git, 'ls-files'])
64+
elif not paths:
65+
paths = get_paths_from_command([git, 'diff', '--cached', '--name-only', '--no-renames'])
66+
67+
for path in paths:
68+
if not path.suffix == '.ipynb':
69+
continue
70+
stripped_path = (
71+
(output / (path.absolute().relative_to(repo_path)))
72+
.with_stem(path.stem + '_stripped')
73+
.with_suffix('.py')
74+
)
75+
process(git=git, ipynb_path=path, stripped_path=stripped_path, no_index=no_index)
76+
77+
78+
if __name__ == '__main__':
79+
parser = argparse.ArgumentParser()
80+
parser.add_argument('--all', action='store_true', dest='all_')
81+
parser.add_argument('--no-index', action='store_true', help='do not modify git index')
82+
parser.add_argument('--output', type=Path)
83+
parser.add_argument('paths', type=Path, nargs='*')
84+
main(**vars(parser.parse_args()))

.pre-commit-config.yaml

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
repos:
2+
3+
- repo: local
4+
hooks:
5+
- id: isort
6+
name: isort
7+
entry: .hooks/isort.sh
8+
pass_filenames: false
9+
language: script
10+
11+
- id: strip-notebooks
12+
name: Strip notebooks
13+
entry: .hooks/strip_notebooks.py
14+
pass_filenames: false
15+
language: script

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# Changelog
2+
3+
## 0.1.0 April 2, 2024
4+
* Project created.

CONTRIBUTING.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# Contributing to MIM NLP
2+
3+
All contributions and ideas are welcome.
4+
Feel free to report any [issue](https://github.com/mim-solutions/mim_nlp/issues)
5+
or suggest a [pull request](https://github.com/mim-solutions/mim_nlp/pulls).

LICENSE

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2024 MIM Solutions sp. z o.o.
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

0 commit comments

Comments
 (0)