Skip to content

Commit 06e6831

Browse files
author
ArturoAmorQ
committed
Merge main
2 parents 6e04ab0 + 7093c62 commit 06e6831

File tree

146 files changed

+1869
-1294
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

146 files changed

+1869
-1294
lines changed

.github/workflows/deploy-gh-pages.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ jobs:
5858
5959
- name: Upload jupyter-book artifact for preview in PRs
6060
if: ${{ github.event_name == 'pull_request' }}
61-
uses: actions/upload-artifact@v3
61+
uses: actions/upload-artifact@v4
6262
with:
6363
name: jupyter-book
6464
path: |

.github/workflows/jupyter-book-pr-preview.yml

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -19,11 +19,10 @@ jobs:
1919
sha: ${{ github.event.workflow_run.head_sha }}
2020
context: 'JupyterBook preview'
2121

22-
- uses: dawidd6/action-download-artifact@v2
22+
- uses: actions/download-artifact@v4
2323
with:
24-
github_token: ${{secrets.GITHUB_TOKEN}}
25-
workflow: deploy-gh-pages.yml
26-
run_id: ${{ github.event.workflow_run.id }}
24+
github-token: ${{secrets.GITHUB_TOKEN}}
25+
run-id: ${{ github.event.workflow_run.id }}
2726
name: jupyter-book
2827

2928
- name: Get pull request number

.gitignore

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,4 +33,3 @@ doc/_build
3333
.idea
3434
*.code-workspace
3535
.vscode
36-

.jupyter/README.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1 @@
11
This directory is to setup jupyter on binder
2-

.jupyter/jupyter_notebook_config.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,2 @@
11
# To use jupytext in binder
2-
c.ContentsManager.preferred_jupytext_formats_read = 'py:percent' # noqa
2+
c.ContentsManager.preferred_jupytext_formats_read = "py:percent" # noqa

.pre-commit-config.yaml

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -5,16 +5,16 @@ repos:
55
- id: check-yaml
66
- id: end-of-file-fixer
77
exclude: notebooks
8+
exclude_types: [svg]
89
- id: trailing-whitespace
910
exclude: notebooks
11+
exclude_types: [svg]
1012
- repo: https://github.com/psf/black
1113
rev: 23.1.0
1214
hooks:
1315
- id: black
14-
- repo: https://github.com/pycqa/flake8
15-
rev: 4.0.1
16+
- repo: https://github.com/astral-sh/ruff-pre-commit
17+
rev: v0.11.2
1618
hooks:
17-
- id: flake8
18-
entry: pflake8
19-
additional_dependencies: [pyproject-flake8]
20-
types: [file, python]
19+
- id: ruff
20+
args: ["--fix", "--output-format=full"]

CITATION.cff

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
cff-version: 1.2.0
2+
message: "If you use this content, please cite it as below."
3+
authors:
4+
- name: "The scikit-learn MOOC developers"
5+
title: "scikit-learn MOOC"
6+
version: latest
7+
doi: https://doi.org/10.5281/zenodo.7220306
8+
url: "https://github.com/INRIA/scikit-learn-mooc"

README.md

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,8 @@
11
# scikit-learn course
22

3-
📢 📢 📢 A new session of the [Machine learning in Python with scikit-learn
4-
MOOC](https://www.fun-mooc.fr/en/courses/machine-learning-python-scikit-learn),
5-
is available starting on November 8th, 2023 and will remain open on self-paced
6-
mode. Enroll for the full MOOC experience (quizz solutions, executable
3+
This is the source code for the [Machine learning in Python with scikit-learn
4+
MOOC](https://www.fun-mooc.fr/en/courses/machine-learning-python-scikit-learn).
5+
Enroll for the full MOOC experience (quiz solutions, executable
76
notebooks, discussion forum, etc ...) !
87

98
The MOOC is free and hosted on the [FUN-MOOC](https://fun-mooc.fr/) platform

build_tools/generate-exercise-from-solution.py

Lines changed: 49 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,10 @@
55
from jupytext.myst import myst_to_notebook
66
import jupytext
77

8+
9+
WRITE_YOUR_CODE_COMMENT = "# Write your code here."
10+
11+
812
def replace_simple_text(input_py_str):
913
result = input_py_str.replace("📃 Solution for", "📝")
1014
return result
@@ -19,37 +23,62 @@ def remove_solution(input_py_str):
1923
before this comment and add "# Write your code here." at the end of the
2024
cell.
2125
"""
22-
nb = jupytext.reads(input_py_str, fmt='py:percent')
26+
nb = jupytext.reads(input_py_str, fmt="py:percent")
2327

24-
cell_tags_list = [c['metadata'].get('tags') for c in nb.cells]
25-
is_solution_list = [tags is not None and 'solution' in tags
26-
for tags in cell_tags_list]
28+
cell_tags_list = [c["metadata"].get("tags") for c in nb.cells]
29+
is_solution_list = [
30+
tags is not None and "solution" in tags for tags in cell_tags_list
31+
]
2732
# Completely remove cells with "solution" tags
28-
nb.cells = [cell for cell, is_solution in zip(nb.cells, is_solution_list)
29-
if not is_solution]
33+
nb.cells = [
34+
cell
35+
for cell, is_solution in zip(nb.cells, is_solution_list)
36+
if not is_solution
37+
]
3038

3139
# Partial cell removal based on "# solution" comment
3240
marker = "# solution"
33-
pattern = re.compile(f"^{marker}.*", flags=re.MULTILINE|re.DOTALL)
41+
pattern = re.compile(f"^{marker}.*", flags=re.MULTILINE | re.DOTALL)
3442

35-
cells_to_modify = [c for c in nb.cells if c["cell_type"] == "code" and
36-
marker in c["source"]]
43+
cells_to_modify = [
44+
c
45+
for c in nb.cells
46+
if c["cell_type"] == "code" and marker in c["source"]
47+
]
3748

3849
for c in cells_to_modify:
39-
c["source"] = pattern.sub("# Write your code here.", c["source"])
50+
c["source"] = pattern.sub(WRITE_YOUR_CODE_COMMENT, c["source"])
51+
52+
previous_cell_is_write_your_code = False
53+
all_cells_before_deduplication = nb.cells
54+
nb.cells = []
55+
for c in all_cells_before_deduplication:
56+
if c["cell_type"] == "code" and c["source"] == WRITE_YOUR_CODE_COMMENT:
57+
current_cell_is_write_your_code = True
58+
else:
59+
current_cell_is_write_your_code = False
60+
if (
61+
current_cell_is_write_your_code
62+
and previous_cell_is_write_your_code
63+
):
64+
# Drop duplicated "write your code here" cells.
65+
continue
66+
nb.cells.append(c)
67+
previous_cell_is_write_your_code = current_cell_is_write_your_code
4068

4169
# TODO: we could potentially try to avoid changing the input file jupytext
4270
# header since this info is rarely useful. Let's keep it simple for now.
43-
py_nb_str = jupytext.writes(nb, fmt='py:percent')
71+
py_nb_str = jupytext.writes(nb, fmt="py:percent")
4472
return py_nb_str
4573

4674

4775
def write_exercise(solution_path, exercise_path):
76+
print(f"Writing exercise to {exercise_path} from solution {solution_path}")
4877
input_str = solution_path.read_text()
4978

5079
output_str = input_str
5180
for replace_func in [replace_simple_text, remove_solution]:
52-
output_str= replace_func(output_str)
81+
output_str = replace_func(output_str)
5382
exercise_path.write_text(output_str)
5483

5584

@@ -59,7 +88,9 @@ def write_all_exercises(python_scripts_folder):
5988
for solution_path in solution_paths:
6089
exercise_path = Path(str(solution_path).replace("_sol_", "_ex_"))
6190
if not exercise_path.exists():
62-
print(f"{exercise_path} does not exist")
91+
print(
92+
f"{exercise_path} does not exist, generating it from solution."
93+
)
6394

6495
write_exercise(solution_path, exercise_path)
6596

@@ -70,12 +101,14 @@ def write_all_exercises(python_scripts_folder):
70101
if path.is_dir():
71102
write_all_exercises(path)
72103
else:
73-
if '_ex_' not in str(path):
104+
if "_ex_" not in str(path):
74105
raise ValueError(
75-
f'Path argument should be an exercise file. Path was {path}')
106+
f"Path argument should be an exercise file. Path was {path}"
107+
)
76108
solution_path = Path(str(path).replace("_ex_", "_sol_"))
77109
if not solution_path.exists():
78110
raise ValueError(
79-
f"{solution_path} does not exist, check argument path {path}")
111+
f"{solution_path} does not exist, check argument path {path}"
112+
)
80113

81114
write_exercise(solution_path, path)

build_tools/generate-index.py

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ def get_first_title(path):
4141
elif path.suffix == ".md":
4242
md_str = path.read_text()
4343
else:
44-
raise ValueError(f"{filename} is not a .py or a .md file")
44+
raise ValueError(f"{path} is not a .py or a .md file")
4545

4646
return get_first_title_from_md_str(md_str)
4747

@@ -96,7 +96,9 @@ def get_single_file_markdown(docname):
9696
# This is simpler to point to inria.github.io generated HTML otherwise
9797
# there are quirks (MyST in quizzes not supported, slides not working,
9898
# etc ...)
99-
relative_url = str(target).replace("jupyter-book/", "").replace(".md", ".html")
99+
relative_url = (
100+
str(target).replace("jupyter-book/", "").replace(".md", ".html")
101+
)
100102
target = f"https://inria.github.io/scikit-learn-mooc/{relative_url}"
101103

102104
return f"[{title}]({target})"
@@ -140,7 +142,9 @@ def test_get_lesson_markdown():
140142
documents = json_info["documents"]
141143
print(
142144
get_lesson_markdown(
143-
documents["predictive_modeling_pipeline/01_tabular_data_exploration_index"]
145+
documents[
146+
"predictive_modeling_pipeline/01_tabular_data_exploration_index"
147+
]
144148
)
145149
)
146150

@@ -156,7 +160,8 @@ def get_module_markdown(module_dict, documents):
156160
module_title = module_dict["caption"]
157161
heading = f"# {module_title}"
158162
content = "\n\n".join(
159-
get_lesson_markdown(documents[docname]) for docname in module_dict["items"]
163+
get_lesson_markdown(documents[docname])
164+
for docname in module_dict["items"]
160165
)
161166
return f"{heading}\n\n{content}"
162167

@@ -219,7 +224,9 @@ def get_full_index_ipynb(toc_path):
219224
md_str = get_full_index_markdown(toc_path)
220225
nb = jupytext.reads(md_str, format=".md")
221226

222-
nb = nbformat.v4.new_notebook(cells=[nbformat.v4.new_markdown_cell(md_str)])
227+
nb = nbformat.v4.new_notebook(
228+
cells=[nbformat.v4.new_markdown_cell(md_str)]
229+
)
223230

224231
# nb_content = jupytext.writes(nb, fmt=".ipynb")
225232
# nb = json.loads(nb_content)

0 commit comments

Comments
 (0)