Add all assignments (excluding test data) to the repo

lil-lab · Aug 18, 2024 · c944262 · c944262
1 parent b3d5688
commit c944262
Show file tree

Hide file tree

Showing 134 changed files with 983,807 additions and 0 deletions.
diff --git a/assignments/README.md b/assignments/README.md
@@ -0,0 +1,48 @@
+# Setting up assignments & Checklist
+
+- The instructions & assignments .pdf files, the starter code and an example of solution are in the `a{1, 2, 3, 4}` subfolders
+
+## Code
+
+### How to create the assignment in Github Classroom, the assignment template repo and the autograding pipeline?
+
+You can refer to the file on [Setting up autograding in GitHub Classroom](https://github.com/lil-lab/cs5740-assignments/blob/master/scripts/github_autograding.md).
+
+### How to set the leaderbaord?
+
+You can refer to the file on [Setting a cronjob for the leaderboard](https://github.com/lil-lab/cs5740-assignments/blob/master/leaderboard/how_to_automatize_leaderboard_updates.md).
+
+## Grading
+
+- Create the assignment on Gradescope and Canvas (Yoav will take care of it for now)
+    - Follow the Canvas link to Gradescope to create the assignment
+
+## Assignment deployment checklist
+
+- [ ] Instruction .pdf ready & reviewed
+  - [ ] Make sure a milestone is included
+- [ ] Report template ready & reviewed
+- [ ] Code
+  - [ ] Go through the starter code template and solutions & update as needed
+  - [ ] Ensure the code is minimal, typing is included and that the docstring contains the information needed & is clear
+  - [ ] Run through every experiment
+  - [ ] Autograding: add/modify tests
+- [ ] Assignment on Github Classroom
+  - [ ] Create the assignment according to the instructions
+  - [ ] Test the assignment submission pipeline (with a dummy submission from the TAs' accounts)
+- [ ] Leaderboard
+  - [ ] Create the pipeline in a private leaderboard repo according to the instructions
+  - [ ] Test the pipeline in the private leaderboard repo with dummy submissions
+  - [ ] The leaderboard script may need to be updated if there are new error cases from the students
+  - [ ] The last refresh of the leaderboard may be a few hours earlier (e.g. ~2h) than the actual deadline, so that students have time to get the test result and update their report
+  - [ ] After the deadline: make sure to end the automatic cronjob
+- [ ] Rubric
+  - [ ] Review & update the rubric
+  - [ ] Add the rubric to Gradescope
+
+## Grading
+
+- [ ] Sync with the graders to set a grading session time (better to grade during the session if possible). If not everything can be graded during the session, fix a milestone to ensure progress
+- [ ] For the grading session: send the instruction, report template and grading rubric to the graders
+- [ ] Go through the grading rubric to make sure the graders are on the same page
+- [ ] Go through a few assignments with the graders to align the grading criteria
diff --git a/assignments/a1/README.md b/assignments/a1/README.md
@@ -0,0 +1,9 @@
+This README is about the 2024 version.
+
+Assignment Overleaf: https://www.overleaf.com/project/66a15468103bd6ea5fe53a7e  
+Report template Overleaf: https://www.overleaf.com/project/65ac35d0c96cc0fd13503bee  
+Starter repo: https://github.com/cornell-cs5740-sp24/Assignment-1   
+
+## Other
+
+- The `assignment_checklist_example.md` file provides some possible items to check (just as example and non-exhaustive)
diff --git a/assignments/a1/a1-assignment-doc.zip b/assignments/a1/a1-assignment-doc.zip
diff --git a/assignments/a1/a1-report-template.zip b/assignments/a1/a1-report-template.zip
diff --git a/assignments/a1/assignment_checklist_example.md b/assignments/a1/assignment_checklist_example.md
@@ -0,0 +1,42 @@
+# How to check an assignment before it's sent to the students
+
+Updated: 2024, with assignment 1
+
+## Assignment 1
+
+Code part.
+
+### Assignment creation
+
+- Needs to use a template repo from the same organization (not a public repo)
+
+### Running through the assignment
+
+- [ ] accept (via the invitation link) and clone the assignment from github classroom
+- [ ] setup the environment
+  - [ ] create a virtual env (e.g. conda) with the python package requirement (3.10.x for now)
+  - [ ] install from requirements with pip's `--no-cache-dir` flat
+- [ ] write/copy solutions to the repo
+
+#### Offline
+
+- [ ] run all pytests (cf. README) and make sure they pass locally
+- [ ] check that the models train correctly
+- [ ] check that the evaluation (dev/test) are correct
+
+#### Online
+
+- [ ] commit and push the completed assignment, make sure that they pass all the autograding tests (cf. Actions for details)
+- [ ] check we get all the points, fix any issues
+
+#### Leaderboard
+
+- [ ] install Github CLI, authentificate, install the Classroom extension and clone the student repos (ex: `gh classroom clone student-repos -a xxxxxx`)
+- [ ] install PyGithub (`pip install PyGithub`)
+- [ ] create a `leaderboards` repo with the corresponding assignment subfolder + placeholder blank csvs
+- [ ] running through the corresponding `leaderboard/*.py` and make sure that they update correctly the csvs
+
+### (Unofficial, just self-reminder) Writeup
+
+- [ ] review the assignment document, check for any errors, typo, inconsistencies, unclear/ambiguous formulation
+- [ ] check the assignment submission template
diff --git a/assignments/a1/assignment_rubric.md b/assignments/a1/assignment_rubric.md
@@ -0,0 +1,75 @@
+Assignment 1 Rubric
+===================
+
+Perceptron Features (16pt)
+--------------------------
+* -1 per
+  * Vague description of how a perceptron feature is computed.
+  * No mention at all of how or whether word frequency is used
+  * No description for bag of words features are provided.
+* -2 per
+  * No mention at all of how unknown words are processed
+* -3 per (cap of -9)
+  * Each dataset should have 3 feature sets in addition to bag of words. Remove three points for each absent feature set.
+  * What doesn’t count as features:
+    * Bag-of-words unigrams
+    * Specifying presence/absence or count for n-grams
+    * Any n-gram above n > 1 is one feature
+    * Most preprocessing (removal of stopwords/punctuation/urls/emails, etc) other than lemmatization/stemming
+
+Experimental setup (8pt)
+------------------------
+### Perceptron hyperparameters (2pt)
+* -1 per missing or nonsense blank
+
+### MLP implementation (4pt)
+* -1 per unclear description of an MLP layer (cannot implement in PyTorch without guesswork)
+* -1.5 pt if no nonlinearity
+
+### MLP Hyperparameters (2pt)
+* -0.5 per missing or nonsense blank 
+* -0.5 if they do not use the validation set for stopping
+
+
+SST and Newsgroup Results and Analyses (14pt each)
+--------------------------------------------------
+### Quantitative results (5pt)
+* -1 per
+  * Missing or nonsensical ablations for features described in Section 1
+  * Missing or nonsensical ablation experiments for MLP
+  * Ablations are better than full model on dev
+  * Unclear what the ablations represent
+  * Missing test results or reports non-final results (compared to leaderboard) on test set
+* +2 per
+  * Places at top-3 in the leaderboard for any experiment (can be used once)
+    * Once for all the leaderboards. [e.g. for A1, we’ll be giving +2 for 6 people, the 3 at the top of each leaderboard]
+
+### Training loss and validation accuracy plots (3pt)
+* -0.5 per
+  * Plots are hard to read (e.g., font too small, axis not clear)
+* -1.5 per
+  * Missing or nonsensical plot
+
+### Qualitative Error Analysis (6pt)
+* -1 per detail
+  * Unclear description of error class
+  * No example shown
+* -1 per section
+  * No or poor description of error statistics (includes not reporting statistics for both models if model type is both)
+  * Error category missing for a model type
+* -2 per detail	
+  * If less than three error classes are given, two points off for each class missing or invalid (already added to the rubric)
+
+Batching Benchmarking (6pt)
+---------------------------
+* -0.5 per
+  * An entry is missing or nonsensical in the batching benchmark experiment
+* -0.25 if missing units (symbolic)
+
+Autograding (10pt)
+------------------
+* 5 unit tests, each 2pt. Can directly use the score from Github Classroom
+
+Performance grading (32pt)
+--------------------------
+* Following equation in assignment PDF
diff --git a/assignments/a1/starter-repo/.gitignore b/assignments/a1/starter-repo/.gitignore
@@ -0,0 +1,29 @@
+.hypothesis
+
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+
+# C extensions
+*.so
+
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
diff --git a/assignments/a1/starter-repo/README.md b/assignments/a1/starter-repo/README.md
@@ -0,0 +1,41 @@
+# Assignment 1
+
+## Example commands
+
+### Environment
+
+It's highly recommended to use a virtual environment (e.g. conda, venv) for this assignment.
+
+Example of virtual environment creation using conda:
+```
+conda create -n env_name python=3.10
+conda activate env_name
+python -m pip install -r requirements.txt
+```
+
+### Train and predict commands
+
+Example commands (subject to change, just for inspiration):
+```
+python perceptron.py -d newsgroups -f feature_name
+python perceptron.py -d sst2 -f feature_name
+python multilayer_perceptron.py -d newsgroups -f feature_name
+```
+
+### Commands to run unittests
+
+Ensure that your code passes the unittests before submitting it.
+The commands can be run from the root directory of the project.
+```
+pytest tests/test_perceptron.py
+pytest tests/test_multilayer_perceptron.py
+```
+
+### Submission
+
+Ensure that the name of the submission files (in the `results/` subfolder) are:
+
+- `perceptron_newsgroups_test_predictions.csv`
+- `mlp_newsgroups_test_predictions.csv`
+- `perceptron_sst2_test_predictions.csv`
+- `mlp_sst2_test_predictions.csv`