-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add all assignments (excluding test data) to the repo
- Loading branch information
Showing
134 changed files
with
983,807 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
# Setting up assignments & Checklist | ||
|
||
- The instructions & assignments .pdf files, the starter code and an example of solution are in the `a{1, 2, 3, 4}` subfolders | ||
|
||
## Code | ||
|
||
### How to create the assignment in Github Classroom, the assignment template repo and the autograding pipeline? | ||
|
||
You can refer to the file on [Setting up autograding in GitHub Classroom](https://github.com/lil-lab/cs5740-assignments/blob/master/scripts/github_autograding.md). | ||
|
||
### How to set the leaderbaord? | ||
|
||
You can refer to the file on [Setting a cronjob for the leaderboard](https://github.com/lil-lab/cs5740-assignments/blob/master/leaderboard/how_to_automatize_leaderboard_updates.md). | ||
|
||
## Grading | ||
|
||
- Create the assignment on Gradescope and Canvas (Yoav will take care of it for now) | ||
- Follow the Canvas link to Gradescope to create the assignment | ||
|
||
## Assignment deployment checklist | ||
|
||
- [ ] Instruction .pdf ready & reviewed | ||
- [ ] Make sure a milestone is included | ||
- [ ] Report template ready & reviewed | ||
- [ ] Code | ||
- [ ] Go through the starter code template and solutions & update as needed | ||
- [ ] Ensure the code is minimal, typing is included and that the docstring contains the information needed & is clear | ||
- [ ] Run through every experiment | ||
- [ ] Autograding: add/modify tests | ||
- [ ] Assignment on Github Classroom | ||
- [ ] Create the assignment according to the instructions | ||
- [ ] Test the assignment submission pipeline (with a dummy submission from the TAs' accounts) | ||
- [ ] Leaderboard | ||
- [ ] Create the pipeline in a private leaderboard repo according to the instructions | ||
- [ ] Test the pipeline in the private leaderboard repo with dummy submissions | ||
- [ ] The leaderboard script may need to be updated if there are new error cases from the students | ||
- [ ] The last refresh of the leaderboard may be a few hours earlier (e.g. ~2h) than the actual deadline, so that students have time to get the test result and update their report | ||
- [ ] After the deadline: make sure to end the automatic cronjob | ||
- [ ] Rubric | ||
- [ ] Review & update the rubric | ||
- [ ] Add the rubric to Gradescope | ||
|
||
## Grading | ||
|
||
- [ ] Sync with the graders to set a grading session time (better to grade during the session if possible). If not everything can be graded during the session, fix a milestone to ensure progress | ||
- [ ] For the grading session: send the instruction, report template and grading rubric to the graders | ||
- [ ] Go through the grading rubric to make sure the graders are on the same page | ||
- [ ] Go through a few assignments with the graders to align the grading criteria |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
This README is about the 2024 version. | ||
|
||
Assignment Overleaf: https://www.overleaf.com/project/66a15468103bd6ea5fe53a7e | ||
Report template Overleaf: https://www.overleaf.com/project/65ac35d0c96cc0fd13503bee | ||
Starter repo: https://github.com/cornell-cs5740-sp24/Assignment-1 | ||
|
||
## Other | ||
|
||
- The `assignment_checklist_example.md` file provides some possible items to check (just as example and non-exhaustive) |
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
# How to check an assignment before it's sent to the students | ||
|
||
Updated: 2024, with assignment 1 | ||
|
||
## Assignment 1 | ||
|
||
Code part. | ||
|
||
### Assignment creation | ||
|
||
- Needs to use a template repo from the same organization (not a public repo) | ||
|
||
### Running through the assignment | ||
|
||
- [ ] accept (via the invitation link) and clone the assignment from github classroom | ||
- [ ] setup the environment | ||
- [ ] create a virtual env (e.g. conda) with the python package requirement (3.10.x for now) | ||
- [ ] install from requirements with pip's `--no-cache-dir` flat | ||
- [ ] write/copy solutions to the repo | ||
|
||
#### Offline | ||
|
||
- [ ] run all pytests (cf. README) and make sure they pass locally | ||
- [ ] check that the models train correctly | ||
- [ ] check that the evaluation (dev/test) are correct | ||
|
||
#### Online | ||
|
||
- [ ] commit and push the completed assignment, make sure that they pass all the autograding tests (cf. Actions for details) | ||
- [ ] check we get all the points, fix any issues | ||
|
||
#### Leaderboard | ||
|
||
- [ ] install Github CLI, authentificate, install the Classroom extension and clone the student repos (ex: `gh classroom clone student-repos -a xxxxxx`) | ||
- [ ] install PyGithub (`pip install PyGithub`) | ||
- [ ] create a `leaderboards` repo with the corresponding assignment subfolder + placeholder blank csvs | ||
- [ ] running through the corresponding `leaderboard/*.py` and make sure that they update correctly the csvs | ||
|
||
### (Unofficial, just self-reminder) Writeup | ||
|
||
- [ ] review the assignment document, check for any errors, typo, inconsistencies, unclear/ambiguous formulation | ||
- [ ] check the assignment submission template |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
Assignment 1 Rubric | ||
=================== | ||
|
||
Perceptron Features (16pt) | ||
-------------------------- | ||
* -1 per | ||
* Vague description of how a perceptron feature is computed. | ||
* No mention at all of how or whether word frequency is used | ||
* No description for bag of words features are provided. | ||
* -2 per | ||
* No mention at all of how unknown words are processed | ||
* -3 per (cap of -9) | ||
* Each dataset should have 3 feature sets in addition to bag of words. Remove three points for each absent feature set. | ||
* What doesn’t count as features: | ||
* Bag-of-words unigrams | ||
* Specifying presence/absence or count for n-grams | ||
* Any n-gram above n > 1 is one feature | ||
* Most preprocessing (removal of stopwords/punctuation/urls/emails, etc) other than lemmatization/stemming | ||
|
||
Experimental setup (8pt) | ||
------------------------ | ||
### Perceptron hyperparameters (2pt) | ||
* -1 per missing or nonsense blank | ||
|
||
### MLP implementation (4pt) | ||
* -1 per unclear description of an MLP layer (cannot implement in PyTorch without guesswork) | ||
* -1.5 pt if no nonlinearity | ||
|
||
### MLP Hyperparameters (2pt) | ||
* -0.5 per missing or nonsense blank | ||
* -0.5 if they do not use the validation set for stopping | ||
|
||
|
||
SST and Newsgroup Results and Analyses (14pt each) | ||
-------------------------------------------------- | ||
### Quantitative results (5pt) | ||
* -1 per | ||
* Missing or nonsensical ablations for features described in Section 1 | ||
* Missing or nonsensical ablation experiments for MLP | ||
* Ablations are better than full model on dev | ||
* Unclear what the ablations represent | ||
* Missing test results or reports non-final results (compared to leaderboard) on test set | ||
* +2 per | ||
* Places at top-3 in the leaderboard for any experiment (can be used once) | ||
* Once for all the leaderboards. [e.g. for A1, we’ll be giving +2 for 6 people, the 3 at the top of each leaderboard] | ||
|
||
### Training loss and validation accuracy plots (3pt) | ||
* -0.5 per | ||
* Plots are hard to read (e.g., font too small, axis not clear) | ||
* -1.5 per | ||
* Missing or nonsensical plot | ||
|
||
### Qualitative Error Analysis (6pt) | ||
* -1 per detail | ||
* Unclear description of error class | ||
* No example shown | ||
* -1 per section | ||
* No or poor description of error statistics (includes not reporting statistics for both models if model type is both) | ||
* Error category missing for a model type | ||
* -2 per detail | ||
* If less than three error classes are given, two points off for each class missing or invalid (already added to the rubric) | ||
|
||
Batching Benchmarking (6pt) | ||
--------------------------- | ||
* -0.5 per | ||
* An entry is missing or nonsensical in the batching benchmark experiment | ||
* -0.25 if missing units (symbolic) | ||
|
||
Autograding (10pt) | ||
------------------ | ||
* 5 unit tests, each 2pt. Can directly use the score from Github Classroom | ||
|
||
Performance grading (32pt) | ||
-------------------------- | ||
* Following equation in assignment PDF |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
.hypothesis | ||
|
||
# Byte-compiled / optimized / DLL files | ||
__pycache__/ | ||
*.py[cod] | ||
*$py.class | ||
|
||
# C extensions | ||
*.so | ||
|
||
# Distribution / packaging | ||
.Python | ||
build/ | ||
develop-eggs/ | ||
dist/ | ||
downloads/ | ||
eggs/ | ||
.eggs/ | ||
lib/ | ||
lib64/ | ||
parts/ | ||
sdist/ | ||
var/ | ||
wheels/ | ||
share/python-wheels/ | ||
*.egg-info/ | ||
.installed.cfg | ||
*.egg | ||
MANIFEST |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
# Assignment 1 | ||
|
||
## Example commands | ||
|
||
### Environment | ||
|
||
It's highly recommended to use a virtual environment (e.g. conda, venv) for this assignment. | ||
|
||
Example of virtual environment creation using conda: | ||
``` | ||
conda create -n env_name python=3.10 | ||
conda activate env_name | ||
python -m pip install -r requirements.txt | ||
``` | ||
|
||
### Train and predict commands | ||
|
||
Example commands (subject to change, just for inspiration): | ||
``` | ||
python perceptron.py -d newsgroups -f feature_name | ||
python perceptron.py -d sst2 -f feature_name | ||
python multilayer_perceptron.py -d newsgroups -f feature_name | ||
``` | ||
|
||
### Commands to run unittests | ||
|
||
Ensure that your code passes the unittests before submitting it. | ||
The commands can be run from the root directory of the project. | ||
``` | ||
pytest tests/test_perceptron.py | ||
pytest tests/test_multilayer_perceptron.py | ||
``` | ||
|
||
### Submission | ||
|
||
Ensure that the name of the submission files (in the `results/` subfolder) are: | ||
|
||
- `perceptron_newsgroups_test_predictions.csv` | ||
- `mlp_newsgroups_test_predictions.csv` | ||
- `perceptron_sst2_test_predictions.csv` | ||
- `mlp_sst2_test_predictions.csv` |
Oops, something went wrong.