You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(optional) assign yourself in "Assignees" over to the right
Try running the notebooks, in Google Colab
See where they break.
Edit the notebook to swap in another dataset. Perhaps by Loading in a HuggingFace dataset, and then writing it back out into a format JoeyNMT knows how to use, creating a train.en and train.xh file maybe.
Edit: see #200, maybe we should leave the old JW300 notebooks up, and instead create new ones
The problem
JW300 has been taken down for copyright reasons. At least the following notebooks all rely on it:
https://github.com/masakhane-io/masakhane-mt/blob/master/starter_notebook_from_English_training.ipynb
https://github.com/masakhane-io/masakhane-mt/blob/master/starter_notebook_gdrive_from_English.ipynb
https://github.com/masakhane-io/masakhane-mt/blob/master/starter_notebook_into_English_training.ipynb
a solution (but see #200 )
They need to be fixed to no longer use this dataset. Perhaps we could use Tatoeba or FloRES 101? Or one of the other machine translation datasets on https://huggingface.co/datasets?task_ids=task_ids:machine-translation&sort=downloads
The text was updated successfully, but these errors were encountered: