Skip to content

Latest commit

 

History

History
18 lines (14 loc) · 725 Bytes

README.md

File metadata and controls

18 lines (14 loc) · 725 Bytes

SentencePiece

This is the code for the SentencePiece Demo.

If you want to run the code locally you can follow the below steps to install the necessary libraries and get the test dataset we use to train our models. Alternatively, you can use this repo to run the code on FloydHub where the dataset is already configured and ready to go.

Install SentencePiece

pip3 install sentencepiece

Download the corpus

Download the data from the blog corpus. You can download it via the notebook or the cmd line.
Remember, whatever you use, you need to unzip it and remember the relevant directory.

wget http://www.cs.biu.ac.il/~koppel/blogs/blogs.zip