-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Common Issues
##This is an active project.
Before reading any of these, do a git pull
and see if it fixed your issue.
###Cannot Load Checkpoint or Sample Data symptoms
luajit failing with cannot open mode r
, similar to issue #46
solutions
- double check the directory you're loading the file from
- check that the current account has read permissions on that file
###Cannot Generate Text symptoms
luajit failing with bad argument #1 to 'size'
, similar to issue #44
solutions
- if you specified primetext, verify that each character has already been seen in the training set
###Fails to start, claiming cunn/cutorch package not found symptoms
cunn/cutorch package not found
solutions
- verify you have NVidia's CUDA runtime installed
- install cunn or cutorch package with
luarocks install cunn
orluarocks install cutorch
- execute the command as root with
sudo
###CPU training executes with only one thread symptoms
only a single core of a multicore system is being heavily used
solutions
- compile OpenBLAS with multithreading support, and compile and link Torch against it.
###loss is exploding, aborting.
symptoms
train_loss
value slowly decreases then suddenly increases to a much higher value
solutions
##cuda runtime error (2) : out of memory
symptoms
train.lua
crashes early on, with the aforementioned error
solutions
- lower
num_layers
parameter, which defaults to 2 if omitted - lower
rnn_size
parameter, which defaults to 128 if omitted - lower
batch_size
parameter, which defaults to 50 if omitted - Linux and some Windows users can take advantage of
nvidia-smi
to view available memory in the GPU