`run_summarization.py` isn't computing on GPU (but allocate memory) #18

pltrdy · 2017-06-01T16:01:01Z

Strange fact. I've been trying to train summarization with default parameters and default dataset.

It turns out that the GPU is detected (2017-06-01 17:57:05.548650: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)), some GPU memory is alocated for the process, but nothing is running on it (load: 0%). It sometimes peak at around 27% for very short time.

The GPU is a single 1080 with 8G RAM (4617MiB allocated by the process)

Any idea?

The text was updated successfully, but these errors were encountered:

pedrobalage · 2017-06-01T16:19:19Z

@pltrdy, try to change the follow line
https://github.com/abisee/pointer-generator/blob/master/run_summarization.py#L125
to
default_device = tf.device('/gpu:0')

pltrdy · 2017-06-01T16:27:03Z

Thx @pedrobalage, I actually removed this block. I don't really understand why the device is hardcoded. Shouldn't it be left "default" (and then user can (un/)set CUDA_VISIBLE_DEVICE) or at least a parameter?

pedrobalage · 2017-06-01T16:33:14Z

@pltrdy, I didn't understand this decision too. This was the only way it worked for me. I also setup the CUDA_VISIBLE_DEVICE.

pltrdy · 2017-06-01T16:37:50Z

@pedrobalage thanks for sharing it, I didn't even noticed this line.

@abisee I forked, switched to python3 syntax & removed this line. My python 3 implementation wont be compatible because of bytes vs str differences so I'm not sure a PR would be welcomed for that.
https://github.com/pltrdy/pointer-generator

ioana-blue · 2017-06-01T17:13:18Z

thanks for pointing this out @pltrdy removing that line allows the code to run on the gpu and it's going 20x faster. so you pretty much made my life go by 20 times faster 👍

tianjianjiang · 2017-06-01T22:21:06Z

Another related discussion can be found in #3

pltrdy · 2017-06-01T22:34:15Z

@ioana-blue In fact I just complained then @pedrobalage helped us ;)

@tianjianjiang thx for referencing it. Interesting to notice that it's the same GPU. Can't figure out how this may work on K40 (used by the author) and not on 1080 since there is enough RAM. Maybe CUDA/TF version have some effect here. No idea.

tianjianjiang · 2017-06-02T01:54:08Z

@pltrdy You're welcome and I can also confirm that changing it from CPU to GPU can be worse on K80 and GT 650M.

tianjianjiang · 2017-06-02T02:37:45Z

BTW, there's a recent post about this topic https://datascience.stackexchange.com/questions/19220/choosing-between-cpu-and-gpu-for-training-a-neural-network
Also my GT 650M and K80 experience is usually a support to not use slow GPU with small RAM, and to be careful with GPU in virtualization (i.e. AWS EC2 and Google Cloud).

abisee · 2017-06-06T16:30:16Z

Hi everyone,

I'm afraid this will be a rather unsatisfactory answer, but the reason that we set default device to CPU is that our code is derived from the TextSum code, which does the same thing (see this line). Similarly, we have a few

with tf.device("/gpu:0"):

lines in model.py that are derived from similar lines in this file.

Personally I'm not sure what is the rationale for doing this (I think I tried removing these lines and didn't see a difference). But that's the code we ran for our experiments so we left it like that for reproducibility, in case it matters.

Edit: When we ran on our K40, the GPU-Util was non-zero so the GPU was being used.

ioana-blue · 2017-06-06T19:07:59Z

I'm new to everything related to this work, but learning. From what I read in the docs and looking at the code, this is my understanding.

https://www.tensorflow.org/tutorials/using_gpu

explains that with tf.device will try to impose to use a certain device for certain computations, instead of the default device chosen by tf (which prioritizes gpus over cpus as far as I understand). if the device specified cannot be allocated, you get an exception, UNLESS you use allow_soft_placement=True, which our code uses: https://github.com/abisee/pointer-generator/blob/master/util.py#L26

That's why you can see various behavior depending on where you run the code, what's running concurrently, etc.

Hope this explains some of the mystery.

Spandan-Madan · 2017-07-10T12:42:34Z

I'm not entirely sure if I was able to make it to run on GPU. @ioana-blue @abisee @pltrdy @pedrobalage @tianjianjiang How long is a training step taking you approximately? For me every training step is about 1.5 seconds, is that the speed you're getting on GPU or the CPU?

pltrdy · 2017-07-10T12:57:55Z

@Spandan-Madan I'm around 1.8sec/step, default parameters, nvidia 1080.
The GPU usage might be sub-optimal as we can see it is not always 100% of load.

eduOS · 2017-07-16T08:39:35Z

It's better to switch to GPU, since the default setting would make the training extremely slow.

abisee · 2017-08-16T22:07:16Z

Hello again,

Though I'm sure I tried changing default_device several months ago and found no difference, I tried making the change again this week and it did make things faster. So we've now committed that change to the repo.

abisee added bug question labels Jul 12, 2017

falcondai added a commit to falcondai/pointer-generator that referenced this issue Jul 26, 2017

fix abisee#18 run on GPU

f7a23a4

falcondai linked a pull request Jul 26, 2017 that will close this issue

run on GPU #38

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`run_summarization.py` isn't computing on GPU (but allocate memory) #18

`run_summarization.py` isn't computing on GPU (but allocate memory) #18

pltrdy commented Jun 1, 2017

pedrobalage commented Jun 1, 2017

pltrdy commented Jun 1, 2017

pedrobalage commented Jun 1, 2017

pltrdy commented Jun 1, 2017

ioana-blue commented Jun 1, 2017

tianjianjiang commented Jun 1, 2017

pltrdy commented Jun 1, 2017

tianjianjiang commented Jun 2, 2017

tianjianjiang commented Jun 2, 2017

abisee commented Jun 6, 2017 •

edited

Loading

ioana-blue commented Jun 6, 2017

Spandan-Madan commented Jul 10, 2017

pltrdy commented Jul 10, 2017

eduOS commented Jul 16, 2017

abisee commented Aug 16, 2017

run_summarization.py isn't computing on GPU (but allocate memory) #18

run_summarization.py isn't computing on GPU (but allocate memory) #18

Comments

pltrdy commented Jun 1, 2017

pedrobalage commented Jun 1, 2017

pltrdy commented Jun 1, 2017

pedrobalage commented Jun 1, 2017

pltrdy commented Jun 1, 2017

ioana-blue commented Jun 1, 2017

tianjianjiang commented Jun 1, 2017

pltrdy commented Jun 1, 2017

tianjianjiang commented Jun 2, 2017

tianjianjiang commented Jun 2, 2017

abisee commented Jun 6, 2017 • edited Loading

ioana-blue commented Jun 6, 2017

Spandan-Madan commented Jul 10, 2017

pltrdy commented Jul 10, 2017

eduOS commented Jul 16, 2017

abisee commented Aug 16, 2017

`run_summarization.py` isn't computing on GPU (but allocate memory) #18

`run_summarization.py` isn't computing on GPU (but allocate memory) #18

abisee commented Jun 6, 2017 •

edited

Loading