Questions about update()
function in EntropyBottleneck
& GaussianConditional
#311
-
Thank you for your outstanding open-source project. I've been studying it recently. First, I’m a bit unclear about the need to run the update() function once at test time before actual encoding. Simply put, what is this preparation for? Why is update() not needed during training? My second question is that I noticed the update() functions for EntropyBottleneck and GaussianConditional are different. Could you briefly explain why that is? My third question is, what is the role of scale_table in GaussianConditional? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
During training, we measure the rate using the negative log-likelihoods (NLL) of During runtime, we are no longer interested in minimizing the rate. (That was the job of training.) We now want an actual bitstream, so this time we do in fact losslessly compress. This requires running the lossless arithmetic coder, which runs faster when we give it a We could certainly precompute these CDFs during training, but they're not needed until later, so it's better to do it after the epoch ends, or after training ends.
The colors represent different distributions in the figure below:
Memory requirements for
At a quick glance, I think 1 Technically,
As mentioned above, the |
Beta Was this translation helpful? Give feedback.
.update()
fills in the_quantized_cdf
table, which is used during runtime, but not during training.During training, we measure the rate using the negative log-likelihoods (NLL) of$\hat{y}$ . That gives us a differentiable function for rate. Thus, we don't actually need to losslessly compress and then losslessly decompress, since that's just the identity function anyways. (Also, it would be difficult to get a differentiable function for rate from just the length of a losslessly compressed bits…