You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _posts/2017-10-11-rnn-bucket-mxnet-R.md
+16-16
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ categories: rstats
7
7
comments: true
8
8
---
9
9
10
-
This tutorial presents an example of application of RNN to text classification using padded and bucketed data to efficiently handle sequences of varying lengths. Some functionalities require running on a GPU with CUDA.
10
+
This tutorial presents an example of application of RNN to text classification using padded and bucketed data to efficiently handle sequences of varying lengths. Some functionalities require running on a CUDA enabled GPU.
11
11
12
12
Example based on sentiment analysis on the [IMDB data](http://ai.stanford.edu/~amaas/data/sentiment/).
13
13
@@ -45,7 +45,7 @@ To illustrate the benefit of bucketing, two datasets are created:
45
45
-`corpus_single_train.rds`: no bucketing, all samples are padded/trimmed to 600 words.
46
46
-`corpus_bucketed_train.rds`: samples split into 5 buckets of length 100, 150, 250, 400 and 600.
47
47
48
-
Below is the example of the assignation of the bucketed data and labels into `mx.io.bucket.iter` iterator. This iterator behaves essentially the same as the `mx.io.arrayiter` except that is pushes samples coming from the different buckets along with a bucketID to identify the appropriate network to use.
48
+
Below is the example of the assignation of the bucketed data and labels into `mx.io.bucket.iter` iterator. This iterator behaves essentially the same as the `mx.io.arrayiter` except that is pushes samples coming from the different buckets along with a bucketID to identify the appropriate symbolic graph to use.
Below are the graph representations of a seq-to-one architecture with LSTM cells. Note that input data is of shape `batch.size X seq.length` while the output of the RNN operator is of shape `hidden.features X batch.size X seq.length`.
71
+
Below are the graph representations of a seq-to-one architecture with LSTM cells. Note that input data is of shape `seq.length X batch.size` while the RNN operator requires input of shape `hidden.features X batch.size X seq.length`, requiring to swap axis.
72
72
73
73
For bucketing, a list of symbols is defined, one for each bucket length. During training, at each batch the appropriate symbol is bound according to the bucketID provided by the iterator.
Since the model attempts to predict the sentiment, it's no surprise that the 2 dimensions into which each word is projected appear correlated with words' polarity. Positive words are associated with lower X1 values ("great", "excellent"), while the most negative words appear at the far right ("terrible", "worst"). By representing words of similar meaning with features of values, embedding much facilitates the remaining classification task for the network.
173
+
Since the model attempts to predict the sentiment, it's no surprise that the 2 dimensions into which each word is projected appear correlated with words' polarity. Positive words are associated with lower X1 values ("great", "excellent"), while the most negative words appear at the far right ("terrible", "worst"). By representing words of similar meaning with features of similar values, embedding much facilitates the remaining classification task for the network.
174
174
175
175
Inference on test data
176
176
----------------------
177
177
178
-
The utility function `mx.infer.buckets` has been added to simplify inference on RNN with bucketed data.
178
+
The utility function `mx.infer.rnn` has been added to simplify inference on RNN with bucketed data.
0 commit comments