Sentiment Analysis¶
Models¶
The model we use for sentiment analysis is the same one we use for the LSTM language model, except that the last output dimension is the number of sentiment classes instead of the vocabulary size. This sameness allows the sentiment analysis model to use the model pretrained on the language model for this task. You can choose to train the sentiment analysis task from scratch, or from the pretrained language model.
In this model, each source sentence is run through the LSTM cells. The last hidden state at the end of the sequence is then passed into the output projection layer before softmax is performed to get the predicted sentiment. If the parameter use_cell_state
is set to True, the last cell state at the end of the sequence is concatenated to the last hidden state.
The datasets we currently support include SST (Stanford Sentiment Treebank) and IMDB reviews.
Model description | Config file | Checkpoint |
---|---|---|
IDMB | imdb-wkt103.py | Accuracy=? |
SST | sst-wkt2.py | Accuracy=? |
The model specification and training parameters can be found in the corresponding config file.
Getting started¶
Get data¶
The SST (Stanford Sentiment Treebank) dataset contains of 10,662 sentences, half of them positive, half of them negative. These sentences are fairly short with the median length of 19 tokens. You can download the pre-processed version of the dataset here <https://github.com/NVIDIA/sentiment-discovery/tree/master/data/binary_sst>. The pre-processed dataset contains the files train.csv, valid.csv, test.csv. The dalay layer used to process this dataset is called SSTDataLayer.
The IMDB Dataset contains 50,000 labeled samples of much longer length. The median length is 205 tokens. Half of them are deemed positive and the other half negative. The train set, which contains of 25,000 samples, is separated into a train set of 24,000 samples and a validation set of 1,000 samples. The dalay layer used to process this dataset is called SSTDataLayer. The dataset can be downloaded here <http://ai.stanford.edu/~amaas/data/sentiment/>.
If you want to use a trained language model for this task, make sure that your dataset is processed in the same way the dataset used for the language model was.
Training¶
Next let’s create a simple LSTM language model by defining a config file for it or using one of the config files defined in example_configs/transfer
.
- if you want to use a pretrained language model, specify the location of the pretrained language model using the parameter
load_model
. - change
data_root
to point to the directory containing the raw dataset used to train your language model, for example, the IMDB dataset downloaded above. - change
processed_data_folder
to point to the location where you want to store the processed dataset. If the dataset has been pre-procesed before, the data layer can just load the data from this location. - update other hyper parameters such as number of layers, number of hidden units, cell type, loss function, learning rate, optimizer, etc. to meet your needs.
- choose
dtype
to be"mixed"
if you want to use mixed-precision training, ortf.float32
to train only in FP32.
For example, your config file is lstm-wkt103-mixed.py
. To train without Horovod, update use_horovod
to False in the config file and run:
python run.py --config_file=example_configs/transfer/imdb-wkt2.py --mode=train_eval --enable_logs
When training with Horovod, use the following command:
mpiexec --allow-run-as-root -np <num_gpus> python run.py --config_file=example_configs/transfer/imdb-wkt2.py --mode=train_eval --enable_logs
Some things to keep in mind:
- Don’t forget to update
num_gpus
to the number of GPUs you want to use. - If your GPUs run out of memory, reduce the
batch_size_per_gpu
parameter.
Inference¶
Running in the mode eval
will evaluate your model on the evaluation set:
python run.py --config_file=example_configs/transfer/imdb-wkt2.py --mode=eval --enable_logs
Running in the mode infer
will evaluate your model on the test set:
python run.py --config_file=example_configs/transfer/imdb-wkt2.py --mode=test --enable_logs
The performance of the model is reported on accuracy and F1 scores.