.. _distributed_training: Multi-GPU and Distributed Training =================================== .. This section will contain information about how to run different models in multi-GPU mode (config change) and in Horovod mode (config change + new command to run script). Can also contain some general guidelines for what seems to be faster in which cases (e.g. depending on the number of GPUs or input-output modality). OpenSeq2Seq supports two modes for parallel training: `simple multi-tower approach `_ and `Horovod-based approach `_. Standard Tensorflow distributed training ------------------------------ For multi-GPU training with native `Distributed Tensorflow approach `_ , you need to set ``use_horovod: False`` and ``num_gpus=`` in the configuration file. To start training use ``run.py`` script:: python run.py --config_file=... --mode=train_eval Horovod ------- To use Horovod you will need to set ``use_horovod: True`` in the config and `use mpirun `_:: mpiexec -np python run.py --config_file=... --mode=train_eval --use_horovod=True --enable_logs You can use Horovod both for multi-GPU and for multi-node training. .. note:: ``num_gpus`` parameter will be ignored when ``use_horovod`` is set to True. In that case the number of GPUs to use is specified in the command line with ``mpirun`` arguments. .. In general we find it useful to use Horovod mode when ... TODO .