Megatron utils
average_losses_across_data_parallel_group(losses, with_context_parallel=False)
Reduce a tensor of losses across all GPUs.
Source code in bionemo/llm/utils/megatron_utils.py
39 40 41 42 43 44 45 46 47 48 49 50 51 |
|
is_only_data_parallel()
Checks to see if you are in a distributed megatron environment with only data parallelism active.
This is useful if you are working on a model, loss, etc and you know that you do not yet support megatron model parallelism. You can test that the only kind of parallelism in use is data parallelism.
Returns:
Type | Description |
---|---|
bool
|
True if data parallel is the only parallel mode, False otherwise. |
Source code in bionemo/llm/utils/megatron_utils.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
|