Amplify
Model Overview
Training Code Lives in bionemo-recipes
Use bionemo-recipes/models/amplify
for checkpoint conversion and model implementation details. For training, use the ESM-2 recipes in
bionemo-recipes/recipes
with an AMPLIFY model tag such as nvidia/AMPLIFY_120M.
Description
A TransformerEngine-optimized implementation of the AMPLIFY model, a protein language model variant of ESM-2 with modified layer structure and dataset construction. The model is designed for protein sequence understanding and prediction tasks, with variants available at 120M and 350M parameter sizes.
This model is ready for commercial use.
Third-Party Community Consideration
This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party's requirements for this application and use case; see the Chandar Lab website.
References
[1] Protein Language Models: Is Scaling Necessary? Quentin Fournier, Robert M. Vernon, Almer van der Sloot, Benjamin Schulz, Sarath Chandar, Christopher James Langmead bioRxiv 2024.09.23.614603; doi: https://doi.org/10.1101/2024.09.23.614603
Model Architecture
Architecture Type: Transformer
Network Architecture: ESM-2 variant with modified layer structure
Input
Input Type(s): Text (Protein Sequences)
Input Format(s): String
Input Parameters: 1D
Other Properties Related to Input: Protein sequence represented as a string of canonical amino acids.
Output
Output Type(s): Embeddings (Amino acid and sequence-level)
Output Format: Numeric vector
Output Parameters: 1D
Other Properties Related to Output: Numeric vector with floating-point values corresponding to an embedding for each amino acid in the input protein sequence.
Software Integration
Runtime Engine(s):
- BioNeMo, TransformerEngine
Supported Hardware Microarchitecture Compatibility:
- NVIDIA Ampere
- NVIDIA Hopper
[Preferred/Supported] Operating System(s):
- Linux
Model Versions
The model is fully compatible with weights distributed via HuggingFace, i.e.,
chandar-lab/AMPLIFY_120M. See
bionemo-recipes/models/amplify
for checkpoint conversion utilities and usage examples.
Training & Evaluation
Training Dataset
The model was trained on a curated dataset of protein sequences following similar principles to ESM-2's training data. For more details on the training dataset, see the original AMPLIFY paper.
Inference
Engine: BioNeMo, TransformerEngine
Test Hardware:
- NVIDIA H100
License
AMPLIFY is provided under the Apache 2.0 license.
Pre-training Performance
Use the recipe READMEs for current training entrypoints:
bionemo-recipes/models/amplifybionemo-recipes/recipes/esm2_native_tebionemo-recipes/recipes/esm2_accelerate_te
| Model Size | GPUs | Batch Size (per GPU) | Training Step Time (s) |
|---|---|---|---|
| 120M | 16 x NVIDIA H100 | 256 | 0.461 |
| 350M | 32 x NVIDIA H100 | 128 | 0.525 |
Model Convergence
Model convergence curves are shown below for the 120M and 350M models, trained on the chandar-lab/UR100P dataset for 1M steps.



Final Perplexities by Model Size
| Model Size | Perplexity at 1M Steps |
|---|---|
| 120M | 4.23 |
| 350M | 3.05 |