Skip to content

Amplify

Model Overview

Training Code Lives in bionemo-recipes

Use bionemo-recipes/models/amplify for checkpoint conversion and model implementation details. For training, use the ESM-2 recipes in bionemo-recipes/recipes with an AMPLIFY model tag such as nvidia/AMPLIFY_120M.

Description

A TransformerEngine-optimized implementation of the AMPLIFY model, a protein language model variant of ESM-2 with modified layer structure and dataset construction. The model is designed for protein sequence understanding and prediction tasks, with variants available at 120M and 350M parameter sizes.

This model is ready for commercial use.

Third-Party Community Consideration

This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party's requirements for this application and use case; see the Chandar Lab website.

References

[1] Protein Language Models: Is Scaling Necessary? Quentin Fournier, Robert M. Vernon, Almer van der Sloot, Benjamin Schulz, Sarath Chandar, Christopher James Langmead bioRxiv 2024.09.23.614603; doi: https://doi.org/10.1101/2024.09.23.614603

Model Architecture

Architecture Type: Transformer

Network Architecture: ESM-2 variant with modified layer structure

Input

Input Type(s): Text (Protein Sequences)

Input Format(s): String

Input Parameters: 1D

Other Properties Related to Input: Protein sequence represented as a string of canonical amino acids.

Output

Output Type(s): Embeddings (Amino acid and sequence-level)

Output Format: Numeric vector

Output Parameters: 1D

Other Properties Related to Output: Numeric vector with floating-point values corresponding to an embedding for each amino acid in the input protein sequence.

Software Integration

Runtime Engine(s):

  • BioNeMo, TransformerEngine

Supported Hardware Microarchitecture Compatibility:

  • NVIDIA Ampere
  • NVIDIA Hopper

[Preferred/Supported] Operating System(s):

  • Linux

Model Versions

The model is fully compatible with weights distributed via HuggingFace, i.e., chandar-lab/AMPLIFY_120M. See bionemo-recipes/models/amplify for checkpoint conversion utilities and usage examples.

Training & Evaluation

Training Dataset

The model was trained on a curated dataset of protein sequences following similar principles to ESM-2's training data. For more details on the training dataset, see the original AMPLIFY paper.

Inference

Engine: BioNeMo, TransformerEngine

Test Hardware:

  • NVIDIA H100

License

AMPLIFY is provided under the Apache 2.0 license.

Pre-training Performance

Use the recipe READMEs for current training entrypoints:

Model Size GPUs Batch Size (per GPU) Training Step Time (s)
120M 16 x NVIDIA H100 256 0.461
350M 32 x NVIDIA H100 128 0.525

Model Convergence

Model convergence curves are shown below for the 120M and 350M models, trained on the chandar-lab/UR100P dataset for 1M steps.

AMPLIFY Pre-training Training Loss

AMPLIFY Pre-training Validation Loss

AMPLIFY Pre-training Validation Perplexity

Final Perplexities by Model Size

Model Size Perplexity at 1M Steps
120M 4.23
350M 3.05