Model readme
Geneformer (TransformerEngine-optimized)
This version of the Geneformer model is optimized with NVIDIA's TransformerEngine library. It is based on the original Geneformer model from Theodoris et al., and (within numerical precision) has identical weights and outputs.
Geneformer is a foundational transformer model pretrained on a large-scale corpus of single cell transcriptomes representing a broad range of human tissues. It is suitable for fine-tuning on a wide range of tasks that take gene expression data as input. For detailed information on the model architecture and training data, please refer to the accompanying paper. You may also be interested in the documentation and examples which demonstrate how to fine-tune Geneformer models on your tasks of interest.
Several Geneformer checkpoints are available in the Hub with varying sizes. Larger sizes generally have somewhat better accuracy, but require much more memory and time to train:
| Checkpoint name | Parameters | Input size | Vocabulary | Training data |
|---|---|---|---|---|
| Geneformer-V1-10M | 10M | 2048 | ~25K genes | ~30M cells |
| Geneformer-V2-104M | 104M | 4096 | ~20K genes | ~104M cells |
| Geneformer-V2-316M | 316M | 4096 | ~20K genes | ~104M cells |
| Geneformer-V2-104M_CLcancer | 104M | 4096 | ~20K genes | ~104M + 14M cancer cells |