Model readme

Geneformer (TransformerEngine-optimized)

This version of the Geneformer model is optimized with NVIDIA's TransformerEngine library. It is based on the original Geneformer model from Theodoris et al., and (within numerical precision) has identical weights and outputs.

Geneformer is a foundational transformer model pretrained on a large-scale corpus of single cell transcriptomes representing a broad range of human tissues. It is suitable for fine-tuning on a wide range of tasks that take gene expression data as input. For detailed information on the model architecture and training data, please refer to the accompanying paper. You may also be interested in the documentation and examples which demonstrate how to fine-tune Geneformer models on your tasks of interest.

Several Geneformer checkpoints are available in the Hub with varying sizes. Larger sizes generally have somewhat better accuracy, but require much more memory and time to train:

Checkpoint name	Parameters	Input size	Vocabulary	Training data
Geneformer-V1-10M	10M	2048	~25K genes	~30M cells
Geneformer-V2-104M	104M	4096	~20K genes	~104M cells
Geneformer-V2-316M	316M	4096	~20K genes	~104M cells
Geneformer-V2-104M_CLcancer	104M	4096	~20K genes	~104M + 14M cancer cells