Model 120B Fine-tune
Training Loss
0.0234
Validation Loss
0.0312
Learning Rate
2.4e-5
GPU Memory
76.2 GB
/ 80 GBThroughput
1,842
tokens/sCheckpoint Evaluations
0.031
0.029
0.048