NV

Infrastructure

Model 70B Fine-tune

Training Epoch 14 / 20 ETA: 4h 23m

Training Loss

0.0234

-42%
Per epoch

Validation Loss

0.0312

+2%
Per epoch

Learning Rate

2.4e-5

GPU Memory

76.2 GB

/ 80 GB

Throughput

1,842

tokens/s

Checkpoint Evaluations

Checkpoint Epoch Train Loss Val Loss Trend BLEU Score Trend Status ckpt-014 14 0.0234
0.031
42.8
Current
ckpt-010 10 0.0298
0.029
41.2
Best
ckpt-005 5 0.0512
0.048
35.6
Saved