Other benchmarks¶
More details are coming soon!
Supported benchmarks¶
arena-hard¶
Note
For now we use v1 implementation of the arena hard!
- Benchmark is defined in
nemo_skills/dataset/arena-hard/__init__.py
- Original benchmark source is here.
More details are coming soon!
Note
For now we use v1 implementation of the arena hard!
nemo_skills/dataset/arena-hard/__init__.py