Code¶
More details are coming soon!
Supported benchmarks¶
swe-bench¶
- Benchmark is defined in
nemo_skills/dataset/swe-bench/__init__.py
- Original benchmark source is here.
livecodebench¶
- Benchmark is defined in
nemo_skills/dataset/livecodebench/__init__.py
- Original benchmark source is here.
livecodebench-pro¶
- Benchmark is defined in
nemo_skills/dataset/livecodebench-pro/__init__.py
- Original benchmark source is here.
human-eval¶
- Benchmark is defined in
nemo_skills/dataset/human-eval/__init__.py
- Original benchmark source is here.
mbpp¶
- Benchmark is defined in
nemo_skills/dataset/mbpp/__init__.py
- Original benchmark source is here.