Skip to content

2025

A Simple Pipeline to Improve Math Reasoning Accuracy

This tutorial walks you through a simplified version of the pipeline that we used to win the AIMO2 Kaggle competition. We will start with Qwen2.5-14B-Instruct model that only scores ~10% on AIME24 benchmark and improve it to ~30% through a series of NeMo-Skills jobs.

If you’re following along, you’ll need access to either an NVIDIA DGX box with eight NVIDIA A100 (or newer) GPUs or a Slurm cluster with similarly configured nodes. All commands should only take ~2 hours to run.