Blog Posts (on matharena.ai)

MathArena Apex: Unconquered Final-Answer Problems
With Flying Colors: Language Models Ace IMC 2025
Not Even Bronze: Evaluating LLMs on IMO 2025

Publications

2025

MathArena: Evaluating LLMs on Uncontaminated Math Competitions
Mislav Balunović, Jasper Dekoninck, Nikola Jovanović, Ivo Petrov, Martin Vechev
NeurIPS Datasets and Benchmarks 2025
Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad
Ivo Petrov, Jasper Dekoninck, Lyuben Baltadzhiev, Maria Drencheva, Kristian Minchev, Mislav Balunović, Nikola Jovanović, Martin Vechev
AI4Math@ICML 2025
The Open Proof Corpus: A Large-Scale Study of LLM-Generated Mathematical Proofs
Jasper Dekoninck, Ivo Petrov, Kristian Minchev, Mislav Balunovic, Martin Vechev, Miroslav Marinov, Maria Drencheva, Lyuba Konova, Milen Milenov Shumanov, Kaloyan Tsvetkov, Nikolay Drenchev, Lazar D. Todorov, Kalina Nikolova, Nikolay Georgiev, Vanesa Kalinkova, Margulan Ismoldayev
AI4Math@ICML 2025
MathConstruct: Challenging LLM Reasoning with Constructive Proofs
Mislav Balunović*, Jasper Dekoninck*, Nikola Jovanović, Ivo Petrov, Martin Vechev
ICML 2025 * Equal contribution

2024

Constraint-Based Synthetic Data Generation for LLM Mathematical Reasoning
Timofey Fedoseev, Dimitar I. Dimitrov, Timon Gehr, Martin Vechev
Workshop on Mathematical Reasoning, NeurIPS 2024