Blog Posts (on matharena.ai)

MathArena Apex: Unconquered Final-Answer Problems
With Flying Colors: Language Models Ace IMC 2025
Not Even Bronze: Evaluating LLMs on IMO 2025

Publications

2025

MathConstruct: Challenging LLM Reasoning with Constructive Proofs
Mislav Balunović*, Jasper Dekoninck*, Nikola Jovanović, Ivo Petrov, Martin Vechev
ICML 2025 * Equal contribution
The Open Proof Corpus: A Large-Scale Study of LLM-Generated Mathematical Proofs
Jasper Dekoninck, Ivo Petrov, Kristian Minchev, Mislav Balunovic, Martin Vechev, Miroslav Marinov, Maria Drencheva, Lyuba Konova, Milen Milenov Shumanov, Kaloyan Tsvetkov, Nikolay Drenchev, Lazar D. Todorov, Kalina Nikolova, Nikolay Georgiev, Vanesa Kalinkova, Margulan Ismoldayev
arXiv 2025
MathArena: Evaluating LLMs on Uncontaminated Math Competitions
Mislav Balunović, Jasper Dekoninck, Nikola Jovanović, Ivo Petrov, Martin Vechev
arXiv 2025
Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad
Ivo Petrov, Jasper Dekoninck, Lyuben Baltadzhiev, Maria Drencheva, Kristian Minchev, Mislav Balunović, Nikola Jovanović, Martin Vechev
arXiv 2025

2024

Constraint-Based Synthetic Data Generation for LLM Mathematical Reasoning
Timofey Fedoseev, Dimitar I. Dimitrov, Timon Gehr, Martin Vechev
Workshop on Mathematical Reasoning, NeurIPS 2024