About Me
Hi, I'm Mark! 👋
I am about to start my PhD at EPFL, where I'll be part of the Theory of Machine Learning Lab, advised by Prof. Nicolas Flammarion. Previously, I completed my Master's at Saarland University, where I worked with Prof. Michael Hahn.
I am amazed by the unreasonable effectiveness of Deep Learning. Somehow, it seems that you can get incredibly capable models by throwing data at gradient descent at large scale (plus a lot of engineering). This felt like magic to me when I first started studying ML, and it still does. Seeing through the spell is the main goal of my research: I want to understand how neural networks (and especially LLMs, which are arguably the most impressive class of them) work so well.
Speaking more concretely, I am currently interested in expressivity (what can models learn in principle?), interpretability (what do they learn in practice?), and training dynamics (how and why do they learn that?). Two recent projects in this spirit that I worked on: one on the limits of Chain-of-Thought expressivity, and another on the mechanisms of feature development in LLMs.
When I'm not staring at my Weights & Biases dashboards, I enjoy a range of (rather stereotypical) hobbies, such as reading books, traveling by train, bike, and on foot, or playing board games. Recently I also got into trail running, and I'm very excited about challenging myself on the alpine trails of Switzerland.
I'm always happy to meet new friends and collaborators! Feel free to drop me an email or message me on X about anything, and if you're in Lausanne, let's grab coffee ☕️☕️
News
September 2025 — Starting a new chapter: a PhD at EPFL! I will do my first semester project in the Theory of Machine Learning Lab.
August 2025 — I've defended my Master's thesis: "The Mechanisms of Learning World Models in Self-Supervised Transformers." So now I officially hold an MSc degree!
July 2025 — We have a short paper with preliminary results from our new project: "On the Emergence of 'Useless' Features in Next Token Predictors." It was presented as a spotlight paper at ICML 2025 Workshop on Assessing and Evaluating World Models!
February 2025 — In our new preprint, we study how long chains of thought need to be for transformers to solve different algorithmic problems. Update: accepted to ICML 2025!
August 2024 — Honored and excited to receive ACL 2024 best paper award for "Why are Sensitive Functions Hard for Transformers?"!
May 2024 — Our paper "Why are Sensitive Functions Hard for Transformers?" was accepted to ACL 2024! My talk about it at FLaNN seminar: link. Twitter thread: link.
April 2024 — Had a great time at ALPS 2024 winter school! Magnificent mountains, great lectures, and insightful discussions.
February 2024 — A new preprint is out! We find a theoretical explanation for why Transformers struggle to learn sensitive functions such as Parity and show empirical evidence supporting our reasoning. arXiv link
November 2023 — I am joining Department of Language Science and Technology at Saarland University as a research assistant! I will be working on LLM interpretability under the supervision of Michael Hahn.
October 2023 — I have arrived to Saarbrücken to start my Master's studies at Saarland University. Extremely happy to finally be here!
June 2023 — Today I have successfully defended my Bachelor Thesis! Now I officially hold a B.S. in Computer Science.
May 2023 — I took part in EACL 2023, located in scenic Dubrovnik, Croatia. A big thanks to all the organizers for their excellent work!
January 2023 — Excited to share that our paper "Vote’n’Rank: Revision of Benchmarking with Social Choice Theory" got accepted to EACL 2023!
November 2022 — This month, I have started working as a Large Language Model Developer at Yandex.