✦ Magnus Saebo_

MSCS @ Columbia University · AI & Agentic Systems

About

Greetings, traveler. I am a Computer Science master's student at Columbia University and a Research Fellow at the Supervised Program for Alignment Research. My work focuses on the performance and alignment of AI agentic systems: how small models can become expert coders, how they drift from goals under pressure, and how they can be optimized at test time.

Before Columbia, I spent nearly three years as a Machine Learning Engineer at Leidos, working on model compression for edge hardware, distributed learning security, and MLOps. I started my journey at Cornell University, where I hold a B.A. in Mathematics and Computer Science.

Magnus Saebo
Class MSCS '26
Guild Columbia University
Specialty AI & Agentic Systems

Publications

SWE-Spot figure
SWE-Spot: Building Small Repo-Experts with Repository-Centric Learning
J. Peng*, M. Saebo*, T. Zhong, Y. Cheng, J. Yang, B. Ray, S. Chen, Y. Ding
Proposes Repository-Centric Learning, a paradigm that prioritizes deep repository understanding over broad task exposure, enabling compact SWE-Spot-4B models to outperform larger open-weight alternatives and match efficiency-focused commercial models while requiring fewer training samples and lower inference costs.
Asymmetric Goal Drift figure
Asymmetric Goal Drift in Coding Agents Under Value Conflict
M. Saebo, S. Gibson, T. Crosse, A. Menon, E. Jang, D. Cruz
Introduces a framework built on OpenCode to measure how coding agents violate explicit system prompt constraints over time. Demonstrates that agents exhibit asymmetric drift, more readily violating constraints that oppose strongly-held values like security, driven by value alignment, adversarial pressure, and accumulated context.
Inherited Goal Drift figure
Inherited Goal Drift: Contextual Pressure Can Undermine Agentic Goals
A. Menon, M. Saebo, T. Crosse, S. Gibson, E. Jang, D. Cruz
Investigates goal drift in state-of-the-art LM agents within a simulated stock-trading environment. Shows that while models are largely robust to direct adversarial pressure, they inherit drift when conditioned on prefilled trajectories from weaker agents, with only GPT-5.1 maintaining consistent resilience.
Duel-Evolve figure
Duel-Evolve: Pairwise Preference Black-Box Optimization of LLM Responses
S. Karlekar*, C. Zheng*, N. Beltran-Velez*, M. Saebo*, S. Yu, M. Kucer, J. Bowlan, D. Blei
An evolutionary optimizer that replaces external scalar rewards with pairwise preferences from the same LLM, aggregated via a Bayesian Bradley-Terry model. Achieves 20% higher accuracy on MathBench and 13% improvement on LiveCodeBench over existing iterative methods, requiring no reward model or ground-truth labels.
BlindFL figure
BlindFL: Segmented Federated Learning with Fully Homomorphic Encryption
E. Gronberg*, L. d'Aliberti*, M. Saebo*, A. Hook
A federated learning framework where clients encrypt and send subsets of local model updates using fully homomorphic encryption, providing protection against gradient inversion attacks while reducing computational overhead.
Physical Neural Networks figure
Nonlinear Dynamical Systems are Scalable and Efficient Physical Neural Networks
Under review, NPJ Unconventional Computing · M. Saebo*, T. King*, M. Anderson, T. Onodera, P. McMahon
Derives physically-constrained model compression and modular scaling techniques for oscillator-based neural networks, achieving 3x parameter efficiency and improved accuracy over other physical neural network approaches.

Repos

Coming Soon

Resume