Ideas in Motion

A visual journey through my research — figures, results, and moments from the work. Much of this grew out of collaborations with brilliant peers, and I've learned as much from those people as from the research itself. Click any image to zoom in. For a full list of publications, visit my Google Scholar.

Representation Learning Enables Scalable Multitask Deep Reinforcement Learning
Preprint 2026

Representation Learning Enables Scalable Multitask Deep Reinforcement Learning

Paper
Local Guidance, Global Impact: Gaussian-Reshaped Trust Region Unlocks Behavior Transitions
Preprint 2026

Local Guidance, Global Impact: Gaussian-Reshaped Trust Region Unlocks Behavior Transitions

Paper
Stable Deep RL via Isotropic Gaussian Representations
ICML 2026 Spotlight

Stable Deep Reinforcement Learning via Isotropic Gaussian Representations

Paper
Simplicial Embeddings Improve Sample Efficiency in Actor-Critic Agents
ICLR 2026

Simplicial Embeddings Improve Sample Efficiency in Actor-Critic Agents

Paper
Asymmetric Proximal Policy Optimization
ICLR 2026

Asymmetric Proximal Policy Optimization: mini-critics boost LLM reasoning

Paper
Grounding Computer Use Agents on Human Demonstrations
ICLR 2026

Grounding Computer Use Agents on Human Demonstrations

Paper
Tricks or Traps? A deep dive into RL for LLM reasoning
ICLR 2026

Tricks or Traps? A deep dive into RL for LLM reasoning

Paper
Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation
ICLR 2026

Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation

Paper
A mechanistic analysis of looped reasoning language models
Preprint 2026

A mechanistic analysis of looped reasoning language models

Paper
Do Enterprise Systems Need Learned World Models?
Preprint 2026

Do Enterprise Systems Need Learned World Models? The Importance of Context to Infer Dynamics

Paper
Stable Gradients for Stable Learning at Scale in Deep RL
NeurIPS 2025 Spotlight

Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning

Paper
The Courage to Stop: Overcoming Sunk Cost Fallacy in Deep RL
ICML 2025

The Courage to Stop: Overcoming Sunk Cost Fallacy in Deep Reinforcement Learning

Paper
The Impact of On-Policy Parallelized Data Collection on Deep RL Networks
ICML 2025

The Impact of On-Policy Parallelized Data Collection on Deep Reinforcement Learning Networks

Paper
Mitigating Plasticity Loss in Continual RL by Reducing Churn
ICML 2025

Mitigating Plasticity Loss in Continual Reinforcement Learning by Reducing Churn

Paper
Measure gradients, not activations! Enhancing neuronal activity in deep RL
NeurIPS 2025

Measure gradients, not activations! Enhancing neuronal activity in deep reinforcement learning

Paper
Generating Creative Chess Puzzles
NeurIPS 2025

Generating Creative Chess Puzzles

Paper
Evaluating In Silico Creativity: An Expert Review of AI Chess Compositions
NeurIPS 2025

Evaluating In Silico Creativity: An Expert Review of AI Chess Compositions

Paper
Trajectory balance with asynchrony
NeurIPS 2025

Trajectory balance with asynchrony: Decoupling exploration and learning for fast, scalable LLM post-training

Paper
Include: Evaluating multilingual language understanding with regional knowledge
ICLR 2025 Spotlight

Include: Evaluating multilingual language understanding with regional knowledge

Paper
All languages matter: Evaluating LMMs on culturally diverse 100 languages
CVPR 2025 Spotlight

All languages matter: Evaluating LMMs on culturally diverse 100 languages

Paper
Neuroplastic Expansion in Deep Reinforcement Learning
ICLR 2025

Neuroplastic Expansion in Deep Reinforcement Learning

Paper
Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL
ICLR 2025 Spotlight

Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL

Paper
Adaptive Computation Pruning for the Forgetting Transformer
COLM 2025

Adaptive Computation Pruning for the Forgetting Transformer

Paper
Learning structured spatiotemporal tasks with xLSTM under uncertainty
Workshop 2025

Learning structured spatiotemporal tasks with xLSTM under uncertainty: A multi-task approach

Paper
Recursive self-aggregation unlocks deep thinking in large language models
Preprint 2025

Recursive self-aggregation unlocks deep thinking in large language models

Paper
A Comedy of Estimators: On KL Regularization in RL Training of LLMs
Preprint 2025

A Comedy of Estimators: On KL Regularization in RL Training of LLMs

Paper
Mixtures of Experts Unlock Parameter Scaling for Deep RL
ICML 2024 Spotlight

Mixtures of Experts Unlock Parameter Scaling for Deep RL

Paper
In value-based deep RL, a pruned network is a good network
ICML 2024

In value-based deep RL, a pruned network is a good network

Paper
On the consistency of hyper-parameter selection in value-based deep RL
RLC 2024

On the consistency of hyper-parameter selection in value-based deep RL

Paper
JaxPruner: A concise library for sparsity research
CPAL 2024

JaxPruner: A concise library for sparsity research

Paper
Mixture of Experts in a Mixture of RL settings
RLC 2024

Mixture of Experts in a Mixture of RL settings

Paper
Small batch deep reinforcement learning
NeurIPS 2023

Small batch deep reinforcement learning

Paper
Bigger, Better, Faster: Human-level Atari with human-level efficiency
ICML 2023

Bigger, Better, Faster: Human-level Atari with human-level efficiency

Paper
Probabilistic multi-modal depth estimation based on camera-LiDAR sensor fusion
Journal 2023

Probabilistic multi-modal depth estimation based on camera–LiDAR sensor fusion

Paper
Revisiting Rainbow: Promoting more insightful and inclusive deep RL research
ICML 2021

Revisiting Rainbow: Promoting more insightful and inclusive deep RL research

Paper
Lifting the veil on hyper-parameters for value-based deep RL
Workshop 2021

Lifting the veil on hyper-parameters for value-based deep reinforcement learning

Paper
Quantification of operating reserves with high penetration of wind power
Journal 2020

Quantification of operating reserves with high penetration of wind power considering extreme values

Paper
Exploiting the potential of deep RL for classification tasks
Workshop 2019

Exploiting the potential of deep RL for classification tasks in high-dimensional and unstructured data

Paper
Probabilistic Perception System for Object Classification Based on Camera-LiDAR Sensor Fusion
Workshop 2019

Probabilistic Perception System for Object Classification Based on Camera-LiDAR Sensor Fusion

Paper
An integrated OPF dispatching model with wind power and demand response
Journal 2019

An integrated OPF dispatching model with wind power and demand response for day-ahead markets

Paper
Divide and conquer: An accurate machine learning algorithm to process split videos
Workshop 2019

Divide and conquer: An accurate machine learning algorithm to process split videos on a parallel processing infrastructure

Paper
Evaluación del Rendimiento de Modulos Solares Híbridos
IEEE 2018

Evaluación del Rendimiento de Modulos Solares Híbridos (FV/T) Para el Abastecimiento Energético de Autoclaves Hospitalarias

Paper
Impacts of demand response under wind power uncertainty in network-constrained electricity markets
IEEE 2018

Impacts of demand response under wind power uncertainty in network-constrained electricity markets

Paper
Quantification of operating reserves with wind power in day-ahead dispatching
IEEE 2018

Quantification of operating reserves with wind power in day-ahead dispatching

Paper
Network Topological Notions for Power Systems Security Assessment
Journal 2018

Network Topological Notions for Power Systems Security Assessment

Paper