Ideas in Motion

A visual journey through my research — figures, results, and moments from the work. Much of this grew out of collaborations with brilliant peers, and I've learned as much from those people as from the research itself. Click any image to zoom in. For a full list of publications, visit my Google Scholar.

Representation Learning Enables Scalable Multitask Deep Reinforcement Learning

Preprint 2026

Representation Learning Enables Scalable Multitask Deep Reinforcement Learning

Paper

Local Guidance, Global Impact: Gaussian-Reshaped Trust Region Unlocks Behavior Transitions

Preprint 2026

Local Guidance, Global Impact: Gaussian-Reshaped Trust Region Unlocks Behavior Transitions

Paper

Stable Deep RL via Isotropic Gaussian Representations

ICML 2026 Spotlight

Stable Deep Reinforcement Learning via Isotropic Gaussian Representations

Paper

Simplicial Embeddings Improve Sample Efficiency in Actor-Critic Agents

ICLR 2026

Simplicial Embeddings Improve Sample Efficiency in Actor-Critic Agents

Paper

Asymmetric Proximal Policy Optimization

ICLR 2026

Asymmetric Proximal Policy Optimization: mini-critics boost LLM reasoning

Paper

Grounding Computer Use Agents on Human Demonstrations

ICLR 2026

Grounding Computer Use Agents on Human Demonstrations

Paper

Tricks or Traps? A deep dive into RL for LLM reasoning

ICLR 2026

Tricks or Traps? A deep dive into RL for LLM reasoning

Paper

Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation

ICLR 2026

Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation

Paper

A mechanistic analysis of looped reasoning language models

Preprint 2026

A mechanistic analysis of looped reasoning language models

Paper

Do Enterprise Systems Need Learned World Models?

Preprint 2026

Do Enterprise Systems Need Learned World Models? The Importance of Context to Infer Dynamics

Paper

Stable Gradients for Stable Learning at Scale in Deep RL

NeurIPS 2025 Spotlight

Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning

Paper

The Courage to Stop: Overcoming Sunk Cost Fallacy in Deep RL

ICML 2025

The Courage to Stop: Overcoming Sunk Cost Fallacy in Deep Reinforcement Learning

Paper

The Impact of On-Policy Parallelized Data Collection on Deep RL Networks

ICML 2025

The Impact of On-Policy Parallelized Data Collection on Deep Reinforcement Learning Networks

Paper

Mitigating Plasticity Loss in Continual RL by Reducing Churn

ICML 2025

Mitigating Plasticity Loss in Continual Reinforcement Learning by Reducing Churn

Paper

Measure gradients, not activations! Enhancing neuronal activity in deep RL

NeurIPS 2025

Measure gradients, not activations! Enhancing neuronal activity in deep reinforcement learning

Paper

Generating Creative Chess Puzzles

NeurIPS 2025

Generating Creative Chess Puzzles

Paper

Evaluating In Silico Creativity: An Expert Review of AI Chess Compositions

NeurIPS 2025

Evaluating In Silico Creativity: An Expert Review of AI Chess Compositions

Paper

Trajectory balance with asynchrony

NeurIPS 2025

Trajectory balance with asynchrony: Decoupling exploration and learning for fast, scalable LLM post-training

Paper

Include: Evaluating multilingual language understanding with regional knowledge

ICLR 2025 Spotlight

Include: Evaluating multilingual language understanding with regional knowledge

Paper

All languages matter: Evaluating LMMs on culturally diverse 100 languages

CVPR 2025 Spotlight

All languages matter: Evaluating LMMs on culturally diverse 100 languages

Paper

Neuroplastic Expansion in Deep Reinforcement Learning

ICLR 2025

Neuroplastic Expansion in Deep Reinforcement Learning

Paper

Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL

ICLR 2025 Spotlight

Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL

Paper

Adaptive Computation Pruning for the Forgetting Transformer

COLM 2025

Adaptive Computation Pruning for the Forgetting Transformer

Paper

Learning structured spatiotemporal tasks with xLSTM under uncertainty

Workshop 2025

Learning structured spatiotemporal tasks with xLSTM under uncertainty: A multi-task approach

Paper

Recursive self-aggregation unlocks deep thinking in large language models

Preprint 2025

Recursive self-aggregation unlocks deep thinking in large language models

Paper

A Comedy of Estimators: On KL Regularization in RL Training of LLMs

Preprint 2025

A Comedy of Estimators: On KL Regularization in RL Training of LLMs

Paper

Mixtures of Experts Unlock Parameter Scaling for Deep RL

ICML 2024 Spotlight

Mixtures of Experts Unlock Parameter Scaling for Deep RL

Paper

In value-based deep RL, a pruned network is a good network

ICML 2024

In value-based deep RL, a pruned network is a good network

Paper

On the consistency of hyper-parameter selection in value-based deep RL

RLC 2024

On the consistency of hyper-parameter selection in value-based deep RL

Paper

JaxPruner: A concise library for sparsity research

CPAL 2024

JaxPruner: A concise library for sparsity research

Paper

Mixture of Experts in a Mixture of RL settings

RLC 2024

Mixture of Experts in a Mixture of RL settings

Paper

Small batch deep reinforcement learning

NeurIPS 2023

Small batch deep reinforcement learning

Paper

Bigger, Better, Faster: Human-level Atari with human-level efficiency

ICML 2023

Bigger, Better, Faster: Human-level Atari with human-level efficiency

Paper

Probabilistic multi-modal depth estimation based on camera-LiDAR sensor fusion

Journal 2023

Probabilistic multi-modal depth estimation based on camera–LiDAR sensor fusion

Paper

Revisiting Rainbow: Promoting more insightful and inclusive deep RL research

ICML 2021

Revisiting Rainbow: Promoting more insightful and inclusive deep RL research

Paper

Lifting the veil on hyper-parameters for value-based deep RL

Workshop 2021

Lifting the veil on hyper-parameters for value-based deep reinforcement learning

Paper

Quantification of operating reserves with high penetration of wind power

Journal 2020

Quantification of operating reserves with high penetration of wind power considering extreme values

Paper

Exploiting the potential of deep RL for classification tasks

Workshop 2019

Exploiting the potential of deep RL for classification tasks in high-dimensional and unstructured data

Paper

Probabilistic Perception System for Object Classification Based on Camera-LiDAR Sensor Fusion

Workshop 2019

Probabilistic Perception System for Object Classification Based on Camera-LiDAR Sensor Fusion

Paper

An integrated OPF dispatching model with wind power and demand response

Journal 2019

An integrated OPF dispatching model with wind power and demand response for day-ahead markets

Paper

Divide and conquer: An accurate machine learning algorithm to process split videos

Workshop 2019

Divide and conquer: An accurate machine learning algorithm to process split videos on a parallel processing infrastructure

Paper

Evaluación del Rendimiento de Modulos Solares Híbridos

IEEE 2018

Evaluación del Rendimiento de Modulos Solares Híbridos (FV/T) Para el Abastecimiento Energético de Autoclaves Hospitalarias

Paper

Impacts of demand response under wind power uncertainty in network-constrained electricity markets

IEEE 2018

Impacts of demand response under wind power uncertainty in network-constrained electricity markets

Paper

Quantification of operating reserves with wind power in day-ahead dispatching

IEEE 2018

Quantification of operating reserves with wind power in day-ahead dispatching

Paper

Network Topological Notions for Power Systems Security Assessment

Journal 2018

Network Topological Notions for Power Systems Security Assessment

Paper