A Pontryagin Perspective on Reinforcement Learning

Onno Eberhard · Claire Vernade · Michael Muehlebach

2025 Seventh Annual Learning for Dynamics & Control Conference (L4DC 2025)
Oral · Best Paper Nomination

Reinforcement learning has traditionally focused on learning state-dependent policies to solve optimal control problems in a closed-loop fashion. In this work, we introduce the paradigm of open-loop reinforcement learning where a fixed action sequence is learned instead. We present three new algorithms: one robust model-based method and two sample-efficient model-free methods. Rather than basing our algorithms on Bellman’s equation from dynamic programming, our work builds on Pontryagin’s principle from the theory of open-loop optimal control. We provide convergence guarantees and evaluate all methods empirically on a pendulum swing-up task, as well as on two high-dimensional MuJoCo tasks, demonstrating remarkable performance compared to existing baselines.

@inproceedings{eberhard-2025-pontryagin,
  title = {A Pontryagin Perspective on Reinforcement Learning},
  author = {Eberhard, Onno and Vernade, Claire and Muehlebach, Michael},
  booktitle = {Proceedings of the Seventh Annual Learning for Dynamics \& Control Conference},
  pages = {233--244},
  year = {2025},
  series = {Proceedings of Machine Learning Research},
  volume = {283},
  url = {https://proceedings.mlr.press/v283/eberhard25a.html}
}

PDF PMLR Code Slides

Also presented at the ICML 2024 Workshop on Foundations of Reinforcement Learning and Control (OpenReview).

The video below shows the open-loop behavior learned by our model-free method on two MuJoCo tasks.

Onno Eberhard

Œ

A Pontryagin Perspective on Reinforcement Learning

Onno Eberhard · Claire Vernade · Michael Muehlebach