Palestra: Optimizing the Learning Trajectory of Reinforcement Learning Agents

No dia 22 de abril, quarta-feira, às 11 horas, na sala 2077 do ICEx, o doutorando da Technical University of Darmstadt and at DFKI, the German institute for Artificial Intelligence, irá proferir a palestra “Optimizing the Learning Trajectory of Reinforcement Learning Agents”. A participação é gratuíta e não há necessidade de inscrição prévia.

Abstract: Reinforcement learning is a powerful tool to solve complex sequential decision-making problems. Most advances in reinforcement learning focus on designing a learning algorithm that remains fixed during the training process. In this talk, I will present a new perspective on the training process, in which the algorithm can change to adapt itself to the learning situation. I will then explain how this vision materializes, adapting the hyperparameters of the reinforcement learning agent to its learning pace. Then, I will present how the agent’s performance can be increased by anticipating its learning trajectory. The idea of optimizing the learning trajectory opens up new possibilities for designing reinforcement learning algorithms that achieve a satisfactory outcome in a single training trajectory.

Bio: Théo Vincent is a Ph.D. student at the Technical University of Darmstadt and at DFKI, the German institute for Artificial Intelligence. He is currently working on off-policy Reinforcement Learning methods. He works under the supervision of Jan Peters. Before his Ph.D., Théo graduated from MVA at ENS Paris Saclay. He also worked on Compute Vision problems in a Parisian lab, Saint-Venant lab, and a Swedish start-up, Signality. Théo did an internship in biostatistics at Harvard Medical School.

Acesso por PERFIL

Acessar o conteúdo