Titre: Contributions to Safe Reinforcement Learning and Degradation Tolerant Control Design
Résumé:This thesis develops an off-policy safe Reinforcement Learning (RL) approach for the regulation and the tracking problem in continuous-time nonlinear systems affine in control input. A novel approach is proposed that ensures system stability and safety during all phases: initialization, exploration, and exploitation. By using quadratic programming with control Lyapunov function (CLF) and control barrier function (CBF), the proposed approach ensures stability and safety of the system during initialization and exploration phases. Furthermore, during exploitation, the safety of the learned policy is ensured by augmenting the cost function with reciprocal CBFs, thus balancing performance optimization and safety. Moreover, this thesis focuses on addressing actuator degradation bu introducing a RL-based degradation-tolerant controller. The objectives are twofold: ensuring system stability despite degradation, and decelerating the degradation rate to complete missions and extend actuator life. This is achieved by imposing constraints on degradation rates using CBFs. Furthermore, a cyclic off-policy algorithm is developed, enabling iterative exploration and exploitation across multiple learning cycles. This allows for continuous updates of neural network weights with recent information on degradation levels, ensuring that the learned policy effectively stabilizes the system while accounting for degradation effects.
Rapporteurs :
– Antoine GIRARD (Professor, Université Paris Saclay)
– Bayu JAYAWARDHANA (Professor, Engineering and Technology Institute Groningen)
Examinateurs :
– Dalil ICHALAL (Professor, Université d’Evry)
– Bahare KIUMARSI (Assistant Professor, Michigan State University)
– Kyriakos VAMVOUDAKIS (Professor, Georgia Institute of Technology)
Directors :
Mayank Shekhar JHA (Maitre de Conférence, Université de Lorraine)
Didier THEILLIOL (Professor, Université de Lorraine)