Ph. D. Project : Stability analysis of optimal control with infinitehorizon discounted cost
Dates :  2016/12/01  2019/11/30  
Student:  Mathieu GRANZOTTO  
Manager(s) CRAN:  Jamal DAAFOUZ , Romain POSTOYAN  
Full reference:  Artificial intelligence abounds in optimal control algorithms. Their purpose is to generate sequences of control inputs for dynamical systems to minimize a given cost function, which may model the energy of the system for instance. These methods are applicable to large classes of nonlinear discretetime systems and have proved their efficiency in numerous applications. To exploit AI optimal algorithms in control theory is very promising. Nevertheless, an important point remains to be clarified: stability. Indeed, these works concentrate on optimality and most of the time ignore the stability of the controlled system. The objective of the Ph.D. thesis is to study the stability of discretetime nonlinear systems controlled by such algorithms. The potential impact is significant as this would create a bridge between AI and control theory. We will study infinitehorizon cost functions which are discounted, meaning that the stage cost is weighted by an exponentially decreasing factor. This type of cost is often considered in dynamic programming [B12], reinforcement learning [BBDSE10], and optimistic planning [LV06] and is convenient for the synthesis and the optimality analysis. On the other hand, the discount factor is source of difficulties when investigating stability. We have recently proposed a general approach to analyze the stability of nonlinear systems when the control input is optimal using Lyapunov theory, see [PBND14; PBND]. The candidate will have to extend these results to the case where the sequence of control inputs is generated by a near optimal algorithm. We will focus on given algorithms, such as the one in [M14] for example. We will then study the impact of stability on optimality, and how stability can be used to improve the algorithms in terms of computation time, as well as the possibility to relax the required assumptions. The Ph.D. thesis will be supervised by Prof. Jamal Daafouz and Dr. Romain Postoyan from CRAN (Université de Lorraine, CNRS, UMR 7039 in Nancy, France), and the work will be done in collaboration with Lucian Busoniu (ClujNapoca Technical University, Romania) and Dragan Nesic (The University of Melbourne, Australia). References [B12] D. P. Bertsekas, “Dynamic Programming and Optimal Control”, volume 2, Athena Scientiﬁc, Belmont, 4th edition, U.S.A., 2012. [BBDSE10] L. Busoniu, R. Babuska, B. De Schutter, and D. Ernst, “Reinforcement Learning and Dynamic Programming Using Function Approximators”, Automation and Control Engineering. Taylor & Francis CRC Press, 2010. [LV06] S. M. LaValle, “Planning Algorithms”, Cambridge University Press, New York, U.S.A., 2006. [M14] R. Munos, “The optimistic principle applied to games, optimization and planning: towards foundations of MonteCarlo tree search”, “Foundations and Trends in Machine Learning”, 7(1):1– 130, 2014. [PBND14] R. Postoyan, L. Busoniu, D. Nesic et J. Daafouz, “Stability of infinitehorizon optimal control with discounted cost”, CDC (IEEE Conference on Decision and Control), Los Angeles: U.S.A., 2014. [PBND] R. Postoyan, L. Busoniu, D. Nesic et J. Daafouz, “A comprehensive stability analysis of infinitehorizon optimal control with discounted cost”, submitted for journal publication. 

Keywords:  Stability, Lyapunov, nonlinear systems, optimal control, artificial intelligence  
Department(s): 
