CRAN - Campus Sciences
BP 70239 - 54506 VANDOEUVRE Cedex
Tél : +33 (0)3 72 74 52 90
Ph. D. Project : Stability analysis of optimal control with infinite-horizon discounted cost
Dates : 2016/12/01 - 2019/11/30
Student: Mathieu GRANZOTTO
Manager(s) CRAN: Jamal DAAFOUZ , Romain POSTOYAN
Full reference: Artificial intelligence abounds in optimal control algorithms. Their purpose is to generate sequences
of control inputs for dynamical systems to minimize a given cost function, which may model the
energy of the system for instance. These methods are applicable to large classes of nonlinear
discrete-time systems and have proved their efficiency in numerous applications. To exploit AI
optimal algorithms in control theory is very promising. Nevertheless, an important point remains to
be clarified: stability. Indeed, these works concentrate on optimality and most of the time ignore the
stability of the controlled system.

The objective of the Ph.D. thesis is to study the stability of discrete-time nonlinear systems
controlled by such algorithms. The potential impact is significant as this would create a bridge
between AI and control theory. We will study infinite-horizon cost functions which are discounted,
meaning that the stage cost is weighted by an exponentially decreasing factor. This type of cost is
often considered in dynamic programming [B12], reinforcement learning [BBDSE10], and optimistic
planning [LV06] and is convenient for the synthesis and the optimality analysis. On the other hand,
the discount factor is source of difficulties when investigating stability.

We have recently proposed a general approach to analyze the stability of nonlinear systems when
the control input is optimal using Lyapunov theory, see [PBND14; PBND]. The candidate will have to
extend these results to the case where the sequence of control inputs is generated by a near-
optimal algorithm. We will focus on given algorithms, such as the one in [M14] for example. We will
then study the impact of stability on optimality, and how stability can be used to improve the
algorithms in terms of computation time, as well as the possibility to relax the required

The Ph.D. thesis will be supervised by Prof. Jamal Daafouz and Dr. Romain Postoyan from CRAN
(Université de Lorraine, CNRS, UMR 7039 in Nancy, France), and the work will be done in
collaboration with Lucian Busoniu (Cluj-Napoca Technical University, Romania) and Dragan Nesic
(The University of Melbourne, Australia).

[B12] D. P. Bertsekas, “Dynamic Programming and Optimal Control”, volume 2, Athena Scientific,
Belmont, 4th edition, U.S.A., 2012.
[BBDSE10] L. Busoniu, R. Babuska, B. De Schutter, and D. Ernst, “Reinforcement Learning and
Dynamic Programming Using Function Approximators”, Automation and Control Engineering. Taylor
& Francis CRC Press, 2010.
[LV06] S. M. LaValle, “Planning Algorithms”, Cambridge University Press, New York, U.S.A., 2006.
[M14] R. Munos, “The optimistic principle applied to games, optimization and planning: towards
foundations of Monte-Carlo tree search”, “Foundations and Trends in Machine Learning”, 7(1):1–
130, 2014.
[PBND14] R. Postoyan, L. Busoniu, D. Nesic et J. Daafouz, “Stability of infinite-horizon optimal
control with discounted cost”, CDC (IEEE Conference on Decision and Control), Los Angeles: U.S.A.,
[PBND] R. Postoyan, L. Busoniu, D. Nesic et J. Daafouz, “A comprehensive stability analysis of
infinite-horizon optimal control with discounted cost”, submitted for journal publication.
Keywords: Stability, Lyapunov, nonlinear systems, optimal control, artificial intelligence
Automatic Control-Identification Diagnosis