Date of Award
Doctor of Philosophy (PhD)
Mechanical and Civil Engineering
Hector M. Gutierrez
Anand Balu Nellippallil
This work investigates the adaptive optimal control problems for continuous-time linear and nonlinear systems. A novel adaptive dynamic programming method, namely hybrid iteration (HI) is presented. Different from the existing adaptive dynamic programming methods, i.e., policy iteration (PI) and value iteration (VI), the HI does not require an initial admissible control policy compared to the PI. On the other hand, the HI converges to the optimal control policy in much less learning iterations and CPU time compared to the VI due to the ensured quadratic rate of convergence of the HI method, which is much faster than the sub-linear convergence rate in the VI method. This method is first investigated for continuous-time linear systems and discrete-time linear systems. The statistical analysis and the simulation results illustrate the effectiveness of the proposed HI algorithm. Afterwards, to ensure the practicality of the proposed HI method, a class of continuous-time nonlinear systems are also considered,
wherein it is shown and proved that, even under the nonlinear framework, the HI method still outperforms the PI and VI methods. In addition, this work is extended to multi-agent systems, wherein the HI method is studied such that the optimal control policy is learned to achieve the cooperative optimal output regulation problem. The practicality of the HI method is ensured by applying the method on islanded modern microgrids with IBRs. Moreover, since this work is mainly concerned about adaptive optimal control problems in information-limited environments, the robust optimal control problem of a class of continuous-time, partially linear, interconnected systems has also been considered. In addition to the dynamic uncertainties resulted from the interconnected dynamic system, unknown bounded disturbances are taken into account throughout the learning process, wherein the system’s dynamics and the disturbances are assumed unknown. These challenges lead the online data, collected from the trajectories of the underlying dynamic system, to be imperfect and incomplete. As a result, traditional data-driven control techniques, such as adaptive dynamic programming (ADP) and robust ADP, encounter a challenge in approximating the optimal control policy due to imperfect data. A novel data-driven robust policy iteration method is proposed to simultaneously solve the robust optimal control problems. Without relying on the knowledge of the system’s dynamics, the external disturbances or the complete state, the implementation of the proposed method only needs to access the input and partial state information. Based on the small-gain theorem, the notions of strong unboundedness observability and input-to-output stability, it is guaranteed that the learned robust optimal control gain is stabilizing and that the solution of the closed-loop system is uniformly ultimately bounded despite the existence of dynamic uncertainties and unknown external disturbances.
Qasem, Omar Ghassan, "Improving the Learning Efficiency of Adaptive Optimal Control in Information-Limited Environments" (2023). Theses and Dissertations. 1238.
Available for download on Monday, May 06, 2024