A Reinforcement Learning strategy for Satellite Attitude Control

Fiocchi, Leonardo (2021) A Reinforcement Learning strategy for Satellite Attitude Control. [Laurea magistrale], Università di Bologna, Corso di Studio in Automation engineering / ingegneria dell’automazione [LM-DM270], Documento full-text non disponibile
Il full-text non è disponibile per scelta dell'autore. (Contatta l'autore)


In recent years space missions for both scientific and commercial purposes have substantially increased. More and more spacecrafts have flexible multibody structures, are subject to liquid volume changes, fuel utilization, and other behaviours that alter the parameters of the spacecraft's model. Moreover, varying disturbances such as the gravity angle torque due to Earth's gravitational field, aerodynamic torque, and others may lead to unwanted effects on the satellite's dynamics. These uncertainties in the model and environment descriptions make it difficult to set up an exact mathematical model, which makes even more difficult the attitude control tasks. For these reasons, data-driven approaches are introduced to alleviate the shortcomings of model-based control methodologies, or even solving them completely. This thesis focuses on these issues, introducing and implementing a data-driven approach, bridging optimal control and reinforcement learning building blocks. The developments are carried out on discrete-time time-varying linear systems. The particular feature of this methodology is that no prior knowledge of the system parameters is necessary. While no a priori information is used, the results show how the algorithm converges to the optimal control of a controller with full and precise knowledge of the system. While the algorithm and learning process use quaternions representation, a custom animation of the algorithm and system results in Euler Angles is provided to better evaluate the performances of the solution.

Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Fiocchi, Leonardo
Relatore della tesi
Correlatore della tesi
Corso di studio
Ordinamento Cds
Parole chiave
reinforcement learning,approximate dynamic programming,satellite attitude control,policy iteration,cubesat,optimal control
Data di discussione della Tesi
28 Maggio 2021

Altri metadati

Gestione del documento: Visualizza il documento