A deep reinforcement learning approach based on policy gradient for mobile robot navigation

Pianazzi, Enrico (2022) A deep reinforcement learning approach based on policy gradient for mobile robot navigation. [Laurea magistrale], Università di Bologna, Corso di Studio in Automation engineering / ingegneria dell’automazione [LM-DM270], Documento full-text non disponibile
Il full-text non è disponibile per scelta dell'autore. (Contatta l'autore)

Abstract

Reinforcement learning is a model-free technique to solve decision-making problems by learning the best behavior to solve a specific task in a given environment. This thesis work focuses on state-of-the-art reinforcement learning methods and their application to mobile robotics navigation and control. Our work is inspired by the recent developments in deep reinforcement learning and from the ever-growing need for complex control and navigation capabilities from autonomous mobile robots. We propose a reinforcement learning controller based on an actor-critic approach to navigate a mobile robot in an initially unknown environment. The task is to navigate the robot from a random initial point on the map to a fixed goal point, while trying to stay within the environment limits and to avoid obstacles on the path. The agent has no initial knowledge of the environment's characteristic, including the goal and obstacles positions. The adopted algorithm is the so-called Deep Deterministic Policy Gradient (DDPG), which is able to deal with continuous states and inputs thanks to the use of neural networks in the actor-critic architecture and of the policy gradient to update the neural network representing the control policy. The learned controller directly outputs velocity commands to the robot, basing its decisions on the robot's position, without the need of additional sensory data. The robot is simulated as a unicycle kinematic model, and we present an implementation of the learning algorithm and robot simulation developed in Python that is able to solve the goal-reaching task while avoiding obstacles with a success rate above 95%.

Abstract
Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Pianazzi, Enrico
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
reinforcement learning,deep learning methods,autonomous navigation,mobile robot navigation,collision avoidance
Data di discussione della Tesi
21 Marzo 2022
URI

Altri metadati

Gestione del documento: Visualizza il documento

^