Deep Reinforcement Learning and sub-problem decomposition using Hierarchical Architectures in partially observable environments

Sovrano, Francesco (2018) Deep Reinforcement Learning and sub-problem decomposition using Hierarchical Architectures in partially observable environments. [Laurea magistrale], Università di Bologna, Corso di Studio in Informatica [LM-DM270]
Documenti full-text disponibili:
[img] Documento PDF (Thesis)
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato

Download (15MB)


Reinforcement Learning (RL) is based on the Markov Decision Process (MDP) framework, but not all the problems of interest can be modeled with MDPs because some of them have non-markovian temporal dependencies. To handle them, one of the solutions proposed in literature is Hierarchical Reinforcement Learning (HRL). HRL takes inspiration from hierarchical planning in artificial intelligence literature and it is an emerging sub-discipline for RL, in which RL methods are augmented with some kind of prior knowledge about the high-level structure of behavior in order to decompose the underlying problem into simpler sub-problems. The high-level goal of our thesis is to investigate the advantages that a HRL approach may have over a simple RL approach. Thus, we study problems of interest (rarely tackled by mean of RL) like Sentiment Analysis, Rogue and Car Controller, showing how the ability of RL algorithms to solve them in a partially observable environment is affected by using (or not) generic hierarchical architectures based on RL algorithms of the Actor-Critic family. Remarkably, we claim that especially our work in Sentiment Analysis is very innovative for RL, resulting in state-of-the-art performances; as far as the author knows, Reinforcement Learning approach is only rarely applied to the domain of computational linguistic and sentiment analysis. Furthermore, our work on the famous video-game Rogue is probably the first example of Deep RL architecture able to explore Rogue dungeons and fight against its monsters achieving a success rate of more than 75% on the first game level. While our work on Car Controller allowed us to make some interesting considerations on the nature of some components of the policy gradient equation.

Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Sovrano, Francesco
Relatore della tesi
Corso di studio
Ordinamento Cds
Parole chiave
Deep Reinforcement Learning,Hierarchical Reinforcement Learning,Partially Observable Markov Decision Process,Sentiment Analysis,Car Controller,Rogue
Data di discussione della Tesi
17 Ottobre 2018

Altri metadati

Statistica sui download

Gestione del documento: Visualizza il documento