Cantini, Giulia
(2020)

*FLATLAND: A study of Deep Reinforcement Learning methods applied to the vehicle rescheduling problem in a railway environment.*
[Laurea magistrale], Università di Bologna, Corso di Studio in

Informatica [LM-DM270]

Documenti full-text disponibili:

## Abstract

In the field of Reinforcement Learning the task is learning how agents should take sequences of actions in an environment in order to maximize a numerical reward signal. This learning process employed in combination with neural networks has given rise to Deep Reinforcement Learning (DRL), that is nowadays applied in many domains, from video games to robotics and self-driving cars.
This work investigates possible DRL approaches applied to Flatland, a multi-agent railway simulation where the main task is to plan and reschedule train routes in order to optimize the traffic flow within the network. The tasks introduced in Flatland are based on the Vehicle Rescheduling Problem, for which determining an optimal solution is a NP-complete problem in combinatorial optimization and determining acceptably good solutions using heuristics and deterministic methods is not feasible in realistic railway systems.
In particular, we analyze the tasks of navigation of a single agent inside a map, that from a starting position has to reach a target station in the minimum number of time steps and the generalization of this task to a multi-agent setting, with the new issue of conflicts avoidance and resolution between agents.
To solve the problem we developed specific observations of the environment, so as to capture the necessary information for the network, trained with Deep Q-Learning and variants, to learn the best action for each agent, that leads to the solution that maximizes the total reward.
The positive results obtained on small environments offer ideas for various interpretations and possible future developments, showing that Reinforcement Learning has the potential to solve the problem under a new perspective.

Abstract

In the field of Reinforcement Learning the task is learning how agents should take sequences of actions in an environment in order to maximize a numerical reward signal. This learning process employed in combination with neural networks has given rise to Deep Reinforcement Learning (DRL), that is nowadays applied in many domains, from video games to robotics and self-driving cars.
This work investigates possible DRL approaches applied to Flatland, a multi-agent railway simulation where the main task is to plan and reschedule train routes in order to optimize the traffic flow within the network. The tasks introduced in Flatland are based on the Vehicle Rescheduling Problem, for which determining an optimal solution is a NP-complete problem in combinatorial optimization and determining acceptably good solutions using heuristics and deterministic methods is not feasible in realistic railway systems.
In particular, we analyze the tasks of navigation of a single agent inside a map, that from a starting position has to reach a target station in the minimum number of time steps and the generalization of this task to a multi-agent setting, with the new issue of conflicts avoidance and resolution between agents.
To solve the problem we developed specific observations of the environment, so as to capture the necessary information for the network, trained with Deep Q-Learning and variants, to learn the best action for each agent, that leads to the solution that maximizes the total reward.
The positive results obtained on small environments offer ideas for various interpretations and possible future developments, showing that Reinforcement Learning has the potential to solve the problem under a new perspective.

Tipologia del documento

Tesi di laurea
(Laurea magistrale)

Autore della tesi

Cantini, Giulia

Relatore della tesi

Correlatore della tesi

Scuola

Corso di studio

Indirizzo

Curriculum C: Sistemi e reti

Ordinamento Cds

DM270

Parole chiave

reinforcement learning,multi agent reinforcement learning,deep learning,neural networks,deep q networks,vehicle rescheduling problem

Data di discussione della Tesi

19 Marzo 2020

URI

## Altri metadati

Tipologia del documento

Tesi di laurea
(NON SPECIFICATO)

Autore della tesi

Cantini, Giulia

Relatore della tesi

Correlatore della tesi

Scuola

Corso di studio

Indirizzo

Curriculum C: Sistemi e reti

Ordinamento Cds

DM270

Parole chiave

reinforcement learning,multi agent reinforcement learning,deep learning,neural networks,deep q networks,vehicle rescheduling problem

Data di discussione della Tesi

19 Marzo 2020

URI

## Statistica sui download

Gestione del documento: