Development of Reinforcement Learning Algorithms for Non-cooperative Target Localization and Tracking

Bertozzi, Enrico (2020) Development of Reinforcement Learning Algorithms for Non-cooperative Target Localization and Tracking. [Laurea magistrale], Università di Bologna, Corso di Studio in Ingegneria elettronica e telecomunicazioni per l'energia [LM-DM270] - Cesena, Documento full-text non disponibile

Salva citazione

Il full-text non è disponibile per scelta dell'autore. (Contatta l'autore)

Abstract

The problem addressed in this thesis is to use swarm agents to find the optimal placement to reach optimal localization performance of a target node in a wireless sensor network scenario. Localization can be based on simply received signal strength (RSSI) and trilateration. To measure the accuracy of the localization process, geometric dilution of precision (GDOP) has been used. Trilateration is performed by mobile anchors that, in this work, will be supposed to be drones. Three anchors are used. The anchors are free to move in an environment represented by a grid. Each drone can assume a grid cell as location. To move from a cell to another there are five actions allowed. Each agent can move one cell square north, south, east, west or remain in its current position, if possible. Localization is performed on a target node arbitrarily positioned in the environment. Each time drones make a move, a reward is awarded to them depending on the estimated distance from the target and the GDOP. This allows drones to determine whether or not the action taken in a particular cell was valid. Three different algorithms have been proposed and implemented. The first one called 'Multi agent Q-learning' is used in small gridworld. Each executable action in a cell is assigned a certain value, called q-value, indicating how much that action is useful to reach the final goal. The tested scenarios include both environments with and without obstacles. A deep reinforcement learning approach was used to shift the problem even to larger environments. Thanks to the use of neural networks, an algorithm called 'actor-critic' has been implemented. The action will be chosen over a distribution of probabilities. Finally, the two algorithms have been united in a hybrid technique that allows trilateration to be performed even on mobile targets.

Abstract

Tipologia del documento

Tesi di laurea (Laurea magistrale)

Autore della tesi

Bertozzi, Enrico

Relatore della tesi

Giorgetti, Andrea

Correlatore della tesi

Testi, Enrico

Scuola

Ingegneria e Architettura

Corso di studio