On Authorship Attribution

Calarota, Gabriele (2021) On Authorship Attribution. [Laurea magistrale], Università di Bologna, Corso di Studio in Informatica [LM-DM270]
Documenti full-text disponibili:
[thumbnail of Thesis] Documento PDF (Thesis)
Disponibile con Licenza: Creative Commons: Attribuzione - Non commerciale - Non opere derivate 4.0 (CC BY-NC-ND 4.0)

Download (3MB)

Abstract

Authorship attribution is the process of identifying the author of a given text and from the machine learning perspective, it can be seen as a classification problem. In the literature, there are a lot of classification methods for which feature extraction techniques are conducted. In this thesis, we explore information retrieval techniques such as Doc2Vec and other useful feature selection and extraction techniques for a given text with different classifiers. The main purpose of this work is to lay the foundations of feature extraction techniques in authorship attribution. At the end of this work, we show how we compared our results with related works and how we managed to improve, to the best of our knowledge, the results on a particular dataset, very known in this field.

Abstract
Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Calarota, Gabriele
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Indirizzo
CURRICULUM A: TECNICHE DEL SOFTWARE
Ordinamento Cds
DM270
Parole chiave
authorship attribution,machine learning,svm,reuters corpus,gdelt,amazon food reviews,tpot,supervised learning,the guardian newspaper
Data di discussione della Tesi
18 Marzo 2021
URI

Altri metadati

Statistica sui download

Gestione del documento: Visualizza il documento

^