Pratiksha, Pratiksha
(2023)
Text anonymization.
[Laurea magistrale], Università di Bologna, Corso di Studio in
Artificial intelligence [LM-DM270], Documento full-text non disponibile
Il full-text non è disponibile per scelta dell'autore.
(
Contatta l'autore)
Abstract
In today's data-driven world, ensuring data privacy and protection has become of paramount importance. This thesis is a comprehensive exploration of data anonymization and privacy preservation, with a particular focus on the development of a user-centric text anonymization system. In an era characterized by the proliferation of data and information sharing, safeguarding sensitive information is not just a best practice but a necessity.
The research begins by examining traditional anonymization techniques and goes on to delve into the complex world of Differential Privacy (DP), a cutting-edge framework that provides provable privacy guarantees for machine learning algorithms. DP is a crucial component of this thesis, and it is worth delving into some technical details.
Differential Privacy operates on the principle that the inclusion or exclusion of any single data point should not substantially impact the results or insights derived from a dataset. It achieves this by introducing carefully calibrated noise into the data analysis process. In essence, DP adds a layer of mathematical privacy protection, making it exceedingly difficult for an adversary to determine whether a specific individual's data is part of the dataset.
The implemented text anonymization system actively engages users in the decision-making process. Users are empowered to specify where privacy-enhancing techniques should be applied and to confirm the detection of sensitive data patterns. Furthermore, the system provides users with the flexibility to choose between classic anonymization techniques and the more advanced DP, recognizing that different data scenarios may necessitate varying levels of privacy protection.
Abstract
In today's data-driven world, ensuring data privacy and protection has become of paramount importance. This thesis is a comprehensive exploration of data anonymization and privacy preservation, with a particular focus on the development of a user-centric text anonymization system. In an era characterized by the proliferation of data and information sharing, safeguarding sensitive information is not just a best practice but a necessity.
The research begins by examining traditional anonymization techniques and goes on to delve into the complex world of Differential Privacy (DP), a cutting-edge framework that provides provable privacy guarantees for machine learning algorithms. DP is a crucial component of this thesis, and it is worth delving into some technical details.
Differential Privacy operates on the principle that the inclusion or exclusion of any single data point should not substantially impact the results or insights derived from a dataset. It achieves this by introducing carefully calibrated noise into the data analysis process. In essence, DP adds a layer of mathematical privacy protection, making it exceedingly difficult for an adversary to determine whether a specific individual's data is part of the dataset.
The implemented text anonymization system actively engages users in the decision-making process. Users are empowered to specify where privacy-enhancing techniques should be applied and to confirm the detection of sensitive data patterns. Furthermore, the system provides users with the flexibility to choose between classic anonymization techniques and the more advanced DP, recognizing that different data scenarios may necessitate varying levels of privacy protection.
Tipologia del documento
Tesi di laurea
(Laurea magistrale)
Autore della tesi
Pratiksha, Pratiksha
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
text anonymization,data anonymization,presidio,differential privacy,structured data anonymization
Data di discussione della Tesi
21 Ottobre 2023
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Pratiksha, Pratiksha
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
text anonymization,data anonymization,presidio,differential privacy,structured data anonymization
Data di discussione della Tesi
21 Ottobre 2023
URI
Gestione del documento: