Improvements to knowledge distillation of deep neural networks

D'Amicantonio, Giacomo (2021) Improvements to knowledge distillation of deep neural networks. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270]

Salva citazione

Documenti full-text disponibili:

Documento PDF (Thesis)
Disponibile con Licenza: Creative Commons: Attribuzione - Non commerciale - Non opere derivate 4.0 (CC BY-NC-ND 4.0)
Download (2MB)

Abstract

One of the main problems in the field of Artificial Intelligence is the efficiency of neural networks models. In the past few years, it seemed that most tasks involving such models could simply be solved by designing larger, deeper models and training them on larger datasets for longer time. This approach requires better performing and therefore expensive and energy consuming hardware and will have an increasingly significant environmental impact when those models are deployed at scale. In 2015 G. Hinton, J. Dean and O. Vinyals presented Knowledge Distillation (KD), a technique that leveraged the logits produced by a big, cumbersome model to guide the training of a smaller model. The two networks were called “Teacher” and “Student” given the analogy between the big model with large knowledge and the small model which has yet to learn everything. They proved that it is possible to extract useful knowledge from the teacher logits and use it to obtain a better performing student when compared with the same model that learned all by itself. This thesis provides an overview of the current state-of-the-art in the field of Knowledge Distillation, analyses some of the most interesting approaches, and builds on them to exploit very confident logits in a more effective way. Furthermore, it provides experimental evidence on the importance of using also smaller logit entries and correcting mistaken predictions from the teacher in the distillation process.

Abstract

Tipologia del documento

Tesi di laurea (Laurea magistrale)

Autore della tesi

D'Amicantonio, Giacomo

Relatore della tesi

Salti, Samuele

Correlatore della tesi

Di Stefano, Luigi

Scuola

Ingegneria e Architettura

Corso di studio