STRUMENTI DI NAVIGAZIONE

Design, Implementation and Evaluation of Parallel Solutions for a Nested Explainability Algorithm

Cortesi, Gabriel (2023) Design, Implementation and Evaluation of Parallel Solutions for a Nested Explainability Algorithm. [Laurea magistrale], Università di Bologna, Corso di Studio in Ingegneria informatica [LM-DM270]

Salva citazione

Documenti full-text disponibili:

Documento PDF (Thesis)
Disponibile con Licenza: Creative Commons: Attribuzione - Condividi allo stesso modo 4.0 (CC BY-SA 4.0)
Download (4MB)

Abstract

In the field of Machine Learning and Data Science there is an escalating need for performance as workloads become more and more complex. Parallelization over multiple cores and machines (clusters) is often employed as a means to significantly improve performance. This work specifically considers the explainability algorithm GLEAMS (Global & Local ExplainAbility of black-box Models through Space partitioning) and the poor performance offered by its sequential Python implementation. GLEAMS is a post-hoc, model agnostic explainability technique capable of giving a global understanding of the original model through recursive partitioning of the input space into non overlapping cells, each featuring a local linear approximation of the black-box model. The purpose of this work is the analysis, development, implementation and testing of a parallel distributed solution for the sequential GLEAMS explainability algorithm. The algorithm poses certain interesting parallelization challenges such as a recursive binary tree and nested parallelism. Notably, the nested nature of the parallelism is of marked relevance due to the complexities it introduces and the poor support that existing Python frameworks and solutions offer for it. Multiple solutions were designed and implemented, and this paper describes the steps taken for their development, justifies the choices made, explains their workings, illustrates their differences and extensively analyses the performance offered. In particular, this work proposes an asyncio based approach, in combination with the Ray framework, as a practical solution to many of the limitations encountered with the current state of nested parallelism support in Python. Additionally, some theoretical and more general approaches and solutions inspired by other languages are proposed and discussed.

Abstract

Tipologia del documento

Tesi di laurea (Laurea magistrale)

Autore della tesi

Cortesi, Gabriel

Relatore della tesi

Loreti, Daniela

Correlatore della tesi

Visani, Giorgio

Scuola

Ingegneria e Architettura

Corso di studio