Carrino, Giuseppe
(2025)
Frugality in second-order optimization: floating-point approximations for Newton's Method.
[Laurea magistrale], Università di Bologna, Corso di Studio in
Artificial intelligence [LM-DM270], Documento ad accesso riservato.
Documenti full-text disponibili:
![[thumbnail of Thesis]](https://amslaurea.unibo.it/style/images/fileicons/application_pdf.png) |
Documento PDF (Thesis)
Full-text accessibile solo agli utenti istituzionali dell'Ateneo
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato
Download (2MB)
| Contatta l'autore
|
Abstract
Minimizing loss functions is at the core of Machine Learning training. While first-order methods dominate in practice, higher-order approaches such as Newton’s method offer higher accuracy and faster convergence, but are often avoided due to their high computational cost. In our work, we analyze the finite-precision errors propagation in Newton’s step, proposing a convergence theorem for mixed-precision Newton optimizers, including “Quasi” and “Inexact” variants. The theorem not only guarantees convergence for not too ill-conditioned problems, but also allows predicting in advance the accuracy achievable by the optimizers. Experiments on logistic regression demonstrate that our method outperforms Adam on the Australian dataset, while being less computationally expensive than the classical Newton method.
Additionally, a generalization of the Gauss-Newton method, namely GN_k, is presented, formalizing the idea of using a subset of the second-order residual derivatives during Gauss-Newton step computation, instead of discarding them all, to reduce computational cost without sacrificing accuracy. To solve the linear system of GN_k, a variant of the CGLS iterative solver, called CGLS_k, is presented. Both the linear and non-linear solvers have been tested on a set of problems, showing the soundness of their implementation and theoretical foundation.
Abstract
Minimizing loss functions is at the core of Machine Learning training. While first-order methods dominate in practice, higher-order approaches such as Newton’s method offer higher accuracy and faster convergence, but are often avoided due to their high computational cost. In our work, we analyze the finite-precision errors propagation in Newton’s step, proposing a convergence theorem for mixed-precision Newton optimizers, including “Quasi” and “Inexact” variants. The theorem not only guarantees convergence for not too ill-conditioned problems, but also allows predicting in advance the accuracy achievable by the optimizers. Experiments on logistic regression demonstrate that our method outperforms Adam on the Australian dataset, while being less computationally expensive than the classical Newton method.
Additionally, a generalization of the Gauss-Newton method, namely GN_k, is presented, formalizing the idea of using a subset of the second-order residual derivatives during Gauss-Newton step computation, instead of discarding them all, to reduce computational cost without sacrificing accuracy. To solve the linear system of GN_k, a variant of the CGLS iterative solver, called CGLS_k, is presented. Both the linear and non-linear solvers have been tested on a set of problems, showing the soundness of their implementation and theoretical foundation.
Tipologia del documento
Tesi di laurea
(Laurea magistrale)
Autore della tesi
Carrino, Giuseppe
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
optimization, Newton's method, finite precision, Error Analysis, Logistic Regression, Machine Learning
Data di discussione della Tesi
7 Ottobre 2025
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Carrino, Giuseppe
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
optimization, Newton's method, finite precision, Error Analysis, Logistic Regression, Machine Learning
Data di discussione della Tesi
7 Ottobre 2025
URI
Statistica sui download
Gestione del documento: