Real-Time exercise recognition and repetition counting from monocular gym video

Cridlig, Nicolas Ivan (2026) Real-Time exercise recognition and repetition counting from monocular gym video. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270]
Documenti full-text disponibili:
[thumbnail of Thesis] Documento PDF (Thesis)
Disponibile con Licenza: Creative Commons: Attribuzione - Condividi allo stesso modo 4.0 (CC BY-SA 4.0)

Download (13MB)

Abstract

This thesis presents the design, development, and evaluation of a real-time system that classifies gym exercises and counts repetitions from a single monocular camera, without wearable sensors or dedicated hardware beyond a compute backend. While pose estimation, activity recognition, and repetition counting have each been studied independently, no existing system integrates all three into a real-time pipeline that operates on commodity security cameras under realistic gym conditions. The pipeline chains three models. MediaPipe extracts 2D body landmarks from monocular RGB video, a Temporal Convolutional Network (TCN) classifies exercises across 16 categories from joint-angle time series, and RepNet counts repetitions in a class-agnostic manner. We compare five classification approaches, from scikit-learn baselines to a custom TCN. The TCN achieves 95.82% test accuracy and, unlike the baselines, produces temporally stable predictions suitable for real-time display. The complete pipeline processes each frame in 82ms on a consumer laptop and is deployed as a containerized web application. In end-to-end evaluation on 81 ground truth exercise sets, the system detects 72.8% of exercises (F1=83.7%), correctly classifies 88.1% of detected events, and counts repetitions with a mean absolute error of 2.1. A dual-camera experiment reveals that viewpoint is the primary limiting factor, with detection dropping from 70% to 40% when the camera moves from an elevated angle to a frontal position. The system currently processes one user at a time, with identity detection and multi-person tracking left as future work.

Abstract
Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Cridlig, Nicolas Ivan
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
exercise recognition, repetition counting, pose estimation, temporal convolutional network, real-time inference, monocular video
Data di discussione della Tesi
26 Marzo 2026
URI

Altri metadati

Statistica sui download

Gestione del documento: Visualizza il documento

^