Empowering On-Device Training: Leveraging Inference Accelerators for Enhanced Training Efficiency

Mateusz Piechocki; Alessandro Capotondi; Marek Kraft; Marko Bertogna

Scientific Information System of the Poznań University of Technology

PL EN

Main page / Publications / Empowering On-Device Training: Leveraging Inference Accelerators for Enhanced Training Efficiency

Submit a comment

Chapter

Download BibTeX

Title

Empowering On-Device Training: Leveraging Inference Accelerators for Enhanced Training Efficiency

Authors

Mateusz Piechocki (WARiE) ^{[ 1 ][ 2.2 ][ SzD ]}
Alessandro Capotondi
Marek Kraft (WARiE) ^{[ 2 ][ 2.2 ][ P ]}
Marko Bertogna

^{[ 1 ]} Wydział Automatyki, Robotyki i Elektrotechniki, Politechnika Poznańska | ^{[ 2 ]} Instytut Robotyki i Inteligencji Maszynowej, Wydział Automatyki, Robotyki i Elektrotechniki, Politechnika Poznańska | ^{[ SzD ]} doctoral school student | ^{[ P ]} employee

Scientific discipline (Law 2.0)

[2.2] Automation, electronics, electrical engineering and space technologies

Year of publication

2024

Chapter type

abstract

Publication language

english

Keywords

EN

on-device training
hardware acceleration
AI accelerator
continual learning

Abstract

EN On-device training is essential in practical continual learning (CL) applications. However, the current methods predominantly focus on memory optimization, while computing power and time budget are primary concerns for most real-world systems. Thus, this study leverages a high-performance AI co-processor primarily developed to enhance the inference of deep learning algorithms to accelerate the on-device training. In conducted examinations, layers' parameters of various levels of model architecture were frozen, transferred, and processed on an AI chip. While the model adaptive stage, with a backpropagation algorithm, was computed on the CPU. The proposed approach achieved up to 17$\times$ training time acceleration compared to a CPU-only process without performance degradation. Concerning conducted examinations, the research findings highlight the potential of advanced hardware accelerators in addressing the computational challenges of on-device CL, paving the way for more efficient and practical deployment in resource-constrained settings.

Pages (from - to)

211 - 214

Book

20th International Summer School on Advanced Computer Architecture and Compilation for High-performance Embedded Systems, ACACES 2024 : Poster Abstracts

Presented on

20th International Summer School on Advanced Computer Architecture and Compilation for High-performance Embedded Systems (ACACES 2024), 17.07.2024, Fiuggi, Italy

System created by Poznań University of Technology and Poznan Supercomputing and Networking Center

Log in through eKonto to add to SIS