Sub Linear Gradient Estimation Algorithms For Training Massive Scale Sparse Models

V. Sujitha; Dr. T.V.  Ambuli; Dr. Baskaran Kuppusamy; Utkal Khandelwal; Dr.K.  Vidhya; Ganesa Murthy A

Authors

V. Sujitha Assistant Professor/CSE(CS), New Prince Shri Bhavani College of Engineering and Technology, Chennai, India.
Dr. T.V. Ambuli Associate Professor & Head, Department of Commerce, Faculty of Science and Humanities, SRM Institute of Science and Technology, Ramapuram Campus, Chennai, India.
Dr. Baskaran Kuppusamy Scientist, Central Research Laboratory, Meenakshi Medical College Hospital & Research Institute, Meenakshi Academy of Higher Education and Research, Chennai, India.
Utkal Khandelwal Institute of Business Management, GLA University, Mathura, India.
Dr.K. Vidhya Professor, Civil Engineering, Mahendra Engineering College, Namakkal, India.
Ganesa Murthy A Librarian, Library and Information Science, Vels Institute of Science, Technology and Advanced Studies (VISTAS) Pallavaram, Chennai, Tamil Nadu, India.

Keywords:

Sub-linear gradient estimation, Sparse model training, Carbon-aware optimization, Communication efficiency, Decentralized learning.

Abstract

The training of massive-scale sparse models on decentralized platforms is fraught with numerous difficulties in terms of computational burden, communication network limitations, and a heavy energy consumption profile. Classical methods, such as gradient descent, have a problem of making large numbers of passes on datasets and exchanging huge numbers of parameters that scale linearly or super-linearly with respect to the size of the model. This leads to an increased carbon footprint for such distributed computations. In order to address this challenge, this paper presents a new sub-linear gradient estimation approach for training massive-scale sparse models in energy-aware edge networks. Experimentation was conducted through a distributed simulation setup using real-life datasets for edge IoT performance to monitor the training accuracy and energy efficiency. The statistics indicate that the use of the sub-linear approach leads to a reduction of the average communication costs by 42.6% and the reduction of cumulative carbon emissions by 38.4% relative to the full gradient optimization methods. Importantly, the approach delivers these levels of efficiency without compromising on the high classification performance, recording only a marginal reduction of 0.75% in model accuracy. This study clearly shows that sub-linear approaches can be adopted to achieve carbon-neutral AI training operations across massive, resource-constrained network architectures.

Sub Linear Gradient Estimation Algorithms For Training Massive Scale Sparse Models

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Make a Submission

INDEXING

Information

Keywords