Curvature-Aware Stochastic Gradient Descent Algorithms for Non-Convex Landscape Navigation

Dr. I.  Ambika; Dr.R.  Udayakumar; Dr. C. Mahiba; Muhayyo Muminjonova; Fazliddin Temirov; Asal Kasimova

Authors

Dr. I. Ambika Associate Professor, Department of Computer Science and Engineering, Jain University, Bengaluru, India.
Dr.R. Udayakumar Professor & Director, Kalinga University, India.
Dr. C. Mahiba Assistant Professor, Department of Computer Science and Engineering, Jain University, Bangalore, India.
Muhayyo Muminjonova Senior Lecturer, Department of Primary Education Pedagogy, Jizzakh State Pedagogical University, Uzbekistan.
Fazliddin Temirov Researcher, Samarkand State Medical University, Samarkand, Uzbekistan.
Asal Kasimova Associate Professor, Department of International Public Law, Tashkent State Transport University, Tashkent, Uzbekistan.

Keywords:

Curvature-Aware SGD, Non-Convex Optimization, Adaptive Learning Rate, Hessian-Vector Approximation, Convergence Acceleration, Deep Learning, Optimization Algorithms

Abstract

Machine learning optimization algorithms often face challenges in navigating highly complex non-convex surfaces, leading to convergence to poor solutions because of saddle points, sharp minima, and plateaus. In this paper, a new algorithm, namely Curvature-aware Stochastic Gradient Descent (CA-SGD), is developed by combining curvature estimates through Hessian vector product approximations to adaptively vary the step size of optimization according to the geometry of the local landscape. The method strikes a balance between computational tractability and geometry-aware update, hence, improving efficiency. The CA-SGD algorithm was tested using both synthetic and real-life benchmarks, such as the Rosenbrock problem and the MNIST benchmark dataset. The results obtained from the experiments show that CA-SGD is better compared to other algorithms such as SGD, RMSProp, and Adam. The lowest Rosenbrock loss value achieved was 0.82, while the highest accuracy attained by using the MNIST dataset was 98.5%, with the minimum iteration being 650. It can therefore be deduced that the application of CA-SGD can lead to efficient solutions to high-dimensional optimization problems. Future work will entail implementing CA-SGD in deep neural networks, meta-learning, as well as Hessian approximations.

Curvature-Aware Stochastic Gradient Descent Algorithms for Non-Convex Landscape Navigation

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Similar Articles

Make a Submission

INDEXING

Developed By

Information

Browse

Current Issue