Self-Explaining Neural Architecture Algorithms Via Integrated Gradient Attribution

Authors

  • Dr.R. Arivukkodi Assistant Professor, Department of Computer Science, Meenakshi College of Arts and Science, Meenakshi Academy of Higher Education and Research, Chennai, Tamil Nadu, India.
  • Nuthanakanti Bhaskar Department of Computer Science and Engineering, CMR Technical Campus, Hyderabad, India.
  • Mohmad Ahmed Ali Associate Professor, Department of CSE, CMR Institute of Technology, Hyderabad, Telangana, India.
  • M. Anitha Assistant Professor, Department of Mathematics, Meenakshi College of Arts and Science, Meenakshi Academy of Higher Education and Research, Chennai, Tamil Nadu, India.
  • Yusufjanov Ulugbek Javlon Ugli Turan International University, Namangan, Uzbekistan.
  • Dr. Rajvir Saini Assistant Professor, Kalinga University, Naya Raipur, Chhattisgarh, India.

Keywords:

Integrated Gradients, Explainable AI, Self-Explaining Neural Networks, Feature Attribution, Neural Architecture Interpretability, Saliency Maps, Deep Learning Transparency.

Abstract

Explainability is still one of the major bottlenecks for deploying deep neural networks in critical use cases, such as clinical diagnostics, self-driving cars, and judicial assistance. While some techniques to calculate post-hoc attributions exist, they are independent of any network architecture and cannot ensure that those attributions represent the network's reasoning process. In this work, research present the SEIG paradigm - a way to incorporate the computation of attributions directly in both forward and backward passes of neural architecture, thereby allowing each network block to output its own certified explanation along with its prediction. Research method builds on the axiomatic framework of Integrated Gradients and adds two more key aspects – layer-wise aggregation of attributions and the architecture decision map showing the influence of individual neurons, attention heads, and convolutional kernels. On two popular benchmark tasks, ImageNet classification (ResNet-50) and IMDB sentiment analysis (BERT-base), studies algorithm obtains a faithfulness metric of 91.7% and 0.93 human alignment coefficient – 9.1 percentage points above the state-of-the-art Integrated Gradients technique. The study also show that the method yields an inference speed up of 1.4× compared to competing approaches and outputs human-interpretable attribution certificates satisfying the complete set of Integrated Gradients axioms.

Downloads

Published

2026-06-01

How to Cite

Arivukkodi, D., Bhaskar, N., Ali, M. A., Anitha, M., Ugli, Y. U. J., & Saini, D. R. (2026). Self-Explaining Neural Architecture Algorithms Via Integrated Gradient Attribution. International Journal of Artificial Intelligence and Machine Learning, 6(4s), 642–649. Retrieved from https://www.svedbergopen.com/index.php/ijaiml/article/view/496