Deep Convolutional Neural Networks with Attention Mechanisms for Multi-Scale Feature Extraction in Complex Image Classification Tasks
Keywords:
Deep learning, CNN, Attention Mechanism, Multi-Scale Feature Extraction, Image Classification, CBAM, ResNet, Confusion Matrix.Abstract
Although there is global support of safety-engineered syringes, the use of auto- The challenges of complex image classification has grown into an urgent research domain in computer vision, as there is a growing need to effectively perform visual recognition on complex images in medical imaging, autonomous systems, intelligent surveillance, and industrial inspection. The traditional convolutional neural networks (CNNs) have proved to be very effective in feature learning ability, but they are usually limited to discriminative multi-scale spatial features as well as fine-grained contextual features of complex image data. These constraints may diminish the robustness of classification, especially with variation in object size, texture, illumination and complexity of the background. To overcome these issues, this research suggests a profound convolutional neural network model with channel attention and multi-scale features extraction schemes to improve the performance of image classification models. The architecture that is proposed uses a backbone based on ResNet50 with channel attention module and multi-scale feature fusion block to dynamically focus on informative feature and avoid redundant feature responses. Standardized training and testing of the model was performed on benchmark image classification datasets, such as, CIFAR-10 and CIFAR-100. Experimental findings show that the proposed framework obtains a high classification accuracy, precision, recall and F1-score when compared to traditional CNN backends that include VGG16, DenseNet121 and baseline ResNet. The confusion matrix analysis also confirms the increased prediction capability by class and decrease in misclassification rates. The significant contribution of this study is that the attention-directed learning of feature refinement is successfully combined with the hierarchical multi-scale learning of representations, leading to better discriminative features extraction and to superior classification resilience in challenging visual recognition tasks.




