A Hybrid Deep Learning Framework For Multimodal Rice Growth Stage Classification Using Image And Iot Sensor Fusion

Riki Ruli A. Siregar; Kudang Boro Seminar; Sri Wahjuni; Edi  Santosa

Authors

Riki Ruli A. Siregar School of Data Science, Mathematics, and Informatics IPB University, Bogor, Indonesia, Faculty of Energy Telematics Institut Teknologi PLN Jakarta, Indonesia.
Kudang Boro Seminar Department of Engineering and Biosystems IPB University Bogor, Indonesia.
Sri Wahjuni School of Data Science, Mathematics, and Informatics IPB University, Bogor, Indonesia.
Edi Santosa Department of Agronomy and Horticulture, Faculty of Agriculture IPB University, Bogor, Indonesia.

Keywords:

Multimodal data fusion, Spatio-temporal analysis, CNN-LSTM, Rice phenology classification, IoT-based vertical farming

Abstract

Vertical farming based on IoT is a promising solution to increase rice production in limited urban spaces, but reliable and automated monitoring of crop phenology remains an open challenge. Previous studies have generally relied on unimodal data or simulated environments, which limit accuracy and discrimination power, especially in growth phases with similar visual characteristics. This research aims to develop a multimodal deep learning framework for precise classification of rice growth phases in actual IoT-based vertical farming systems. The proposed method integrates RGB canopy images with temporal environmental sensor data, including temperature, humidity, light intensity, soil moisture, and pH, through end-to-end spatio-temporal feature fusion using several CNN architectures (MobileNet, ResNet-50, VGG-19, and Xception) combined with an LSTM branch before the classification stage. Evaluation was conducted on a real-world dataset annotated by experts and spanning eight rice growth phases, measured in days after planting. Experimental results show that the proposed model significantly outperforms unimodal and CNN-only approaches, achieving a macro average F1 score of 0.96 on the VGG19–LSTM variant and maintaining performance above 0.89 in the most challenging intermediate growth phases, where visual information alone is insufficient. The contribution of these findings to the lightweight MobileNet–LSTM model maintains high accuracy with real-time inference support, making it potentially effective for application in edge computing devices in operational vertical farming systems.

A Hybrid Deep Learning Framework For Multimodal Rice Growth Stage Classification Using Image And Iot Sensor Fusion

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Make a Submission

INDEXING

Information

Keywords