Disease Detection of Vegetables using Ensemble Machine Learning Classifier and Deep Learning with the Aid of Feature Words

Trisha Sarkar; Sadia Hossain; Imdadul Islam

doi:10.5815/ijem.2026.03.25

Scientific articles \ Prolegomena. Fundamentals of knowledge and culture. Propaedeutics \ Computer science and technology. Computing. Data processing \ Artificial intelligence

Disease Detection of Vegetables using Ensemble Machine Learning Classifier and Deep Learning with the Aid of Feature Words

Автор: Trisha Sarkar, Sadia Hossain, Imdadul Islam

Журнал: International Journal of Engineering and Manufacturing @ijem

Статья в выпуске: 3 vol.16, 2026 года.

Бесплатный доступ

This paper presents a comparative study for disease detection that proposes to combine machine learning classifiers (kernel support vector machine, random forest, decision tree, and eXtreme gradient boosting) to form a stronger ensemble classifier and also deep learning classifiers (long short-term memory and convolutional neural networks) to make a decision on whether the deep learning classifiers individually work better or the ensemble classifier consisting of four machine learning classifiers following the feature extraction method bag of features. The main reason for the global crisis in agricultural production is the presence of various vegetable diseases. It damages food quality and reduces production. These diseases must be detected, which is a challenging task to perform manually. Using various algorithms, we can identify vegetable diseases. Recently, deep learning has demonstrated notable success in the field of precision agriculture for identifying vegetable diseases. In this paper, the detection of vegetable diseases is done using three techniques: Ensemble Machine Learning Classifier (Kernel support vector machine, Random Forest, Decision Tree, and eXtreme gradient Boosting), CNN, and LSTM. By using the Bag of Features feature extractor, we extract 500 feature words from each vegetable dataset. CNN, LSTM, these two deep learning algorithms, and the ensemble method of machine learning classifier are used to make classifications of healthy and disease-affected vegetable categories and generate a confusion matrix. Then, from the confusion matrix, the performance metrics (precision, recall, F1-score, and accuracy) are identified. By applying soft voting for each individual classifier of machine learning, we predict the average best accuracy for each of the datasets. At the end, compare the performance of the ensemble method with the two deep learning algorithms according to the accuracy value. For the Cauliflower dataset, the Ensemble Machine Learning Classifier gives the accuracy of 83%, the deep learning classification algorithm CNN presents the accuracy of 94.51%, and LSTM gives the accuracy of 92.95%. The potato dataset's ensemble method accuracy is 89%, Convolutional Neural Network's is 89.47%, and Long Short-Term Memory's is 84.35%.

Convolutional Neural Network, Long Short-Term memory, accuracy of detection, Soft Voting and Confusion Matrix

Короткий адрес: https://sciup.org/15020508

IDR: 15020508 | DOI: 10.5815/ijem.2026.03.25