E-Chars74k: An Extended Scene Character Dataset with Augmentation Insights and Benchmarks
Автор: Payel Sengupta, Tauseef Khan, Ayatullah Faruk Mollah
Журнал: International Journal of Image, Graphics and Signal Processing @ijigsp
Статья в выпуске: 6 vol.17, 2025 года.
Бесплатный доступ
Semantic understanding of camera-captured scene text images is an important problem in computer vision. Scene character recognition is the pivotal task in this problem, and deep learning is now-a-days the most prospective approach. However, limited sample-size of scene character datasets appear to be a major hindrance for training deep networks. In this paper, we present (i) various augmentation techniques for increasing the sample size of such datasets along with associated insights, (ii) an extended version of the popular Chars74k dataset (herein referred to as E-Chars74k), and (iii) the benchmark performance on the developed E-Chars74k dataset. Experiments on various sets of data such as digits, alphabets and their combination, belonging to the usual as well as wild scenarios, clearly reflect significant performance gain (20%-30% increase in scene character recognition accuracy). It is noteworthy to mention that in all these experiments, a deep convolutional neural network powered with two conv-pool pairs is trained with the uniform training test partition to foster comparison on equal bench.
Scene Character Recognition, Deep Learning, CNN, Augmentation, Chars74k Dataset
Короткий адрес: https://sciup.org/15020036
IDR: 15020036 | DOI: 10.5815/ijigsp.2025.06.08