A method to reduce errors of string recognition based on combination of several recognition results with per-character alternatives

Бесплатный доступ

We consider the problem on recognition of a string object presented in several video stream frames. In order to maximize the output accuracy, we combine several results of the recognition. To this end, we consider a model of result of a string object recognition. The model takes into account the estimations of alternative results of per-character classification. Also, we propose an algorithm to combine results of a string recognition according to this model. The algorithm was evaluated on a MIDV-500 dataset of document images. The experimental results show that the proposed algorithm allows to achieve the high accuracy of recognition result due to an analysis of several images, and the use of the estimations of alternative results of per-character classification gives the higher results then a combination of strings that contain only the final alternatives of each character.

Еще

Recognition in video stream, mobile ocr, recognition algorithms

Короткий адрес: https://sciup.org/147232962

IDR: 147232962   |   DOI: 10.14529/mmp190307

Список литературы A method to reduce errors of string recognition based on combination of several recognition results with per-character alternatives

  • Bulatov, K. Smart IDReader: Document Recognition in Video Stream / K. Bulatov, V.V. Arlazarov, T. Chernov, O. Slavin, D. Nikolaev // Proceeding 14th International Conference on Document Analysis and Recognition. - 2017. - V. 6. - P. 39-44.
  • Burie, J.-C. ICDAR 2015 Competition on Smartphone Document Capture and OCR / J.-C. Burie, J. Chazalon, M. Coustaty et al. // Proceeding 13th International Conference on Document Analaysis and Recognition. - 2015. - P. 1161-1165.
  • Puybareau, E. Real-Time Document Detection in Smartphone Videos / E. Puybareau, T. Geraud // Proceeding 25th IEEE ICIP. - 2018. - P. 1498-1502.
  • Арлазаров, В.В. Анализ особенностей использования стационарных и мобильных малоразмерных цифровых камер для распознавания документов / В.В. Арлазаров, А. Жуковский, В. Кривцов и др. // Информационные технологии и вычислительные системы. - 2014. - № 3. - C. 71-78.
  • Chernov, T. An Algorithm for Detection and Phase Estimation of Protective Elements Periodic Lattice on Document Image / T. Chernov, S. Kolmakov, D. Nikolaev // Pattern Recognition and Image Analysis. - 2017. - V. 27, № 1. - P. 53-65.
  • Arlazarov, V.V. A Dataset for Identity Documents Analysis and Recognition on Mobile Devices in Video Stream / V.V. Arlazarov, K. Bulatov, T. Chernov, V.L. Arlazarov. - 2019. - URL: arXiv.1807.05786.
  • DOI: 10.18287/2412-6179-2019-43-5-818-824
  • Kittler, J. On Combining Classifiers / J. Kittler, M. Hatef, R.P.W. Duin, J. Matas // IEEE Transactions on Pattern Analysis and Machine Intelligence. - 1998. - V. 20, № 3. - P. 226-239.
  • Kuncheva, L.I. Decision Templates for Multiple Classifier Fusion: an Experimental Comparison / L.I. Kuncheva, J.C. Bezdek, R.P.W. Duin // Pattern Recognition. - 2001. - V. 34, № 2. - P. 299-314.
  • Fiscus, J.G. A Post-Processing System to Yield Reduced Word Error Rates: Recognizer Output Voting Error Reduction (ROVER) / J.G. Fiscus // Proceeding IEEE Workshop on Automatic Speech Recognition and Understanding. - 1997. - P. 347-354.
  • Wemhoener, D. Creating an Improved Version Using Noisy OCR from Multiple Editions / D. Wemhoener, I.Z. Yalniz, R. Manmatha // Proceeding 12th International Conference on Document Analysis and Recognition. - 2013. - P. 160-164.
  • Stuner, B. LV-ROVER: Lexicon Verified Recognizer Output Voting Error Reduction / B. Stuner, C. Chatelain, T. Paquet. - 2017. - URL: arXiv.1707.07432.
  • Llobet, R. OCR Post-Processing Using Weighted Finite-State Transducers / R. Llobet, J.-R. Cerdan-Navarro, J.-C. Perez-Cortes, J. Arlandis // Proceeding 20th International Conference on Pattern Recognition. - 2010. - P. 2021-2024.
  • Булатов, К.Б. Методы интеграции результатов распознавания текстовых полей документов в видеопотоке мобильного устройства / К.Б. Булатов, В.Ю. Кирсанов, В.В. Арлазаров и др. // Вестник РФФИ. - 2016. - Т. 92, № 4. - С. 109-115.
  • Распознавание. Классификация. Прогноз. Математические методы и их применение. - М.: Наука, 1989.
  • Krizhevsky, A. ImageNet Classification with Deep Convolutional Neural Networks / A. Krizhevsky, I. Sutskever, G.E. Hinton // Advances in Neural Information Processing Systems 25. - 2015. - P. 1097-1105.
  • Sankoff, D. Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison / D. Sankoff, J. Kruskal. - Stanford: CSLI Publications, 1999.
  • Yujian, L. A Normalized Levenshtein Distance Metric / L. Yujian, L. Bo // IEEE Transactions on Pattern Analysis and Machine Intelligence. - 2007. - V. 29, № 6. - P. 1091-1095.
  • Ing-Jr Ding. Developments of Machine Learning Schemes for Dynamic Time-Wrapping-Based Speech Recognition / Ing-Jr Ding, Chih-Ta Yen, Yen-Ming Hsu // Mathematical Problems in Engineering. - 2013. - 10 p.
  • Casenave, T. Overestimation for Multiple Sequence Alignment / T. Casenave // IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology. - 2007. - P. 159-164.
  • Zilbershtein, S. Using Anytime Algorithms in Intelligent Systems / S. Zilbershtein // AI Magazine. - 1996. - V. 17. - P. 73-83.
Еще
Статья научная