Maximal coordinate discrepancy as accuracy criterion of image projective normalization for optical recognition of documents
Автор: Konovalenko I.A., Kokhan V.V., Nikolaev D.P.
Рубрика: Программирование
Статья в выпуске: 3 т.13, 2020 года.
Бесплатный доступ
Application of projective normalization (a special case of orthocorrection and perspective correction) to photographs of documents for their further optical recognition is generally accepted. In this case, inaccuracies of normalization can lead to recognition errors. To date, a number of normalization accuracy criteria are presented in the literature, but their conformity with recognition quality was not investigated. In this paper, for the case of a fixed structured document, we justify a uniform probabilistic model of recognition errors, according to which the probability of correct recognition of a character abruptly falls to zero with an increase in the coordinate discrepancy of this character. For this model, we prove that the image normalization accuracy criterion, which is equal to the maximal coordinate discrepancy in the text fields of a document, monotonously depends on the probability of correct recognition of the entire document. Also, we show that the problem on computing the maximal coordinate discrepancy is not reduced to the nearest known one, i.e. the linear-fractional programming problem. Finally, for the first time, we obtain an analytical solution to the problem on computing the maximal coordinate discrepancy on a union of polygons.
Orthocorrection, perspective correction, image projective normalization, optical character recognition, accuracy criteria, coordinate discrepancy, nonlinear programming
Короткий адрес: https://sciup.org/147235021
IDR: 147235021 | УДК: 004.932.2 | DOI: 10.14529/mmp200304
Максимальная невязка координат как критерий точности проективной нормализации изображения при оптическом распознавании документов
Общепринято применение проективной нормализации (частный случай ортокоррекции и коррекции перспективы) к фотографиям документов для их последующего оптического распознавания. При этом неточности нормализации могут приводить к ошибкам распознавания. На сегодняшний день в литературе предложен ряд критериев точности нормализации, однако их соответствие качеству распознавания не исследуется. В данной работе для случая документа фиксированной структуры обосновывается равномерная вероятностная модель ошибок распознавания, в соответствии с которой вероятность верного распознавания символа скачком падает до нуля с ростом невязки координат этого символа. Для этой модели доказано, что критерий точности нормализации изображения, равный максимальной по текстовым полям документа невязке координат, монотонно связан с вероятностью верного распознавания всего документа. Показано, что задача вычисления максимальной невязки координат не сводится к ближайшей известной, т.е. задаче дробно-линейного программирования. Наконец, впервые получено аналитическое решение задачи вычисления максимальной невязки координат на объединении многоугольников.
Список литературы Maximal coordinate discrepancy as accuracy criterion of image projective normalization for optical recognition of documents
- Kunina I.A., Terekhin A.P., Gladilin S.A., Nikolaev D.P. Blind Radial Distortion Compensation from Video Using Fast Hough Transform. ICRMV 2016, SPIE, 2017, vol. 1025308, pp. 1-7. DOI: 10.1117/12.2254867
- Shapiro L., Stokman D., Boguslavskiy A.A., Sokolov, S.M. Komp'yuternoe zrenie [Computer Vision]. Moscow, BINOM. Laboratoriya znaniy, 2013. (in Russian)
- Putjatin E.P., Prokopenko D.O., Pechenaja E.M. [Image Normalization Issues in Projective Transformations]. Radiojelektronika i informatika [Electronics and Informatics], 1998, vol. 2, no. 3, pp. 82-86. (in Russian)
- Zeynalov R., Velizhev A., Konushin A. [Recovering the Shape of a Page of Text for Correcting Geometric Distortions]. Proceedings of the 19 International Conference GraphiCon-2009, Moscow, 2009, pp. 125-128. (in Russian)
- Zhukovsky A., Nikolaev D., Arlazarov V., Postnikov V., Polevoy D., Skoryukina N., Chernov T., Shemiakina J., Mukovozov A., Konovalenko I. Segments Graph-Based Approach for Document Capture in a Smartphone Video Stream. ICDAR 2017, IEEE Computer Society, 2017, vol. 1, pp. 337-342. DOI: 10.1109/ICDAR.2017.63
- Bolotova J.A., Spicyn V.G., Osina P.M. [An Overview of the Algorithms for Detecting Text Areas in Images and Videos]. Komp'yuternaya optika [Computer Optics], 2017, vol. 41, no. 3, pp. 441-452. (in Russian)
- Shemiakina J.A., Zhukovsky A.E., Faradjev I.A. [The Research of the Algorithms of a Projective Transformation Calculation in the Problem of Planar Object Targeting by Feature Points]. Iskusstvenny intellekt i prinyatie resheniy [Artificial Intelligence and Decision Making], 2017, vol. 2017, no. 1, pp. 43-49. (in Russian)
- Skoryukina N., Shemiakina J., Arlazarov, V. L., Faradjev I. Document Localization Algorithms Based on Feature Points and Straight Lines. ICMV 2017, SPIE, 2018, vol. 106961H, pp. 1-8. DOI: 10.1117/12.2311478
- Povolotskiy M.A., Kuznetsova E.G., Khanipov T.M. Russian License Plate Segmentation Based On Dynamic Time Warping. Proceedings ECMS 2017, 2017, pp. 285-291.
- Skoryukina N.S, Chernov T.S, Bulatov K.B, Nikolaev D. P., Arlazarov V.L. Snapscreen: TV-Stream Frame Search with Projectively Distorted and Noisy Query. ICMV 2016, SPIE, 2017, vol. 103410Y, pp. 1-5. DOI: 10.1117/12.2268735
- Youye Xie, Gongguo Tang, Hoff W. Geometry-Based Populated Chessboard Recognition. Tenth International Conference on Machine Vision (ICMV 2017): International Society for Optics and Photonics, 2018, vol. 1069603, pp. 1-5.
- Arvind C.S., Mishra R., Vishal K., Gundimeda V. Vision Based Speed Breaker Detection for Autonomous Vehicle. Tenth International Conference on Machine Vision (ICMV 2017), 2018, vol. 106960E, pp. 1-9. DOI: 10.1117/12.2311315
- Dubuisson M.P., Jain A.K. A Modified Hausdorff Distance for Object Matching. Proceedings of 12th International Conference on Pattern Recognition, 1994, vol. 1, pp. 566-568. DOI: 10.1109/ICPR.1994.576361
- Sim D.G., Kwon O.K., Park R.H. Object Matching Algorithms Using Robust Hausdorff Distance Measures. IEEE Transactions on Image Processing, 1999, vol. 8, no. 3, pp. 425-429. DOI: 10.1109/83.748897
- Orrite C., Herrero J.E. Shape Matching of Partially Occluded Curves Invariant Under Projective Transformation. Computer Vision and Image Understanding, 2004, vol. 93, no. 1, pp. 34-64. DOI: 10.1016/j.cviu.2003.09.005
- Nikolayev P.P. [Projectively Invariant Description of Non-Planar Smooth Figures. 1. Preliminary Analysis of the Problem]. Sensornye sistemy [Sensor System], 2016, vol. 30, no. 4, pp. 290-311. (in Russian)
- Balickiy A.M., Savchik A.V., Gafarov R.F., Konovalenko I.A. [About Design-Invariant Points of an Oval with a Distinguished External Line]. Problemy peredachi informacii [Information Transfer Issues], 2017, vol. 53, no. 3, pp. 84-89. (in Russian)
- Savchik A.V., Nikolaev P.P. [Projective Matching Method for Ovals with Two Marked Points]. Informacionnye tehnologii i vychislitel'nye sistemy [Information Technology and Computing Systems], 2018, vol. 2018, no. 1, pp. 60-67. (in Russian)
- Katamanov S.N. [MTSAT-1R Automatic Geostationary Satellite Image Linking]. Sovremennye problemy distancionnogo zondirovanija Zemli iz kosmosa [Modern Problems of Remote Sensing of the Earth from Space], 2007, vol. 1, no. 4, pp. 63-68. (in Russian)
- Karpenko S., Konovalenko I., Miller A., Miller B., Nikolaev D. UAV Control on the Basis of 3D Landmark Bearing-Only Observations. Sensors, 2015, vol. 15, no. 12, pp. 29802-29820. DOI: 10.3390/s151229768
- Holopov I.S. [Projection Distortion Correction Algorithm for Low-Altitude Shooting]. Komp'yuternaja optika [Computer Optics], 2017, vol. 41, no. 2, pp. 284-290. (in Russian)
- Legge G.E., Pelli D.G., Rubin G.S., Schleske M.M. Psychophysics of Reading-I. Normal Vision. Vision Research, 1985, vol. 25, no. 2, pp. 239-252. DOI: 10.1016/0042-6989(85)90117-8
- Kunina I.A., Gladilin S.A., Nikolaev D.P. [Blind Radial Distortion Compensation in a Single Image Using Fast Hough Transform]. Komp'yuternaja optika [Computer Optics], 2016, vol. 40, no. 3, pp. 395-403. (in Russian) DOI: 10.18287/2412-6179-2016-40-3-395-403
- Arlazarov V.V., Slavin O.A.E., Uskov A.V.E., Janiszewskinn I.M. Modelling the Flow of Character Recognition Results in Video Stream. Bulletin of the South Ural State University. Series: Mathematical Modelling, Programming and Computer Software, 2018, vol. 11. no. 2, pp. 14-28. DOI: 10.14529/mmp180202
- Avriel, M. Nonlinear Programming: Analysis and Methods. North Chelmsford, Courier Corporation, 2003.
- Charnes A., Cooper, W.W. Programming with Linear Fractional Functionals. Naval Research Logistics Quarterly, 1962, vol. 9, no. 3-4, pp. 181-186. DOI: 10.1002/nav.3800090303
- Boyd, L. Convex Optimization. Cambridge, Cambridge University Press, 2004. DOI: 10.1017/CBO9780511804441
- Biswas A., Verma S., Ojha, D.B. Optimality and Convexity Theorems for Linear Fractional Programming Problem. International Journal of Computational and Applied Mathematics, 2017, vol. 12, no. 3, pp. 911-916.
- Judin D.B. Matematicheskie metody upravlenija v uslovijah nepolnoj informacii [Mathematical Control Methods in Conditions of Incomplete Information]. Moscow, Izdatel'skaya gruppa URSS, 2010. (in Russian)
- Rokafellar R. Vypuklyj analiz [Convex Analysis]. Moscow, Mir, 1973. (in Russian)