Effective practices of using spatial models in document image classification

Бесплатный доступ

This paper presents a new approach to modelling the structure of document images for classification tasks. Each of the document images is considered as a realization of a stochastic point process. Estimates of the properties of the point process are used to describe the document structure. The main objective of this paper is to determine the type of a new document using a nonparametric classification method. A method of classification of functional properties of point processes based on the concept of statistical depth is proposed. Practical issues of experimentation are considered. Modeling on real data showed the effectiveness of the proposed approach.

Еще

Documents with flexible structure, classification, spatial point process, reproducible point patterns, depth, dd-plot, α-procedure

Короткий адрес: https://sciup.org/147242590

IDR: 147242590   |   DOI: 10.14529/mmp230404

Список литературы Effective practices of using spatial models in document image classification

  • Chen Nawei., Blostein D. A Survey of Document Image Classification: Problem Statement, Classifier Architecture and Performance Evaluation. International Journal of Document Analysis and Recognition, 2007, vol. 10, pp. 1–16. DOI: 10.1007/s10032-006-0020-2
  • Li Liu, Zhiyu Wang, Taorong Qiu, Qiu Chen, Yue Lu, Ching Y. Suen. Document Image Classification: Progress Over Two Decades. Neurocomputing, 2021, vol. 453, pp. 223–240. DOI: 10.1016/j.neucom.2021.04.114.
  • Gaceb D., Eglin V., Lebourgeois F. Classification of Business Documents for Real-Time Application. Journal of Real-Time Image Processing, 2014, vol. 9, no. 2, pp. 329–345. DOI: 10.1007/s11554-011-0227-4
  • Slavin O.A. Using Special Text Points in the Recognition of Documents. Cyber-Physical Systems: Advances in Design and Modelling, 2020, pp. 43–53.
  • Illian J., Penttinen A., Stoyan H., Stoyan D. Statistical Analysis and Modelling of Spatial Point Patterns. Chichester, John Wiley and Sons, 2008.
  • Baddeley A., Rubak E., Turner R. Spatial Point Patterns: Methodology and Applications with R. Boca Raton, London, New York, CRC press, 2015.
  • Pawlasov´a K., Dvoˇr´ak J. Supervised Nonparametric Classification in the Context of Replicated Point Patterns. Image Analysis and Stereology, 2022, vol. 41, no. 2, pp. 57–109. DOI: 10.5566/ias.2652
  • Ba´illo A., Cuevas A., Fraiman R. Classification Methods for Functional Data. The Oxford Handbook of Functional Data Analysis, Oxford, Oxford University Press, 2010, pp. 259–297.
  • Hahn U. A Studentized Permutation Test for the Comparison of Spatial Point Patterns. Journal of the American Statistical Association, 2012, vol. 107, pp. 754–764. DOI: 10.1080/01621459.2012.688463
  • Mahalanobis P.C. On the Generalized Distance in Statistics. National Institute of Science of India, 1936, vol. 2, no. 2, pp. 49–55.
  • Vardi Y., Cun-Hui Zhang. The Multivariate L1-Median and Associated Data Depth. Proceedings of the National Academy of Sciences, 2000, vol. 97, no. 4, pp. 1423–1426. DOI: 10.1073/pnas.97.4.142
  • Zuo Yijun, Serfling R. General Notions of Statistical Depth Function. Annals of statistics, 2000, vol. 28, no. 2, pp. 461–482. DOI: 10.1214/aos/1016218226
  • Mosler K., Mozharovskyi P. Fast DD-Classification of Functional Data. Statistical Papers 58, 2017, vol. 4, pp. 1055–1089. DOI: 10.1007/s00362-015-0738-3
  • Li Jun, Cuesta-Albertos J.A., Liu R.Y. DD-Classifier: Nonparametric Classification Procedure Based on DD-plot. Journal of the American Statistical Association, 2012, vol. 107, no. 498, pp. 737–753. DOI: 10.1080/01621459.2012.688462
  • Baddeley A., Turner R. Spatstat: an R Package for Analyzing Spatial Point Patterns. Journal of Statistical Software, 2005, vol. 12, no. 6, pp. 1–42. DOI: 10.18637/jss.v012.i06
  • Pokotylo O., Mozharovskyi P., Dyckerhoff R. Depth and Depth-Based Classification with R-Package Ddalpha. Journal of Statistical Software, 2019, vol. 91, no. 5, pp. 1–46. DOI: 10.18637/jss.v091.i05
Еще
Статья научная