U-net-bin: hacking the document image binarization contest
Автор: Bezmaternykh Pavel Vladimirovich, Ilin Dmitrii Alexeevich, Nikolaev Dmitry Petrovich
Журнал: Компьютерная оптика @computer-optics
Рубрика: Обработка изображений, распознавание образов
Статья в выпуске: 5 т.43, 2019 года.
Бесплатный доступ
Image binarization is still a challenging task in a variety of applications. In particular, Document Image Binarization Contest (DIBCO) is organized regularly to track the state-of-the-art techniques for the historical document binarization. In this work we present a binarization method that was ranked first in the DIBCO' 17 contest. It is a convolutional neural network (CNN) based method which uses U-Net architecture, originally designed for biomedical image segmentation. We describe our approach to training data preparation and contest ground truth examination and provide multiple insights on its construction (so called hacking). It led to more accurate historical document binarization problem statement with respect to the challenges one could face in the open access datasets. A docker container with the final network along with all the supplementary data we used in the training process has been published on Github.
Historical document processing, binarization, dibco, deep learning, u-net architecture, training dataset augmentation, document analysis
Короткий адрес: https://sciup.org/140246518
IDR: 140246518 | DOI: 10.18287/2412-6179-2019-43-5-825-832
Список литературы U-net-bin: hacking the document image binarization contest
- Kruchinin, A.Yu. Industrial DataMatrix barcode recognition for an arbitrary camera angle and rotation / A.Yu. Kruchinin // Computer Optics. - 2014. - Vol. 38(4). - P. 865-870.
- Fedorenko, V.A. Binarization of images of striated toolmarks for estimation of the number of matching striations traces [In Russian] / V.A. Fedorenko, E.V. Sidak, P.V. Giverts // Journal of Information Technologies and Computational Systems. - 2016. -Issue 3. - P. 82-88.
- Gudkov, V. Skeletonization of binary images and finding of singular points for fingerprint recognition / V. Gudkov, D. Klyuev // Bulletin of the South Ural State University. Seria: Computer Technologies, Automatic Control & Radioelectronics. - 2015. -Vol. 15, No. 3. - P. 11-17. - DOI: 10.14529/ctcr150302
- Nikolaev, D.P. Segmentation-based binarization method for color document images / D.P. Nikolaev // Proceedings of the 6th German-Russian Workshop "Pattern recognition and image understanding" (OGRW-6). - 2003. - P. 190-193.
- Nagy, G. Disruptive developments in document recognition / G. Nagy // Pattern Recognition Letters. - 2016. - Vol. 79. - P. 106112. DOI: 10.1016/j.patrec.2015.11.024
- Gatos, B. ICDAR 2009 Document Image Binarization Contest (DIBCO 2009) / B. Gatos, K. Ntirogiannis, I. Pratikakis // 2009 10th International Conference on Document Analysis and Recognition. - 2009. - P. 1375-1382. -
- DOI: 10.1109/icdar.2009.246
- Pratikakis, I. ICDAR2017 Competition on document image binarization (DIBCO 2017) / I. Pratikakis, K. Zagoris, G. Barlas, B. Gatos // 2017 14th lAPR International Conference on Document Analysis and Recognition (ICDAR). - 2017. - Vol. 1. -P. 1395-1403. -
- DOI: 10.1109/icdar.2017.228
- Ronneberger, O. U-Net: convolutional networks for biomedical image segmentation [Electronical Resource] / O. Ronneberger, P. Fischer, T. Brox. - 2015. - URL: https://arxiv.org/abs/1505.04597 (request date 25.07.2019).
- Otsu, N. A threshold selection method from gray-level histograms / N. Otsu // IEEE Transactions on Systems, Man, and Cybernetics. - 1979. - Vol. 9, Issue 1. - P. 62-66.
- DOI: 10.1109/tsmc.1979.4310076
- Sauvola, J. Adaptive document image binarization / J. Sauvola, M. Pietikainen // Pattern Recognition. - 2000. - Vol. 33, Issue 2. - P. 225-236. -
- DOI: 10.1016/s0031-3203(99)00055-2
- Cheriet, M. A recursive thresholding technique for image segmentation / M. Cheriet, J.N. Said, C.Y. Suen // IEEE Transactions on Image Processing. - 1998. - Vol. 7, Issue 6. - P. 918-921.
- DOI: 10.1109/83.679444
- Jianzhuang, L. Automatic thresholding of gray-level pictures using two-dimension Otsu method / L. Jianzhuang, L. Wenqing, T. Yupeng // 1991 International Conference on Circuits and Systems. - 1991. -
- DOI: 10.1109/ciccas.1991.184351
- Ershov, E.I. Exact fast algorithm for optimal linear separation of 2D distribution / E.I. Ershov, V.V. Postnikov, A.P. Terekhin, D.P. Nikolaev // 2015 European Conference on Modelling and Simulation. - 2015. - P. 469-474.
- Shi, Z. Digital image enhancement using normalization techniques and their application to palm leaf manuscripts / Z. Shi, S. Setlur, V. Govindaraju. - 2005. - URL: https://cedar.buffalo.edn/~zshi/Papers/kbcs04_261.pdf (request date 25.07.2019).
- Gatos, B. Adaptive degraded document image binarization / B. Gatos, I. Pratikakis, S.J. Perantonis // Pattern Recognition. -2006. - Vol. 39, Issue 3. - P. 317-327. -
- DOI: 10.1016/j.patcog.2005.09.010
- Lu, S. Document image binarization using background estimation and stroke edges / S. Lu, B. Su, C.L. Tan // International Journal on Document Analysis and Recognition. - 2010. - Vol. 13, Issue 4. - P. 303-314. -
- DOI: 10.1007/s10032-010-0130-8
- Niblack, W. An introduction to digital image processing / W. Niblack. - Upper Saddle River, NJ: Prentice-Hall, Inc., 1990.
- Trier, O.D. Evaluation of binarization methods for document images / O.D. Trier, T. Taxt // IEEE Transactions on Pattern Analysis and Machine Intelligence. - 1995. - Vol. 17, Issue 3. - P. 312-315.
- DOI: 10.1109/34.368197
- Khurshid, K. Com parison of Niblack inspired binarization methods for ancient documents / K. Khurshid, I. Siddiqi, C. Faure, N. Vincent // Document Recognition and Retrieval XVI. - 2009. -
- DOI: 10.1117/12.805827
- Lazzara, T. Efficient multiscale Sauvola's binarization / T. Lazzara, G. Geraud // International Journal on Document Analysis and Recognition. - 2014. - Vol. 17, Issue 2. - P. 105-123. -
- DOI: 10.1007/s10032-013-0209-0
- Kim, I.-J. Multi-window binarization of camera image for document recognition / I.-J. Kim // Ninth International Workshop on Frontiers in Handwriting Recognition. - 2004. - P. 323-327. -
- DOI: 10.1109/IWFHR.2004.70
- Howe, N.R. Document binarization with automatic parameter tuning / N.R. Howe // International Journal on Document Analysis and Recognition. - 2012. - Vol. 16, Issue 3. - P. 247-258. -
- DOI: 10.1007/s10032-012-0192-x
- Wen, J. A new binarization method for non-uniform illuminated document images / J. Wen, S. Li, J. Sun // Pattern Recognition. - 2013. - Vol. 46, Issue 6. - P. 1670-1690. -
- DOI: 10.1016/j.patcog.2012.11.027
- Chen, Y. Decompose algorithm for thresholding degraded historical document images / Y. Chen, G. Leedham // IEE Proceedings - Vision, Image and Signal Processing. - 2005. - Vol. 152, Issue 6. - 702. - :20045054.
- DOI: 10.1049/ip-vis
- Chou, C.-H. A binarization method with learning-built rules for document images produced by cameras / C.-H. Chou, W.- H. Lin, F. Chang // Pattern Recognition. - 2010. - Vol. 43, Issue 4. - P. 1518-1530. -
- DOI: 10.1016/j.patcog.2009.10.016
- Gatos, B. Improved document image binarization by using a combination of multiple binarization techniques and adapted edge information / B. Gatos, I. Pratikakis, S.J. Perantonis, // 2008 19th International Conference on Pattern Recognition. - 2008. -
- DOI: 10.1109/icpr.2008.4761534
- Badekas, E. Optimal combination of document binarization techniques using a self-organizing map neural network / E. Badekas, N. Papamarkos // Engineering Applications of Artificial Intelligence. - 2006. - Vol. 20, Issue 1. - P. 11-24. -
- DOI: 10.1016/j.engappai.2006.04.003
- Wu, Y. Learning document image binarization from data / Y. Wu, P. Natarajan, S. Rawls, W. AbdAlmageed // 2016 IEEE International Conference on Image Processing (ICIP). - 2016. -
- DOI: 10.1109/icip.2016.7533063
- Westphal, F. Document image binarization using recurrent neural networks / F. Westphal, N. Lavesson, H. Grahn // 2018 13th IAPR International Workshop on Document Analysis Systems (DAS). - 2018. -
- DOI: 10.1109/das.2018.71
- Tensmeyer, C. Document image binarization with fully convolutional neural networks / C. Tensmeyer, T. Martinez // 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). - 2017.
- Xiong, W. Degraded historical document image binarization using local features and support vector machine (SVM) / W. Xiong, J. Xu, Z. Xiong, J. Wang, M. Liu // Optik. - 2018. - Vol. 164. - P. 218-223. -
- DOI: 10.1016/j.ijleo.2018.02.072
- Nikolaev, D.P. Quality criteria for the problem of automated adjustment of binarization algorithms [In Russian] / D.P. Nikolaev, A.A. Saraev // Proceeding of the Institute for Systems Analysis of the Russian Academy of Science. - 2013. - Vol. 63, Issue 3. -P. 85-94.
- Krokhina, D. Analysis of straw row in the image to control the trajectory of the agricultural combine harvester (Erratum) / D. Krokhina, A.Y. Shkanaev, D.V. Polevoy, A.V. Panchenko, S.R. Nailevish, D.L. Sholomov // Tenth International Conference on Machine Vision (ICMV 2017). - 2018. - P. 90. -
- DOI: 10.1117/12.2310143
- Chollet, F. Keras: The Python deep learning library / F. Chollet, [et al.]. - 2015. - URL: https://keras.io (request date 25.07.2019).
- Kingma, D.P. Adam: A method for stochastic optimization [Electronical Resource] / D.P. Kingma, J. Ba. - 2014. - URL: https://arxiv.org/abs/1412.6980 (request date 25/07/2019).
- Pratikakis, I. ICFHR 2018 Competition on Handwritten Document Image Binarization (H-DIBCO 2018) / I. Pratikakis, K. Zagori, P. Kaddas, B. Gatos // 2018 16th International Conference on Frontiers in Handwriting Recognition (IcFHR). - 2018. -
- DOI: 10.1109/icfhr-2018.2018.00091
- Oliveira, S.A. dhSegment: A generic deep-learning approach for document segmetation / S.A. Oliveira, B. Seguin, F. Kaplan // 2018 16th The International Conference on Frontiers of Handwriting Recognition (ICFHR). - 2018. - P. 7-12.
- Calvo-Zaragoza, J. A selectional auto-encoder approach for document image binarization / J. Calvo-Zaragoza, A.-J. Gallego // Pattern Recognition. - 2019. - Vol. 86. - P. 37-47. -
- DOI: 10.1016/j.patcog.2018.08.011
- Arlazarov, V.V. MIDV-500: A dataset for identity documents analysis and recognitionon mobile devices in video stream [Electronical Resource] / V.V. Arlazarov, K. Bulatov, T.S. Chernov, V.L. Arlazarov. - 2018. - URL: https://arxiv.org/abs/1807.05786 (request date 25.07.2019).