Artificial intelligence system for classifying complex structure documents
Автор: Бутенко Ekaterina А., Zadorozhniy Alexandr M., Lyubovinkina Natalya J., Potemkina Snezhana V.
Журнал: Сетевое научное издание «Системный анализ в науке и образовании» @journal-sanse
Статья в выпуске: 1, 2023 года.
Бесплатный доступ
The paper presents method for restoring the logical coherence of texts obtained in a result of using Optical Character Recognition (OCR) methods for classifying scanned copies of business documents. The method includes two stages. First, the preliminary segmentation of the areas of interest is performed by means of a convolutional neural network (CNN) deep learning with yolo architecture. The obtained information allows you to restore the logical coherence of the document text. Then the same approach applies to compare the attribute name and its value for one of the common types of their representation in the form of two columns: a column of names and a column of values. The method successfully solves the issues of document classification and extraction of key attributes in the context of an electronic document management system.
Artificial intelligence system, document segmentation, deep learning convolutional neural network, electronic document management
Короткий адрес: https://sciup.org/14127898
IDR: 14127898