Classification of symbols in shorthand documents: basic, superscript and subscript

Бесплатный доступ

When decoding historic shorthand documents, the relative position of symbols influences their meaning. We distinguish three positions: basic, superscript, or subscript. The article presents a comparison of two algorithms for symbols’ classification performed by single and double approximation methods. Algorithm parameters are chosen experimentally using a validation set. The set is created automatically by identifying lines and then defining the type of each symbol. The performance of the algorithms is measured in terms of accuracy, precision, recall, F-measure and summarized F-measure. Based on the summarized F-measure, the best result is achieved with the algorithm for symbols’ classification by a double approximation method. We tune the parameters for each algorithm that the summarized F-measure is maximized for the validation data.

Еще

Shorthand document, algorithm of symbols' classification, superscript and subscript symbols, approximation method

Короткий адрес: https://sciup.org/14750753

IDR: 14750753

Статья научная