Методы извлечения информации из текста
Автор: Ермакова Лиана Магдановна
Журнал: Вестник Пермского университета. Математика. Механика. Информатика @vestnik-psu-mmi
Рубрика: Информатика. Информационные системы
Статья в выпуске: 1 (9), 2012 года.
Бесплатный доступ
Представлены существующие методы извлечения информации из текстовых данных. Рас- сматриваются способы извлечения сущностей и отношений. Приводится описание методов автоматического построения онтологий по корпусу, а также способы их оценки. Особое внимание уделяется техникам выделения сущностей и отношений из открытых областей, обработке именованных сущностей, а также идентификации фактов, локализованных во времени.
Информационный поиск, извлечение информации, сущность, отношение, именованные сущности, временные факты
Короткий адрес: https://sciup.org/14729776
IDR: 14729776 | УДК: 025.4.03
Methods of information extraction from text
The article presents the overview of existing methods of information extraction from text data. Entity and relation extraction are considered. Automatic construction techniques from text corpora as well as evaluation metrics are studied. Special attention is paid to open-domain extraction, named entity recognition and temporal facts.
Список литературы Методы извлечения информации из текста
- Weikum G. Knowledge Harvesting from Web Sources//RuSSIR/EDBT 2011. Saint Petersburg. 2011.
- Etzioni O., Banko M., Cafarella M.J. Machine Reading//Proceedings of AAAI. 2005.
- Banko M. Open Information Extraction for the Web. Washington: University of Washington. 2009.
- Banko M. et al. Open Information Extraction from theWeb//Communications of the ACM -Surviving the data deluge, New York, 51, №12. 2008.
- Wu F., Weld D.S. Open Information Extraction using Wikipedia//Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. 2010. P. 118-127.
- Miller G.A., Beckwith R., Fellbaum C. Introduction to WordNet: An On-line Lexical Database. 1993.
- Suchanek F.M., Kasneci G,. Weikum G. YAGO: A Core of Semantic Knowledge//WWW 2007/Track: Semantic Web. 2007.
- Wu F., Weld D.S. Automatically refining the wikipedia infobox ontology//WWW. 2008.
- Milne D., Witten I.H. An Effective, Low-Cost Measure of Semantic Relatedness Obtained from Wikipedia Links//Proceedings of AAAI. 2008. P. 25-30
- Mchale M.A Comparison of WordNet and Roget's Taxonomy for Measuring Semantic Similarity//Proceedings of COLING/ACL Workshop on Usage of WordNet in Natural Language Processing Systems. 1998. P.115-120.
- Landauer, T. K.; Foltz, P. W.; Laham, D. Introduction to Latent Semantic Analysis//Discourse Processes, № 25. 1998. p. 259-284.
- Gabrilovich, E.; Markovitch, S. Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis//Proceedings of the 20th International Joint Conference on Artificial Intelligence. 2007. p. 1606-1611.
- Niemann, E.; Gurevych, I. The People's Web Meets Linguistic Knowledge: Automatic Sense Alignment of Wikipedia and WordNet//International Conference on Computational Semantics. 2011.
- Richardson, M.; Domingos, P. Markov Logic Networks//Machine Learning, MA, 62, № 1-2. 2006.
- Kundu, G.; Roth, D.; Samdani, R. Constrained Conditional Models For Information Fusion//Proceedings of the 14th International Conference on Information Fusion, 5-8 July 2011. p. 1 -8.
- Sutton, C.; Mccallum, A. An Introduction to Conditional Random Fields for Relational Learning//L. Getoor and B. Taskar, editors, Introduction to Statistical. 2006.
- Suchanek, F. M.; Sozio, M.; Weikum, G. SOFIE: A Self-Organizing Framework for Information Extraction//WWW. 2009.
- Hearst, M. A. Automatic acquisition of hyponyms from large text corpora//COLING '92 Proceedings of the 14th conference on Computational linguistics. 1992.
- Brin, S. Extracting patterns and relations from the World-Wide Web//Proceedings of on the 1998 International Workshop on Web and Databases. 1998.
- Agichtein, E.; Gravano, L. Snowball: Extracting Relations from Large Plain-Text Collections//Proceedings of the 5th ACM International Conference on Digital Libraries (DL). 2000.
- Mintz, M. et al. Distant supervision for relation extraction without labeled data//Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP of the AFNLP. 2009.
- Ling, X.; Weld, D. S. Temporal Information Extraction//AAAI. 2010.
- Dylla, M.; Sozio, M.; Theobald, M. Resolving Temporal Conflicts in Inconsistent RDF Knowledge Bases. 2011.
- Ratinov, L.; ROTH, D. Design Challenges and Misconceptions in Named Entity Recognition//Proceedings of the Thirteenth Conference on Computational Natural Language Learning. 2009.
- Manning, C. The Art of Loss Functions//Natural Language Processing Blog. 2006.
- Manning, C. Doing Named Entity Recognition? Don't optimize for F1//Natural Language Processing Blog. 2006.