Algorithmic procedures of identifying advertisement texts in mass media discourse
Автор: Kamensky M., Bredikhin S.
Журнал: Вестник Волгоградского государственного университета. Серия 2: Языкознание @jvolsu-linguistics
Рубрика: Межкультурная коммуникация и сопоставительное изучение языков
Статья в выпуске: 1 т.24, 2025 года.
Бесплатный доступ
The article presents an algorithm of identifying advertisement blocks in mass media content and determining the type of the given text as either an advertisement or an informative text, which is enabled through automation with the aid of intellectual semantic and syntactic analysis systems. The GATE corpus manager is used as the development environment for the algorithm, and the ANNIE Gazetteer, JAPE Transducer, and Java Regexp Annotator are used as the principal processing resources for the presented algorithm. The use of ANNIE Gazetteer enables the automated identification of the most common lexical units typical of advertisements, as well as various lexical and syntactic markers of the advertisement content. The JAPE Transducer technology enables the development of an algorithm aimed at identifying an array of lexical and syntactic means of psychological influence. Identification of lexical repetitions of proper nouns is performed using a regular expression for the Java Regexp Annotator processing resource. The list of tokens used as advertisement content markers is identified and described. It is noted that lexical and syntactic means of manipulative influence dominate in advertisement texts. Research findings indicate a significant difference in the search results ratio between advertisements and informative texts when advertisements are identified automatically with the aid of formal markers. This proves the effectiveness of natural language processing systems in identifying messages with explicit and implicit advertisement content, determining the discursive type of media texts, and classifying them as either informative texts or advertisements.
Automated analysis system, semantic-and-syntactic analyzer, mass media, corpus analysis, manipulative discourse, advertisement content, automated search algorithms
Короткий адрес: https://sciup.org/149148703
IDR: 149148703 | DOI: 10.15688/jvolsu2.2025.1.6