A hybrid approach to generating adjective polarity lexicon and its application to Turkish sentiment analysis

Автор: Rahim Dehkharghani

Журнал: International Journal of Modern Education and Computer Science @ijmecs

Статья в выпуске: 11 vol.10, 2018 года.

Бесплатный доступ

Many approaches to sentiment analysis benefit from polarity lexicons. Existing methods proposed for building such lexicons can be grouped into two categories: (1) Lexicon based approaches which use lexicons such as dictionaries and WordNet, and (2) Corpus based approaches which use a large corpus to extract semantic relations among words. Adjectives play an important role in polarity lexicons because they are better polarity estimators compared to other parts of speech. Among natural languages, Turkish, similar to other non-English languages suffers from the shortage of polarity resources. In this work, a hybrid approach is proposed for building adjective polarity lexicon, which is experimented on Turkish combines both lexicon based and corpus based methods. The obtained classification accuracies in classifying adjectives as positive, negative, or neutral, range from 71% to 91%.

Еще

Sentiment analysis, Polarity Lexicons, Adjectives

Короткий адрес: https://sciup.org/15016806

IDR: 15016806   |   DOI: 10.5815/ijmecs.2018.11.02

Список литературы A hybrid approach to generating adjective polarity lexicon and its application to Turkish sentiment analysis

  • Liu, B. Sentiment Analysis and Opinion Mining. Morgan and Claypool Publishers, USA, 2012.
  • Miller, G. A. WordNet: a lexical database for English. Communications of the ACM 38, 11 (1995), 39–41.
  • Turney, P. D. Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th annual meeting on association for computational linguistics (2002), Association for Computational Linguistics, pp. 417–424.
  • Hatzivassiloglou, V., and McKeown, K. R. Predicting the semantic orientation of adjectives. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (1997), pp. 174–181.
  • Martineau, J., and Finin, T. Delta tfidf: An improved feature space for sentiment analysis. In ICWSM (2009).
  • Oflazer, K., and Bozsahin, H. C. Turkish natural language processing initiative: An overview. In Middle East Technical University (1994), Citeseer.
  • Akin, A. A., and Akin, M. D. Zemberek, an open source NLP framework for Turkic languages. Structure 10 (2007).
  • Eryigit, G. ITU Turkish NLP web service. In Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL) (Gothenburg, Sweden, April 2014), Association for Computational Linguistics.
  • Pang, B., Lee, L., and Vaithyanathan, S. Thumbs up? sentiment classification using machine learning techniques. In Proceedings of EMNLP (2002), pp. 79–86.
  • Meena, A., and Prabhakar, T. Sentence level sentiment analysis in the presence of conjuncts using linguistic analysis. Springer Berlin Heidelberg, 2007.
  • Wilson, T., Wiebe, J., and Hoffmann, P. Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the conference on human language technology and empirical methods in natural language processing (2005), Association for Computational Linguistics, pp. 347–354.
  • Wilson, T., Wiebe, J., and Hoffmann, P. Recognizing contextual polarity: An exploration of features for phrase-level sentiment analysis. Computational Linguistics (2009), 399–433.
  • Agarwal, A., Biadsy, F., and Mckeown, K. R. Contextual phrase-level polarity analysis using lexical affect scoring and syntactic n-grams. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (2009), Association for Computational Linguistics, pp. 24–32.
  • Yi, J., Nasukawa, T., Bunescu, R., and Niblack, W. Sentiment analyzer: Extracting sentiments about a given topic using natural language processing techniques. In ICDM 2003, Third IEEE International Conference on (2003), IEEE, pp. 427–434.
  • Kanayama, H., and Nasukawa, T. Fully automatic lexicon expansion for domain-oriented sentiment analysis. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (2006), Association for Computational Linguistics, pp. 355–363.
  • Tsai, A. C.-R.,Wu, C.-E., Tsai, R. T.-H., Hsu, J. Y.-j., et al. Building a concept-level sentiment dictionary based on commonsense knowledge. IEEE Intelligent Systems 28, 2 (2013), 22–30.
  • Poria, S., Gelbukh, A., Hussain, A., Howard, N., Das, D., and Bandyopadhyay, S. Enhanced senticnet with affective labels for concept-based opinion mining. IEEE Intelligent Systems 28, 2 (2013), 31–38.
  • Cambria, E., Olsher, D., and Rajagopal, D. Senticnet 3: a common and common-sense knowledge base for cognition-driven sentiment analysis. In Twenty-eighth AAAI conference on artificial intelligence (2014).
  • Hu, M., and Liu, B. Mining and summarizing customer reviews. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining (2004), ACM, pp. 168–177.
  • Dehkharghani, R., Saygin, Y., Yanikoglu, B., and Oflazer, K. Sentiturknet: a Turkish polarity lexicon for sentiment analysis. Language Resources and Evaluation 53 (2015), 1–19.
  • Kim, S.-M., and Hovy, E. Determining the sentiment of opinions. In Proceedings of the 20th international conference on Computational Linguistics (2004), Association for Computational Linguistics, p. 1367.
  • Baccianella, S., Esuli, A., and Sebastiani, F. Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In LREC (2010), vol. 10, pp. 2200–2204.
  • Bilgin, O., Çetinoglu, Ö., and Oflazer, K. Building a wordnet for Turkish. Romanian Journal of Information Science and Technology 7, 1-2 (2004), 163–172.
  • Wu, Y., and Wen, M. Disambiguating dynamic sentiment ambiguous adjectives. In Proceedings of the 23rd International Conference on Computational Linguistics (2010), Association for Computational Linguistics, pp. 1191–1199.
  • Bosco, C., Patti, V., and Bolioli, A. Developing corpora for sentiment analysis: The case of irony and senti-tut. IEEE Intelligent Systems 28, 2 (2013), 55–63.
  • Yıldırım, E., Çetin, F. S., Eryigit, G., and Temel, T. The impact of NLP on Turkish sentiment analysis. TÜRKIYE BILISIM VAKFI BILGISAYAR BILIMLERI ve MÜHENDISLIGI DERGISI 7, 1 (Basılı 8 (2015).
  • Vural, A. G., Cambazoglu, B. B., Senkul, P., and Tokgoz, Z. Ö. A framework for sentiment analysis in Turkish: Application to polarity detection of movie reviews in Turkish. In ISCIS (2012), E. Gelenbe and R. Lent, Eds., Springer, pp. 437–445.
  • Thelwall, M., Buckley, K., and Paltoglou, G. Sentiment strength detection for the social web. Journal of the American Society for Information Science and Technology 63, 1 (2012), 163–173.
  • Kaya, M., Fidan, G., and Toroslu, I. H. Sentiment analysis of Turkish political news. In Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology-Volume 01 (2012), IEEE Computer Society, pp. 174–180.
  • Boynukalin, Z. Emotion analysis of Turkish texts by using machine learning methods. MSc thesis, Middle East Technical University (2012).
  • Eroğul, U. Sentiment analysis in Turkish. MSc thesis, Middle East University, Turkey (2009).
  • Dehkharghani, R., Yanikoglu, B., Saygin, Y., , and Oflazer, K. Sentiment analysis in Turkish at different granularity levels. Natural Language Engineering (2016).
  • Holmes, G., Donkin, A., andWitten, I. H. Weka: A machine learning workbench. In Intelligent Information Systems, 1994. Proceedings of the 1994, Second Australian and New Zealand Conference on (1994), IEEE, pp. 357–361.
  • Pontiki, M., Galanis, D., Papageorgiou, H., Androutsopoulos, I., Manandhar, S., Al-Smadi, M., Al-Ayyoub, M., Zhao, Y., and Qin, B. Orphée de clercq, véronique hoste, marianna apidianaki, Xavier tannier, natalia loukachevitch, evgeny kotelnikov, nuria bel, salud marıa jiménez-zafra, and gülsen eryigit. 2016. semeval-2016 task 5: Aspect based sentiment analysis. In Proceedings of the 10th international Workshop on Semantic Evaluation, SemEval (2016), vol. 16.
  • Marcus, M. P., Marcinkiewicz, M. A., and Santorini, B. Building a large annotated corpus of English: The penn treebank. Computational linguistics 19, 2 (1993), 313–330.
  • Benamara, F., Cesarano, C., Picariello, A., Recupero, D. R., & Subrahmanian, V. S. (2007, March). Sentiment analysis: Adjectives and adverbs are better than adjectives alone. In ICWSM
Еще
Статья научная