Twitter Sentiment Analysis and Visualization Using Apache Spark and Elasticsearch

  • Authors

    • Maragatham G
    • Shobana Devi A
    2018-07-20
    https://doi.org/10.14419/ijet.v7i3.12.16049
  • .
  • Sentiment analysis on Twitter data has paying more attention recently. The system’s key feature, is the immediate communication with other users in an easy, fast way and user-friendly too. Sentiment analysis is the process of identifying and classifying opinions or sentiments expressed in source text. There is a huge volume of data present in the web for internet users and a lot of data is generated per second due to the growth and advancement of web technology. Nowadays, Internet has become best platform to share everyone's opinion, to exchange ideas and to learn online. People are using social network sites like facebook, twitter and it has gained more popularity among them to share their views and pass messages about some topics around the world. As tweets, notices and blog entries, the online networking is producing a tremendous measure of conclusion rich information. This client produced assumption examination information is extremely helpful in knowing the supposition of the general population swarm. At the point when contrasted with general supposition investigation the twitter assumption examination is much troublesome because of its slang words and incorrect spellings. Twitter permits 140 as the most extreme cutoff of characters per message. The two procedures that are mostly utilized for content examination is information base approach and machine learning approach. In this paper, we investigated the twitter created posts utilizing Machine Learning approach. Performing assumption examination in a particular area, is to distinguish the impact of space data in notion grouping. we ordered the tweets as constructive, pessimistic and separate diverse people groups' data about that specific space. In this paper, we developed a novel method for sentiment learning using the Spark coreNLP framework. Our method exploits the hashtags and emoticons inside a tweet, as sentiment labels, and proceeds to a classification procedure of diverse sentiment types in a parallel and distributed manner.

  • References

    1. [1] Agarwal, B. Xie, I. Vovsha, O. Rambow, and R. Passonneau. Sentiment analysis of twitter data. In Proceedings of the Workshop on Languages in Social Media, pages 30{38, 2011.

      [2] L. Barbosa and J. Feng. Robust sentiment detection on twitter from biased and noisy data. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters, pages 36{44, 2010.

      [3] H. Bloom. Space/time trade-o_s in hash coding with allowable errors. Commun. ACM, 13(7):422{426, 1970.

      [4] Z. Cheng, J. Caverlee, and K. Lee. You are where you tweet: A content-based approach to geo-locating twitter users. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pages 759{768, 2010.

      [5] Davidov and A. Rappoport. E_cient unsupervised discovery of word categories using symmetric patterns and high frequency words. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, pages 297{304, 2006.

      [6] D. Davidov, O. Tsur, and A. Rappoport. Enhanced sentiment learning using twitter hashtags and smileys. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters, pages 241{249, 2010.

      [7] J. Dean and S. Ghemawat. Mapreduce: Simpli_ed data processing on large clusters. In Proceedings of the 6th Symposium on Operating Systems Design and Implementation, pages 137{150, 2004.

      [8] X. Ding and B. Liu. The utility of linguistic rules in opinion mining. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 811{812, 2007.

      [9] Hadoop. The apache software foundation: Hadoop homepage. http://hadoop.apache.org/, 2015. [Online; accessed 20-September-2015].

      [10] M. Hu and B. Liu. Mining and summarizing customer reviews. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 168{177, 2004.

      [11] L. Jiang, M. Yu, M. Zhou, X. Liu, and T. Zhao.Target-dependent twitter sentiment classi_cation. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, pages 151{160, 2011.

      [12] H. Karau, A. Konwinski, P. Wendell, and M. Zaharia. Learning Spark: Lightning-Fast Big Data Analysis. O'Reilly Media, 2015.

      [13] V. N. Khuc, C. Shivade, R. Ramnath, and J. Ramanathan. Towards building large-scale distributed systems for twitter sentiment analysis. In Proceedings of the 27th Annual ACM Symposium on Applied Computing, pages 59{464, 2012.

      [14] C. Lin and Y. He. Joint sentiment/topic model for sentiment analysis. In Proceedings of the 18th ACM Conference on Information and Knowledge Management, pages 375{384, 2009.

      [15] J. Lin and C. Dyer. Data-Intensive Text Processing with MapReduce. Morgan and Claypool Publishers, 2010.

      [16] Q. Mei, X. Ling, M. Wondra, H. Su, and C. Zhai. Topic sentiment mixture: Modeling facets and opinions in weblogs. In Proceedings of the 16th International Conference on World Wide Web, pages 171{180, 2007.

      [17] T. Nasukawa and J. Yi. Sentiment analysis: Capturing favorability using natural language processing. In Proceedings of the 2Nd International Conference on Knowledge Capture, pages 70{77, 2003.

      [18] N. Nodarakis, E. Pitoura, S. Sioutas, A. K. Tsakaidis, D. Tsoumakos, and G. Tzimas. kdann+: A rapid aknn

      [19] classi_er for big data. T. Large-Scale Data- and Knowledge-Centered Systems, 23:139{168, 2016.

      [20] B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up?: Sentiment classi_cation using machine learning

      [21] techniques. In Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing - Volume 10, pages 79{86, 2002.

      [22] Spark. The apache software foundation: Spark homepage. http://spark.apache.org/, 2015. [Online; accessed 27-December-2015].

      [23] M. van Banerveld, N. Le-Khac, and M. T. Kechadi. Performance evaluation of a natural language processing approach applied in white collar crime investigation. In Future Data and Security Engineering - First International Conference, FDSE 2014, Ho Chi Minh City, Vietnam, November 19-21, 2014, Proceedings, pages 29{43, 2014.

      [24] X. Wang, F. Wei, X. Liu, M. Zhou, and M. Zhang. Topic sentiment analysis in twitter: A graph-based hashtag sentiment classi_cation approach. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pages 1031{1040, 2011.

      [25] T. White. Hadoop: The De_nitive Guide, 3rd Edition. O'Reilly Media / Yahoo Press, 2012.

      [26] T. Wilson, J. Wiebe, and P. Ho_mann. Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pages 347{354, 2005.

      [27] T. Wilson, J. Wiebe, and P. Ho_mann. Recognizing contextual polarity: An exploration of features for phrase-level sentiment analysis. Comput. Linguist., 35(3):399{433, Sept. 2009.

      [28] Y. Yamamoto, T. Kumamoto, and A. Nadamoto. Role of emoticons for multidimensional sentiment analysis of twitter. In Proceedings of the 16th International Conference on Information Integration and Web-based Applications & Services, pages 107{115, 2014.

      [29] H. Yu and V. Hatzivassiloglou. Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences. In Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, pages 129{136, 2003.

      [30] W. Zhang, C. Yu, and W. Meng. Opinion retrieval from blogs. In Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, pages 831{840, 2007.

      [31] L. Zhuang, F. Jing, and X.-Y. Zhu. Movie review mining and summarization. In Proceedings of the 15th ACM International Conference on Information and Knowledge Management, pages 43{50, 2006.

  • Downloads

  • How to Cite

    G, M., & Devi A, S. (2018). Twitter Sentiment Analysis and Visualization Using Apache Spark and Elasticsearch. International Journal of Engineering & Technology, 7(3.12), 314-321. https://doi.org/10.14419/ijet.v7i3.12.16049