A novel approach: big data analysis based on multi-view data visualization using clustering similarity measure

  • Abstract
  • Keywords
  • References
  • PDF
  • Abstract

    In big data, data visualization is annotable concept to represent data for competent data analysis to handle high dimensional data. In data visualization, there are three main properties i) to characterize without loss of data patterns ii) without any changes in data pattern change the attributes iii) data visualization among structure and unstructured data attributes for data examination. There are various types of data visualization are existing virtually to identify data analysis (i.e. topic based data revelation, attribute based data visualization, audio based data visualization and text based data visualization in different data sets). Parallel coordinate is  proficient and effective data visualization tool to analyze and handle multi attribute high dimensional data. It is based 5Ws density sending and receiving data visualization, it also read data patterns and attributes with reduces the overlapping to data patterns. Parallel measure is a labeling property to characterize data with affiliation objects in data set appraisal with different pair of attributes. We need to get better parallel coordinate tool to sustain multi-attribute object relations, so we recommend and implement novel method i.e. (Similarity Measure Centered with Multi Viewpoint (SMCMV)) approach and related clustering approaches to represent data. Using multi-viewpoint, we can accomplish assessment based similarity index with data visualization. Using multi viewpoint, we present hypothetical analysis based on multi attributes presentation. Our experimental results gives best data representation in data visualization with capable similarity measure on real time document evaluation with different known collected clustering approaches.



  • Keywords

    Data Visualization; Parallel Co-Ordinate; Multivariate Attributes; Clustering Methods; Similarity Measure; Multi Viewpoint.

  • References

      [1] Jinson Zhang, Wen Bo Wang,” Big Data Density Analytics using Parallel Coordinate Visualization”, 2014 IEEE 17th International Conference on Computational Science and Engineering.

      [2] Pingdom, “Internet 2012 in numbers”, posted on Jan 16, 2013, http://royal.pingdom.com/2013/01/16/internet-2012-in-numbers/.

      [3] J. Sanyal, S. zhang, J. Dyer, A. Mercer, P. Amburn, and R.J. Moorhead, “Noodles: A Tool for Visualization on Numerical Weather Model Ensemble Uncertainty”, IEEE Transactions on Visualization and Computer Graphics, vol. 16, no 6, pp 1421-1430, Nov/Dec 2010. https://doi.org/10.1109/TVCG.2010.181.

      [4] S. Hadiak, H.J Schulz, and H. Schumann, “In Situ Exploration of Large Dynamic Networks”, IEEE Transactions on Visualization and Computer Graphics, vol. 17, no 12, pp 2334-2343, Dec 2011. https://doi.org/10.1109/TVCG.2011.213.

      [5] Y.S. Wang, C. Wang, T.Y. Lee, and K.L. Ma, “Feature-Preserving Volume Data Reduction and Focus+Context Visualization”, IEEE Transactions on Visualization and Computer Graphics, vol. 17, no 2, pp 171-181, Feb 2011 https://doi.org/10.1109/TVCG.2010.34.

      [6] S. Afzal, R. Maciejewski, Y. Jang, N. Elmqvist, and D.S. Ebert, “Spatial Text Visualization Using Automatic Typographic Maps”, IEEE Transactions on Visualization and Computer Graphics, vol. 18, no 12, pp 2556-2564, Dec 2012. https://doi.org/10.1109/TVCG.2012.264.

      [7] A.H. Meghdadi, and P. Irani, “Interactive Exploration of Surveillance Video through Action Shot Summarization and Trajectory Visualization”, IEEE Transactions on Visualization and Computer Graphics, vol. 19, no 12, pp 2119-2128, Dec 2013 https://doi.org/10.1109/TVCG.2013.168.

      [8] E. Lamboray, S. Wurmlin, and M. Gross, “Data Streaming in Telepresence Environments”, IEEE Transactions on Visualization and Computer Graphics, vol. 11, no 6, pp 637-648, Nov/Dec 2005 https://doi.org/10.1109/TVCG.2005.98.

      [9] L. Shi, Q. Liao, X. Sun, Y. Chen and C. Lin, “Scalable Network Traffic Visualization Using Compressed Graphs”, In Proc. 2013 IEEE International Conference on Big Data (IEEE BigData 2013), pp. 606-612, Oct 2013

      [10] W. Cui, Y. Wu, S. Liu, F. Wei, M.X. Zhou, and H. QU, “Context- Preserving, Dynamic Word Cloud Visualization”, IEEE Computer Graphics and Applications, vol. 30, no 6, pp. 42-53, Nov/Dec 2010 https://doi.org/10.1109/MCG.2010.102.

      [11] J. Zhang and M.L Huang, “5Ws Model for Big Data Analysis and Visualization”, In Proc. 2013 16th IEEE International Conference on Computational Science and Engineering (CSE), pp. 1021-1028, Dec 2013 https://doi.org/10.1109/CSE.2013.149.

      [12] A. Shiravi, H. Shiravi, M. Tavallaee, and A.A. Ghorbani, “Toward developing a systematic approach to generate benchmark datasets for intrusion detection,” Computers & Security, vol. 31, no. 3, pp 357-374, May 2012 https://doi.org/10.1016/j.cose.2011.12.012.

      [13] W.S. Seol, H.W. Jeong, B. Lee and H.Y. Youn, “Reduction of Association Rules for Big Data Sets in Socially-Aware Computing”, In Proc. 2013 16th IEEE International Conference on Computational Science and Engineering (CSE), pp. 949-956, Dec 2013 https://doi.org/10.1109/CSE.2013.140.

      [14] Z. Wang, W. Xiao, B. Ge, and H. Xu, “ADraw: A novel social network visualization tool with attribute-based layout and coloring”, In Proc. 2013 IEEE International Conference on Big Data (IEEE BigData 2013), pp. 25-32, Oct 2013

      [15] J. Zhang and M.L. Huang, “Density approach: a new model for BigData analysis and visualization”, Concurrency and Computation: Practice and Experience. Publish online July 2014, https://doi.org/10.1002/cpe.3337.

      [16] Z. Wang, J. Zhou, W. Chen, C. Chen, J. Liao and R. Maciejewski, “A Novel Visual analytics Approach for Clustering Large-Scale Social Data”, In Proc. 2013 IEEE International Conference on Big Data (IEEE BigData 2013), pp. 79-86, Oct 2013.

      [17] Duc Thang Nguyen, Lihui Chen,” Clustering with Multi-Viewpoint based Similarity Measure”, IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. XX, NO. YY, 2011.

      [18] Y. Zhao and G. Karypis, “Empirical and theoretical comparisons of selected criterion functions for document clustering,” Mach. Learn., vol. 55, no. 3, pp. 311–331, Jun 2004. https://doi.org/10.1023/B:MACH.0000027785.44527.d6.

      [19] G. Karypis, “CLUTO a clustering toolkit,” Dept. of Computer Science, Uni. of Minnesota, Tech. Rep., 2003, http://glaros.dtc. umn.edu/gkhome/views/cluto.

      [20] A. Strehl, J. Ghosh, and R. Mooney, “Impact of similarity measures on web-page clustering,” in Proc. of the 17th National Conf. on Artif. Intell. Workshop of Artif. Intell. For Web Search. AAAI, Jul. 2000, pp. 58–64.

      [21] A. Ahmad and L. Dey, “A method to compute distance between two categorical values of same attribute in unsupervised learning for categorical data set,” Pattern Recognit. Lett. vol. 28, no. 1, pp. 110 – 118, 2007. https://doi.org/10.1016/j.patrec.2006.06.006.

      [22] D. Ienco, R. G. Pensa, and R. Meo, “Context-based distance learning for categorical data clustering,” in Proc. of the 8th Int. Symp. IDA, 2009, pp. 83–94. https://doi.org/10.1007/978-3-642-03915-7_8.

      [23] P. Lakkaraju, S. Gauch, and M. Speretta, “Document similarity based on concept tree distance,” in Proc. of the 19th ACM conf. on Hypertext and hypermedia, 2008, pp. 127–132. https://doi.org/10.1145/1379092.1379118.

      [24] H. Chim and X. Deng, “Efficient phrase-based document similarity for clustering,” IEEE Trans. on Knowl. In addition, Data Eng., vol. 20, no. 9, pp. 1217–1229, 2008.

      [25] Madala S.R., Rajavarman V.N., Venkata Satya Vivek T. (2018) Analysis of Different Pattern Evaluation Procedures for Big Data Visualization in Data Analysis. In: Satapathy S., Bhateja V., Raju K., Janakiramaiah B. (eds) Data Engineering and Intelligent Computing. Advances in Intelligent Systems and Computing, vol 542. Springer, Singapore. https://doi.org/10.1007/978-981-10-3223-3_44.




Article ID: 19458
DOI: 10.14419/ijet.v7i4.19458

Copyright © 2012-2015 Science Publishing Corporation Inc. All rights reserved.