Top-K high utility item set identification in big data by MUP-growth the evolutionary approach with less time constraints

  • Authors

    • Tejaswini K. Thorat
    • Amol D. Potgantwar
    https://doi.org/10.14419/ijet.v7i4.21693
  • High utility itemset mining is a eminent data mining technique used for acquiring the itemsets with high utility among the transactional dataset. As it supports various proposed analysis, it is adopted in a distinct domain applications, ranging from network to medical records data. At present there is huge amount of data generation from various sources, different algorithms have been promoted to handle such a data and also used to recognize high utility itemsets. This research, evaluates MUP-Growth (Multithreaded Utility pattern growth) algorithm to address the high utility itemset mining problem in big data domain with minimum amount of time constraints. The information of such a high utility itemset is maintained in tree data structure known as UP-Tree(utility pattern tree). In this paper, we propose a new framework for mining top-k high utility itemset, where k is the desired number of HUIs to be mined. Performance of proposed algorithm is computed on different datasets and compared with previous approach. Experimental evaluation shows that proposed algorithm out performs better in terms of time constraints. Finally, based on the research, it gives forthcoming research direction to expand any application in the region of pattern mining by selecting the proper combination of these technologies.

  • References

    1. [1] akesh Agrawal Ramakrishnan Shrikant*,â€Fast Algorithms for Mining Association Rulesâ€, Proc. 20th Intl Conf. Very Large Data Base (VLDB), pp. 487-499, 1994.

      [2] DanieleApiletti, ElenaBaralis, TaniaCerquitelli, PaoloGarza, FabioPul- virenti, LucaVenturini, “Frequent Itemsets Mining for Big Data: A Com- parative Analysisâ€, 2017 Elsevier.

      [3] Puneet Singh Duggal,Sanchita Paul,â€Big Data Analysis: Challenges and Solutionsâ€,International Conference on Cloud, Big Data and Trust 2013, Nov 13-15, RGPV.

      [4] Firat Tekiner1 and John A. Keane,â€Big Data Frameworkâ€, IEEE International Conference on Systems, Man, and Cybernetics 2013.

      [5] Mrs. Mereena Thomas,â€A Review paper on BIG Dataâ€, International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 02 Issue: 09, Dec-2015.

      [6] Jaseena K.U. and Julie M. David,â€Issuse, Challenges, And Solutions: Big Data Miningâ€.

      [7] Dhavapriya, N. Yasodha, “Big Data Analytics: Challenges and Solutions Using Hadoop, Map Reduce and Big Table “, International Journal of Computer Science Trends and Technology (IJCST) Volume 4 Issue 1, Jan - Feb 2016.

      [8] Sabia and Love Arora,â€Technologies to Handle Big Data: A Survey “.

      [9] Ibrar Yaqooba, Ibrahim Abaker Targio Hashema, Abdullah Gania, Salimah Mokhtara,Ejaz Ahmeda, Nor Badrul Anuara, Athanasios V. Vasi- lakos ,â€Big data:From beginning to futureâ€, International Journal of Information Man- agement 36 (2016) 12311247.

      [10] Sravanthi,Tatireddy Subba Reddy, “Applications of Big data in Various Fields “, (IJCSIT) International Journal of Computer Science and Informa- tion Technologies, Vol. 6 (5) , 2015, 4629-4632.

      [11] J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Moal, and M.C. Hsu ,†Mining Sequential Patterns by Pattern-Growth: The Prefixspan Approach “, IEEE Trans. Knowledge and Data Eng., vol.16, no.10, pp. 1424-1440, Oct. 2004.

      [12] Aileen P. Wright a,*, Adam T. Wright b, Allison B. McCoy c, Dean F. Sittig d ,â€The use of sequential pattern mining to predict next prescribed Medications “, Journal of Biomedical Informatics 53 (2015) 7380, 2014 Elsevier.

      [13] Uma Dave, Jayna Shah, “Efficient Mining of High Utility Sequential Pattern from Incremental Sequential Dataset “, International Journal of Computer Applications (0975 8887) Volume 122 No.12, July 2015.

      [14] Mrs. M.Kavitha 1, Ms.S.T.Tamil Selvi 2,†Comparative Study on Apriori Algorithm and FP-Growth Algorithm with Pros and Cons “, International Journal of Computer Science Trends and Technology (I JCS T) Volume 4 Issue 4, Jul - Aug 2016.

      [15] R. Chan, Q. Yang, and Y. Shen,†Mining High Utility Itemsets “,Proc. IEEE Third Intl Conf. Data Mining, pp. 19-26, Nov. 2003. https://doi.org/10.1109/ICDM.2003.1250893.

      [16] J. Han, J. Pei, and Y. Yin,†Mining Frequent Patterns without Candidate Generation “, Proc. ACM-SIGMOD Intl Conf. Management of Data, pp. 1-12, 2000.

      [17] Zhou, Zhiyong Zhong, Jin Chang, Junjie Li, Joshua Zhexue Huang, Shengzhong Feng, “Balanced Parallel FP-Growth with MapReduce “, 2012 IEEE.

      [18] M.S. Mythili,A.R. Mohamed Shanavas, “Performance Evaluation of Apri- ori and FP-Growth Algorithms “, International Journal of Computer Appli- cations (0975 8887) Volume 79 No10, October 2013.

      [19] Sankalp Mitra 1, Suchit Bande 2, Shreyas Kudale 3, Advait Kulkarni 4, Asst. Prof. Leena A. Deshpande 5,â€Efficient FP-Growth Using Hadoop- (Improved Parallel FP-Growth) “, International Journal of Scientific and Research Publications, Volume 4, Issue 7, July 2014.

      [20] Haoyuan Li, Yi Wang, Dong Zhang, Ming Zhang, Asst Edward Chang PFP: Parallel FP-Growth for Query Recommendation “, ACM.

      [21] Hongjian Qiu, Rong Gu, Chunfeng Yuan, Yihua Huang*,â€YAFIM: A Parallel Frequent Itemset Mining Algorithm with Spark “, 28th International Parallel & Distributed Processing Symposium Workshops, 2014 IEEE.

      [22] Adetokunbo Makanju, Zahra Farzanyar, Aijun An, Nick Cercone,â€Deep Parallelization of Parallel FPGrowth Using Parent-Child MapReduceâ€, IEEE International Conference on Big Data, 2016.

      [23] K.Dharmaraajan,M.A. Dorairangaswamy,â€Analysis of FP-Growth and Apriori Algorithms on Pattern Discovery from Weblog Dataâ€,2016 IEEE International Conference on Advances in Computer Applications (ICACA). https://doi.org/10.1109/ICACA.2016.7887945.

      [24] W. Wang, J. Yang, and P. Yu, “Efficient Mining of Weighted Association Rules (WAR) “, Proc. ACM SIGKDD Conf. Knowledge Discovery and Data Mining (KDD 00), pp. 270-274, 2000. https://doi.org/10.1145/347090.347149.

      [25] K. Sun and F. Bai, “Mining Weighted Association Rules without Preassigned Weights “, IEEE Trans. Knowledge and Data Eng., vol. 20, no. 4, pp. 489-495, Apr. 2008. https://doi.org/10.1109/TKDE.2007.190723.

      [26] F. Tao, F. Murtagh, and M. Farid, “Weighted Association Rule Mining Using Weighted Support and Significance Framework “, Proc. ACM SIGKDD Conf. Knowledge Discovery and Data Mining (KDD 03), pp. 661-666, 2003. https://doi.org/10.1145/956750.956836.

      [27] A. Erwin, R.P. Gopalan, and N.R. Achuthan,†Efficient Mining of High Utility Itemsets from Large Data Sets “, Proc. 12th Pacific-Asia Conf. Advances in Knowledge Discovery and Data Mining (PAKDD), pp. 554- 561, 2008. https://doi.org/10.1007/978-3-540-68125-0_50.

      [28] H.F. Li, H.Y. Huang, Y.C. Chen, Y.J. Liu, and S.Y. Lee,†Fast and Memory Efficient Mining of High Utility Itemsets in Data Streams “, Proc. IEEE Eighth Intl Conf. on Data Mining, pp. 881- 886, 2008.

      [29] C.F. Ahmed, S.K. Tanbeer, B.-S. Jeong and Y.-K. Lee, “Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases “, IEEE Trans. Knowledge and Data Eng., vol. 21, no. 12, pp.1708-1721, Dec. 2009. https://doi.org/10.1109/TKDE.2009.46.

      [30] V.S. Tseng, C.-W. Wu, B.-E. Shie, and P.S. Yu,“UP-Growth: An Efficient Algorithm for High Utility Itemsets Mining “, Proc. 16th ACM SIGKDD Conf. Knowledge Discovery and Data Mining (KDD 10), pp. 253-262, 2010.

      [31] B.-E. Shie, H.-F. Hsiao, V., S. Tseng, and P.S. Yu,†Mining High Utility Mobile Sequential Patterns in Mobile Commerce Environmentsâ€, Proc. 16th Intl Conf. DAtabase Systems for Advanced Applications (DASFAA 11), vol. 6587/2011, pp. 224-238, 2011.

      [32] Vincent S. Tseng, Bai-En Shie, Cheng-Wei Wu, and Philip S. Yu, Fellow,â€Efficient Algorithms for Mining High Utility Itemsets from Transactional Databasesâ€,IEEE Transactions on Knowledge And Data Engineering, VOL. 25, NO.8 AUGUST 2013.

  • Downloads

  • How to Cite

    Thorat, T. K., & Potgantwar, A. D. (2018). Top-K high utility item set identification in big data by MUP-growth the evolutionary approach with less time constraints. International Journal of Engineering & Technology, 7(4), 3412-3417. https://doi.org/10.14419/ijet.v7i4.21693