Hash based Approach for Mining Frequent Item Sets from Transactional Databases

 
 
 
  • Abstract
  • Keywords
  • References
  • PDF
  • Abstract


    Frequent Itemset Mining become so popular in extracting hidden patterns from transactional databases. Among the several approaches, Apriori algorithm is known to be a basic approach which follows candidate generate and test based strategy. Although it is efficient level-wise approach, it has two limitations, (i) several passes are required to check the support of candidate itemsets. (ii) Towards more candidate itemsets and minimum threshold variations. A novel approach is proposed to tackle the above limitations. The proposed approach is one pass Hash-based Frequent Itemset Mining to derive frequent patterns. HFIM has feature that maintains candidate itemsets dynamically which are independent on minimum threshold. This feature allows to limit the number of scans over the database to one. In this paper, HFIM is compared with the Apriori to show the performance on standard datasets. The result section shows that HFIM outperforms Apriori over large databases.

     


  • Keywords


    Frequent Itemset Mining, Apriori Algorithm, minimum threshold.

  • References


      [1] Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets ofitems in large databases. In: ACM SIGMOD Record, vol. 22, no. 2, pp. 207–216.ACM, June 1993

      [2] Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation.In: ACM SIGMOD Record, vol. 29, no. 2, pp. 1–12. ACM, May 2000

      [3] Brin, S., Motwani, R., Ullman, J.D., Tsur, S.: Dynamic itemset counting and implication rules for market basket data. In: ACM SIGMOD Record, vol. 26, no. 2, pp.255–264. ACM, June 1997

      [4] Mueller, A.: Fast sequential and parallel algorithms for association rule mining: acomparison. Technical report CS-TR-3515, University of Maryland, College Park,August 1995.

      [5] Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W.: New algorithms for fast discovery of association rules. In: Third International Conference Knowledge Discoveryand Data Mining (1997).

      [6] Amphawan, K., Lenca, P., Surarerks, A.: Efficient mining top-k regular-frequentitemset using compressed tidsets. In: Cao, L., Huang, J.Z., Bailey, J., Koh, Y.S.,Luo, J. (eds.) PAKDD 2011. LNCS (LNAI), vol. 7104, pp. 124–135. Springer,Heidelberg (2012). doi:10.1007/978-3-642-28320-8 11.

      [7] Leung, C.K.-S., Mateo, M.A.F., Brajczuk, D.A.: A tree-based approach for frequent pattern mining from uncertain data. In: Washio, T., Suzuki, E., Ting, K.M.,Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 653–661. Springer,Heidelberg (2008). doi:10.1007/978-3-540-68125-0 61.

      [8] Cerf, L., Besson, J., Robardet, C., Boulicaut, J.F.: Closed patterns meet n-aryrelations. ACM Trans. Knowl. Discov. Data (TKDD) 3(1), 3 (2009).

      [9] Grahne, G., Zhu, J.: Fast algorithms for frequent itemset mining using FP-trees.IEEE Trans. Knowl. Data Eng. 17(10), 1347–1362 (2005).

      [10] Borgelt, C.: Frequent itemset mining. Wiley Interdisc. Rev.: Data Min. Knowl.Discov. 2(6), 437–456 (2012).

      [11] Djenouri, Y., Bendjoudi, A., Mehdi, M., Nouali-Taboudjemat, N., Habbas, Z.:GPU-based bees swarm optimization for association rules mining. J. Supercomput. 71(4), 1318–1344 (2015).

      [12] Djenouri, Y., Drias, H., Habbas, Z.: Bees swarm optimisation using multiple strategies for association rule mining. Int. J. Bio-Inspired Comput. 6(4), 239–249 (2014).

      [13] Gheraibia, Y., Moussaoui, A., Djenouri, Y., Kabir, S., Yin, P.Y.: Penguins searchoptimisation algorithm for association rules mining. CIT J. Comput. Inf. Technol.24(2), 165–179 (2016).

      [14] Luna, J.M., Pechenizkiy, M., Ventura, S.: Mining exceptional relationships withgrammar-guided genetic programming. Knowl. Inf. Syst. 47(3), 571–594 (2016).

      [15] Hegland, M.: The apriori algorithm tutorial. Math. Comput. imaging Sci. Inf.Process. 11, 209–262 (2005).

      [16] Guvenir, H.A., Uysal, I.: Bilkent university function approximation repository(2000). http://funapp.CS.bilkent.edu.tr/DataSets. Accessed 12 Mar 2012.


 

View

Download

Article ID: 19214
 
DOI: 10.14419/ijet.v7i3.34.19214




Copyright © 2012-2015 Science Publishing Corporation Inc. All rights reserved.