Frequent item set mining using normalized FP-growth algorithm
Keywords:Item Setmining, FP Growth Algorithm, Association Rule Mining.
As the volume of data and its storage schemes are increasing drastically, the knowledge discovery from these huge volume of heterogeneous and high dimension data emerges as an essential process. Number of algorithms for data association analysis has been introduced considering time and main memory requirements. However this algorithms get completed when the items and records grows extremely high. In this paper we have created a data structure that can be reused without modifying the schema. So the aim of this work is to make an efficient association rule mining independent of the algorithm being selected.
Our data structure make data access much faster by simplifying and reorganizing them by implementing shuffling strategy using hamming distance and inverted index mapping. In this work we explore our algorithmâ€™s efficiency by using many datasets containing millions of records in it. We increased the runtime orders of the magnitude and reduced the auxiliary and main memory requirements.
 R. Agrawal and R. Srikant, â€œFast Algorithms for Mining Association Rules in Large Databases,â€ Proc 20th Intâ€™l Conf. Very Large Data Bases (VLDB), pp. 487-499, 1994.
 R. Agrawal and R. Srikant, â€œPrivacy-Preserving Data Mining,â€ Proc. ACM SIGMOD Conf., pp. 439-450, 2000. https://doi.org/10.1145/342009.335438.
 D. Beaver, S. Micali, and P. Rogaway, â€œThe Round Complexity of Secure Protocols,â€ Proc. 22nd Ann. ACM Symp. Theory of Computing (STOC), pp. 503-513, 1990. https://doi.org/10.1145/100216.100287.
 M. Bellare, R. Canetti, and H. Krawczyk, â€œKeying Hash Functions for Message Authentication,â€ Proc. 16th Ann. Intâ€™l Cryptology Conf. Advances in Cryptology (Crypto), pp. 1-15, 1996. https://doi.org/10.1007/3-540-68697-5_1.
 J.C. Benaloh, â€œSecret Sharing Homomorphisms: Keeping Shares of a Secret Secret,â€ Proc. Advances in Cryptology (Crypto), pp. 251-260, 1986.
 J. Brickell and V. Shmatikov, â€œPrivacy-Preserving Graph Algorithms in the Semi-Honest Model,â€ Proc. 11th Intâ€™l Conf. Theory and Application of Cryptology and Information Security (ASIACRYPT), pp. 236-252, 2005. https://doi.org/10.1007/11593447_13.
 D.W.L. Cheung, J. Han, V.T.Y. Ng, A.W.C. Fu, and Y. Fu, â€œA Fast Distributed Algorithm for Mining Association Rules,â€ Proc. Fourth Intâ€™l Conf. Parallel and Distributed Information Systems (PDIS), pp. 31-42, 1996. https://doi.org/10.1109/PDIS.1996.568665.
 D.W.L Cheung, V.T.Y. Ng, A.W.C. Fu, and Y. Fu, â€œEfficient Mining of Association Rules in Distributed Databases,â€ IEEE Trans. Knowledge and Data Eng., vol. 8, no. 6, Dec. 1996. https://doi.org/10.1109/69.553158.
 T. ElGamal, â€œA Public Key Cryptosystem and a Signature Scheme Based on Discrete Logarithms,â€ IEEE Trans. Information Theory, vol. IT-31, no. 4, July 1985. https://doi.org/10.1007/3-540-39568-7_2.
 A.V. Evfimievski, R. Srikant, R. Agrawal, and J. Gehrke, â€œPrivacy Preserving Mining of Association Rules,â€ Proc. Eighth ACM SIGKDD Intâ€™l Conf. Knowledge Discovery and Data Mining (KDD), pp. 217-228, 2002. https://doi.org/10.1145/775047.775080.
 R. Fagin, M. Naor, and P. Winkler, â€œComparing Information without Leaking It,â€ Comm. ACM, vol. 39, pp. 77-85, 1996. https://doi.org/10.1145/229459.229469.
 M. Freedman, Y. Ishai, B. Pinkas, and O. Reingold, â€œKeyword Search and Oblivious Pseudorandom Functions,â€ Proc. Second Intâ€™l Conf. Theory of Cryptography (TCC), pp. 303-324, 2005. https://doi.org/10.1007/978-3-540-30576-7_17.
 M.J. Freedman, K. Nissim, and B. Pinkas, â€œEfficient Private Matching and Set Intersection,â€ Proc. Intâ€™l Conf. Theory and Applications of Cryptographic Techniques (EUROCRYPT), pp. 1-19, 2004. https://doi.org/10.1007/978-3-540-24676-3_1.
 O. Goldreich, S. Micali, and A. Wigderson, â€œHow to Play Any Mental Game or a Completeness Theorem for Protocols with Honest Majority,â€ Proc. 19th Ann. ACM Symp. Theory of Computing (STOC), pp. 218-229, 1987.