Elite Sequence Mining of Big Data using Hadoop Mapreduce

P. Amarendra Reddy; O Ramesh; . .

doi:10.14419/ijet.v7i4.10.20696

Authors

P. Amarendra Reddy
O Ramesh
. .

Received date: October 1, 2018

Accepted date: October 1, 2018

Published date: October 2, 2018

DOI:

https://doi.org/10.14419/ijet.v7i4.10.20696

Keywords:

Big data, MAPREDUCE, SVD, LSI.

Abstract

Text mining can deal with unstructured information. The proposed work extricates content from a PDF report is changed over to plain content configuration; at that point record is tokenized and serialized. Record grouping and classification is finished by discovering similarities between reports put away in cloud. Comparable archives are distinguished utilizing Singular Value Decomposition (SVD) strategy in Latent Semantic Indexing (LSI). At that point comparative records are assembled together as a group. A similar report is done between LFS (Local File System) and HDFS (HADOOP DISTRIBUTED FILE SYSTEM) as for rate and dimensionality. The System has been assessed on genuine records and the outcomes are classified.
Â
Â

References

[1] Feldman, Ronen, et al. "Knowledge Management: A Text Mining Approach."PAKM.Vol. 98. 1998.
[2] Vaithyanathan, Shivakumar, Mark R. Adler, and Christopher G. Hill. "Computer method and apparatus for clustering documents and automatic generation of cluster keywords." U.S. Patent No. 5,857,179. 5 Jan. 1999.
[3] Neto, Joel Larocca, et al. "Document clustering and text summarization." (2000).
[4] Sahane, Manisha, Sanjay Sirsat, and Razaullah Khan. "Analysis of Research Data using MapReduce Word Count Algorithm." Internl.Journal of Advanced Research in Computer and Commn.Engg 4 (2015).
[5] Liang, Yen-Hui, and Shiow-Yang Wu. "Sequence-Growth: A Scalable and Effective Frequent Itemset Mining Algorithm for Big Data Based on MapReduce Framework." Big Data (BigData Congress), 2015 IEEE International Congress on.IEEE, 2015.
[6] Wang, Jingjing, and Chen Lin. "MapReduce based personalized locality sensitive hashing for similarity joins on large scale data." Computational intelligence and neuroscience 2015 (2015): 37.
[7] Nagwani, N. K. "Summarizing large text collection using topic modeling and clustering based on MapReduce framework." Journal of Big Data 2.1 (2015): 1-18.
[8] Negrevergne, Benjamin, and Tias Guns. "Constraint-based sequence mining using constraint programming." International Conference on AI and OR Techniques in Constriant Programming for Combinatorial Optimization Problems.Springer International Publishing, 2015.
[9] Feinerer, Ingo. "Introduction to the tm Package Text Mining in R." 2013-12-01]. http://www, dainf, ct. utfpr, edu.br/-kaestner/Min-eracao/RDataMining/tm, pdf (2015).

Elite Sequence Mining of Big Data using Hadoop Mapreduce

Authors

P. Amarendra Reddy

O Ramesh

. .

How to Cite

DOI:

Keywords:

Abstract

References

Downloads

How to Cite