Survey of duplicate detection using progressive detection techniques

  • Authors

    • K Venkatraman
    • A Akila
    2018-03-01
    https://doi.org/10.14419/ijet.v7i1.9.9757
  • Progressive Sort Neighborhood Method, Progressive Blocking, Duplicate Detection.
  • Data is an important task in real world; the common data is represented and used in all the fields. The duplicate data is executed and displayed in scenario. The proposed work two types of techniques used first one Progressive Sort Neighbourhood Method (PSNM) and Progressive Blocking (PB). Progressive Sort Neighbourhood Method is used to deliver the exact input based output and the method will separate the input based keywords and check the similarity of the output data. The progressive blocking is to filter the irrelevant information, keywords based indexing and entry level filtering standard input is implemented based on user requirement.

  • References

    1. [1] Ahmed K. Elmagarmid, Vassilios S. Verykios, Member,â€Duplicate Record Detection: A Surveyâ€. IEEE KDE, VOL. 19, NO. 1, JANUARY 2007.

      [2] S. Ramya, C. Palaninehruineering,“A Study of Progressive Techniques for Efficient Duplicate Detectionâ€. International Journal of Advanced Research in Computer Science and Software Engineering.Volume 5, Issue 11, November 2015. www.ijarcsse.com.

      [3] Mohd Shoaib Amir Khan, “Progressive identification of duplicityâ€.International Journal of Scientific and Research Publications, Volume 6, Issue 4, April 2016.

      [4] Mauricio A. Hernandez, J.Stolfo, .Real World Data Is Dirty:Data Cleaning And The Merge/Purge Problem.

      [5] Jayant Madhavan, Shawn R. Jeffery, Shirley Cohen, Xin (Luna) Dong,DavidKo, Cong Yu, Alon Halevy,Google, Inc. “Web-scale Data Integration: You can only afford to Pay As You Goâ€.

      [6] Shawn R. Jeffery_ UC Berkeley Jeffery,Alon Y. Halevy â€Pay-as-you-go User Feedback for Dataspace Systemsâ€.

      [7] Top-k Set Similarity Joins Chuan Xiao Wei Wang Xuemin Lin Haichuan Shang

      [8] Ritika Mishra1, Navjot Kaur2 “A Survey of Spelling Error Detection and Correction Techniques†International Journal of Computer Trends and Technology- volume4Issue3- 2013.

      [9] Piotr Indyk1 A Small Approximately Min-Wise Independent Family of Hash Functions Received June 7, 1999.

      [10] Uwe DraisbachHasso Plattner, “A Generalization of Blocking and Windowing Algorithms for Duplicate Detectionâ€.

      [11] Rupali Vairagade, Savitribai Phule “A Survey of Sorted Neighbourhood Indexing Technique for DeDuplication†International Journal of Innovative Research in Computer and Communication Engineering (An ISO 3297: 2007 Certified Organization). Vol. 3, Issue 12, December 2015.

  • Downloads

  • How to Cite

    Venkatraman, K., & Akila, A. (2018). Survey of duplicate detection using progressive detection techniques. International Journal of Engineering & Technology, 7(1.9), 171-172. https://doi.org/10.14419/ijet.v7i1.9.9757