Record linkage and deduplication using traditional blocking


  • G Somasekhar
  • SeshaSravani K
  • Keerthi P
  • Sai Sandeep G





Blocking, Blocking Key, Blocking Key Value, Deduplication, Record Linkage, Traditional Blocking.


Record Linkage and Deduplication are the two process that are used in matching records. Matching of records is done to remove the duplicate records. These duplicate records highly influence the outputs of data mining and data processing. If the matching of records is done on the single database, it is called Deduplication. In Deduplication we check for the duplicate records in the single database. Unlike deduplication if the matching of the records is done on the several databases it is called as record linkage. In this paper we also discuss about the indexing technique called as traditional blocking which is used to remove non matching pairs that leads to the less number of record pair to be compared.


