Brigham Young University Homepage
Route Y Secure Sign In
College of Physical and Mathematical Sciences College of Physical and Mathematical Sciences

An Evaluation of Name, Location, and Date Comparison Metrics for Record Linkage - Yao Huang Lin

Personal Information
Primary Presenter First Name: 
Yao
Primary Presenter Last Name: 
Huang Lin
Abstract Information
Department: 
Computer Science
Faculty Advisor: 
Christophe Giraud-Carrier
Title of Abstract: 
An Evaluation of Name, Location, and Date Comparison Metrics for Record Linkage

Record linkage is the process of identifying approximately duplicate entities in datasets and determining whether or not the two entities in fact refer to the same real world entity. Entities in record linkage commonly have name, location, and date attributes. The process of comparing these entities can frequently be accomplished by resolving each attribute pair into a similarity score using a comparison metric. Advanced metrics have been created by aggregating the outputs of a wide variety of such metrics into an artificial neural network which outputs a regression score. For example, combining a date comparison metric based upon calculating the difference in days between two dates with a date comparison that compares the similarity of dates' strings successfully combines a statistical significance metric with a metric that scores data entry errors. In this research, we compare simple and aggregate similarity metrics for dates, names, and locations on a large, post-blocking genealogical database.

Maintained by the College of Physical and Mathematical Sciences Webmaster
Copyright © 2009. Brigham Young University. All Rights Reserved.