2-1.4 Various strengths of data.

Depending on how valuable the field is (and there are measures for this, e.gg., distinctiveness of possible data values, reliability of the data, etc.), it could rate anywhere from very heavily against linkage to abstention to very heavily in favor of linkage (non-coincidental). There may also be measures on the extent to which the data in the field agrees — partial agreement. Some differences in form, such as abbreviations, may usually reflect no difference in meaning (what it might refer to). But some differences, such as "Nancy" for "Ann", may also be made to affect the strength of the vote. When all the votes are in, the strength of the vote should correspond to a probability that the pair being compared is a match. If the probability is high enough, if it comes above a particular threshold, the comparison counts as linked. Implementation of partial agreement takes the form of populating an additional identifying field (naturally value dependent), which may agree (fully) when the original field disagrees (fully).

Probabilistic Record Linkage Principle of Standardization

An additional example may appear with the approximate year of an event. The year may have a precision of ±3 (its standard year range) so that it agrees partially with any year in the range, or any approximate year whose precision range overlaps. How close the years are would then represent a degree of agreement.