2-1.4 Various strengths of data.
Depending on
how valuable the field is (and there are measures for this, e.gg., distinctiveness of possible data
values, reliability of the data, etc.), it could rate anywhere from very heavily against linkage to
abstention to very heavily in favor of linkage (non-coincidental). There may also be measures on
the extent to which the data in the field agrees partial agreement. Some differences in form, such
as abbreviations, may usually reflect no difference in meaning (what it might refer to). But some
differences, such as "Nancy" for "Ann", may also be made to affect the strength of the vote. When
all the votes are in, the strength of the vote should correspond to a probability that the pair being
compared is a match. If the probability is high enough, if it comes above a particular threshold, the
comparison counts as linked. Implementation of partial agreement takes the form of populating an additional
identifying field (naturally value dependent), which may agree (fully) when the original field disagrees (fully).
An additional example may appear with the approximate year of an event. The year may
have a precision of ±3 (its standard year range) so that it agrees partially with any year in the range,
or any approximate year whose precision range overlaps. How close the years are would then represent a
degree of agreement.