Section 6-3 ADJUSTMENTS FOR VALUE DEPENDENCE



Preliminary adjustment calculations.   A chain of three fully dependent fields results in eight combinations: 1 = [   ], i.e., no fields (dis)agree, 2 = [ 1 ], i.e., only the most dependent field (dis)agrees, 3 = [ 2 ], i.e., only less dependent field (dis)agrees, 4 = [ 3 ], i.e., only independent field (dis)agrees, 5 = [ 1, 2 ], i.e., dependent fields (dis)agree, 6 = [ 1, 3 ], i.e., only most dependent and independent fields (dis)agree, 7 = [ 2, 3 ], i.e., less dependent and independent fields (dis)agree, 8 = [ 1, 2, 3 ], i.e., all three fields (dis)agree. As pointed out in 4-19, the adjusted reliability and coincidence on only four of these is non-zero: 1 = [   ], i.e., no fields (dis)agree, 4 = [ 3 ], i.e., only independent field (dis)agrees, 7 = [ 2, 3 ], i.e., less dependent and independent fields (dis)agree, and 8 = [ 1, 2, 3 ], i.e., all three fields (dis)agree. We also pointed out that when we compare two records their various comparisons are then grouped into four (dis)agreement cases; these cases we then refer to as ar0, ar1, ar2, ar3 and ac0, ac1, ac2, ac3. These values are arranged on table 2 as some of the raw data needed before we can begin probability calculations involving value dependent fields.
Case Present Adjusted Reliability Eq. (4.3) Adjusted Coincidence Eq. (4.4)
0 1 ar0 = 1 – r12 ac0 = 1 – c12
1 4 ar1 = r12r22 ac1 = c12c22
2 7 ar2 = r22r32 ac2 = c22c32
3 8 ar3 = r32 ac3 = c32
Table 2 — Value Dependence Adjustments

Using adjusted values.   The adjusted reliability and coincidence are associated with the particular combination of fields making up the comparison weight. Combinations () present in the constructive value dependence chain are also presence dependent. The equation 3.1 for blocking recall must use the adjusted presence and reliability, whose values are determined by the combination involved. The only valid combinations are those sub-chains all of whose fields are blocking or weighting fields. Some choices may result in the adjusted presence for a blocking field becoming very small or disappearing altogether. Although the choice may be valid, it may reduce recall drastically.
recall = apki × arki(6.9)

Similarly the equation 3.12 for the blocking precision uses the adjusted reliability and coincidence. Here, of course, precision only makes sense when the field is present. The measure is only for comparisons in blocks.
precision = {1+(G ÷ NU)} ÷ {1 + (G ÷ NU) + [(acki ÷ arki) × Ntotal]}(6.10)

The adjusted reliability enters into the probability that the comparison weight will be matched (y) of equation 6.4, and the adjusted coincidence, that the comparison weight will be unmatched (z) of equation 6.5.

Number of calculations involved.   In the simple model there are three ways to interpret Ri and Ci resulting in possibly 3n weights to be calculated, where n is the number of fields being weighted. Where there are no presence and value dependencies this also expresses the maximum number of probability calculations: one set for the matched and one set for the unmatched. Our example includes the case where the presence of each field was in one presence dependence chain. This is the minimum number of probabilities to calculate without value dependence. Accordingly the number of probability calculations, i.e., the number of weights (Ncw), would be:
where n = Nfields(6.11)

When there are multiple presence dependence chains, every chain is smaller than this with their total length equal to n. Each chain factors its number of calculations to make up the total, which would never exceed 3n. Suppose there were eight fields in four dependence chains with lengths of 1, 2, 2, and 3. In this case there are in all 38 or 6561 combinations. The presence dependence chains, however, reduce the number of weights to 3 × 7 × 7 × 15 or 2205. Value dependency chains have a further reducing effect on the number of calculations required. When the presence dependence allows a combination, the value dependence may nevertheless preclude agreement in the face of disagreement in the dependent field. The extent of the reduction has the effect of changing the 3n in equation 6.11 to 2n for fields in the chain. Value specificity multiplies the number of weight calculations by the number of partitions in the field type spectrum. This is only effective for the calculation of weights (x) and the unmatched probabilities (z), i.e., about 2/3 of the most numerous calculations.