3-2.4 Estimating numbers of duplicate n-tuples.
Based on the duplication rates for each
sample it is possible to calculate the estimated expected number of duplicate pairs, triples, and
quadruples. For example, when there are 10227 unique records we would expect 591, i.e., 0.0578
times 10227 to be in duplicate groups. Of these 34, i.e., 0.0578 times 591 are in groups
having more than two members, and 2, i.e., 0.0578 times 34 are in groups with more than
three members. The estimates in table 2 are from subtracting out the numbers in overlapping classes:
2 in quadruples, 34 minus 2 = 32 in triples, and 591 minus 34 = 557 in pairs.