5-4.6 Fitting to the decay curve as a model.

There is a formula commonly used for modeling growth and decay in science and economics that appears to model our personal name data quite well. In these former applications the formula looks at change (c) over time (t) with a starting value (y0). In our case we look at relative frequency differences from rank to rank (r). In the most general form the relationship is as in equation 5.5 where e = 2.718 . . . , i.e., the base of the natural logarithms.

f(r) = a × eb(r – 1) + c(5.5)
We rank the relative frequencies arbitrarily from greatest to least and so the constant b is negative. When r = 1, a is the relative frequency of the most common field value. Each spelling of a name is called a type. When we take the token count of each type and divide by the total number of tokens of all types, we arrive at a relative frequency for that type. Knowing the relative frequency for the various names, we can find the best fit by choosing the appropriate values for a, b, and c.

gn="center">