5-4.6 Fitting to the decay curve as a model.
There is a formula commonly used for
modeling growth and decay in science and economics that appears to model our personal name data
quite well. In these former applications the formula looks at change (c) over time (t) with a starting
value (y0). In our case we look at relative frequency differences from rank to rank (r). In the most
general form the relationship is as in equation 5.5 where e = 2.718 . . . , i.e., the base of the natural
logarithms.
f(r) = a × eb(r 1) + c | (5.5) |
We rank the relative frequencies arbitrarily from greatest to least and so the constant b is negative.
When r = 1, a is the relative frequency of the most common field value. Each spelling of a name
is called a type. When we take the token count of each type and divide by the total number of tokens
of all types, we arrive at a relative frequency for that type. Knowing the relative frequency for the
various names, we can find the best fit by choosing the appropriate values for a, b, and c.
gn="center">