Section 7: Variation


[31 Aug 2006]

Names normally occur with multiple variations in spelling. There are many reasons for this variation. One of the most common arose in times past when writing materials were in short supply. The inventive clerk could preserve space by reducing the commonly occurring names to a few distinctive characters. Besides this, before about 1650, even among the well educated there was no standard way to spell most names. The clerk would apply his own experience and imagination to writing down the name. Sometimes there were also copying errors, arising from the variations and uncertainties in reading, writing, pronouncing, or placing the name in its linguistic context. This need to abbreviate may also motivate the use of shortened forms. The colloquial speech of a community and family may also give rise to shortened forms. Contriving a familiar or pet form may also alter the name or its short form by changing its pronunciation or by adding a diminutive affix.

If we want to measure the variation within a name group (spellings of the same name), it is desirable to devise a distance function. Given two spellings (s) as parameters, such a function will return the degree to which they differ from each other. To make this simple we will first require that the difference (d) be expressed as a proportion (0 < d < 1.0). No difference (both have the same identical spelling) would score zero. Presumably the function would also be expected to return one or something greater, in the case where the two spellings are for different names. Secondly we will assume that there are some spellings of a name that are different from each other to a greater degree and some to a smaller degree. In this case presumably for each spelling there will always be one of the others that displays a minimum non-zero difference.

There are several phases or dimensions of the communication process that influence the name spelling. It is therefore reasonable for the distance function to analyze the spelling along one of these dimensions. An analysis whose fundamental elements are based on the written language alone may be called an orthographic analysis. The handwriting style of the clerk may introduce ambiguity of interpretation, especially if the reader is not fluent with it. An appropriate name for this dimension of analysis is the iconic or optigraphic. The way the name itself is pronounced in the language is a third or phonological dimension. In addition the choice of affix or other grammatical device may or may not affect whether the spelling exhibiting it belongs to the same name group as the spelling without it. This fourth dimension is the morphological dimension of analysis. To remind us of these four dimensions of analysis, the following illustrations are provided with selected distinctive fonts and colors as given on figure 1.

Figure 1 illustrates first that spellings are considered to be strings of characters concatenated together. If desirable, concatenation can be considered a function. Likewise the process of going from a particular spelling to some representation of that spelling in each dimension of analysis can also be expressed using a function. In these higher dimensions, the elements of analysis may be given their own set of characters to represent them. A grapheme is normally a character or a particular set of characters that represent a particular sound. The sounds of the language are the phonemes. This definition ties it tightly to the phonetic dimension of analysis. We extend it slightly to allow it to refer to a character, characters, or part of one that presents a particular image to the eye. By analogy I have coined a term for the distinctive images of the elements of a style of handwriting as its iconemes.

The two interpretations of grapheme for our purposes are illustrated in figure 2. The ones based on the phoneme /f/ are shown between forward slashes, which practice is customary in the field of acoustic phonetics. The p-grapheme has at least three forms with a function (P) to produce the phoneme from it and its inverse (P-1) to yield any of the three graphemes when given the phoneme /f/. In an analogous fashion we set the iconeme between reverse slashes. The letter “M” stands for a single (indicated by the superscript) minim while the “A” represents a single ascender. These are just a few of the elements needed to describe spellings in the Viewex code for the Spencerian handwriting.