Symbolizing genealogical structures.  

Depending on the contents of a genealogical source it may be viewed as a set of various structures. For example, a marriage entry in a parish register involves as components the structures of a date, a locality, a bride, and a groom. Such composition may be symbolized in the form of an expression called a definition. A definition is a symbolic expression composed of a definiendum (part to be defined), and a definiens (part defining). In our symbolic language we choose to use the symbol “” to separate the two parts of the definition so that its expression has the form: definiendum definiens.

The source (label) and its various components may be symbolized in an ad hoc way by letters of the alphabet. As a definiendum the source is symbolized by certain letters. Then the definiens is a set of further sets of letters. Each of these sets may in turn be found as definienda in other definitions. For example, we may want to define a “personal name” as a combination of “first name,” a “middle name,” and a “surname.” Symbolizing these elements in a definition we could have:

PN FN + MN + SN.

In this way we are able to diagram large structures by defining (using rules of formation) their varous parts as composed of smaller structures. We can express this method of symbolizing mathematically by using still other symbols for these concepts of components and definition. To describe definitions in this abstract sense we may speak of components (C) in the following form:

Ci Ci1 + Ci2 + … + Cin.

In other words, a definition consists of a component followed by an arrow followed by a series of other components each separated by a plus sign.

Take another example. Suppose we want to describe the structure of a death record index that consists of a list of names with a certificate number for each one. One possible symbolization of this definition might take the form:

DRI PN1 + CN1 + … + PNn + CNn.

When there is a long list or series of identical structures, it is useful to abbreviate the expression of their definition. Let us expand our notational system to include what is commonly used in mathematics — the sigma notation. Generally it expresses a series of expressions with the index beginning at a designated point and ending at another designated point. In our above definition we have an index to designate distinct elements beginning at one and ending at some point n:

DRI ( PNi + CNi ).

Here the sigma no longer means “sum” but now means simply “series.”

To illustrate the construction of definitions in our symbolic language further consider our example of “personal name” again. Suppose that certain sources give such a name the form of “first name” plus “surname” omitting the “middle name.” This would call for another definition:


It is often convenient, if not more expressive, to give all the definitions for “personal name” in a single rule. A convention for doing this is to put parentheses around the portion of the fuller definition that does not appear in the other, thus:

PN FN ( MN ) SN.

The meaning of such an abbreviation is that “middle name” may or may not be a part of a “personal name” occurring between the “first name” and the “surname.”

Continuing with our example consider the case where a component may be defined in two completely distinct ways, as with our “middle name” above. An abbreviatory convention to write those two rules as one is to place curly brackets around the alternatives:

MN .

We have now reduced four definitions to two rules that to an extent describe the structure of personal names like: John Brown, John George Brown, or John Smith Brown. Several additions or alterations suggest themselves to describe even more possibilities, but this should serve to illustrate the basic construction or our definitions.

Up to this point we have shown some of what is involved in describing various structures by defining their parts. Is it possible to symbolize the structures themselves? Why can’t we write rule of formation to give these structures directly? For our “personal name” we might have the rules:

1. PN FN + FN + SN
2. PN FN + SN + SN
3. PN FN + SN

Certainly this can be done. But besides it being tedious to list all the various possible patterns, we miss the generalization of “middle name.” Somehow we feel that such intermediate structures belong conceptually as separate components, especially when we think of some of their non-formal dimensions. Suppose we needed to state the rule that middle names were more frequently used in a certain time period and among a certain social class. It must be symbolized as an entity separate from the symbolization of the farious forms it may take.

For this reason we will make our structural description be derived from our definitions. Let us illustrate how ths may take place. Suppose we have the following definitions in our source description:

C0 C1 + … + Ci + … + Cm
Ci Ci1 + … + Cin.

We will say that the following structure may be derived from these two rules:


More generally we must say that given a structure where a component ends a branch, branches ending with other components may be extended from it, provided there is a rule defining that component as a series of those other components.

Let this suffice as a first approximation of a way in which genealogical structures may be symbolized. Definitions are divised to describe the various parts of the structure and the symbolization of the structure as a tree is derived from them. From our rules for personal names:

1. PN FN ( MN ) SN
2. MN

we can derive these three structures:

1. 2. 3.

which describe some of the conceptual content of John Smith (1), John George Smith (2), and John Brown Smith (3).