Symbolizing genealogical sources.  

We are now ready to take a specific genealogical source and examine its structure and determine what additional rules we need and how the symbolic language develops.

Here is an entry from a marriage register in the parish of Apples in Switzerland:

1677 à (=at) Bussy le (=the) 15.9br (=Nov, i.e., the 9th month) ont este espouzée (=was married) Samuel Fazan le jeune (=the young one) d’ (=of) Apples et (=and) Marie Françoise Voizin d’Aulbonne.

We might now say that we have here a set of structures that need definitions. There are the “date,” “place,” and the identifications of a “groom” and a “bride.”

ME DT + PL + GM + BD.

Note, however, that elements of the date appear in various places in the structure, that the localities are specified in an abbreviated way, and that spellings are archaic. In some respects we must say that the actual structure of this source is derived from an idealized structure (by transformation), which we now describe.

An idealized form of a date may conform to the following definition:

DT DAY + MO + YR.

What is the consequence of further defining these elements? We find rather quickly that the definition of one depends heavily on the definition of another. For example, a “day” is an integer between one and thirty in certain months and something else in other months. We need to incorporate into our language a definition that takes account of the context — the way adjoining elements have been defined.

One way to add the control of context into our definitions is with an additional symbol. Say we have the following definitions:

C0 C1 + … + Ci + Cj + … + Cm
Ci Ci1 + … + Cin / _____ Cj.

We will then be able to adjust derivation to say that the following structure is derivable from these two rules:

.

Or, more generally, we must say that structures are involved in the way given in the last section, but here with reference to the context.

Continuing with the definition of “date” let us say that we would have such definitions as these for use with this particular marriage register (notice the French month names and the treatment of leap year):

DAY / _____
DAY / _____ Feb + CMYR
DAY / _____ Feb + LPYR
DAY / _____

Again we might suggest some ways that these rules could be condensed even further by introducing an ordering convention, but this is an area that would require further investigation.

Note here also that the contextually restricted definition for DAY implies suitable definitions for MO and YR:

MO
YR

In other records we might find the need for more complicated rules. Suppose a feast day is given for the date. Maybe we would have to have such rules as these:

DT FD + YR.
FD

together with all the various forms of fixed and movable dates that occur.

We also need context sensitive rules to define the “place” of our marriage entry. The genealogist usually specifies places in terms of superior jurisdictions, so that a “village” is part of a “parish,” which is part of a “county” (here a canton), which is part of a “country.” The gazetteer does for places much the same thing as a calendar does for dates. In our example we might need the following definitions:

PL VLG + PRSH + CNTY + CTRY
CTRY
CNTY CANTON / _____ "Switzerland"
CANTON
PRSH /_____Vaud
VLG /_____Apples

Further rules can also be supplied until we are satisfied with the description of the structure of our marriage entry as given in figure 2.

One type of problem that occurs with our definitions in the form so far developed may be illustrated by the definition of “first name.” Typically the first name of a male is chosen from a relatively small set of proper names for the most part quite distinct from the set from which a name for a female is chosen. In figure 2 we see that the FN connected below GM is masculine, but the FN’s connected below BD are feminine. Suppose we change our definitions to carry this feature along:

GM PNM +
BD PNF +
PNM FNM ( FNM ) SN
PNF FNF ( FNF ) SN.

Such a revision looses the significant expression of the fact that these entities for the first name share the same basic structure. The way that we have chosen particular letters of the alphabet is not sufficient to show their similarity; this choice was completely ad hoc. We will have to make our language more expressive, more powerful, by allowing for the subcategorization of such symbols. How classification in general might be done will not be treated in this article.

Before considering how our symbolization of a source fits into the symbolization of the research process, let us point out some of the elements of the source that will eventually have to be symbolized, but which in the interest of brevity we have neglected. In a parish register we usually have not only a “marriage register” consisting of a series of “marriage entries,” but also a “baptismal register” and a “burial register” also as series of entries. Here again we have idealized the structure since in some cases the entries are not separated. Other elements that need to be considered in the research process are the “jurisdiction” and the “religious denomination” of the record. We can approximate these structures as follows:

PR JR + RD + BPR + MGR + BRR
BPR BPEi
MGR MEi
BRR BREi

To add the temporal limits, which often go into the definition of a register of this kind we might consider an adjustment to our sigma-notation. Also, it might be appropriate to further define “jurisdiction” and “religious denomination” by means of rules that allow subcategorization.