3.1 Name string.   A researcher may want to query a data store of individuals identified by name. For accessing the data the program accepts a name string as a query. The program assumes some things about the culture and whether the query represents an official form of the name or some alias of the name. With this information the program transfers the string to the tokenizer and parser, which assign a phrase structure to the string. They do this by looking at each token and checking it against an entry in the the LEXICON. This entry gives the most likely category of each string found. In this process it may discover that there are several versions of the name all put together. The culture determines whether the parser uses the appropriate PS-RULES.

In general the PS-RULES characterize three classes of personal names according to the order of their constituent elements: 1) monotypic, i.e., single word epithets; 2) eurotypic, i.e., two kinds of main categories with the category of surname second; 3) sinotypic, i.e., these two kinds with the surname category first. The LEXICON can also provide certain culturally dependent display forms, such as abbreviations. For alphabetizing the program may also require additional name string versions of a eurotypic name, for example:

1.Guido dela MAREgiven name + nobiliary particle + root of surname normal spoken order
2.DELAMARE, Guidosingle word surname + comma + given name standard 1
3.DELA MARE, Guidonobiliary particle + root of surname + comma + given name standard 2
4.MARE, Guido delaroot of surname + comma + given name + nobiliary particle European standard

These forms are additional aliases of the person. These particular aliases are directly derivable from the name pieces in the lexical entries.

Here is a second example — this one a monotypic name of the South Pacific. The ancestor’s name was Ikatauirangi. His father’s name was Maroro. The form used for name submission asked for two kinds of names: given name (GN) and surname (SN). It was natural, therefore, for some of the descendants to record the name as eurotypic. But this could have happened in many ways. It actually did happen in the first four of the following ways:

1.Te Ikatauirangi MAROROhonorific particle and name as GN’s, father’s name as SN
2.Te IKATAUIRANGIhonorific particle as GN, name as SN
3.Te Ikatauirangihonorific particle and name as GN’s, no SN
4.Te Ika-Tau-I-Rangimeaningful parts of name written as separate morphemes
5.Ikatauirangiomit honorific particle, name as GN, no SN

The fourth occurence illustrates the need to standardize the use of hyphenation. The last version (never found) is probably the most strictly accurate.