3.1 Name string. A researcher may want to query a data store of individuals identified by name. For accessing the data the program accepts a name string as a query. The program assumes some things about the culture and whether the query represents an official form of the name or some alias of the name. With this information the program transfers the string to the tokenizer and parser, which assign a phrase structure to the string. They do this by looking at each token and checking it against an entry in the the LEXICON. This entry gives the most likely category of each string found. In this process it may discover that there are several versions of the name all put together. The culture determines whether the parser uses the appropriate PS-RULES.
In general the PS-RULES characterize three classes of personal names according to the order of their constituent elements: 1) monotypic, i.e., single word epithets; 2) eurotypic, i.e., two kinds of main categories with the category of surname second; 3) sinotypic, i.e., these two kinds with the surname category first. The LEXICON can also provide certain culturally dependent display forms, such as abbreviations. For alphabetizing the program may also require additional name string versions of a eurotypic name, for example:
1. | Guido dela MARE | given name + nobiliary particle + root of surname | normal spoken order |
2. | DELAMARE, Guido | single word surname + comma + given name | standard 1 |
3. | DELA MARE, Guido | nobiliary particle + root of surname + comma + given name | standard 2 |
4. | MARE, Guido dela | root of surname + comma + given name + nobiliary particle | European standard |
These forms are additional aliases of the person. These particular aliases are directly derivable from the name pieces in the lexical entries.
Here is a second example this one a monotypic name of the South Pacific. The ancestors name was Ikatauirangi. His fathers name was Maroro. The form used for name submission asked for two kinds of names: given name (GN) and surname (SN). It was natural, therefore, for some of the descendants to record the name as eurotypic. But this could have happened in many ways. It actually did happen in the first four of the following ways:
1. | Te Ikatauirangi MARORO | honorific particle and name as GNs, fathers name as SN |
2. | Te IKATAUIRANGI | honorific particle as GN, name as SN |
3. | Te Ikatauirangi | honorific particle and name as GNs, no SN |
4. | Te Ika-Tau-I-Rangi | meaningful parts of name written as separate morphemes |
5. | Ikatauirangi | omit honorific particle, name as GN, no SN |
The fourth occurence illustrates the need to standardize the use of hyphenation. The last version (never found) is probably the most strictly accurate.