2.2 Character set. Some languages have alphabets that require diacritical marks, but are recorded with equipment incapable of reproducing them. Name forms without the correct alphabet, or translated to another alphabet can still be standards, so long as the character set is designated. Specifying this attribute allows names spelled using non-romanized alphabets and those using romanized alphabets with diacritical markings to belong to the same name groups as the versions without such alphabets and markings.

Figure 1 illustrates some of the classes for names relating to character sets. Notice that the Chinese character for this particular surname name has two forms, traditional and simplified. It is interesting that the first implimentation of UNICODE used a sans serif form of the glyph, whereas the simplified version is implemented in a font having a formal more accurate (serif) style. The chart includes a reconstruction for an old pronunciation of the name. The spelling is a conventional one that may well be represented by one of the names in green, which are all Anglicized versions from genealogists who have traced different lineages of this family name. The provisional class of all these spellings is “transcription.” The spellings in light blue are standard romanizations for the two main languages of China. The superscript numbers are important to phonetic transcriptions. These indicate the distinctive tone of the syllable but are often not preserved in the Anglicized version. It is quite likely that some of the spellings are associated with other systems of Romanization which I have not studied yet.