LSDI - Trageted List Services



Home




About Us



Append Ethnicity To Your Database



Data Card Collection



Mailing Services / Lettershop



Online Counts



SIC Code Search



Testimonials




Contact Us



Industry Links



Ethnic & Religious Encoding System




Site Map




List FAQ's



ETHNIC AND RELIGIOUS ENCODING SYSTEM

 

Humans are social animals. Individuals form groups; groups form cultures,
and cultures evolve into civilizations. Names have created a unique means of identifying and categorizing individuals.

Name classifications were based on similar factors within similar groups. These factors included recognizable physical features and character traits Individuals with similar attributes were usually clustered within specific geographic areas. As they moved from rural to urban environments, separate groups were formed each having their own societal structure and belief system.

As groups became more unified ethnic, religious and minority distinctions evolved when each group perceived themselves as the “we” group and all others as the “they” group.

List Service Direct’s Ethnic and Religious Encoding System utilizes this historical concept as one component of its process. Knowing that each group has a distinct culture and a distinct “world view”, our process has created a rule and exception based program that incorporates the idea that each group has last and first names that will be unique to that group. By applying specific criteria in a specific order, the ethnic, religious, and minority identity of the individual can be ascertained.

The accuracy of this identification is further enhanced by applying a geographical analysis, based on census tract data, of the name within the ethnic, religious or minority group.

Our Ethnic and Religious Encoding System is NOT A SURNAME BASED SYSTEM. Rather, it is a revolutionary new process that allows the marketer or researcher to select over 130 ethnic, religious, and minority groups from any list.

Our ethnic encoding system analyzes both an individuals first and last name and applies, in a specific order, ethno-linguistic and geocentric rules to both the surname prefix and suffix and identifies the specific ethnic, religious, and minority status of individuals, even an individual with a multiethnic surname.

The LSDI ethnic encoding system consists of a set of irrevocably entwined computer programs and data files as follows:

  • A unique first name file by ethnicity
  • A non-unique surname file by ethnicity
  • A series of two to five character prefix rules by ethnicity
  • A series of two to five character suffix rules by ethnicity
  • A series of codes to identify the ethnic, religious and minority
    status of an individual
  • A geocentric reference table
  • A complex series of computer programs that analyze the individuals
    names using the systems data.

Another exclusive feature of our system is its ability to recognize hyphenated and misspelled names, which will be correctly coded because of the prefix and suffix rules.

Hyphenated names will be captured using our first name and surname tables in conjunction with the prefix and suffix rules that apply to them.

In order to understand and appreciate our system, it is necessary to trace the onomastic variables’ that are found within the process. These variables include ethnic heritage descriptions, locational identifiers, and ethnic life form and individual trait describers.


Ethnic Heritage Descriptions

An ethnic heritage descriptor alludes to the parentage of the individual. Each ethnicity and language has a different way of expressing this within the first or the last name.

List Service Direct’s Ethnic and Religious Encoding System has used these descriptors (either suffixes or prefixes appended to first or last name) to accurately identify particular names unique to particular ethnic groups. Below are several examples that will illustrate the way ethnic heritage describers may be used.

In the Finnish ethnicity, the suffix “NEN” means the “offspring of’. In Welsh, the original prefix “AP” (since shortened to “P” when combined with a first name) means “offspring of’. Thus, PROBERT is Welsh for “offspring of Robert”.

The suffix “UCCI” means “descendant of’ in Italian while the Turks use the suffix “BASHI” to mean “father of’.

Prefixes play an important role in identifying some Irish names. “Grandson of’ is implied in the prefix “O”, while the prefix “MC” means “son of’. To designate “Uncle”, the Burmese use the prefix “U”.

It is important to remember that the use of these name endings and beginnings do not alone guarantee accuracy. These components are only a part of our process.

 

Ethnic Locational Identifiers

During the Dark and Middle Ages it was essential that an individual could
be traced to his country of origin or his geographic location within a country. This information immediately identified the individual as friend or foe. One method that was adapted was to add an identifier to his name in the form of a prefix or suffix.

Geographic locators are important to the ethnic identifier process as well. In addition to the suffixes and prefixes and the rules derived therein, our system also incorporates actual geographic coordinates in the U.S. to determine ethnic, religious and minority group clusters. This improves the accuracy of our system.

Below are some examples of ethnic locational identifiers and the popular name “myths” they refute.

In the Finnish ethnicity, the surname suffixes “OLA” “YLA” and “KOSKI” mean upper, lower, and middle respectively. “KOSIU” refutes the popular notion that all names ending in SKI are Polish and “OLA” proves that not all names ending in a vowel are Italian.

The French use the prefixes “DU, DE, DELAS, and DES” to designate “from”; while the Romanians use ‘AN-U” and “EANU” to convey the same meaning.

Italian names ending in “DDA” and “DDO” show that the individual is from Sardinia.

Ethnic locational identifiers help our ethnic system to correctly determine ethnic origin of all groups including Italian and Polish. (see above Finnish). Other systems currently in use do not have this ability and are far less accurate.

 

Ethnic Life Form & Individual Trait Descriptors

This category reveals the humorous and sometimes cruel side of human
nature. Created as a means of classifying individuals by their physical attributes and likeness to animals (sometimes not flattering); these descriptors offer an unusual method of identifying individuals by ethnic group.

Our system uses these (as well as all other descriptors) to help build its rule and exception based system. These rules allow our process to capture names that might be eliminated or inaccurately identified using surname systems, and other programs. Our system captures and assigns these individuals to their correct ethnic group.

The Italians provide us with many examples of both life form and individual
trait descriptors: “FUZZO”(“curly), “MANCINI”(“left handed”), “LAGO”(“tal1”) and “FASANO” (“pheasant”). Less flattering is “BOCCACIO” (“ugly mouth”) “IZZO” (“snail”), and “MUSSOLINI” (“gnats”).

 


Religious Affiliation

Our system has a code that determines religious affiliation. However, the
process cannot distinguish denominations, sects and splinter groups within individual religions. For example, it cannot accurately determine who is Baptist or Calvinist within the Protestant group. Nor can it select Hasidic
Jewish groups from the Jewish population at large.

Religious affiliations are determined by geographic locators and ethnic
group identifiers. Yes, we will include some atheists and agnostics within
the groups. However, our percentage of accuracy still holds.

We are constantly utilizing new technology and developing new rules,
exceptions, and criteria that will allow our system to maintain its high level
of accuracy.

 

African-American

Our system differs from conventional approaches in that it goes well beyond knowing which areas have high concentrations of Afro-Americans. (Most compiled lists generate this select based on the “neighborhood” approach; i.e. if you live in an a high percentage “nonwhite” zip code you must be African-American).

Our process identifies African based Afro-American names with its unique
first name and surname tables. Individuals identified in this manner may
reside anywhere in the United States, not just in African-American clusters.

In addition, our system identifies Afro-Americans with non African based
but unique first names anywhere in the United States. Sheneka Brinter living in Conway, Arkansas is African American. So is Amarta Azubuike.

As a further safeguard, our system looks within the African-American
clusters and eliminates all non-black ethnicities, qualifying only those individuals with commonly borrowed ethnic names and certain Islamic names.

The system continually refines the selection criteria to ensure that the name identified as African-American will be African-American; not just a “could be” but an “is”.

 

Hispanic

Our system identifies Hispanic individuals by unique last and first names
using rules and exceptions that apply to these names. Geographic mapping confirms the locations of this population.

Our process will identify Hispanics in non-Hispanic areas. For example, in Conway, Arkansas we identified Juanita Beene as Hispanic NOT by zip code cluster and NOT by last name but by FIRST NAME. John Martinez was identified NOT by first name or zip cluster but LAST NAME.

Surname based systems cannot identify non-Hispanics with Hispanic
surnames. Our ethnic encoding system can and does.

There are many multiethnic names (e.g. Delgado) which could be Hispanic but could also be another ethnicity (e.g. Italian). Our system can separate the multiethnic name into its proper component ethnicities by using first names where possible. The remainder are stored with the multiethnic uncoded class until they are verified as being Hispanic using first name indicators. Hispanic women who marry individuals with non-Hispanic surnames are identified by our system’s unique first name table. Quite often, Hispanics marrying Hispanics lead to hyphenated names Our system identifies these and some misspelled names with its ethno-linguistic rules.

Surname based systems are simply that: systems that use only the last name of an individual to infer that individuals ethnic, religious or minority status. Our system takes this idea and expands on it. Thus, our Hispanic names are HISPANIC Hispanic names; not Portuguese, Italian, and other names based on conventional wisdom.

 

Japanese

Almost all Japanese names are comprised of descriptive components put together. Hence, the names are almost musical due to their repetitive vowel sounds.

Although the Japanese surname is the easiest Oriental name to distinguish, (especially since they were not influenced by the Chinese) most surname based systems have included all Asians in a category known as “Oriental”. Therefore, the Koreans, who not only share certain surnames with the Chinese but often introduce Chinese qualifiers to their names, are mixed with the Japanese and other Asian ethnic groups.

Our system has separate and unique prefix and suffix rules and exceptions for all ethnic groups including those representing the continent of Asia. Also, we have an extensive Japanese surname and first name table.

These features allow us to identify Japanese in traditionally non-Japanese areas such as Mark Tanaka and Junk0 Takahashi in Conway, Arkansas and to also identify Japanese women who have non-Japanese surnames.

It is important to remember that each ethnic group within the Asian community considers themselves to be mutually exclusive of the others. Japanese wish to be identified as such and not confused with or considered as other Orientals. Other systems either overlook this fact or have not developed components within their programs to allow proper ethnic identifications.

Our system considers each ASIAN ethnic group as a separate and identifiable selection. This has been attained by creating rules and exceptions based on the study of the history and development of surnames and first names with the culture of each country. This allows the user of our system to select a particular ethnic group, such as Japanese from the larger Asian or Oriental category.

 

Completeness and Accuracy

In order to correctly determine the accuracy of our data and the algorithms contained within, we contracted with a national market research company to conduct a telephone study. In March of 1999, a major telephone study was conducted. Sample size was determined by the research company to ensure that the resulting data would be at the 95th level of confidence. We set up quotas by major ethnicity (Hispanic, African American, Asian and "Other") in an effort to make sure each was properly represented in the study. A total of 1,566 telephone interviews were conducted. A telephone methodology was chosen as opposed to a mail study because we felt we would be able to reach a larger number of respondents via telephone quicker and at less cost than a mail study.

The sample for the study was pulled from our national database using a random nth selection in order to get a statistically valid cross section of the database. Each piece of sample was assigned a sample number. This number was used after the data was tabulated to cross match the data from each individual completed interview with the data for that record contained in our database. In other words, if a respondent indicated that they are Hispanic in the survey, we would look at that respondent's data record in our database to see if the data matched as a way of checking accuracy. This extra step was done in addition to the standard data tabulations that were completed for the study.

Our findings indicated that different ethnicities produced different levels of cooperation and accuracy. Please see the chart below, which reflects cooperation and accuracy by major ethnicity.


ETHNICITY COOPERATION ACCURACY
HISPANIC
48%
94%
ASIAN
39%
86%
AFRICAN AMERICAN
47%
90%
OTHER
46%
92%

 

Descriptions and Explanation of Usage

On the following pages are summary level counts by ethnicity which depict the actual record counts our ethnic system stores and utilizes when analyzing an individuals full name and address.

For each ethnicity there are columns for the number of onomastic rules that apply to that ethnicity, the number of unique first names applicable to that ethnicity, and the number of surnames stored for that ethnicity.

ONOMASTIC RULES (Prefix & Suffix Rules)

There are 1,157 onomastic rules currently implemented in our ethnic system. Each rule reaching implementation level was hypothesized and tested to ensure validity. Many hypothesized rules were not implemented as they were found to be only partially valid. Implemented rules apply to the examination of the prefix and suffix of a surname. When an individuals ethnicity cannot be determined by looking at the whole name, its component parts, the prefix and suffix are analyzed and matched against the rule files in a specific order. The order is governed by length of argument, i.e., search five character suffix before four character, the three character etc.

Thus, all names ending in “KOSKI” not found on the surname file or matched versus a unique fast name file will be coded as Finn because of the onomastic rule. Other names ending in “SKI”, but not “KOSKI” will, after not being coded with first and last name examination, result in the individual being coded Polish.

Our system does not require all Polish names ending with SKI” to be on its surname file, nor does it require all Finnish names ending in “KOSKI” to be on the surname file. There are many advantages to this open ended approach.

Misspelled names, hyphenated names, and names new to this country are a few. Our Onomastic rules allow our process to outperform other surname based systems in all three above cases.

UNIQUE FIRST NAME FILE

The key operative word here is “unique”. While “Anthony” is a name very commonly used in Italian families, it is not unique to Italian families. Hence, there is no Anthony in the unique frost name file. Nor are “Juan” or “Pablo” to be found in the unique first name file.

Because of the assimilation process that has occurred in the U.S.A. there is not a single unique first name stored under English. Our system currently recognizes 2 1,60 1 unique first names that can be pegged to a specific ethnicity.

While there are no absolutes, the chance that a person with a first name “Fumihiko” is other than Japanese is statistically irrelevant. Likewise, a person with the Igbo first name “Ogochukwu” is statistically unlikely to be other than from Africa or is an African American, even if their last name is “Smith”.

SURNAME FILE

Our system currently has 129,76 1 surnames on its surname file. Where surnames are useful due to numerous variations in prefix and suffix spellings, such as in Italian, there are a correspondingly large number of surnames compiled for that ethnicity. There are over 18,000 on file for Italian. Where a large proportion of individuals can be determined by either unique first names or onomastic rules, there are fewer names needed. For instance, in Japanese there are only about 2500 surnames on file, but there are 182 onomastic rules and another 500 plus unique first names.

 

ETHNICITY ONOMASTIC RULES UNIQUE 1ST NAMES SURNAMES
English 54 0
12,688
Scot 3 71
3,628
Dane 1 71
608
Swede 71 129
1,279
Norw 7 58
927
Finn 52 165
1,732
Icelandic 3 1
108
Dutch 84 71
6,768
Belgian 0 4
632
German 78 27
13,188
Austrian 0 0
580
Hungarian 27 204
1,720
Czech 4 24
1,504
Slovak 1 6
160
Irish 21 13
3,792
Welsh 4 48
268
French 45 34
8,634
Italian 112 130
18,767
Spanish 65 1776
10,064
Portuguese 6 148
562
Polish 36 152
4,824
Estonian 1 17
184
Latvian 3 106
188
Lithuanian 8 67
492
Ukranian 6 65
748
Georgian 5 6
124
Byelorus 0 0
120
Armenian 1 1
908
Russian 42 0
6,304
Turk 1 189
340
Greek 57 168
3,448
Persian 0 94
668
Moldavian 0 0
20
Bulgarian 1 140
568
Romanian 9 147
972
Albanian 0 9
11
Native American 0 400
39
Slovene 0 15
48
Croatian 0 66
523
Serbian 4 42
1,123
Bosnian 0 1
68
Azerb 0 1
19
Kazakh 0 4
51
Afghan 0 24
3
Pakistani 0 4
56
Bengladesh 0 0
10
Indonesian 0 15
62
Indian 84 1,228
2,472
Burmese 0 27
13
Mongol 0 41
59
Chinese 1 2,576
964
Korean 0 4,596
616
Japanese 182 760
3,284
Thai 55 860
1,148
Malay 0 2
18
Laotian 1 220
528
Khmer 10 36
236
Vietnamese 00 1,288
632
Sri Lanka 0 11
20
Uzbek 0 1
28
Misc Orient 0 6
16
Jewish 8 2,976
6,428
Arab 83 492
3,770
Egyptian 0 2
68
Ruandan 0 0
23
Tonga 0 1
2
Senegal 0 4
14
Sudanese 0 0
2
Moroccan 0 3
67
Afric-Am 4 224
680
Kenyan 0 160
144
Nigerian 0 304
236
Ghana 0 109
35
Zambia 0 0
20
Zaire 0 5
17
Surinam 0 0
4
Mozambique 0 0
3
Ivory Coast 0 7
23
Bhutanese 0 0
3
Ethiopian 3 169
560
Ugandan 0 260
31
Botswana 0 0
3
Cameroon 0 1
16
Zimbabwe 0 58
28
Congo 0 0
3
Cent Af Rep 0 0
1
Togo 0 1
1
Bahrain 0 0
1
Qatar 0 0
1
Guyana 0 0
0
Tibetan 0 1
1
Fiji 0 0
1
Swaziland 0 0
3
Namibian 0 0
3
Burundi 0 0
8
Tanzania 0 41
19
Gambian 0 0
3
Somalia 0 0
2
Macedonia 0 0
4
Chad 0 0
3
Gabonese 0 0
2
Angola 0 0
2
Chech 0 7
23
Kirghiz 0 2
2
Tajik 0 0
2
Algerian 0 2
34
Phillipine 0 6
8
Lesotho 0 3
7
Tunisian 0 0
16
Hawaiin 0 69
1,440
Madagasgar 0 2
7