The sci.lang FAQ: 8

8 How are present-day languages related?

[Previous] [Next] [Index]

[--Scott DeLancey]

This is an incomplete list of some of the world's language families. More detailed classifications can be found in Voegelin and Voegelin, Classification And Index Of The World's Languages (1977), and M. Ruhlen, A Guide To The World's Languages (1987). (Note: Ruhlen's classification recognizes a number of higher-order groups which most linguists regard as speculative).

A language family is a group of languages that have been proven to have descended from a common ancestral language. Branches of families likewise represent groups of languages with a more recent common ancestor. For example, English, Dutch, and German have a common ancestor which we label Proto-West-Germanic, and thus belong to the West Germanic branch of Germanic. Icelandic and Norwegian are descended from Proto-North Germanic, a separate branch of Germanic. All the Germanic languages have a common ancestor, Proto-Germanic; farther back, this ancestor was descended from Proto-Indo- European, as were the ancestors of the Italic, Slavic, and other branches.

Not all languages are known to be related to each other. It is possible that they are related but the evidence of relationship has been lost; it's also possible they arose separately. It is likely that some of the families listed here will eventually turn out to be related to one another.

While low-level close relationships are easy to demonstrate, higher-order classification proposals must rely on more problematic evidence and tend to be controversial. Recently linguists such as Joseph Greenberg and Vitalij Shevoroshkin have attracted attention both in linguistic circles and in the popular press with claims of larger genetic units, such as Nostratic (comprising Indo-European, Uralic, Altaic, Dravidian, and Afroasiatic) or Amerind (to include all the languages of the New World except Na-Dene and Eskimo-Aleut). Most linguists regard these hypotheses as having a grossly insufficient empirical foundation, and argue that comparisons at that depth are not possible using available methods of historical linguistics. For more see question 22.

This list isn't intended to be exhaustive, even for families like Germanic and Italic. Nor is it the last word on what's a 'language'; see question 12.

Note: English is not descended from Latin.
English is a Germanic language with a lot of Latin vocabulary, borrowed from French in the Middle Ages.

[Maps of the world's language families]
[Numbers from 1 to 10 in thousands of languages]


Italic Celtic Hellenic: Greek (ancient and modern)
Slavic: Russian, Bulgarian, Polish, Czech, Serbo-Croatian, etc. (but not Rumanian or Albanian)
Baltic: Lithuanian and Latvian
Indo-Iranian Albanian: Albanian
Armenian: Armenian
Tokharian: Tokharian (an extinct language of NW China)
Hittite: Hittite (extinct language of Turkey)

Semitic: Arabic, Hebrew (not Yiddish; see above), Aramaic, Amharic and other languages of Ethiopia
Chadic: languages of northern Africa, e.g. Hausa
Cushitic: Somali, other languages of eastern Africa
Egyptian: Ancient Egyptian
Berber: languages of North Africa

NIGER-KORDOFANIAN: includes most of the languages of sub-Saharan Africa. Most of the languages are in the Niger-Congo branch; the most widely known subgroup of N-C is Bantu (Swahili, Zulu, Xhosa, etc.)

URALIC: Finnish, Estonian, Saami (Lapp), Hungarian, and several languages of central Russia

MONGOL: Mongolian, Buryat, Kalmuck, etc.
TURKIC: Turkish, Azerbaijani, Kazakh, and other languages of Central Asia
TUNGUSIC: Manchu, Juchen, Evenki, Even, Oroch, and other languages of northeastern Asia

Some linguists group these three families together as ALTAIC. Rather more controversially, some add Korean and Japanese to this group.
It has been claimed that URALIC and ALTAIC are related (as URAL-ALTAIC), but this idea is not widely accepted.

DRAVIDIAN: languages of southern India, including Tamil, Telugu, etc.

Sinitic: Chinese (several 'dialects', or arguably distinct languages: Mandarin, Wu (Shanghai), Min (Hokkien [Fujian], Taiwanese), Yue (Cantonese), Hakka, Gan, Xiang
Tibeto-Burman: Tibetan, Burmese, various languages of Burma, China, India, and Nepal

Mon-Khmer: Vietnamese, Khmer (Cambodian), and various minority and tribal languages of Southeast Asia
Munda: tribal languages of eastern India


Most of these languages fall in a branch called Malayo-Polynesian

JAPANESE: A number of linguists argue that Japanese is ALTAIC; others,
that it is most closely related to Austronesian, or that it represents
a mixture of Austronesian and ALTAIC elements.

TAI-KADAI: Thai, Lao, and other languages of southern China and
northern Burma. Possibly related to Austronesian.
An outdated hypothesis that Tai is part of SINO-TIBETAN is still often found in reference works and introductory texts.

AUSTRALIA: the Aboriginal languages of Australia are conservatively classified into 26 families, the largest being PAMA-NYUNGAN, consisting> of about 200 languages originally spoken over 80-90% of Australia.

A large number of language families are found in North and South America. There are numerous proposals which group these into larger units, some of which will probably be demonstrated in time. To date no New World language has been proven to be related to any Old World family. The larger North American families include:

ESKIMO-ALEUT: two Eskimo languages and Aleut.

ATHAPASKAN: most of the languages of Alaska and northwestern Canada, also includes Navajo and Apache. Eyak (in Alaska) is related to Athapaskan; some linguists put these together with Tlingit and Haida in a NA-DENE family.

ALGONQUIAN: most of Canada and the Northeastern U.S., includes Cree, Ojibwa, Cheyenne, Blackfoot

IROQUOIAN: the languages of NY state (Mohawk, Onondaga, etc.) and Cherokee

SIOUAN: includes Dakota/Lakhota and other languages of the Plains and Southeast U.S.

MUSKOGEAN: Choctaw, Alabama, Creek, Mikasuki (Seminole) and other languages of the southeast U.S.

UTO-AZTECAN: a large family in Mexico and the Southwestern U.S., includes Nahuatl (Aztec), Hopi, Comanche, Paiute, etc.

SALISH: languages of Washington and British Columbia

HOKAN: languages of California and Mexico; a controversial grouping

PENUTIAN: languages of California and Oregon; also controversial

Work on documentation and classification of South American languages still has a long way to go. Generally recognized families include:

ARAWAKAN, TUCANOAN, TUPI-GUARANI (including Guarani, a national language of Paraguay), CARIBAN, ANDEAN (including Quechua and Aymara)

LANGUAGE ISOLATES: A number of languages around the world have never been successfully shown to be related to any others-- in at least some cases because any related languages have long been extinct. The most famous isolate is Basque, spoken in northern Spain and southern France; it is apparently a survival from before the Indo-Europeanization of Europe.

[Previous] [Next] [Index]