The Turkic Languages in a Nutshell

A revised taxonomic description
with comment and illustrations
based upon linguistic and historical analysis


Special appreciation to Yusuf B. Grsey for reviewing this web page 
and providing many valuable remarks and corrections at sci.lang



Version 8.0

04/2009 (first online) > 10/2009 (major update) > 11/2010 (classification rearranged) >
10-12/2011 (minor corrections) > 03-04/2012 (corrections, fonts changed, classification update, English transcription remarks, songs, references added) > 05/2012 (Chulym, Khwarezmian, Nogai, Kumyk, Karaim, Sibir Tatar, Baraba added or rewritten)
> 07/2012 (corrections, Sakha maps added)
> 02/2013 (spelling and editing corrections) > 01/2014 (paragraphs added, font and layout adjusted to increase readability, some controversial parts removed, editing corrections)




The origins of the Bulgaro-Turkic languages

The migration of the Turkic peoples
A draft of the Bulgaric and Turkic migration
from 1000 BCE to 1000 CE,
an older version (2008)

The spread and migration of Turkic tribes and languages between the 6th and the 11th century
Historically attested
later migrations of Turkic peoples
between 500 and 1200 CE (2012)

The Turkic languages is a closely related phylogenetic language group further related to Mongolic and Tungusic languages [see, for instance, Hugjiltu (1995)[5] and herein (2009-12)[3](2012)[5a]], within the proposed Altaic family [e.g. Starostin (1991)[8]].

Whereas Turkic languages is a generally-accepted term, another correct name for a taxon comprising the Turkic and Bulagric languages could be Bulgaro-Turkic because of the early separation of the Bulgaric branch from the rest of the stem; consequently, Bulgaric and Turkic can rather be used as names of the two sibling branches, even though this terminology is far less common.

According to the present glottochronological study,[2] the Bulgaric languages apparently branched off from Turkic at a rather early period of time, most likely c. 1100-900 BC, which is considerably earlier than normally cited elsewhere.[10][10a][10b] The discrepancy can be attributed to the use of apparently incorrect Starostin's glottochronological formulas in other studies, although the exact date cannot be calculated with precision due the taxonomical uniqueness of Chuvash and possible lexicostatistical fluctuations.

The location of the Proto-Bulgaro-Turkic Urheimat is still controversial, however there may be reasons to believe that it may be confined to an area in northern Kazakhstan southwest of the Irtysh River /eer-TISH/, including its tributaries Tobol /taw-BAWL/ and Ishim /ee-SHIM/. This conclusion can be drawn from the position of the Bulgaro-Turkic center-of-gravity and the corroborative geolexical analysis, though an alternative and a more traditional hypothesis places it near Mongolia.[4]

The geolexical analysis based on the materials collected in SIGTY, Lexis (2002)[9] suggests that the Proto-Bulgaro-Turkic people lived in the open habitat with deciduous groves (birch, willow, aspen, linden); occasional marshland; freshwater and saline lakes with various fish, waterfowl and small mammal fauna, particularly beavers. Terms denoting taiga or desert ecozone have not been preserved. The Proto-Bulgaro-Turkic people were well familiar with crop cultivation (millet, barley, Spelt, possibly flax), cattle and horse breeding, dairy products, metal working (bronze, copper and precious metals), horse harnessing and riding, as well as probably the wheeled cart.

The active spread and diversificational migration of Proto-Bulgaric and Proto-Turkic apparently began between 900 and 200 BCE, which matches the onset of the Iron Age in West Siberia and could be connected with the widespread introduction of equestrianism and iron weapons, though most details of this process are still hypothetical.


The geographic tree of the Turkic languages

The geographical tree
of Turkic languages (2012) (clickable)

The phylogenetic tree (dendrogram) of the Turkic languages

The glottochronological tree
of Turkic languages (2012)

On the present classification of Bulgaro-Turkic languages

Turkology is probably one of the oldest branches of historical linguistics, given that the earliest sketch of Turkic dialects was drawn by Mahmud al-Kashgari c. 1073, years before the first Crusade. There were many previous attempts to build a consistent classification of Turkic languages [see for instance, Baskakov's review (1969)[7] for historiographic details]. The most prominent classifications were those of Rémusat (1820), Balbi (1847), Berezin (1848, 1857), Ilminskiy (1861), Vámbéry (1885), Radloff (1882), Katanov (1894), Aristov (1896), Mller (1896), Foy (1903), Korsh (1910), Winkler (1921), Samoylovich (1922), Rahmati (1922), Bogoroditskiy (1934), Ligeti (1934), Batmanov (1947), Rsnen (1949), Malov (1951),  Baskakov (1952, 1969, 1988), Benzing (1959), Menges (1959), Tekin (1980), Johanson (1998), Schoening (1999), Dyachok (2001), Anna Dybo (2006), Mudrak (2002, 2009), ASJP (2009). Accordingly, a slightly different version has been published about every five years for the past two centuries or so. Whereas some of these classifications were just superficial attempts without much justification, others were part of a lifetime work (e.g. Radloff, Baskakov).

The classical Baskakov's classification,[6][7] first presented in 1952 (then republished in 1969, 1988), was widely accepted in the Soviet/Russian Turkology at least until the 2000's, and seems to have affected even some of the western approaches. It did not include any lexicostatistical studies, however, and most of its conclusions were based upon phonological and some grammatical observations alone. In his books, Baskakov used expressions like "a complex system isogloss" by which he apparently understood a vague conglomeration of lingustic traits, which marks his classification as rather phenetic in nature.
As to other recent works, Anna Dybo's research (2006)[10a] is purely lexicostatistical, based on Swadesh-100, whereas Oleg Mudrak's classification (2002, 2009)[10b] is phono-morphostatistical.

The present taxonomic system has been rebuilt nearly from scratch, and is not directly based on any previous classifications, consequently, it may differ from earlier works in several aspects. It attempts to take a closer look at the phonolgical, grammatical, and lexical features of the Turkic languages, as well as their known geography, history and archaeology. Speaking in biological terms, it can also be seen as an attempt at a cladistic phylogeny which tries to build a taxonomic system based on shared innovations.

All the linguistic argumentation and other theoretical studies concerning the present classification are provided in The Internal Classification and Migration of Turkic Languages (2009-2012), a separate online article. The lexicostatistical research with possible dates can be found in The Lexicostatistics and Glottochronology of the Turkic Languages (2009-2012).

The present taxonomic description hardly address any obsolete languages, for which no lexical data were found either because of access difficulties or the nearly complete absence of historical evidence (e.g. "Hunnic"), therefore by no means should this study be viewed as exhaustive. The total number of modern Turkic ethnicities may exceed 50, especially if all the large dialect-languages and notable historical ethnic groups with individual self-appellations are counted, so it is difficult to mention and describe all of them. Consequently, the present series of articles has mostly been focused on getting all the major subgroups together in the proper order, something that was particularly hard to accomplish considering the close proximity of most Turkic sub-branches and their posterior interaction.

It should also be noted that this particular page was inspired by the comprehensive work on the numerals of the world conducted by Mark Rosenfelder.

The nine nouns listed below were carefully chosen to visually demonstrate the maximum phonological differences across the Turkic languages, unlike the numbers which simply run from 1 to 10. Font colors tend to mark phonologically similar lexemes, except the black color that stands for "unclassified", or gray that marks an "internal lexical replacement or borrowing". One should not pay much attention to the colors, these are mostly auxiliary and were used to analyze the material at the initial stage, but were not removed afterwards, since they still help to visually pick up similar phonetic elements.



On the mutual proximity of Turkic languages

The lexicostatistical proximity map of the Turkic languages
The lexicostatistical proximity map
of Turkic languages (2012) (clickable)

A frequently asked question concerns the mutual intelligibility between Turkish and other Turkic languages. The question has been explored, for instance, by Talat Tekin (1979).[22] Of course, no two languages can be entirely "mutually intelligible", let alone the subjectivity of this concept, so by mutual intelligibility we understand mutual lexical proximity under standardized conditions. In any case, it turns out that Turkish is pretty much a western language and therefore is rather distant from other Turkic subgroups. Of the major Turkic languages, it exhibits close proximity only to Azeri and some of the lesser Seljuk languages (such as Gagauz, to which it is particularly close), sharing with them most grammar and vocabulary (cf., say, the relatedness between Spanish and Portuguese). There is much less mutual intelligibility with Turkmen than one could expect from their common Oghuz descent. On the other hand, Uzbek and Uyghur, despite being even more geographically distant, still share lots of familiar Old Turkic, Persian and Arabic words with Turkish and can be learned with some effort as any two comparable in-group languages, cf. for instance English and Danish.

The intelligibility of Turkish with the languages that had limited contact with Oghuz tribes and the Perso-Arabic world, such as Kazakh and Kyrgyz, let alone the languages located east of the Irtysh River line or beyond the Altay Mountains, is extremely poor or zero. For instance, speaking just one of the Oghuz languages, it is hardly possible to understand anything but a few words in Kazakh without preparation. However, many similar words and typical idioms — for instance, such as the local variants of var/bar/pur "there is" and yok/jok/s'uk "there is not", to name just one of the most frequently used ones — can be picked up even as far as Sakha and Chuvash, whereas the fundamentals of basic grammatical structure and many mophological suffixes are largely similar in all the Turkic languages.

Using the meticulous lexicostatistical study of 215-word Swadesh lists,[2] it is possible to make certain conclusions concerning the actual mutual proximity of the Turkic languages (see the clickable map above). Outside of (1) Chuvash and (2) Sakha, which have been known for centuries for their independent positions, there are several internal lexical clusters or intelligibility islands: (3) Oghuz-Seljuk, (4) Great-Steppe, (5) Altay-Khakas, (6) Tuvan, (7) Yugur (Yugur is not measured herein because of the scarcity of lexical materials but it is clearly different), although (3a) Turkmen and (4a) Karachay-Balkar likewise seem to be rather detached from the rest.

Note that in real speech, the value for the subjective intelligibility will normally be much lower than the figures in the map obtained for the standardized lexical lists. For instance, 50% in the diagram will approach zero in a real idiomatic fluent speech of a native speaker, because of many additional effects. On the other hand, the abundance of shared Arabic, Persian or Russian borrowings may at times contribute to the intelligibility in formal speech even between distant languages.

A note on the Silk Road and the Central Asian Bridge

One can better understand the migration of Turkic languages after familiarizing with the geography of the Silk Road and the concept of the *Central Asian Bridge. During the Middle Ages, people could not use flying carpets. Any kind of travel or ethnic migration could only proceed along narrow, geographically suitable pathways extending between deserts and mountain ranges and forming a natural, permanent network of migration routes. Basically, in Central Asia, a considerable part of this network became known as the Silk Road. The Silk Road is often considered merely from the economic perspective, although it also played a critical military, cultural, demographic, and linguistic role being a unique, vital artery which conveyed and maintained life in Eurasia for many generations. The Huns, the Turks, the Mongols, the Gipsies, whoever passed through Central Asia, could only travel along this natural migratory system; consequently, the distribution and classification of peoples in Asia is in fact nearly predetermined by the geographical structure of its routes and adjacent areas. That is especially true of the Turkic, Mongolic, and Iranian peoples who have lived by and off the Silk Road for hundreds of years. The Silk Road was also a streaming jet of genes running in the opposite directions that contributed to the exchange of human DNA in Eurasia. It also carried infections, such as plague, in both directions, and brought tea, paper, compass, gunpowder, and other inventions to Europe causing it to rise from the Middle Ages into the era of art, reason, technology, as well as fierce firearm warfare.


A note on clan societies

The social structure of Turkic (and other Eurasian) tribes has been based on the system of patrilineal clans. In Europe, the clan structure has been well-known for the Celtic tribes [cf. Scottish Gaelic clann, Old Irish cland "tribe, offspring", also cf. semantically similar English kin, Old English cynn "relatives, family"]. In many ways, clans and their names worked in the same way as modern European surnames, which are apparently nothing but remnants of the Indo-European clan structure.

Until the 20th century and sometimes even later, the clans dictated many rules and laws of social living. Each man was supposed to know his family tree down to the 7th generation (as in the case of the Bashkirs and Kazakh) or at least to the 4th (Altayans). Each clan had a guardian spirit that could be interacted with through a shaman (kam) and some specific sacrifices and practices. A clan often had a legendary progenitor, whose story had been passed down in oral tradition, and who had often been connected to a totem animal.[23b][25] Moreover, a clan often possessed a cattle brand (tamga), which apparently is historically similar to the coats of arms used in European families. We assume herein that the Turkic clan structure can be seen as a model for many societies of the Bronze and Iron Age, including Indo-European.

Naturally, a clan members were considered brothers and sisters who had many social responsibilities and could not intermarry either entirely (e.g. Altayans) or until a certain generation. Even today many Turkic society members often regard themselves as part of a large social family as opposed to the Western much more individualistic worldview. Marriages were often arranged by parents at a very early age — sometimes even at the cradle — with a member of a specific neighboring clan. The memory of cradle or children's marriages seems to be still reflected in modern life when we say that "people are destined for each other". Though generally the marriage customs varied. For instance, in other cases, the young man could choose his bride, and the marriage was accompanied by paying the bride price (qalïn) to the bride's family. Furthermore, at least judging by Genghis Khan's story,[23a] in the case of the Mongols, wives and concubines could be obtained by force as war trophies. Alien clans could also be integrated into a local society, which explains why we find, for instance, Kipchak clans as far apart as the Altai Mountains and the Black Sea, and which also explains why people with different DNA haplogroups could be part of a society speaking the same language.

The names of Turkic languages and clan names often seem to be connected. As it has been shown in [On the origins of Turkic ethnonymy],[1] the name of the strongest and richest clan was often passed to the confederacy of clans, and sometimes, after a thousand years or so, to the name of a language. Taking the example of the Smiths family name in English, we could make a reconstruction of a certain male, apparently a blacksmith, that lived in England during a certain unknown period before the 10th century, and if the English clan structure were fully developed, the English language could presently be called something like "Smithish" or "Smithonian". Sometimes, such language naming was done almost deliberately in the course of the 20th century, for instance the failure to realize that the word Kypchak functioned basically in the same way as a family clan name, resulted in its sweeping extrapolation in Baskakov's classification [see below]. Moreover, and in practice, the Smith family name was probably reinvented and readopted many times, so not all the Smiths are related to each other; by the same token, this analogy explains that not anyone who is called a Tatar or Kypchak has in fact anything to do with the original progenitor of the Tatars or Kypchaks. In many cases, trying to find the original meanings of Turkic ethnonyms seems to be quite pointless, since usually they should not contain any more information than, say, such English surnames as Archer, Hawkins or Green, so unreasonable ethnonymic guessing is a constant source of errors and folk etymologies.

As Radloff explained in the 1860's,[23b] the 19th century's Kazakh social structure, which is apparently a typical representation of early Turkic societies in general, was built in the following way. At the basement of the social pyramid, there were 6-10 families forming an aul, awul /ah-(W)OOL/ (a village) that used a similar geographic pattern of migration throughout the year. The head of the awul was usually the oldest and the richest man, and most of the other members were personally related to him. At a winter camp (qïshlïq), several awuls formed a larger gathering, where the judicial power belonged to a bey, the richest alderman that was able to settle any conflicts or disputes between different auls. Several clan subdivisions of this type formed a full clan, where the internal matters were usually settled by a council of beys. At times, a group could branch off from the rest of the old clan and receive the name of its new ruling bey leader, thus forming a new clan. Finally, to defend from external enemies or to invade them and capture their pastures, cattle or slaves, a number of clans could be united into a horde (ordu) "an army", headed by an electable khan. The rulers and the ruling clans were known as ak sök "white bone", whereas the common people were called kara kalk "black people" or kara sök "black bone".


Notes on transcription

The UTF encoding, let alone the IPA signs, have been avoided herein right from the beginning, for reasons of compatibility, consequently the present system of transcription and transliteration may initially seem slightly unusual.

//, // is used as in Turkish or German.

// is a back high vowel similar to the Russian <bI> letter or the Turkish <I> vowel. A special note should be made on the pronunciation of /ï/ for English speakers, since the information in tends to be misleading. The closest match of /ï/ is the short English /i/ in kit, din, however /ï/ is a back vowel with the tongue being pushed much further into the throat, which creates a rather peculiar acoustic effect, distantly similar to the sound in cut or done. This vowel does not exist in English and it cannot be directly compared to a shwa in about, ago, since a schwa is a middle-middle vowel, and the /ï/ phoneme is supposed to be high-back. This sound seems to be a Eurasian areal phenomenon, so in addition to Turkic, it also exists in Mongolic, Korean, Slavic and many other neighboring languages. In the English spelling, it is usually denoted as <y>, e.g. Kyrgyz /kir-GIZ, keh-r-GEHZ/.

// is mostly a schwa as in about, but in some languages it may denote a different sound.

/N/ is the nasal /ng/.

/x/ is usually a velar <kh> similar to the Russian <x> or the Spanish <j>.

/sh/ as in English; /zh/ as in treasure but usually less palatalized.

// (in Bashkir, Turkmen) as in this; /ß/ as in thump.

/s'/ (in Chuvash) is a palatalized form of /s/ similar to a soft /s/ in Russian.

/d'/ is a palatalized /d/ in Altay Turkic similar to the very light pronunciation of <j> in English.

/J/ or /j/ is a sound similar to the <j> in Jack in Englsh.

/q/ and /G/ are respectively voiceless and voiced deep velars (or even uvulars). Note that <q> is the traditional way to denote the voiceless "throaty" velar in English, usually of Arabic (cf. "Quran"), or Turkic origin (cf. "Nissan Qashqai"). Even though this sound must have been the original Proto-Turkic phoneme, it seems to have been falling out of use throughout the Turkic history, being slowly replaced by /k/ and /g/ from Russian, Greek and other Indo-European languages. In other words, the /k/:/q/ distinction is in fact often non-phonemic: the /q/ is usually pronounced in /qa/, /qu/, /qo/, /qï/, but is moved forward allophonically in /ke/ and /ki/. Moreover, the younger Russian-influenced speakers may replace /q/ by /k/ entirely or attenuate it in all these cases.

/*P/, /*B/ (in Tuvan, Tofa, Proto-Turkic) is a way to denote reconstructed phonemes probably intermediate between /p/ and /b/ as in Mandarin or some Mongolic languages.

/D-/ (in Yugur, Tuvan) is a phoneme usually intermediate between /t/ and /d/ as in Mandarin.

/*-D-/ (in Old Turkic, intervocal) is a reconstructed phoneme that was probably similar either to the Spanish intervocal -d- or the interdental English // [uncertain].

/*S/ (in Proto-Bulgaro-Turkic) is a reconstructed phoneme with much surrounding controversy, probably similar either to the palatalized /s'/ as in Chuvash or the Japanese /sh/ or the soft Russian /sch'/ or even the English /j/.

/*R/ (in Proto-Turkic) is a reconstructed trill, probably a mixture of /r/ and /z/ as in Czech.

/*L/ (in Proto-Bulgaro-Turkic) is a reconstructed palatalized lateral fricative apparently similar to the one in modern Khalkha Mongolian, essentially a mixture of /l/ and /s/.

/*H/ marks intense aspiration or a similar reconstructed phoneme.

/ ' / after vowels (in Chuvash) marks stress; after consonants it marks softeness.

The pronunciation of certain other phonemes may in fact be unconfirmed, unattested or unknown.

The Turkic languages do not have any clearly defined rules for the dynamic stress as the European languages do, so the stress seems to vary depending on the intonation, but separate Turkic words are normally pronounced stressed on the final syllable, e.g. usually Tatar /tah-TAR/ not /TAH-ter/ as in English.


Attempts at the Proto-Bulgaro-Turkic reconstruction

Any kind of reconstruction of a proto-language is more of an art than an exact science, so inevitably it should be taken with a grain of salt. As one should understand perfectly well, there is no such thing as the correct or generally-accepted reconstruction, they all are merely artificial approximations that normally cause much unsubstantiated argument among different authors, and in many cases are unfalsifiable. Consequently, Starostin's team's work — typically cited for Proto-Turkic — cannot be viewed as ultimate reality, either. For the same reason, there was some disagreement between Yusuf Grsey and me (2009-10) on a number of issues in Proto-Turkic, e.g. the problem of the initial S*- vs. y*, the initial t-/d-, b-/m- controversy, the final -q in Chuvash, etc. In any case, the following brief reconstruction has been performed to the best of our knowledge and according to the guidelines in the introduction to the main article.[1]


dry leaf sleep horn liver house
*xurGux *Sl-bar-
*uDu- *mïR, *muR *bar, *bawr *e:B




The Bulgaric peoples were first attested west of the Southern Ural /YOO-ral[26], near the lower and middle course of the Volga River and in the Ponto-Caspian Steppes during the fall of the Roman Empire, but the details of their origin are still obscure.

In any case and for all practical purposes, one should keep in mind that the difference between Bulgaric and Turkic is fairly significant, and they should rather be viewed as separate taxonomic groupings. Herein, we consistently reserve the term Turkic (Proper) to refer only to the languages outside Bulgaric, using Bulgaro-Turkic as the most general term.

Volga Bulgaric

Bulgars /BOOL-gars[26]/ were a subgroup of Turkic-related nomads that first appeared near the Caucasus mountains c. 350 AD and then c. 475 AD, on the Danube /DAN-yoob[26]/. They seem to have contributed to the creation of several medieval kingdoms: (0) the short-lived Old Great Bulgaria (632-671 AD), founded by Khan Kubrat in the Pontic Steppe which led to the formation of the other three affiliate states, ruled by Khan Kubrat's sons:
(1) Volga Bulgaria (670-1236 AD) along the middle course of the Volga River, which finally gave rise to present-day Chuvashia /choo-VUSH-eeya, chuh-VUSH-eeya/;
(2) Danube Bulgaria (670 -864 AD), which gave rise to the modern Slavic-speaking Bulgaria;
and finally (3) the Khazar Khagante /hah-ZAR, kah-ZAR/ (650-969 AD) near the Caspian Sea, which was famous for its Judaism but which finally vanished almost entirely.

The Bulgaric languages are only poorly attested in historical records. The Volga Bulgar and Danube Bulgar languages are only known from a few inscriptions written with Greek and Arabic characters or Old Turkic runes. Khazar is only known from the inscription oqurum "I have read" and the name of the city of Sar-kel "The White Home or Tower". Therefore, the only surviving remnant of Bulgaric languages is Modern Chuvash descending from the language of Volga Bulgaria.

Volga Bulgars
Volga Bulgars
Danube Bulgarian
A Danube Bulgarian

Chuvash ura
xrl tip
my pver
kil prr kk,
pllk ltt

Chuvash /chah-VAHSH, chuh-VUSH/, cf. Russophone pronunciation /choo-VUSH/, is still spoken in the Chuvash Republic (capital: Cheboksary /chebok-SAH-reh/, cf. the Chuvash original: Shupashkar /shoo-pahsh-KAR/) and is believed to be a direct descendant of the language of Volga Bulgaria (ancient capitals: Bolghar and Bilar, the latter was a large city of about 2 miles across).

Volga Bulgaria was founded c. 670, near the confluence of the Volga and Kama /KAH-ma[26]/. It consisted of many small towns found by archaeologists. Commanding the middle Volga, this state controlled trade between the northern Europe and Persia, and was similar in this respect to the Kievan Rus that controlled the Dnepr (Dniepr) River /NEE-per[26]/. Volga Bulgaria was Islamized in 922 after being visited by an Arab writer and diplomat Ibn-Fadlan. Curiously, his famous account inspired a modern book, whose plot was used to make The 13th Warrior movie starring Antonio Banderas.

Volga Bulgaria was destroyed during the Tatar-Mongol invasion in 1236. Consequently, Middle Chuvash has been strongly affected by Tatar. Today, the "Devil's Tower" in the Yelabuga /ye-LAH-booga/ town on the Kama River (left fig. below) is one of the few standing remnants of this long gone civilization, although the later buildings in Bolghar from the 13th and 14th centuries (right fig. below) also preserve its spirit. In 1552, the Russians seized Kazan /kuh-ZAN[26]/ further affecting the Chuvash language and culture.

In any case, the standalone position of Chuvash among other Turkic languages is rather indisputable, much of Chuvash lexical core is quite archaic, and it can be seen as a valuable data source for the purposes of Bulgaro-Turkic reconstruction.

There are 1.04 million speakers of Chuvash (2010),[24d] but most of them are bilingual in Russian.

As an example, listen to this very lovely folk song (mp3) in Chuvash with an English translation — note certain Slavic features in music and phonology.

Note that most of the music clips below have been chosen because of their unusual, enthralling or typically local tunes and lyrics and are recommended for listening as part of this ethnographic study.

Chuvash and Volga Bulgars
(1) Chuvash traditional dress (left);
(2) the reconstruction of the Bolghar City (right);
(3) the original Volga Bulgar tower in Yelabuga near the Kama river (left below);
(4) the restored buildings dating from the Golden Horde period (right below)


Turkic (Proper)


The map of the Altai Sayan Mountain System
The topographic map of
the Altai-Sayan Mountains,
based on maps from (clickable)

The supertaxon that excludes any Bulgaric languages is named herein Turkic (Proper). It is also sometimes confusingly known as Common Turkic, which may have misleading associations with Proto-Turkic or even certain modern Turkic conlangs.

The late homeland of Proto-Turkic Proper was evidently located near the Altai-Sayan Mountains /al-TY[26]/, /sah-YAHN[26]/, most likely near northwestern ridges of the Altai between 900 BC and 300 BC. This conclusion can be drawn from the following evidence: (1) the historical distribution of the early Turkic tribes and the result of backtracking their migration vectors; (2) the location of the center-of-gravity of the maximum language diversification area. The date above is inferred from a meticulous glottochronological analysis.[2] Similar hypotheses about the Altai localization of the Proto-Turkic peoples were suggested, in fact, at least as early as the 19th century.[25]

This Proto-Turkic period seems to match the onset of the Iron Age in West Siberia, when iron weapons and horse riding became very common, which might have contributed to the active spread of the early Turkic dialects. The glottochronologically determined time depth of the Proto-Turkic split, therefore, seems to be greater than that of Slavic or Romance (c. 1600 years ago, c. 400 AD) but more or less similar to that of Germanic.

It seems that initially there existed three main early Proto-Turkic dialects: (1) Eastern, that moved towards Lake Baikal thus forming Proto-Yakutic, (2) Central, that initially stayed near the Altai, and (3) Southern, that migrated into Dzungaria and Mongolia.

Despite considerable separation between these earliest branches, some of the Turkic languages within the internal subgroups may still retain a great deal of mutual intelligibility due to their recent diversification, common borrowings or posterior contacts.



(1) Eastern Turkic Languages

The map of the Altai Sayan Mountain System
Possible reconstructed migrations
of Proto-Yakutic (clickable)

The Eastern Turkic Languages is a major grouping that includes only two known representatives: Sakha (Yakut) and Dolgan (the northern offshoot of Sakha), which can also be collectively named Yakutic.

Note that the name Sakha /sah-KAH, SAH-kah/ is the original self-appellation, whereas Yakut /ya-KOOT[26]/ seems to be a Russophone exonym, but the two words are often used interchangeably.

The drastic discrepancy, that set Yakutic aside from any other Turkic languages, has been well recognized since the 19th century. Generally, there isn't much doubt that the Yakutic subgroup should be viewed as an important, early-splitting branch of the Turkic languages. Most glottochronological studies [e.g. Dyachok (2001)[10] and herein (2009-12)[2]] imply a very early separation of Proto-Yakutic from the main Turkic stem, though the exact date is not very clear (somewhere between c. 200-300 and 900 BC).

On the other hand, there seem to be certain common features that the Yakutic supertaxon shares with the Altai-Sayan languages. After a thorough consideration in this work, these features have been attributed to the secondary contact between the Proto-Altai-Sayan and Proto-Yakutic languages, that must have occurred along the Yenisei River soon after the initial Proto-Turkic split.


Subgroup 1:

The Lena migrants

Essentially, the Yakuts are a Turkic group that formed as an outcome of migration along the Lena River (Anglophone: /LEE-nah/, Russophone: /LEH-nah[26]/, incidentally, hence also the pseudonym of Lenin). This Lena migration has led to a large-scale distribution of Yakutic settlers that presently spread from the area of Yakutsk City all the way to the Arctic Ocean.

The Yakutic branch seems to be highly deviant in many respects, having little to do with its closest neighbors, Tuvan or Khakas. Sakha and Dolgan share many Russian and Mongolic cultural lexical borrowings, and much of their vocabulary seems to come from an unknown source, though they still retain many important archaic Turkic features, just as well.

  Sakha (Yakut) warriors
Sakha warriors (staged)
Lena River, Yakutia
A village along the Lena

Any details concerning the early Proto-Yakutic migration are inevitably hypothetical, however the present study[1] attempted to create a general outline of the subject. Before the beginning of the common era, Proto-Yakutic must have moved from the Minusinsk Depression in the Altai Mountains towards Lake Baikal by following the upper reaches of the Yenisei River that takes source in Mongolia near Lake Khövsgöl. Then, Proto-Yakutic tribes must have continued down the Irkut River until they reached the western shore of Lake Baikal /by-KAHL[26]/, where the sources of Lena are located. There on the western and southern shores of Baikal the Proto-Yakuts apparently must have formed a tribal confederacy, known as Kurykan /koo-reh-KAHN/, that existed between the 6th and 10th centuries, according to archaeological evidence and some scanty Chinese and Grkturk historical records.

The further migration down the Lena was a much later event, most likely (but not necessarily) connected with the notorious upheaval of the 13th century, when the Proto-Sakha could have been expelled from their Baikal habitat by the invading Buryats or other Mongolic tribes. This is supported by the evidence of a genetic bottleneck that most Proto-Sakha must have gone through[12a], and which may document an ancient holocaust, implying that most of the Proto-Yakuts were largely exterminated during that period. Survivors fled along the Upper Lena towards the present-day area of Yakutsk. This downstream migration along the Lena must have been relatively effortless in terms of geographic constraints.

The remote corners of the Lena basin were reached only after the introduction of firearms in the 17th century, whereas many distant areas of the taiga remain uninhabited up to this day.


Sakha (Yakut) ataq sulus khl kura:naq sebirdeq utuy- muos bar Jie, d'ie bi:r ikki s trt bies alta sette aGs toGus uon

Yakut /yah-KOOT/ (the usual name in Russian), or Sakha /sah-KHAH, sa-HA/ (self-appellation) is spoken along the Lena basin in the Sakha (Yakutia) Republic of Russia (capital: Yakutsk /yah-KOOTSK/), which is the largest in the world subnational governing body by area. Though looking large on the map, the region is in fact covered with dense taiga, and is scarcely populated, while most life is concentrated along multiple rivers.

Historically, the northern Yakuts were largely hunters, fishermen and reindeer herders, while the southern Yakuts raised cattle and horses. The city of Yakutsk (originally Lensky Ostrog "The Lena Fortess") was founded in 1632, when this territory was annexed by Russia. Religion: originally, Tengriism, then Orthodoxy. C. 450 000 speakers (2010),[24d] but most are bilingual in Russian.

  A Sakha girl
The Sakha Beauty Contest
Oymyakon, Yakutia
Oymyakon [OY-meh-KON], the Pole of Cold
Yakutsk in winter
Yakutsk City in winter

Dolgan atak hulus khl kura:nak hebirdek utuy- muos bar   bi:r ikki s trt bies alta hette agis togus uon

Dolgan /dol-GAHN/ is the northernmost offshoot of Yakutic, spoken near the Taymyr /ty-MIR/ Peninsula and other extremely sparsely populated areas of the northern tundra. Dolgan exposes evident Evenk influence and can apparently be regarded as Sakha that wandered over the local Evenk substratum. According to Ubryatova (1985), the main researcher of Dolgan, it separated from Sakha before the end of the 16th century. There are c. 7000 Dolgans (2002), of which less than 80% are actual native speakers.  


(2) Central Turkic Languages

Central Turkic is large hypothetical grouping that includes about the 70% of all the present-day Turkic languages that extend from the upper Yenisei basin in the east all the way across the Great Steppe until the Black Sea in the west. The major supergrouping consists of the two main subtaxa: (1) Altai-Sayan (Turkic) and (2) Great-Steppe (Turkic).

Curiously, most of the ethnic groups included into the Central supertaxon have been historically known as either Kyrgyz or Tatar. In some cases, these names were just faulty exonyms that were added afterwards, but in other they seem to be authentic, going back to a very early period of time. At any rate, Kyrgyz and Tatar appear among the oldest clan names used by the Turkic peoples.

Most ethnic groups in the Central Turkic supertaxon have been part of the Russian Empire since the 16th-17th centuries, so naturally most of the Central Turkic languages exhibit pronounced Russian influence particularly in the cultural and technical vocabulary.



Subgroup 2:

The map of the Altai Sayan Mountain System
An approximate distribution
of the Altai-Sayan languages
circa the beginning of the 20th century
(clickable), based on maps
from the 1940-60's[12b][12c][12d]

The Altai-Sayan subgroup [the name is introduced herein] includes Altay, Khakas, Tuvan and other closely-related languages. It has been named in this way herein, because it is distributed in the Altai and Sayan mountains (see the map).

The Altai-Sayan subgroup probably corresponds to the descendants of the so called Yenisei Kyrgyz, a historically important cluster of eastern Turkic tribes that were attested under various names in Chinese chronicles between 200-900 AD, but dissolved after the Mongol invasion in the 13th century. The territory of the Yenisei Kyrgyz in Khakassia was mentioned under the name Kirgizskaya Zemlitsa "The Kirgiz (Little) Land" during the clashes with the Russians in the 17th century.

The Yenisei Kyrgyz are said to have destroyed the Uyghur (= Gökturk) Empire in Mongolia and its capital Ordu-Balïq /or-DOO bah-LIK/ in 840 AD (see below), which caused the final dissipation of the Orkhon /or-HON/ Turkic peoples, but led to the rise of the Yenisei Kyrgyz Kaganate (840-1207).

Originally, the Yenisei Kyrgyz seem to have inhabited the Minusinsk Depression in Khakassia [Minusinsk /mee-noo-SINSK/ is a city near Abakan /ah-bah-KAHN/, the capital of Khakassia]. The Minusinsk Depression is a geographically suitable plane with steppes, lakes, and valleys located along the upper Yenisei River between the Kuznetsk Alatau /kooz-NETSK AH-lah-TOU/, and Western and Eastern Sayan Ridge. Protected by these mountains, the Minusinsk Depression has relatively mild climate convenient for agriculture, to the extent that even cherry and apricot orchards have been grown there at least since the 19th century [perhaps even forever because cherry pits have been found in archaeological excavations in nearby Tuva].

By proceeding south, up the Yenisei, and after crossing the Western Sayan Ridge, one can arrive into the interconnected Tuva Depression, where the Tyva Republic is located.

The Anglophone pronunciation of Tuva is /TOO-vah/, however the name of the country itself has been formally changed in the 1990's to Tyva /tuh-VAH/, which is supposed to be closer to the Turkic original, whence the modern-day spelling and pronuciation discrepancy.

By following even further along the uppermost reaches of the Yenisei, one can get into northern Mongolia originally inhabited by two very remote and frequently omitted Tuvan-related ethnicities, the Tsaatan /tsah-TAHN/ and the Soyot /saw-YOT/.

Four horsemen
A Genghis Khan movie
filmed in Tuva and Khakassia (2007)

Shor people
Shor people processing leather (1913)

Whereas Tuvans often still live in classical yurts, many other Khakas and Altay peoples seem to have lived in semi-subterranean log huts, leading semi-settled lifestyle, suitable for fishing, crop cultivation and metal working. It is in fact these types of permanent dwellings that are typically found in archaeological sites across West Siberia in the layers corresponding to the Bronze and Iron Ages, so it is not really clear which one of the two came first.

The Proto-Altai-Sayan, or Proto-Yenisei-Kyrgyz settlements seem to be identifiable with the Tashtyk /tash-TIK/ archaeological culture (2nd BC-5th AD) famous for their stunning, poignant funerary asks showing rather European features.

Another striking trait found among the local population is the odd ethnological resemblance of Altay and Tuvan shamans to the North American Indians, which may be far from coincidental, judging by the geographical proximity of Yeniseian tribes, which have recently been shown to be linguistically related to Na-Dene (see Dene-Yeniseian superfamily). The genetic studies conducted since 1997 too demonstrate a high concentration of Native American mtDNA lineages in Tuvan, Soyot, Khakas, Altay, and Buryat population [e.g. Zakharov (2003)].

As the name of the Yeniseian peoples implies, they have inhabited almost the same area as the Altai-Sayan Turkic, along the middle and upper course of Yenisei north of Tuva, approximately until the 17-19th century, and may possibly have transmitted certain ethnological and genetic characteristics to the Turkic immigrants.

The Altay and Khakas languages and dialects seem to be rather archaic, and contain relatively few non-Turkic loanwords in their basic vocabulary (except for the abundant cultural borrowings from Russian that have been coming along since the 17th century onward)). As a result, because of the smallest number of Arabic and Mongolic loanwords, as well as the purity and archaism of lexicon, Khakas, Altay, and Kyrgyz can be regarded among the most typical Turkic languages, preserving the maximum number of late Proto-Turkic features, so perhaps they may provide a rather good idea of what the late Proto-Central-Turkic or even late Proto-Turkic-Proper may have actually sounded like when it still existed. Note that Tuvan, on the otherhand, contains too many Mongolic borrowings.

The Altay and Khakas population has been historically subdivided into more than a hundred patrilneal clans, known as seoks (sö:k "bone").

The modern generic self-appellation of both Khakas and Altay peoples is in fact Tadarlar (Tatars), perhaps originally from a Russian exonym, because of the widespread usage of this name in the Russian Empire of the 18-19th century for any Turkic peoples.

Subgroup 2a:
Tuvan-Tofa (Sayan)

The Yenisei-Kyrgyz migrants to the Sayan Mountains

The Tuvan-Tofa subgroup [the name is introduced herein] includes Tuvan (proper), Todzhin, Tofa, Tsaatan and Soyot. The Tuvan-Tofa subgroup represents those ethnic groups that settled in the south of the region along the uppermost reaches of the Yenisei in the Western and Eastern Sayan Mountains.

In other words, from the geographic perspective, the Tuvans and their siblings can be seen as those Proto-Altai-Sayan tribes that migrated along the Yenisei from the Minusinsk Depression first into the Tuvan Depression and then into the nearby regions along the Mongolian border. For this reason, this Tuvan-Tofa subtaxon may also be referred as the Sayan subgroup.

Glottochronologically, the Tuvan-Tofa subgroup must have separated from Proto-Khakas and Proto-Altay by about 250 AD.[2]

The Tuvan languages and dialects are rather peculiar and exhibit many unusual words, including Mongolic borrowings, so for the most part, they cannot be understood not only by the Turks of Central Asia but even by their closest neighbors, the Khakas and Altay people.

The self-appellation Tofa or Tva might in some way be related to the name of the Tuba /too-BAH/ River in the Minusinsk Depression near Abakan, though this suggestion is controversial.

Note that the Tuvan and Tofa(lar) Cyrillic spelling systems may contain voiced symbols, such as <b>, <d>, <g>, which in practice denote the so called "weak" consonants that are normally pronounced Chinese-style: as unvoiced in the beginning and as semi-voiced in the intervocal position, as opposed to the <p>, <t>, and <k> that always denote aspirated consonants (as in the Mandarin Romanization).


Tuvan put slds qzl qurgag pr udu- mys pa:r g pir i:yi sh trt pesh ald chedi ses tos on

Tuvan is spoken in the Tyva Republic /teh-VAH/ (outdated: Tuva /TOO-vah/) (capital city: Kyzyl /keh-ZEL, kuh-ZUL/, lit. "Red"). The Tyva Republic is suitably located in the Tuvan Depression along the upper Yenisei between the Western Sayan Ridge and the Tannu-Ola Ridge near the Mongolian border. Tuvan has also been historically known under the ambiguous name Uriankhai /oo-run-HI/. Tyva was a de jure independent state between 1920 and 1944, when it was finally fully annexed by the USSR.

Traditional economy: nomadic horse and cattle breeding; sedentary life in towns since the 19th-20th century. Religion: Tibetan Buddhism and still some Tengriism. About 253.000 speakers (2010),[24d] of which at least 60% are bilingual in Russian. Still there seems to be a large number of monolinguals and true native speakers, evidently because of the isolated geographic position of Tuva.


Todzin                   bir i ysh drt peish ltï t'etï, chetï ses tos on
Karagas                   bir ihi is, trt beis, alt t~ed sehes tohos on
Tofa But slts qzl qurGaG Br udu- miis Ba:r G Bir hi ysh trt Beish lti chedi shes thos on

The Karagas people were thought to be extinct in the 19th century, yet the Tofa(lar)s /taw-FAH, taw-fah-LAR/ in the forests of the Eastern Sayan mountains seem to be their direct continuation. Tofa(lar) [the -lar ending is just a Turkic plural suffix] probably separated from Tuvan by migrating along the Greater Yenisei towards its source in the East Sayan mountains. The Tofa(lar)s have recently been studied in detail by Rassadin (1980's-2000's). Reindeer breeding and hunting in the taiga; Tengriistic shamanism and nomadism before the 1930s. About 760 persons, but only 93 formally listed as Tofa speakers (2010),[24d] and just 15 as active speakers (2002).

Moreover, there are c. 1900 Todzhin people (2010).

Tofalars today
and in the beginning of the 20th century

Subgroup 2b:

The Yenisei-Kyrgyz migrants along the Yenisei

The Khakas subgroup includes at least the following languages and major dialects: (0) (Standard Literary) Khakas /hhah-KAHS/, which is basically a rather artificial 20th century's literary koine based on Sagai; (1) Sagai /sah-GUY/ (presently, the most commonly spoken vernacular dialect of Khakas, situated to the east of the Kuznetsk Alatau Mountains), (2) Kach(a) (Russian káchinskiy; actually from the old self-appellation /qa:sh/; now rare, though still active in the beginning of the 20th century), (3) Kyzyl (almost extinct), (4) Koibal, (5) Beltir (both extinct); (6) Mras-Su Shor, (7) Kondom Shor (meaning the Shor people living along the Mras-Su and Kondom Rivers near the Kuznetsk Alatau west of the Sagai area); and finally (8) Middle Chulym /choo-LIM/ (spoken in a couple of villages, in remote northern areas along the middle course of the Chulym River), and possibly (9) Lower Chulym (acc. to a local researcher, the last speaker died in 2010). According to Baskakov's classification (1960-80's), the Khakas subgroup may also include some of the northern Altay dialects.

The ethnonym Khakas is an entirely modern invention created only in 1918; it was patterned on the then-supposed reading of Chinese chronicles [see the discussion in the published correspondence by Yakhontov, Butanayev (1992)].[13] Except for formal occasions, the word Khakas is still out of use in Khakas communities, with the main self-appellation Tadar(lar) being used instead. (The "Tadarlar" ethnonym is also accepted among the Altay people.) The reason why the original generic name for Khakas clans appears to be lost in history is perhaps the long-standing differentiation of the Khakas subgroup into many unconnected dialects and languages.

The Khakas peoples had traditionally practiced nomadic herding, agriculture, hunting, and fishing, but were mostly Russified and Westernized in the course of the 20th century.




azax chlts xzl xuruG pr uzu- m:s pa:r ib pir iki s trt pes alt cheti segis toGis on

Khakas /hhuh-KAHS/ is spoken in the Republic of Khakassia /ha-KAHS-iya/ (capital: Abakan /abah-KAHN/), that was annexed by Russia in 1727. As explained above, Khakas is rather a collection of vernacular dialect-languages originally dispersed along the upper Yenisei in the Minusinsk Depression, but presently mostly extinct, except for Sagai, which is spoken in villages along the Abakan River. Textbook Khakas exists mostly as a poorly-known standard that may differ from Sagai.

Formally, there are 72.950 persons who consider themselves "Khakas" but no more than about 42.000 Khakas speakers (2010),[24d], most of which being proficient in Russian.

  Khakas wedding
A traditional Khakas wedding (c. 1915)
Khakas woman Khakassia

Shor azaq chlts qzl quruq   chat- m:s   em pir iygi, igi sh trt pesh alt chetti segis togus on

Shor is a small ethnic group closely related to Khakas and located further west, in the forested areas of the Kuznetsk Alatau. Dialects: Mras-Su and Kondom Shor. Population: 2840 speakers (2010).[24d]

The Shor people created peculiar songs, such as Pörü "The wolf", so skillfully performed by singer Chiltis Tannagasheva. It sounds like this song really doesn't go along with the modern studio, rather being associated with an entirely different story of a prehistoric survival.


azh   qzl     uzi     ib br igi ush durt bish alt chiti sigis doGus on

Fuyu Kyrgyz is an often omitted and oddly located, and presently nearly extinct variant of Khakas in northeastern China. It is now remembered only by the elderly and only to a very small extent. It was originally spoken northwest of Harbin along the Nenjiang River near a town called Fuy, hence the odd exonym; the self-appellation is in fact Gïrgïs or Xïrgïs. The Fuyu Kyrgyz seem to have been exiled form Khakssia to Dzungaria in 1703-06 and then resettled to China in 1761 after the conquest of Dzungaria by the Qing Empire. They apparently belongs to the Khakas subtaxon (cf. namir < Khakas nanmr "rain"; and suG "water"). The Fuyu Kyrgyz were studied by Hu, Zheng-Hua (1982), and recently revisited by Butanayev (2005) from Khakassia, but no detailed description is available (in Chinese only?).

Religion: originally shamanism, then Lamaism. Population: only a few elderly speakers left, acc. to Butanayev (2005)



Chulym azaq,
chlts qzl,
xuruG pr uzu- m:s pa:r em
, uG
ts trt pesh alt chetti,

segis toGus on

The Chulym River /choo-LIM/, the tributary of the Ob, flows through the taiga a very long way from any areas populated by the Khakas or Altay peoples. As a result, the local Chulym villages seem to be situated at the very edge of the inhabited world: there are virtually hardly any human settlements to the north of that area for a good thousand miles — nothing but forests and marshland. [Note that there also exists another Chulym River, the tributary of Lake Chany, which is not connected to the Chulym of the Ob].

In the 20th century, Chulym was studied by Dulzon (1940-60's) and Biryukovich (1970's). After their formal recognition in 2001 as a separate ethnicity, the Chulym people began to set up their own village festivals and teach some language lessons.

Precontact way of living: fishing, millet and barley cultivation, semi-subterranean dwellings. Religion: shamanism before the 18th century, presently atheist or Orthodox. 355 persons, only 45 speakers (2010)[24d] (cf. 380 speakers in the 1970's).

Chulym people
(1) Pásechnoye Village located at the middle course of the Chulym River (3): almost a one-village country;
(2) Horsemen at the local Chulym festival (2010);

The existence of Melet and Tutgal variants in Middle Chulym, which are spoken in different villages, indicate at least several hundred years of linguistic differentiation within Chulym.

Lower Chulym has been traditionally described as a "dialect of Chulym", despite its many differences, the influence from Tomsk Tatar and its distant location, all of which set it rathe apart. Lower Chulym apparently became extinct in 2010. Küärik, a third main dialect of Chulym along the lower course of the Kiya river (a tributary of Chulym), had disappeared in the beginning of the 20th century.[16a]

These facts suggest that Chulym was in fact a small subgroup of closely-related languages.


Subgroup 2c:
Altay (Turkic)

The Yenisei-Kyrgyz migrants to the Altai Mountains

The Altay (Turkic) subgroup is a complex assortment of rather poorly studied dialect-languages with an ambiguous classification, some of which may exhibit proximity to Khakas, while others to the Kyrgyz language of Kyrgyzstan. The peculiarities of the lesser Altay languages are frequently underestimated or completely ignored.

There are presently 65.500 nominal speakers of the Altay languages (2002), though the local dialects quickly fall out of use. According to Baskakov (1969),[7] who studied some of the Altay dialects in vivo after the WWII, the Altay subgroup may have the following taxonomic structure:

The North Altay Turkic subtaxon includes:
(1) Kumandy /koo-MAHN-deh, koo-mahn-DEE/; population: 2890 persons, c. 740 speakers (2010);[24d]
(2) Chalkan /chal-KAHN/ or Kuu /KOO/; population: 1180 persons, all bilingual in Russian; named after the Kuu ("Swan") River;
(3) Tuba /too-BAH/ (rather intermediate between North and South); population: 1965 persons, 230 speakers (2010);

The South Altay Turkic subtaxon includes:
(1) Standard (Literary) Altay, or Altay-kizhi /al-TY kee-ZHEE/ from kizhi "person", or Altay (Proper).
There are 74.230 persons formally listed as "Altayans", and circa 56.000 speakers (2010).[24d] Before 1948, the Altay people were confusingly named "Oyrots" after the subgroup of Mongolic languages due to their interaction with the Dzungarians in the 18th century, even though Radloff in the 1860's had called them just "Altayans".
(2) Teleut /te-leh-OOT/ was used as a standard before 1917; population: 2640 persons, 975 speakers (2010); for a typical example of the Teleut speech, see this clip
(phonologically, it is probably pretty close to what late Proto-Turkic sounded like);
(3) Telengit /teh-len-GIT/is situated further in the mountains, thus is less affected by external influence; population: 3710 persons (2010).

Altay (Turkic) is sometimes viewed as rather intermediate between Khakas and Kyrgyz languages. However, much of the Altay vocabulary seems to match Khakas, and to a lesser extent, Tuvan, therefore, according to the present study,[1] Altay (Turkic) should be seen as part of the Altai-Sayan subgroup, being closely related to the Khakas subgrouping. Also, note that much of the southern Altai Mountains are located in eastern Kazakhstan, which may explain certain non-Altai-Sayan features in Altay Turkic as a result of secondary interaction with Kazakh [uncertain].

Note that the difference between the spelling of Altai Mountains and Altay (Turkic) languages; the names ending in -ai reflect an older historical spelling, whereas -ay is a modern English transliteration.

Also note that the Altai Republic (capital: Gorno-Altaysk, from Russian "Mountain Altaysk") and the Altai Krai /al-TY KRY/(from Russian Krai "country", administrative center: Barnaul /bar-na-OOL/) are geographically connected but politically different federal subjects of the Russian Federation that should not be conflated. The Altay people mostly live in the Altay Republic, isolated in the mountains, whereas Altai Krai, situated on the plain, presently is almost entirely Russian-speaking.


North Altay (Turkic)
Kumandy ayak;
kzl kurgak br uyta-; uykta m:s pu:r,
k, uk, uu bir eki, iki ch trt, trt pish alt cheti segis togus,

Kumandy is spoken by less than about 1000 speakers living along the Biya river in the Altai mountains. The Kumandy language was described by Baskakov (1972). The suffix -d in Kumandï is a Turkic suffix marking an adjective, therefore the original meaning was apparently "of Cumans, Cumanic".

Just like the other North Altay languages, Kumandy seem to share many common elements with Khakas, Chulym and Shor, cf. (1) *S- > ch- in cheti "seven" as in Khakas cheti; (2) the word-initial n'- in nïmrtka "egg" as in Khakas nmrxa; (3) the word-final -G in sug "water, river" just as in Khakas; (4) the archaic -d-bs, -d-vs ending in verbs in the 1st person, plural, past tense, instead of -d-uk, -d-k as in most western Turkic languages, and other features.

A Kumandy fisherman

South Altay (Turkic)

Standard (South) Altay but, put;
d'lds qzl qurgak d'albraq;
, bri,
uyukta- m:s bu:r,
y bir eki ch trt besh,
alt d'eti segis togus on

The official literary standard of the Altai Republic is based on the language of the Altay-kizhi people (from kizhi "person").

In phonology, the South Altay subgroup is characterized by the word-initial palatalized, lightly tapped /d'-/ or /j'/ as in /d'ok, j'ok/ "there is not", /d'ol, j'ol/ "way", etc. About 56.000 speakers (2010).


Altai (Altay) people

Listen to the Altay throat singing by Altai Kai in Batïrïs jurtaGan, literally "Bigman-our yurted" — "Once upon a time there lived our warrior (strongman, batïr)".



Subgroup 3
Great-Steppe (Turkic)

Most Turkic languages distributed over the enormous area of the Great Steppe, extending from the Irtysh River all the way to the Black Sea, have been shown[1] to belong to a single major genetic taxon apparently containing the following subdivisions:

(3a) the Kyrgyz-Kazakh subgroup, including Kyrgyz, Kazakh, Karakalpak and hypothetically, the unattested dialect of the Karluk tribes;
(3b) the Chagatai subgroup
, including early medieval Chagatai, modern Uzbek, Uyghur and their dialects;
(3c) the Kimak subgroup (or Kimak-Kypchak-Tatar subgroup), which includes multiple languages stemming from the expansion of the Kimak Confederacy and the Golden Horde, such as Kazan Tatar, Bashkir, Sibir Tatar, North Crimean Tatar, Nogai, Kumyk, Karachay-Balkar, etc.

Note that the former two groups, Kyrgyz-Kazakh and Chagatai, are apparently more closely related to each other than to Kimak.

Most Turkologists normally refer to the Great-Steppe Turkic subtaxon as Kypchak /kip-CHAHK/, a term popularized by the Baskakov's classification generally-accepted in the USSR during the 1950's-1980's. However, because the specific composition of the two concepts may be interpreted in different ways, and because the original Kypchak clans had a more modest historical reality (not necessarily overextended to include all the tribes of the Great Steppe (also see discussion below)), we decided to introduce a different term.

The Great Steppe languages must stem from a rather archaic segment of late Proto-Turkic-Proper apparently originally dispersed in the Kulunda Steppe /koo-loon-DAH/ and along the nearby-located middle and upper course of the Irtysh River /eer-TISH[26]/. This Proto-Tukic segment does not seem to have been involved in the earliest migrations right after the initial Proto-Turkic split, so the Great-Steppe branches began to advance in the western direction only after about 600-700 AD.

Consequently, the languages of the Great Steppe retain many archaic Proto-Turkic features, on one hand, and are quite newly-formed and closely-related, on the other, sharing good mutual intellegibility, especially within each subgroup, subjectively up to about 70-80% in real speech, according to reports of proficient speakers.

The Great-Steppe Turkic languages also have fewer innovations and borrowings from other languages, such as Persian and Arabic — a classical source of basic vocabulary loanwords in the west. However, they may include many cultural and technical borrowings from Russian.



Subgroup 3a:


The Karluk and Kyrgyz tribes that migrated to the Tian Shan

The earliest migrations in the Great-Steppe taxon were probably connected with the settlements in the vicinity of the Tian Shan Mountains. The Tian Shan is known as Tanrï da: in Turkish, Tengri taG in Uyghur and Te:nger U:l in Mongolian meaning "Heavenly (or God's) Mountains", which suggests that the Chinese name tien shang "sky (heavenly, blue) mountain(s)" may merely be a reinterpretation of a Turkic or Mongolic original.


The descendants of the Karluk Confederacy
It should be explained that the exact origins and dialectal affiliation of the Karluks is quite obscure. Herein they are viewed as an ethnic group closely related to the Kyrgyz of Kyrgyzstan, which is more of an educated guess than a well-supported hypothesis.

The Karluk /kar-LOOK/ Confederacy (766 840) was a medieval state located in the Zhetysu (rather Jeti-Su / jeh-tee-SOO/)("the Seven Waters/Rivers"), a historical region with arable land and multiple rivers between the Tian Shan mountains and Lake Balkhash /bahl-KAHSH[26], bal-HUSH/ near present-day Kyrgyzstan.

Originally, the Karluks must have been a clan from the Altai Mountains that had migrated towards the Irtysh River c. 665, finally reaching the Jeti-Su by c. 700 AD [wiki]. After the famous Battle of Talas /tah-LAHS/ in 751, when the Chinese forces were defeated by the Arabs, the Karluks were able to occupy Suyab, the capital of the Western Gkturk Kaganate, and beginning of 766, gained control over the northern part of the Silk Road and the whole Jeti-Su area. They were partly converted to Islam c. 780.

In 840, the Karluk Kaganate was subdued by a secondary migration wave of the Yenisei Kyrgyz (from the Altai Mountains?)[uncertain]. By 940, the Karluk Kaganate was captured by the Karakhanids.

It seems that after the disappearance of the Karluks, the region was occupied by the Kyrgyz tribes, though it is entirely uncertain when and why the Kyrgyz people first appeared in Kyrgyzstan, with different sources citing various opinions on the matter. At any rate, a Turkic tribe named Kyrgyz, apparently located in the vicinity of the Tian Shan region, was mentioned by Mahmud al-Kashgari at least as early as 1073.


Tian Shan Kyrgyz
ayaq Jldz qzl qurGaq Jalbrak ukta- myz bo:r y bir eki ch trt besh alt Jeti segiz toGuz on

Kyrgyz people

Kyrgyzstan /KIR-giz-STAN/(capital: Bishkek /bish-KEK/) is a small mountainous country in the Tian Shan near Lake Issyk Kul /EE-sek KOOL[26]/ (lit. "Hot Lake"), originally situated along the northeastern part of the Silk Road.

The legendary history of the Kyrgyz people, including battles against the Khitans and Dzungarians, are described in the Epic of Manas /mah-NAHS/, an extremely long, orally transmitted epic poem (essentially similar in spirit to the Greek Iliad and Odyssey) first mentioned in the 16th century and written down in 1885.

The ample narrative tradition reflected in the Manas makes Kyrgyz an elaborate and full-blooded literary language. It is also famous for the works of Kyrgyz writer Chingiz Aytmatov /chin-GIZ uyt-MAH-tov/ mostly created during the 1960's.

Kyrgyzstan was integrated into Russia in 1876, but eventually became independent in 1991. Youngsters often no longer speak Russian.

The Kyrgyz people and language were known as Kara-Kyrgyz before the 1920's. Religion: formally Muslims, though, as Radloff attested in the 1860's[23b], Islam did not take much root among the Kazakhs, and even less so, among the Kyrgyz tribes of the 19th century, so both languages are relatively free of Arabic borrowings. There are circa 4 million speakers of Kyrgyz.

Listen to the song Aged 18 from the 1960's performed by Zhanetta Bobkova (2009) — a nice voice, poetry and the girl (and the numerals) — and an older and qainter version of the same song by Zeynep Shagieva (1960's?), seen as Kyrgyz music classics, as well as another old dramatic song: Ömür daira or Ömür daira (mirror, youtube) "The River of Life" by Zhanysh Kochkorov.


On the origins and pronunciation of the ethnonym Kyrgyz: (Note: any ethnonymic remarks are unavoidably hypothetical.)

The traditional Anglophone spelling and pronunciation of Kirg(h)iz is /keer-GEEZ[26]/, which is based on the Russified variant with an /ee/, but the original Turkic phonology includes the /ï/ back vowel and therefore is rather shorter and harder: Qïrgïz /kr-GEZ, ker-GIZ/.

The outdated ethnonym "Karagas" for Tofa(lar)[or another local tribe?] may too be just another way to pronounce "Kyrgyz"; moreover, note the direct retention of this ethnonym in Fuyu Kyrgyz in China.

The ethnonym Kyrgyz may be listed among the oldest known Turkic ethnonyms, and just like in ther similar cases, it probably originates from a name or alias of a patrilineal clan's progenitor [herein]. Many centuries after its creation, the name must have spread to several other neighboring clans or clan confederecies, finally becoming overused and ambiguously applied to many ethnic groups of various descent.

It may also be assumed that the word by itself seems to have the same root as in *kork- "to fear" or as in *kyr- "to break" and may contain a reduplication of *kyr-kyr > *kyr-kyz with the first -r retained before the consonant. Moreover, words of the same phonological shape in Turkic of West Siberia seem to allude to terror and force, cf. Tuvan korgysh, Khakas xorGs, Kyrgyz korkush "fear, terror"; Kazakh qurtu "exterminate", qrqu "shearing, cutting"; Altai kr "erase", krksh "shearing", Sakha krgs "fight, destroy each other", etc.

A more popular but less likely and less meaningful etymological version (apparently first mentioned in the History of the Yuan Dynasty) is that the Kyrgyz ethnonym originates from a juxtaposition qrq + qïz "the forty girls" or "forty + an unknown suffix".


Kazakh ayaq zhldz qzl qrGaq zhapraq yqta- myiz bawr y bir eki sh tort bes alt zhetti segiz toGz on

The Republic of Kazakhstan /KAH-zak-STAN/ (capital: Astana /AHS-tuh-NAH/) is just that big, giant, eye-catching spot on the map of Central Asia. Despite its large size, much of central Kazakhstan's territory is in fact semi-desert continental steppe with most population concentrated in the north along the border with Russia or in the south near the Tian Shan Mountains. Note the former capital and the second largest city Almaty /AHL-muh-TEH, Russified: ahl-mah-TAH/ (probably from Kyrgyz Alma To: "Apple Mountain", but the exact etymology is controversial).

Historically, the Kazakh /kuh-ZAHK,[26] kuh-ZAHH/ people seem to be just those Kyrgyz nomads that had begun to spread beyond their original Jeti-Su and the Chu river homeland near the Tian-Shan after the 1460's, and whose language was afterwards strongly affected by the Nogai and Tatar dialects of the dissolved Golden Horde.[1] In the 17th-18th centuries the country was divided into the three zhüz (jüzes) (= large confederacies of Kazakh tribes).

Since the 1820's, the Russians in Kazakhstan began to use the Kazakh territory for coal mining, agriculture, nuclear tests, and launches from the Baikonur Cosmodrome. Kazakhstan became independent in 1990, emerging as a huge Central Asian power with rapidly growing economy and relatively high level of urbanization.

Kazakh and Kyrgyz are mutually intelligible, and the Kazakhs were even named Kazak-Kyrgyz or Kaisak-Kyrgyz or just Kyrgyz in the Russian sources between the 1730's and 1920's (the self-appellation was still Kazakh, though) [e.g. see Melioranskiy (1894).[14]] Cf. an old Kazakh saying, /qazaq qïrGïz bir tuGan, sart shirkindi kim tuGan/ "Kazakh and Kyrgyz are one kin, but who in the world made Sart? (=a Chagatai city dweller, trader, an Uzbek)." There are circa 12 million speakers.

Listen to the Jalgan ay folk song by Asemkhan from the Xinjiang Autonomous Region of China where Kazakh is also spoken — a nice and clear eastern pronunciation and admirable voice.

Kazakh people, Kazakhstan
Modern buildings in Astana (upper row):
(1) The Pyramid of Peace;
(2) The Khan Shatyry Entertainment Center;
(3) The Bayterek in the distance (the tower with the golden ball)

ayaq zhuldz qzl qrGaq zhapraq uyqla- muyiz bawr y bir eki sh trt bes alt zheti segiz toGz on

Karakalpak /kah-RAH-kal-PAHK[26]/ literally means "black calpacks, hats" (= brave warriors). Karakalpak has been spoken in the autonomous Republic of Karakalpakstan in Uzbekistan (capital: Nukus /noo-KOOS/) located near the southwestern coasts of the Aral Sea and is often considered nearly (but not quite) a dialect of Kazakh.

Since the period when the Oxus River, or Amu Darya inflow /ah-MOO DAR-ya[26]/ (from Persian daryá "sea, river", therefore originally /ah-MOO dah-RYAH/, had been diverted for irrigation, the Aral Sea shrunk and almost disappeared between the 1950's and the 1990's causing terrible deterioration in the whole region.

Karakalpak exhibits even more Nogai-Tatar influence than Kazakh. As to the status of Karakalpak, Poppe (1965)[18a] noted the following, "Menges has correctly stated that Karakalpak is a dialect of Kazakh but not an independent language as the Soviet scholars believe." Nevertheless, there exist separate dictionaries of the Karakalpak language.


The relationship between Kyrgyz, Kazakh and Nogai
The current lexicostatistical study[2] demonstrates that modern Kyrgyz and Kazakh are surprisingly close — circa 91-92% in Swadesh-215 — probably even constituting a single dialectical continuum at their geographic extremes. As mentioned above, both ethnic groups were commonly known as Kirgiz until the 1920's.

The classical Baskakov's classification (1952)[6][7] used to group Kazakh and Nogai together along with and the other "Kipchak" /keep-CHAHK/ languages, whereas Kyrgyz in that classification was locked away into a special subgroup along with South Altay. Being an author and coauthor of Nogai dictionaries and textbooks during the postwar 1940-50's, Baskakov seemed to view Nogai as particularly close to Kazakh, however an examination of his classification reveals that he did not differentiate between shared archaisms and innovations. Consequently, there turns out to be little evidence relating modern Nogai situated in the North Caucasus, a rather typical Kimak language, directly to the Kazakh stem, whereas most shared features between the two languages seem to be archaic retentions present in many other languages of the Great Steppe. This does not mean, however, that Kazakh and Nogai have nothing in common, and certain features, such as the /ch/ > /s/ mutation, indeed seem to be recent innovations (also present in Sibir Tatar and other languages), but herein they are rather attributed to a seconday mutual influence [uncertain].

Kazakh, which occupies the vast steppes of Kazakhstan, must have separated from the Kyrgyz stem in the Jeti-Su region by the 15th century. According to the present study,[1] Kazakh seems to have been affected by a dialect of the Nogai Horde and acquired certain new features which finally differentiated it from the Kyrgyz foundation. This seems quite logical, considering that the period of dispersal of the Nogai Horde during the 2nd half of the 16th century matches the early formative days of Kazakh, and some of the stray Nogai clans, at least in theory, could have become intermixed with the early Kazakhs somewhere near the Ural (Yaik) River.

On the other hand, the Kyrgyz language of Kyrgyzstan, isolated in the Tian Shan mountains, retained more archaisms of the Altay type and probably even acquired new Altay borrowings during the Dzungarian invasion of the Oyrots in the 17th century.[1] There is good phonological correspondence between Kyrgyz and South Altay, including some shared isolexemes, such as Kyrgyz and Standard Altay but "leg", Kyrgyz chong, Standard Altay d'a:n "big", etc. As a result, Kyrgyz speakers may find Altay languages rather intelligible. This leads to a conclusion that Kyrgyz may have been affected by the recent (17th-18th centuries) migration from the the Altai Mountains that must have come with the invading Dzungarians [uncertain].

The relatedness among Altay, Kyrgyz, Kazakh, Nogai and Kazan Tatar is a typical example of Turkic languages forming a dialectal continuum with many secondary seams. So, to rephrase the old quote, if one could take a ride in the early 19th century from the Altai Mountains to Kazan City along the Volga, in each town on the way, he would find a dialect only slightly different from the one in the previous town.


The Karluk tribes that crossed the Tian-Shan

The descendants of the Chagatai Khanate

The patchwork of Central Asian languages gets particularly complex at this point.

Sometime during the murky turmoil days of the Mongol invasion of the 13th-14th century, a certain segment of Proto-Karluk-Kyrgyz-Kazakh speakers at the foothills of the Tian Shan Mountains — with the Karluks being a particularly likely example — must have spread over the Tian Shan mountain ridges into the destabilized by squirmishes and partly devasteted territory of the Karakhanid Khanate /kuh-RAH-huh-NEED/[uncertain]. These Proto-Kyrgyz-Kazakh tribes must have largely displaced the local Karakhanid population and become intermingled with it, thus creating the basis for what soon became known as the medieval literary Chagatai language /chuh-gah-TY/.

As a result, the present-day Kazakh and Kyrgyz are particularly close to Uzbek and Uyghur[1], sharing with them about 83% of lexemes in the 215-word Swadesh list (borrowings excluded).

Whereas the spoken Chagatai must have split up into western and eastern dialect by about the 14th-15th century, finally transforming into the modern Uzbek /OOZ-bek[26], ooz-BEK/ and Uyghur /ooy-GOOR/ languages of today, the written Chagatai continued to be used as a common medieval Turkic lingua franca in literature and written correspondence until about the 19th century.



Chagatai ayaq,
yulduz qzl quruq,
yapurGan yapurGaq yapurGaG
uyu   baGr y bir iki ch trt besh alt yeti sekiz toquz on

Chagatai /chuh-guh-TY/ is essentially a Proto-Uzbek-Uyghur, and an indirect continuation of Karakhanid. Originally, it was the language of the Chagatai Khanate (c. 1230-1700) established by the Mongols to replace the Karakhanid dynasty in the Tian Shan and the Tarim Basin, given that Chagatai Khan was actually the second son of Genghis Khan. At their greatest extent, the Chagatai Khanate domains spread from the Irtysh River in Siberia down to Ghazni in Afghanistan, and from Transoxiana to the Tarim Basin, which obviously contributed to the large-scale distibution and acceptance of the Chagatai language.

The period of the classical Chagatai literature starts with the publication of Navai's /NAH-vah-EE/ (1441-1501) poetry. After that, Chagatai lived its heyday during the Timurid Empire. As a result, between 1400 and 1920, Chagatai was transformed into a sophisticated Central Asian koine written with the Perso-Arabic alphabet and having many local variations. These Chagatai varieties are usually known as Trki /tyoor-KEE, tur-KEE/.

As much as the Arabic script created difficulties in phonetic interpretation, it provided laxness for dialectal variation and cross-cultural usage. Each Türki dialect user could write and reinterpret other people's writtings in his own Turkic dialect, still using the same writing system understandable to anyone. Because of the wide-spread Persian bilingualism and the knowledge of Arabic, he could also add a generally-accepted Persian or Arabic word when he thought a local Turkic expression would not do, which finally resulted in a high percentage of Perso-Arabic borrowings. Therefore Chagatai-Türki can also be seen as a written communication system rather than a real spoken language.

Consequently, one may assume that the emergence of the early Chagatai was very similar to the rise of Middle English from the Scandinavian and Anglo-Saxon linguistic exchange with multiple French and Latin borrowings. Finally, the four different medieval cultures (Karluk/Kyrgyz, Karakhanid/Old-Uyghur, Persian, and Arabic) mixed and blended, creating the variety of today's Uzbek and Uyghur dialects with their distinct local flavor, as well as the strong recent Russian or Chinese influence. Unsurprisingly, Uzbek, which is in fact the modern-day descendant of Chagatai, is still the most widely spoken Turkic language apart from Turkish and Azeri.

Look for Qaro ko'zlar (Urgelai) "(Your) black eyes (My beloved one)" sung by Uzbek singer/actress Ziyoda and styled as Babur's /bah-BOOR/ 16th century's Chagatai poetry — this exquisite and refined music clip may catch your fancy.

Uzbek oyoq yulduz qizil quruq yaproq uxla- shox,
uy br ikk uch trt besh lt yett sakkz tkkz n

The Republic of Uzbekistan (capital Tashkent) is mostly desert territory, with life historically concentrated only in the fertile Fergana Valley and the southern oases of arable land along the Zeravshan River known as Sogdiana, which includes such prominent, large, ancient cities as Khujand (founded by Alexander the Great in 329 BC), Bukhara /Anglophone: boo-KAH-rah,[26] Russophone: boo-ha-RAH, Uzbek: boo-haw-RAW/(known since 500 BC) and Samarkand (since 700 BC). The Arabic name for the region was "Mawaran-nahr", meaning "beyond the river (Oxus)", hence also Transoxiana in Latin.

The invasion of the Karakhanid Khanate by the Mongols in 1219 led to the establishment of the Chagatai Ulus and the diffusion of the newly-formed Chagatai language over the Persian substratum. Because of the interaction with Persian, among the most typical features of Uzbek are the Perso-Arabic borrowings and the loss of the vowel harmony.

Timur/ Tamerlane /tee-MOOR, TAH-mer-layn[26]/ who was born near Samarkand, was a conqueror of Central Asia, who founded the Timurid dynasty (1370-1585) and was famous for his brutality.

An Uzbek Chai-Khana, Samarqand, pilaf, Emir of Bukhara, an Uzbek market
Left to right: (1) Chai-khana (tea house) visitors and (2) the Emir of Bukhara
(both images areearly true color photography by Prokudin-Gorski, c. 1911);
(3) downtown Samarqand today;
(4) a pilaf dish;
(5) Uzbeks as excellent market traders (present-day)

Before 1924, the Uzbeks used to be known as "Sarts" (originally, townspeople, or city dwellers as seen by the nomads in the north) and the Uzbek language was even known as Sart tili.[25]

Presently, Uzbek is a robust, significant Central Asian language with several internal dialects and about 25 million speakers (40% non-Russophone).

Curiously, the modern Uzbek Latin alphabet (introduced in 1993) allows to use only the ASCII (=American) characters, for instance <o'> instead of <ö>; it may be used along with the older Cyrillic alphabet.

Look for a modern blissful love song Chegaralar bormu, qaysarliklaringä? "Are there any limits to your stubborness?" by Bojalar & Ruhshona, made in the 1970's style, humorously recreating life in the Soviet Union. Moreover, watch a clip about an Uzbek family near the Zeravshan Mountains still living in the old ways (in English).

Khwarezm [/haw-REZM/; the odd English spelling comes from Persian, so Khorezm seems more appropriate] is a historical oasis civilization in Central Asia that deserves special mention. Khwarezm was located in the lower course of the Amu Darya (Oxus) River, on the p/d border of Uzbekistan, Turkmenistan, and Karakalpakstan (=the autonomous republic of Uzbekistan).

The rise and demise of Khwarezm have been connected to the instability of the Amu Darya (Oxus) riverbed that flows through the Kara Kum /kuh-RAH KOOM/ ("Black Sand") and Kyzyl Kum /kuh-ZIL KOOM/ ("Red Sand") deserts in its upper course. In 1598, the Amu Darya had turned off to the north from the Caspian Sea thus leading to the formation of the Aral Sea as it was known until the 1990's, when it dried up again, partly due to another bend of the Amu Darya that turned to Lake Sary-Kamysh ("Yellow Reed"). The dry Amu Darya riverbed is now known as the Uzboy.

The Khwarezmian language of East Iranian stock had been spoken in the area until the 8th-13th century, but was mostly eradicated by the Arab, and then finally, the Mongol invasion. At the time, Khwarezm was famous for a number of early scholars. Muhammed Al-Khwarezmi (=from Khwarezm) (780-850) was a famous Arabic-writing mathematician, who introduced the decimal numbers to the Western world and whose name is commemorated in the word "algorithm". Al-Biruni (973-1048) was a polymath, known as the founder of Indology, and a contemporary of Avicenna (980-1037) from Bukhara. Avicenna, too, visited Köhne-Urgench (Turkmen: "Old Urgench" /oor-GENCH/), the then-capital of Khwarezm, established as early as about the 5th century BC.

During the Karakhanid rule in the 12th-13th centuries, the main language in the area was the Khwarezmian dialect of Karakhanid that used the Arabic script supplanted the Iranic substratum, but was itself gradually replaced by Uzbek Chagatai.

After the bloody massacres of Genghis Khan and Tamerlane invasions and the drying of the Uzboy, the capital was transferred from Old Urgench to Khiva /hee-VAH/. Khiva was taken by the Russian troops in 1873, which led to the abolition of slave trade, though Khwarezm still retained some independence until 1924. Presently, Khiva, with its beautiful old town, is turned into pretty much an open-air museum. A Khwarezmian (Oghuzic) dialect of Uzbek is spoken in the area.

As a sample, look for Här görgende yurek tik-tik urmei-mi? literally "At every glance the-heart, tick-tick, doesn't-beat-does-it?" by Feruza.

The Khwarezm civilization: Khiva and Old Urgench
(1) The Kunya Arka City Wall, Khiva (founded in 1688,
restored in the 19th century);
(2) Al-Khwarezmi monument;
(3) The unfinished Kalta-Minar minaret (1855), Khiva;
(4) A street in Khiva;
(5) Khiva in the 19th century, unknown artist;
(6) The capture of Khiva, a fragment of painting by Vereschagin (the 1870's);
(7) The ruines of Old Urgench in the desert,
where al-Biruni and Avicenna could have met;
the image shows the exceptionally tall 60-m minaret (from the 1320's)
and the Tekesh Mausoleum (from the 13th century)

Uyghur ayaq yultuz qizil quruq yopurmaq uxla- mNgz beGir y bir ikki ch trt bsh alt ytt
skkiz toqquz on

Uyghur /ooy-GOOR/ is the eastern descendant of Chagatai spoken in the Xinjiang /sin-JANG/ Uyghur Autonomous Region of China (capital: Urumchi /oo-ROOM-chee[26], oo-room-CHEE/). Essentially, Uyghur is distributed along the edges of the Takla-Makan Desert /tak-LAH mah-KAHN/(most likely from Uygur taglï makan "mountain dwelling"). The Silk Road here has always been ethnic running water, and Chagatai was blended into the earlier 9th century's Kara-Khoja (Old Uyghur), as well as into Persian and Chinese adstrata, yet most scholars would agree that it cannot be seen as a direct continuation of Old Uyghur.


Uyghur, Uygur, Uighur
(1) A street in Kashgar /kush-GAR/;
(2) Uyghur women at the mosque

Outside the peculiar use of the Arabic alphabet (mostly dropped in other Turkic countries during the 1920's), the Uyghur language is typically characterized by long vowels and the dropping of the final -r (karGa > ka:Ga "crow"), as in British English; otherwise it is mostly similar to Uzbek.

Before the 1920s, all Chagatai-speaking Muslims in the region were known under different names, such as Kashgar (in the west), Moghols (the ruling class), Sarts (merchants and townspeople), Taranchis (farmers), etc., whereas the ethnonym of "Uyghur" was artificially created (or rather restored) only in 1921. Population: c. 9 million speakers.

Both Uyghur and Uzbek are languages with pronounced dialectal differentiation that is poorly researched. Uyghur, for instance, seems to embrace several closely related dialect-languages, such as Ili /ee-LEE/ in the northeast, Lop (Luobu, Lobnor, Lopnur) in the east, the central dialect (Turfan, Kashgar), the southern Khotan (Hotan) dialect; a special position belongs to ynu.



Subgroup 3b:

The descendants of the Kimak Confederacy

The Dialects of the Golden Horde

Kimak dialects of the Golden Horde (clickable)

The Kimak subgroup includes at least Kazan Tatar, Mishar Tatar, Crimean Tatar, Astrakhan Tatar, Sibir Tatar, Bashkir, Nogai, Kumyk, Urum, Crimean Karaim, Lithuanian Karaim, Karachai-Balkar, Kypchak/Cuman/Polovtsian (extinct) and possibly other major dialects and languages. There are good reasons to believe that all of them stem from the Kimak /kee-MAHK, kee-MAK/ Confederacy (later Kaganate) that formed along the upper course of the Irtysh River by about 700-800 AD.

Note: The exact position of Baraba and Tomsk Tatars is much less obvious, though.

According to the well-known legend, attested by Gardezi in his work Zayn-al-Akhbar c. 1030, where he actually cites another older book by Ibn Khordadbeh (820-912), the Kimak Confederacy initially consisted of the seven original clans, including Kimak (Proper), Tatar, Kypchak, Bayandur, Imi, Lanikaz, and Ajlad. Hence, the expression The snake has the seven heads, cited by Mahmud al-Kashgari in 1073.

Kimak or Kimek was also called Yemek or Imek in Arabic sources, but the difference between the two is rather obscure. It is assumed herein that it may most likely have arisen due to an error in copying the letters of Arabic alphabet, though Kumekov,[15] one of the main scholars of the early Kimak history, cites different opinions. The lack of vowel harmony in Kimak also indicates that the word has been distorted after going through Chinese and Arabic renderings: the original pronunciation was perhaps Qumuq (cf. qum "sand") as still reflected in the self-appellation of Kumyks distributed near the Caspian Sea.

The Kimak Kaganate was a great pastoral nomadic Tengriistic confederacy of local clans that existed near the southern edge of the Altai Mountains, especially around Lake Zaysan /zy-SAHN/ and the upper course of the Irtysh River between 743 and 1210 AD,[1] also see [Kumekov (1972)][15] for details.

The Kimak Kaganate had initially been part of the Gktrk-Uyghur Empire, but must have become free after its collapse in 840. The Kimak population was semi-nomadic and relatively urbanized, with over a dozen towns scattered along the upper Irtysh River, such as Imakiya /ee-mah-KEE-ya/, which is probably an Arabic misspelling for the adjective "Kimak (Imak)" (City). These towns were marked on the map prepared by the Arab geographer Al Idrisi (1099-1165). The towns had markets and temples, and were visited by Chinese merchants taking part in the Silk road trade; their inhabitants used the Orkhon script writing system. This Kimak civilization is now rarely mentioned by historians, albeit it seems to be an influential cultural and political formation in Southwest Siberia.

Archaeological evidence and migrational analysis suggest that somewhere after 850 AD, the Kimak tribes began to spread northwest down the Irtysh towards the Tobol River /teh-BAWL/, and finally all the way to the Southern Ural. By the 900's AD, the Kimaks must have reached the Volga River, known as Itil /ee-TEEL/ in local Turkic languages (originally from Bulgaric), where the Kimaks were vividly described by Ibn-Fadlan in 922 as "al-Bashkird".

By 1068, the Kimak clans began to migrate further into the fecund Pontic pastures robbing the Kievan Rus towns. Here, they became known as the Polovtsy /PAW-lov-tsee/or Polovtsians to Kievan Russians, Cumans /koo-MAHNS/ to Byzantine Greeks and Hungarians, and Kifchak < Qypchaq /kep-CHUK, kip-CHAHK/ to Arabs. During the 12th-14th centuries, this westernmost Kypchak dialect was recorded along the Black Sea coast in a medieval textbook known as the Codex Cumanicus.

Because the westernmost Kimak descendants were addressed as "Kifchak" in Arabic sources, the name Kipchak was passed into the 20th century's classifications, however it seems to be poorly founded in other respects. Despite the fact that Kypchak is a frequent clan name among many Turkic peoples, it looks like the Kypchaks constituted only a relatively small part of the original Kimak confederacy and were attested mostly in the area adjacent to the Kievan Rus. They are briefly mentioned, for instance, in the Secret History of the Mongols (1240),[23a] but only as a vague nickname. Therefore the term "Kypchak" as used for all of the clans and tribes that once inhabited the Great Steppe seems to be an overextrapolation promoted by Baskakov's classification beginning of the 1960's. Nearly nowhere in his late booklet about the Kypchak languages (1987),[15a] which was supposed to cover the whole subject in detail, did Baskakov address the issue of the origin, early development and migration of the original Kypchaks; apparently, to him "Kypchak" was just a suitable name for the Turkic languages of the Soviet Union in general, except for Oghuz, Khakas and a few other strongly differentiated branches. This is the reason why we tried to abandon the term in the present classification by differentiating between the orignal Kimak Confederacy at the Irtysh and the languages of the Great Steppe, in general. The terms Kimak and Kimak-Kypchak-Tatar are used as interchangeably synonyms herein, the latter being just a self-explanatory expansion of the former.

Polovtsian statues
Polovtsian statues near Izyum, Ukraine

In any case the Kimak-Kypchak-Tatar ethnic groups left large geographic traces on the map of Eurasia, so the whole Great Steppe (Ponto-Kazakh steppe) was once designated as Cumania (in Latin), Desht-i-Qipchaq (in Persian), Kipchak steppe or Polovtsian Land (in Russian), etc.

The Kimak-Kipchak-Tatars are also remembered for their stone statues (known as bábas in Russian), a very typical sign of their early culture.

After the Mongol invasion of the 13th century, the descendants of the original Kimak migrants were apparently integrated into the Ulus of Jochi. Jochi was actually the eldest, and therefore the most important son of Genghis Khan, who had participated in the invasion of "the forest peoples" of Siberia c. 1207 and thus inherited the western part of Genghis Khan's empire in 1226 because of his achievements. However, he died just months later, so the name of his empire was purely formal, and the Ulus of Jochi rather became known as the Golden Horde (1240-1502) in European historiography.

The Golden Horde was a predominantly Kimak-Kypchak-Tatar Khanate ruled by a nominally Mongol elite that was formally Islamized only in the 14th century.[25] At the time when being a Mongol signified power, the original Mongolian descent was probably claimed by many local tribes and families, and many local rulers were or claimed to be genetically Mongolic on their paternal lineage. However, the use of the Mongolian language in the Golden Horde was rather limited, so it is reasonable to assume that most local clans were in fact of purely Kimak-Kypchak-Tatar linguistic background. It should be noted, on the other hand, that the Mongolian presence is evidenced in a thin layer of Mongolic borrowings in many Kimak languages.

By 1500, after the 250 years of rule by Mongolian dynasties, the Golden Horde Empire broke up into several important "Tatar" khanates, including the Khanate of Kazan /kuh-ZAHN/ (hence Kazan Tatars), the Khanate of Crimea /kry-MEE-ah[26]/ (hence Crimean Tatars), the Khanate of Astrakhan /AHS-trah-kan/ (hence Astrakhan Tatars), the Qasim /kah-SIM, kuh-SIM/ Khanate (hence Mishar /mee-SHAR/ Tatars), and the Uzbek Khanate (hence the modern name of Uzbeks). This diversification process finally contributed to the crystallization of modern Kimak-Kypchak-Tatar languages and dialects. As a result, another suitable term for the majority of Kimak languages could be the languages of the Golden Horde, taken that it were the Kimak descendants rather than pure Mongols who actually inhabited the Golden Horde area.

During the reign of the Ivan the Terrible (1533-1584), the Russian armies defeated and annexed the Kazan and Astrakhan Khanates and moved eastward beyond the Urals, where they attacked another Tatar state, the Tengriistic Khanate of Sibir /see-BIR/(1495-1582) (capital Siber, or Qashlyk /kush-LIK/, the latter evidently from qïsh-lïq "the winter camp") located on the lower part of the Irtysh River (where it meets the Tobol) and ruled by Kuchum Khan /koo-CHOOM/. The task of annexing the Khanate of Sibir was accomplished by Yermak /yer-MAHK/, a Cossack leader, sometimes depicted in the Russian historiography as something of a Siberian Columbus. Curiously, Irmak means "river" or yermek "to scorn" in Turkish and some other Turkic languages, which implies that Yermak himself might have been of Turkic origin. This is supported by an interesting local Baraba legend, recorded by Dmitriyeva in the 1950-60's,[16d] which says that Yermak had grazed the cattle for Kuchum Khan before they had a quarrel, and so Yermak finally came back with an army from Ivan the Terrible [also see Sibir Tatar below].

All the Kimak languages exhibit considerable mutual intelligibility among themselves, for instance Kazan Tatar and Bashkir are still strikingly close (95% in Swadesh-215, borrowings excluded).[2] Moreover, being part of the Great-Steppe supertaxon, the Kimak languages are also closely related to Kyrgyz-Kazakh (80% in Swadesh-215, borrowings excluded) and Uzbek-Uyghur (78%).

The typical phonological features shared by Kimak members include:
(1) the partial loss of the original *S- as in Kazan Tatar yoldz, Nogai yuldz, Bashkir yondo "star"; Kazan Tatar yafraq "leaf", yul road, ylan "snake", yrek "heart", etc. but the partial retention of *S- in /Ji-/ as for instance, in Kazan Tatar Jir "earth", Jil "wind", often with an allophonic distribution across different dialects;
(2) the presence of the semi-vowel /-w-/, /-u/ after a vowel as in awuz "mouth", tau "mountain";
(3) the /-t-/ > /-l-/ mutation in suffixes and endings, as in Kazan Tatar yoqla-, Nogai uykla-, Bashkir yoqla- "to sleep", as opposed to Kyrgyz ukta-.


Battle with Polovtsians, Tataro-Mongol invasion, Battle with Sibir Khanate Tatars
The battlefield of Igor Svyatoslavich with the Polovtsians (Cumans) in 1185, painting by Viktor Vasnetsov (1880)
The siege of Moscow
by Mongol Khan Tokhtamysh in 1382,
painting by Vasily Smirnov ( the 1880's)
The conquest of the Sibir Khanate by Yermak in 1582,
painting by Vasily Surikov (1895)

On the origins of the ethnonym Polovtsian:

The word Polovstian may be distantly familiar through the theme song Polovtsian Dances (here is an engaging modern rock version) [note that the wiki ogg files may block any other sound files from being played in the back/foreground]. The music originates from the 1890's opera Prince Igor by Alexander Borodin /baw-raw-DEEN/, which was then remade into the Stranger in Paradise (1953). The 19th century's opera created by Borodin had in turn been based on The Tale of Igor's Campaign (of 1185), one of the most famous works of the early East Slavic literature that integrates many Turkic motifs. The etymology of the Old Russian word Pólovtsy should most likely be interpreted as "the field inhabitants, those who come from the steppe" (from Russian and Slavic pól'e "field"), though the traditional interpretation from Vasmer's Etymological Dictionary [referenced to Sobolevsky (1886)] is apparently incorrectly based on the Old Russian adjective polóvïy "light yellow", which does not seem to have any meaningful connection to Turkic tribes.


On the origins and development of the ethnonym Tatar:

The name Tatar /TAH-ter,[26] though originally: tah-TAR/ was first firmly attested in 732 AD on the Kl-Tegin monument and then mentioned again by Mahmud al-Kashgari in 1073 AD. Just like Kyrgyz, the name Tatar seems to be among the earliest-attested names of Turkic clans, and perhaps the most widely-known throughout the world. It may be understood that originally the word Tatar referred to the name or alias of a patrilineal clan founder, in other words, it worked like a modern male surname [suggested herein].

However, after the expansion of the Tatar clan, the name Tatar was frequently applied to various, external parties that had very little or absolutely nothing to do with the genetic or linguistic core of the original clan members.

Certain Tatars living in northeastern Mongolia east of Lake Baikal c. 1200 AD are mentioned in the Secret History of the Mongols[23a] (apparently authored by Genghis Khan himself). Whoever was the author, he says almost nothing about the language of these Mongolian Tatars, except that at a few occasions they seem to be able to make themselves understood in Middle Mongolian. Whether they originally were a stray Turkic clan integrated into the loose Mongolian society, or a different social phenomenon, is hard to say for sure.

After the expansion of the Mongols, the term Tatar seems to become ubiquitous. The famous report by Plano de Carpini (1245)[23] indiscriminately refers to all of the Mongols of Mongolia as Tatars, which apparently was a common trend throughout Europe and the Kievan Rus: the Tatars just became confused with the army of invaders coming from the east. The coincidental association with the Tartarus of the Ancient Greeks by European historians must have only added fuel to the flame, cf. for instance English tartar meaning "fierce", "brutal", etc.

By the 19th century, Tatar became an erroneous misnomer heavily overused in the Russian Empire's ethnographic tradition. The Russian exonym Tatary /tah-TAH-ree/ or Latin Tartari was ambiguously applied not only to all the Turkic speaking population of the Tsarist Russia, even including Azerbaijanis, who are of Oghuz origin, but even to the Tungusic and Mongolic peoples in East Siberia. This persistent overuse of this word with a vague, ambiguous meaning finally resulted in its ostracization by the beginning of the 20th century. Consequently, it fell out of ethnographic use as an umbrella term and is now largely being avoided both by Turkologists and Turkic population alike, except for the direct formal reference to Kazan Tatars, Sibir Tatars, Crimean Tatars, Mishar Tatars, Tomsk Tatars and some of the other, lesser-known ethnicities of Kimak-Kipchak origin.[1]

Presently, the term Tatars as a standalone word usually refers to Kazan Tatar people, in the first place, which is still one of the largest and the most influential of the modern Kimak ethnicities. During the Soviet period many of the lesser-known Kimak communities (such as Sibir Tatar, Baraba) were indiscriminately taught Kazan Tatar as a common standard, which might have resulted in the contamination of local Kimak dialects and languages by Kazan Tatar borrowings.


The relationship between Kimak and Oghuz

Even though the Kimak languages are closely related to the Karluk-Kyrgyz-Kazakh subgroup and Chagatai subgroup as part of the Great-Steppe unity, they furthermore seem to share certain features with the Oghuz /aw-GOOZ/ languages, also named Oghuz-Seljuk /sel-JOOK/ herein.

Particularly notable is the persistent use of the innovative *tüGel instead of the more archaic e(r)mes "not (after nouns and adjectives)" in both Kimak and Oghuz subtaxons. Another typical feature is a tendency to use the *y semi-vowel, especially before /a/, /o/, /u/, where more archaic Turkic languages have sibilants or affricates formed from the Proto-Bulgaro-Turkic *S, cf. Kazan Tatar yoldïz "star", yafraq "leaf", yuk "there is not" and Turkish yïldïz, yaprak, yok.

This phenomenon of mutual interaction between generally only distantly-related Oghuz and Kimak languages can possibly be explained[1] as a result of the Oghuz-Kimak linguistic exchange near Lake Zaysan. It can be surmised that the Kimaks had originally been part of the Proto-Great-Steppe clans distributed near the southern edge of the Altai Mountains. Circa 600-700 AD, these early Proto-Kimak clans must have been linguistically and culturally affected by the expansion of early Oghuz confederacies (such as Toquz Oghuz, Üch Oghuz and others) situated somewhere in Dzungaria, in the vicinity of the established Kimak settlements. Taken that the Oghuz clans must have participated at the time in the Silk Road migrations of the Gökturk-Uyghur Kaganate that exerted stong military and cultural control over the whole western part of the Silk Road, potential Oghuz influence in Proto-Kimak seems very plausible [herein].

Moreover, a few centuries later, during the 800-900's, the subsequent linguistic exchange between the Aral Oghuz and the Kimaks near the Southern Ural mountains could have led to a further stabilization of the acquired Oghuz features [uncertain].

However, despite some mutual exchange between Kimak and Oghuz, there is a shared average of only 68% in Swadesh-215 (borrowings excluded),[2] which makes them very far from "mutually intelligible" in practical situations. Therefore learning, say, just Turkish or Azeri is not sufficient to understand Kazan Tatar or Bashkir, and vice versa.



The Kimaks that stayed near the Irtysh River


ayaq   kïzl   yapraq yoqla-   pawïr,
y bir
trt psh
alt ydi,

Presently, the Baraba territory /bah-rah-BAH [?]/ are just a tiny spot of several villages east of the Irtysh River. Originally, they inhabited the area around large Lake Chany /chah-NEE, chah-NEH/ and the adjacent Baraba Steppe, apparently named after the population [uncertain].

The ethnonym Baraba does not mean bar-ba "don't go" or similar, as it is usually explained in folk etymology, but is probably related to the legendary clan progenitor Baram, as mentioned in local legends[16d][1]

The Baraba people were first attested in Russian records in 1595, and then described in more detail by the Messerschmidt and Strahlenberg in 1721,[16] during their famous Sberian expedition which, among other significant discoveries, led to the establishment of the main Ural-Altaic language groups in Strahlenberg's work.[5a]

The Baraba legends mention their relatedness to the Khanate of Sibir (1495-1582)[16d] as well as the neighboring Samoyedic population,[16] which seems quite reasonable, and some specific linguistic features may indeed relate the Tobol-Irtysh Tatars to Baraba. However, the unique grammatical differences (e.g. the bara-tï-n ("you go now") type of the present tense in Baraba, as in Altay) and the lack of certain Kimak characteristics (e.g. the -ar future in Baraba instead of the -achaq future in Kimak and Oghuz-Seljuk)[16d] lead to a hypothesis that the Baraba people might be the remnants of the early Proto-Great-Steppe tribes which had inhabited the Baraba and Kulunda Steppe (between the Ob and Irtysh Rivers) before 500-700 AD and then intermingled with the Kimaks. The Baraba language was also contaminated by Kazan Tatar during the 20th century.

Economy: settled, non-nomadic population originally living in wooden houses; crop cultivation, animal husbandry, hunting, fishing.[16e] Religion: originally shamanism, then Islamized. About 4000 persons are cited,[16f] but few (if any) actual native speakers.


Baraba woman
A Baraba woman
(c. 2005)

The map of Siberian Tatars distribution
Map of the Sibir-Tatar-related population (clickable),
based on an ethnographic atlas (1964)[12b]




The Kimaks that spread into the Great Steppe

The rest of the Kimak clans have spread west towards the Ural mountains and finally formed the languages and dialects of the Golden Horde.

Sibir Tatar, Bashkir (essentially "Ural Tatar"), Kazan Tatar, and Mishar Tatar form the four Kimak languages in mainland Russia around the Ural mountains and the middle Volga. A few varieties of Astrakhan Tatar and Nogai, originally distributed south of the Urals and along the lower Volga (presently also near the northern Caucasus) share similarities with Kazakh and Karakalpak. North Crimean Tatar, Krymchak, Crimean Karaim, Lithuanian Karaim, and Urum (originally based on Greek population) form the Kimak-Kypchak continuation around the Crimean peninsula. Kumyk, situated along the western coast of the Caspian Sea, is adjacent to Azerbaijan.

Karachai-Balkar is the most deviant Kimak languages located in the northern Caucasus, perhaps unrelated to the original Golden Horde continuum.

ayaq yoltos qïsl qoro yapraq yoqla- möyes pawïr     ike
türt pish


The Tobol-Irtysh, or rather just Sibir Tatars, live near the cities of Tyumen /tyoo-MEN/ and Tobolsk, along the confluence of the Tobol and Irtysh rivers, east of the Ural mountains in West Siberia.

The Sibir Tatars are the remnants of the Khanate of Sibir (1468-1607) and the Tyumen Khanate, its predecessor, which first appeared in historical records in 1468, during the decline of the Golden Horde. In 1582, the main Sibir Khanate settlement, known as Sibir, or Sïbïr (or Isker, or Kashlyk [=winter camp]), was taken by the Yermak's army sent by Ivan the Terrible, making the then-ruling Kuchum Khan and his people flee to the steppe. The Sibir settlement soon became depopulated, so in 1587 the fortress of Tobolsk was founded instead, about 10 miles away from Sibir, being one of the earliest Russian outposts beyond the Ural mountains.

Throughout the 20th century, Sibir Tatar was considered to be as merely a "dialect" of Kazan Tatar, so apart from a couple of dissertations,[16b][16c] there are no textvooks or detailed publications, even though the phonological, grammatical and lexical differences of Sibir Tatar clearly require separate description. The /ch/ > /ts/ and /sh/ > /s/ mutation is among the immediately notable features, which reminds of the /sh/ > /s/ mutation in Kazakh and Nogai.

  Sibir Tatars
(1) The Russian fortress of Tobolsk
(c. 2010);
(2) The Siber town found
on a European map (1562);
(3-4) At the Isker Festival of Sibir Tatars (2010)

Population: c. 6700 persons (2010)[24d](probably counted including Baraba and Tomsk Tatars, since many sources do not differentiate them).


On the origins of the toponym Siberia:

The word Siberia as a general name for all of the northeastern Eurasia seems to be an 18th century's extrapolation from "the town of Siber" > "Sibir Khanate" > "West Siberia" (which is presently defined as a plain located between the Ural Mountains and the Yenisei River) > all of northeastern Eurasia to the Pacific Ocean.

The word Siberia replaced the older and just as vague designation of (Great) Tartary used during the 17th-18th centuries. The latter was formed from Greek Tartarus, a murky place beneath the earth, so deep that an anvil takes nine days to fall there. Consequently, as mentioned above, until about the middle of the 19th century, Ta(r)tars meant nearly any of the Siberian aborigines, and were initially associated with the demons of Tartarus, especially in connection with the turmoil of the 13th-14th centuries.

Before that, in the antiquity and the Middle Ages, the equally vague name of Scythia had been in use, and West Siberia had been associated with the Scythians, described by Herodotus in the 5th century BC.

Kazan Tatar ayaq yoldz qzl kor yafrak yoqla- mgez bawr y ber ike ch drt bish alt Jide,
sigez tugz un

The Republic of Tatarstan (capital: Kazan /kah-ZAHN[26]/) is a federal subject of Russia, historically situated near the confluence of the Volga and Kama River, essentially in the same area as Volga Bulgaria (see above).

It is not correct to say that the Tatars displaced the Volga Bulgars as soon as they arrived at the Volga, because according to the report of Ibn-Fadlan, some sort of the Kimak tribes were already attested in that area at least as early as 922 AD, showing peacful coexistence with the Volga Bulgars. But the later events of the Mongol invasion seem to have turned the tables against the Chuvash, when the Mongol armies, possibly with some participation of the local Kimak-Kypchak-Tatar tribes [uncertain], destroyed the towns of Volga Bulgaria between 1232-36, including the siege of the Bilar city. The devastation of Volga Bulgaria must have presumably caused an intense dispersal of Chuvash population that was either driven towards the forestland on the right bank of the Volga or just remained there in a natural refugium.

  Kazan Kremlin, Tatar people

Kazan Kremlin

The Qolsharif Mosque, Kazan
(1) The Kazan Kremlin, today as if 500 years ago;
(4) The Qolsharif Mosque (inaugurated in 2005) (above) is the largest mosque in Russia;
(5) A view of Kazan

The Kazan Khanate (1438-1552) emerged in history only after the dissolution of the Golden Horde. The Kazan Khanate was soon conquered by the troops of Ivan the Terrible in 1552 and became part of Russia. In fact, the famous Saint Basil's Cathedral on Red Square was built to commemorate the capture of Kazan.

The Tatar participation in the Mongol invasion is still remembered in the Russian language and culture, cf. such sayings as "An uninvited guest is worse than a Tatar"; "Mamai/the Tatars went over it" as about raising havoc; "the Tataro-Mongol Yoke", etc. Consequently, the Tatar appellation and language seems to, unfortunately enough, have a rather low social status.

Historical autonyms: Bolgar, Kazanl. Religion: Sunni Islam.

Over 4.2 million formally listed as speakers of Tatar (2010),[24d] but more than 70-90% are in fact Russian speakers.

Bashkir ayaq yondo ql koro yaprak yoqla- mg bawr y ber ike s drt bish alt yete hige tuGI un

Bashkir /bash-KIR[26]/ is spoken in the Republic of Bashkortostan (capital: Ufa /oo-FAH[26]/) situated west of Tatarstan in the Southern Ural Mountains. Self-appellation: Bashqort.

Between 1220 and 1234, the Bashkirs were fighting the Mongols, preventing their expansion to the west, but then voluntary joined the Moscovy in 1557.

There are 95% of matches between Kazan Tatar and Bashkir in Swadesh-215. So essentially, Bashkir is just a sort of Ural variety of Kazan Tatar.

  A Bashkir girl (staged) Bashkirs (staged)
Bashkir horsemen
Bashkir girls and horsemen
A Bashkir woman (real), c. 1910
This photo
made c. 1910,
is t
rue color
photography by Prokudin-Gorski

Also note some of the following Kazan-Bashkir shared innovations in vowels typical only of these languges: ber < *bir; drt < *trt; un < *on.

Population: 1.15 million speakers (2010).

The deviant Bashkir phonology exhibits many unusual mutations, such as ch > s, s > h, z > , for instance Bashkir hïu instead of the normally-occurring Turkic su "water" seems very far-fetched. These mutations are sometimes explained by an absorption of an unknown local substratum.

Curiously, the Bashkir population may at least be partly descended from Proto-Hungarians (or Magyars /mah-JARS/) of the Hungaria Magna or perhaps the other closely-related Ugric tribes, as well as possibly from the Bulgaric tribes. Proto-Hungarians were mentioned as still speaking Hungarian along Ak-Itil, the main river of Bashkortostan, c. 1235 by Friar Julian.[1] They were apparently linguistically assimilated by the local Kimak tribes during the expansion of the Golden Horde. That seems to date the emergence of the Bashkir dialect to after the 13th century.

Judging by the rather unreasonable proximity of literary Bashkir and Kazan Tatar languages, which must have almost necessarily involved some secondary interaction, Bashkir may have been later affected by the Kazan Tatar immigration to the Ural Mountains, especially taken that the Ural Bashkir people had certain historical freedoms and suffered less feudal opression than the Tatar population along the Volga.[1]

Economy: nomadic animal husbandry until the 18th century. Religion: Islam since the 950s, but mostly non-religious since the Soviet period. Population: 1.15 million speakers (2010), most of them bilingual in Russian. Being isolated in the Ural mountains, Bashkir is probably a little more alive and active than Kazan Tatar.

Listen to Yel kapahï - Kiler keshe, kemder bar "A small gate — Someone's coming, someone's there" by Güzel Shhisultanova with the typical sights of the Southern Urals.

On the origins of the ethnonym Bashkir:

The ethnonym "al-Bashkrt" had appeared very early on, being first mentioned in the Arab sources c. 840 and then clearly attested by Ibn-Fadlan near the Emba River /EHM-bah/ and then the confluence of the Volga and Kama in 922. Therefore, there is some terminological discrepancy: as a language similar to Kazan Tatar, Bashkir seems to be a relatively recent phenomenon, whereas its historical attestation in reference to the Kimak-Kypchak-Tatar tribes of the Urals and the Middle Volga seems to be going further back in time.

Crimean Tatar

ayax, ayaq Jldz qzl quru Japrax,
Juqla- myz bavur; Jiger u:y bir eki u:ch; us, drt,
rt, trt
besh alt yedi
sigiz tohuz on

The Crimean Khanate (1441-1783) (capital: Bakhchy-Saray /bahh-CHEE sah-RY/ lit. "The Garden Palace", rightmost image) was a Kypchak post-Golden-Horde state situated in the Crimean Peninsula and the adjacent Pontic Steppe. The Crimean Khanate maintained massive slave trade with the Ottoman Empire making raids into the Polish-Lithuanian Commonwealth and Russia.

The Northern Crimean Tatar dialects should not be confused with Crimean Turkish in the south of the peninsula and the Middle Crimean dialect, which is a dialectal seam between the two. After the 1920's there were attempts to build a mutually intelligible "literary language" that could be understood by all the Crimean Tatars, however, the actual dialectical situation in the Crimea is more complicated. And although the pure dialects may still survive in vivo, not enough field work on them has been done.

  Battle of Tatars with Lithuanians
A battle of Crimean Tatars
with Poles-Lithuanians
in the 17th century
a painting by Kossak,
(the 1870's)
Crimean Tatars
Crimean Tatars
(c. the 1820's)[24c]
Bakhchisaray succession home
The succession home
of the Crimean Khans
in Bakhchy-Saray

Crimean Tatars are also famous for being persecuted by Stalin as "Nazi collaborators" and resettled to Uzbekistan, though they mostly returned by the mid-1980's. Present-day population: circa 260.000 persons in the Crimea, 170.000 elsewhere.

Karaim ayax yldz,
qzl   yaprax yuxla-
mnguz   y bir eki its dyert,
alt yedi segiz toGuz on

Crimean Karaites /KAH-ruh-ite[26]/ are a rather odd and presently very small branch of adherents to Karaite Judaism, a teaching based on reading the Tora itself rather than its interpretations.

Karaim actually means in Hebrew "those who read (the scriptures)", so it refers to the name of the population, though the terms Karaite and Karaim are frequently conflated. Self-appellations: Qïrïm qaraylar, Qaray, etc.

The exact origin of Karaim is obscure. The connection with the Khazars has been speculated as early as the 19th century but is poorly corroborated. In any case, the Karaites seem to descend from a Jewish sect, that could have come from the Ottoman Empire [uncertain] and switched to the Polovtsian dialect spoken in the Crimea after the 13th century. Being socially and religiously detached from the rest of the Turkic communities, the Karaim language must have branched off from the Kimak main stem in the same way as Ladino, Yiddish and other Judaic languages from Indo-European.

During the WWII, the Karaites were saved from extermination after managing to demonstrate their formal dissociation from mainstream Judaism. Karaites have always been literate and many were quite influential despite their small community.

In 1392, a part of the Crimean Karaites were relocated to Lithuania thus forming a different branch of Trakai (Lithuanian) Karaim.

Presently, only c. 600 persons in the Crimea (2002), 257 in Lithuania (1997), and c. 1000 in other countries.

Karaite women
Crimean Karaite women (staged)
Karaites in the 19th century[24c]

ayaq yulduz qzl qaq yapraq uykla- myz   y bir eki ch drt besh alt yetti segiz toGuz on

The Kumyk people /koo-MIK, koo-MEK/, self-appellation: qumuq, occupy the steppeland along the northwestern coast of the Caspian Sea north of Azerbaijan, in Dagestan, which is probably one of the most ethnically complex areas in the world. Neither Kumyk nor Nogai have their own formal autonomies.

The exact origins of Kumyk are unclear, though their geographical position and notable dialectal differentiation suggest they had arrived in the area of the Caspian Sea before the Nogai people, that is before the mid-16th century, which is supported by the fact of the foundation of Tarki Shamkhalate in the 1440's. The direct descent from Khazars has also been claimed, considering that Tarki Village near Makhachkala (the capital of Dagestan) has often been associated with the legendary city of Samandar, founded by the Khazars but destroyed in 969 AD.

Historical economy: agriculture, fishing, settled living in villages. Religion: Sunni Islam. Printed books since the mid-19th century.

Population: 502.000 persons, 426.000 speakers (2010).[24d]

  Nogai and Kumyk, map
(2) Khalimbek-Aul Village;
(6) An approximate map of the region:
Nogai (light blue),
Kumyk (dark blue)

Nogai ayaq yuldz qzl qaq, kur
yaprak uykla- myiz bawr y bir eki sh drt bes alt yeti segiz togiz on

Nogai (Noghai) /naw-GUY, nuh-GUY/) are presently scattered in the steppeland of the Northern Caucasus in Chechnya, Stavropol Krai, Dagestan and Karachay-Balkaria. The name Nogai is derived from Middle Mongolian *noqai, the alias of Nogai Khan, a Mongol general, literally meaning "dog" in most Mongolic languages.

The Nogai people are the remnants of the Nogai Horde (c. 1392-1639), a loose nomadic confederacy that was centered in Saray-Juk (or Saraychik "Little Palace") near the delta of the Ural (Yaik) River. The Nogai Horde also covered the Lower Volga and probably some of the Astrakhan Khanate (1466-1556). The end of the Nogai Horde is connected with the poorly documented Russo-Tatar wars during the reign of the Ivan the Terrible. When the Russian army took Kazan (1552) and Astrakhan (1556), Devlet Giray Khan of the Crimean Khanate retaliated by destroying Moscow in 1571, however the local renegade Cossacks destroyed Saray-Juk in 1580, which was the end of the Nogai supremacy along the Yaik and the Volga River. As a result, sometime during this turmoil, about 1552-1554, part of the Nogai tribes began migrating towards the steppes near the Northern Caucasus, particularly the area of the Kuban river /kyoo-BAN[26]/, which resulted in the formation of the Lesser Nogai Horde along the Kuban River.[15b] In 1683, these Kuban Nogais were attacked by the Dzungarians from Mongolia (= essentially, Kalmyks) and then by the army of general Suvorov in 1782-83. It is plausible to assume that some of them were Russified becoming part of the Kuban Cossacks in the 18th-19th century, though a good many were exiled first towards the Black Sea and then finally deported to the Ottoman Empire.[25] All the details of this dispersal and exodus are now difficult to reconstruct.

Presently, there are 103.000 persons, 87.000 Nogai speakers (2010)[24d] [see the map above].

Because of its historical proximity to nothern Kazakhstan, Nogai seems to share more Kazakh-related features than other Kimak languages.

(1) An artist's reconstruction of Saray-Juk;
(2) The actual Saray-Juk archaeological site;
(3) Nogai men (2012); (4) Nogai girls (1881);
(5) A German map from 1549 with the inscription "Nogai Tartars" placed along the Lower Volga;
the inscription Saraichek can be read
near the bottom, though it should rather be
near the estuary of the Iaick Fl (Yaik River)
on the right;

Watch a very dramatic Dombïra song (=dombra, the name of the musical instrument) with Nogai-Turkish subtitles and some bloody battle scenes added later from the Mongol movie (2007); as well as the same song in a another clip featuring its strikingly talented performer Arslanbek Sultanbekov. In a similar fashion, more of his songs coming from the very heart of the ancient strife: Menim Nogayïm "My Nogai" and Ne kaldï? "What is left?" (the latter one performed by Rasul Beyseev): a poignant story about the 17th century's Dzungarian invasion into the lands of Nogais and Kazakhs with the background from "The Nomads", a Kazakh movie.

Karachay ayaq Julduz qzl qurGaq
chapraq Juqla- myz bawur y bir eki ch trt besh alt Jeti segiz toGuz on

Karachay-Balkar /KAH-ruh-CHY bahl-KAR/ is spoken in the confusingly named Karachay-Cherkess Republic (capital: Cherkessk /chehr-KESK/) and the Kabardino-Balkar Republic (capital: Nalchik /NAHL-chik/). The two republics were created rather artificially in 1922. The other two ethnic groups, the Cherkess and Kabardins, are of unrelated North Caucasian origin (but distantly related to each other).

Karachay-Balkar has many mutations at several levels, and a few Kabardino-Cherkes borrowings in the basic vocabulary, so it seems to stand aside from the typical languages of the Golden Horde that usually maintain certain mutual intellegibility.

It seems that Karachay-Balkar may be an early offshoot of Kimak that must have been present in the Northern Caucasus at least since the Mongol invasion of the 1220's, but perhaps having settled there a few centuries earlier c. 900-1000 AD [uncertain], when the Kimaks/Kypchaks/Cumans/Polovtsians were just moving into the Ponto-Caspian steppe.

  A tower in Kabardino-Balkaria
A modern tower
in Kabardino-Balkaria
A modern photo
Karachays, c. 1910
This photo c. 1910,
(true color photography by Prokudin-Gorskii)

Non-nomadic settled population; Islamized only by the 18th century. In 1943, the Karachay-Balkars have been forcibly resettled to Kazakhstan by Stalin, which led to mass starvation, but returned after 1956-57.

There are two main dialects, which among other features, differ in the pronunciation of *S as follows: (1) the Karachayl + Malqar Taulu (< from tau-lu "mountain-ous") pronounce /J-/, /ch-/, whereas (2) the rest of Malqarl pronounce /dz-, z-/, /ts-/.

Population: 218.000 persons listed as Karachay and 113.000 as Balkar (2010);[24d] 80% are bilingual in Russian.

(3) Southern Turkic Languages

The Southern Turkic Languages is a supertaxon that includes the Turkic languages that have formed south of the mountain system that can be collectively named as the Great Eurasian Barrier, comprising the Altai, Tian-Shan, Pamir, Kopet Dag, Caucasus, and other mountain ranges.

Initially, the Southern Turkic tribes had occupied Mongolia, Dzungaria and the oases of the Takla-Makan Desert (Tarim Basin), but then spread westwards into other adjacent regions.

The Southern Turkic languages seem to include the two main subgroups:
(1) The Oghuz-Orkhon-Karakhanid subgroup, which comprises Orkhon Old Turkic of Mongolia, Old Uyghur of the eastern Tarim Basin, Karakhanid of the western Tarim Basin, as well as any of the medieval or modern Oghuz-Seljuk languages;
and possibly (2)The Yugur-Salar subgroup, which herein is considered separately from the rest of the Turkic languages, its precise genetic position still being a matter of controversy.

Subgroup 4:

The Turks that migrated to West China

The Guanzhou Kingdom descendents

Yugur and Salar are the two peculiar Turkic languages located near the Tibet, in a historical region known as the Hexi Corridor /heh-SEE/, where the Silk Road enters the Chinese territory. It is a thin strip of land squeezed between the Nan-Shan (or Qilian) Mountain Ridge in the south [from Chinese nan shan "south mountains"] and the Alashan Desert in the north, separated by the Great Wall.

The exact linguistic origin of the Yugurs and Salars is difficult to determine, however most of their features either point towards the Orkhon-Karakhanid subgroup or even set Proto-Yugur completely apart from the rest of the Turkic languages, making them a separate major branch of Turkic Proper. In any case, the mutual relatedness between Yugur and Salar is rather evident:[1] both languages share similar verbal paradigms with largely absent personal endings as well as a system of similar innovative verbal tenses, which clearly indicates their common descent, considering such grammatical features are rarely borrowed.

The Turkic languages of the Ganzhou Kingdom are not unique in their odd classificational isolation. Curiously, the local Mongolic languages (Baonan, Dongxian /dong-see-AN, doon-SAN/, Monguor and Shira [Mong. "yellow"] Yugur (another one!)), usually grouped into a separate Southwestern cluster within Mongolic, share a number of similar typological traits, such as clipped morphology. It can be hypothesized that the Hexi Corridor was a formative area, where several language groups (Turkic, Mongolic, Chinese, Tibetan, Iranian) merged and blended as part of the Silk Road trade interaction, resulting in the emergence of trade pidgins and finally some of the unique local creoles. These creole languages further interacted with each other, as in the case of Yugur (Turkic) and Yugur (Mongolic), the latter apparently resulting from the Mongolicization of the former after the Genghis Khan's invasion when Mongolic languages became ubiquitous. The study of this complex creolization process may be interesting in the context of the English language history and the rather obscure linguistic process that led to the rise of Middle English.


(West) Yugur
azaq yulds Gzl quruG lahpzhq
< Mong.
uzu- moNs BaGr y br
ush drt

bes ahldy yidy, yeti,
saGs doGs on,

Yugur /yoo-GOOR/ people are a small ethnic group, which are assumed to have migrated into western China (Sunan Yugur Autonomous County) after c. 850 AD from undefined Uyghur oases probably to avoid Islamization [uncertain]. There, on the outskirts of China, the Yugurs established the prosperous Ganzhou /gun-JOW, kun-CHOW/ Kingdom (870-1036 AD) with the capital near present-day Zhangye /jung-YEH/ and economy based on the Silk Road trade.

The exact classification of Yugur is unclear, but it seems to be a "mixed" language based on the ancient Turkic substratum with some Mandarin-Mongolic-Tibetan influence. Yugur is characterized by the loss of verbal conjugation; the presence of the archaic ire copula; multiple loanwords; the Mandarin consonant system (which means that <b>, <g>, <d> are pronounced as semi-voiced, whereas <p>, <t>, <k> as pre- or post-aspirated).

Yugur herdsmen, China A Yugur girl
Yugurs at home (staged)

The self-appellation is Sarg Yogr "Yellow Uyghur". The Oilyg Yugurs are nomadic cattle breeders in the steppes, the Taglyg — in the mountains. Additionally, note the most commonly accepted names in other languages: (West) Yugur in English, sar-jugurskij in Russian, Sar Uygurca in Turkish.

The Yugur people like to wear their traditional red hats. Religion: Tibetan Buddhism, traces of shamanism. Population: circa 4500 speakers (2000).

The Yugur people are not to be confused: (1) with the Mongolic-speaking Shera-Yugurs, or Eastern Yugurs (c. 2800 speakers), who by the way wear a different hat style; or (2) with the Yughu (the Sinicized Yugurs losing their ethnic roots).

Yellow Uighur (?)                   pr

"Yellow Uighur" is not usually mentioned as a separate language, yet some sources, such as Tenishev (1966), cite contradictory data; these inconsistencies could be due to a dialectal split in Yugur or even due to the existence of another Yugur language, which would be natural considering more than the 1200 yearlong existence of this subgroup. This ambigous evidence has been preserved here for later consideration.


Salar aya:x yldus qizil kuru, kur yRfax,
uxla- moNus,
paGr oy pir,


Salar /sah-LAR/ is another language of controversial classification. According to legends, the Salar people are said to have moved into Xunhua /shoon-HWAH/ Salar Autonomous County in western China, approximately the same location as the Yugur people. The migration is said to originate either from Samarqand, Uzbekistan, or the Khorasan Province, occurring c. 1370, which matches the rise of Tamerlane. The migration could have been accomplished by traveling along the Silk Road.

Traditional Turkology usually describes Salar as "Oghuz", however there is a conspicuous absence of any typical Oghuz-Seljuk innovations. Moreover, the striking phono-semantic mutations, the grammatical similarity to Yugur (including the loss of conjugation), and the strong Chinese influence (e.g. native numbers no longer in use, phonological adaptations, the sporadic use of the "sh" copula, etc.) also tend to contradict this grouping. By no means should Salar be mindlessly viewed as just "Oghuz"— rather it seems to be the outcome of creolized transition from the local Middle Yugur substratum to one of the closely located Turkic languages such as the early Chagatai or late Oghuz, additionally with some Chinese and Dongxiang (Mongolic) influence.[1]

Salar people

Religion: Islam. Population: circa 100.000 ethnic Salar people, but the language is now mostly spoken only by the elder.

Listen to this lovely traditional Salar song, Usher ya maa, usher "Look at me, gather around".



Subgroup 5:

The Oghuz-Orkhon-Karakhanid languages must have separated from the rest of the Turkic stem very early on, most likely circa 400 BC,[2] when part of the Proto-Turkic continuum infiltrated beyond the Tian-Shan-Altai-Sayan mountain barrier into Dzungaria, following the upper reaches of the Kara-Irtysh River.

In Dzungaria, Proto-Oghuz-Orkhon-Karakhanid must have soon split up into the three main branches: (1) the tribes that stayed near Dzungaria apparently forming the basis of Proto-Oghuz; (2) the tribes that spread to the east, towards the Gobi Desert, circumventing the Mongolian Altai and forming the Orkhon Old Turkic of the Eastern Gktrk Kaganate; (3) finally, the tribes that spread to the west towards the Tarim Basin initially forming Kara-Khoja (= Old Uyghur) and Karakhanid, and then contributing to the formation of Khalaj. Hence, the subgroup's tripartite name used in this publication.[1]

The founders of the Gktrk Kaganate, seemed to have been originally known as Trq or Trq (as reconstructed from the Orkhon Old Turkic script[17]), whereas other early Turkic clans originally had different clan names, such as Kyrgyz, Tatar, Oghuz, to name just a few among the earliest attested. Just like western surnames, such as John-son, Peter-son, etc, the name Tr()q most likely initially referred to the hypothetical patrilineal clan founder, which is supported by the early legends recorded in the Oghuz-namah and a mention by Makhmud al-Kashgari. Consequently, the males of that clan formerly traced their personal ancestry and family histories to the clan's legendary progenitor. When the Trq clan became prominent by the 550 AD, the name began to spread with its political influence and power, and seems to have been inherited or adopted by several Turkic peoples in Central Asia, such as the Karakhanids of the Tarim Basin, the Oghuz Turkmen near the Kopet Dag, and the Ottoman Turks in Anatolia, though the exact details of their ethnonymic history may be difficult to reconstruct.


The Turks that moved to Mongolia

The descendants of the Göktürk Kaganate


Old Turkic
adaq yultuz qzl quruG yapurGaq uD- mz baGr eb bir iki,
ch trt besh alt yeti skiz toquz on

Long before the era of Mongols, there existed a Eurasian Empire centered in Mongolia that was nearly just as great and just as powerful as that of Genghis Khan /JEN-gis, CHEN-gis, not GUEN-gis/. It was known as the Gkturk Kaganate (552-744 AD), and it controlled the western stretch of the Silk Road even as far west as the Black Sea. European historians rarely mention this empire, probably because the Gktrks ("Blue or Celestial Turks") have not reached western Europe directly, still their influence on Central Asia and Byzantine was very profound.

The Eastern (Gökturk) Kaganate (capital: Ordu-Balïq /or-DOO bah-LIK/ lit. "Army city [?]" with the population of about 100.000) had been centered in the sacred and fertile Orkhon Valley /OR-kon, or-HON/ in Mongolia. Curiously, Genghis Khan's capital Karakorum was afterwards located in the very same place: only 10 miles away from the Ordu-Balq ruins, probably because, just like the Turkic peoples, the Mongols believed in the divine force emanating from the Orkhon Valley and mythical Mount tken.

The Western (Gökturk) Kaganate, which existed until 659, was ruled from the Silk Road outpost city Suyab in today's Kyrgyzstan.

Ghengis Khan warriors Orkhon script stella Ordu-Baliq
From a Genghis Khan film (2007)

The ruins of Ordu-Balq
Orkhon River Valley
Orkhon script Ghengis Khan warriors
The Orkhon River in Mongolia From a Genghis Khan film (2007)

The Gktrk Empire was overrun first by the Chinese (659-681), and then by the Old Uyghurs (not to confuse with the present-day Uyghurs) who founded the Uyghur Kaganate that existed between 744-840. However, these seem to be changes just in the ruling dynasties and maybe religious affiliation (the spread of Manichaeism), not the language.

Finally, after a period of political decline, Ordu-Balq and other eastern cities were razed by the Yenisei Kyrgyz in 840. The collapse of this empire probably affected the spread of many Turkic peoples, pushing them further to the west.

The Gökturks-Uyghurs used the Old Turkic (Okhon-Yenisei) runiform alphabet attested since the 720s.[17] It was carved on stone obelisks thus preserving the Old Turkic language for posterity.

The name Orkhon Old Turkic has been introduced herein to describe the language of the Orkhon-Yenisei inscriptions found in Mongolia and connected with activities of the Eastern Gökturk-Uyghur Kaganate, even though in reality, the Orkhon inscriptions were not just limited to the basin of the Orkhon River but are scattered all over the steppes and mountains of western Mongolia[4].

Some publications do not discriminate among different variants of Old Turkic, so it must be noted that Orkhon Old Turkic, Old Uyghur, and Karakhanid were altogether different linguistic entities separated by hundreds of miles, several hundred years of time and different writing systems, so their common origin, no matter how obvious, may require additional proof and consideration.


The Turks that moved to the Tarim Basin

Kara-Khanid — Kara-Khoja

aaq yulduz qzl quruG yapurGa:q u- mNz baGr ev, v bi:r ekki

After the downfall of the Gktürk-Uyghur Kaganate in 840 AD or even earlier [uncertain], some of the Turkic tribes migrated towards the oases of the Takla-Makan Desert and Tarim Basin /tah-REEM[26]/ where they created the Kara-Khoja and Kara-Khanid Kaganates.

Kara-Khoja /kuh-RAH haw-JAH/ (Kocho) (capital: Besh-Balik lit. "Five city [?]") was a confederacy of decentralized Buddhist states in the eastern Takla-Makan oases, where Old Uyghur, self-appellation trk uyGur tili, was spoken. It mostly used its own Old Uyghur alphabet with vowel based on the Sogdian alphabet, but rather polyphonic and requiring many efforts in phonetic reconctruction.[18b] It was finally displaced by the Arabic script.

The Kara-Khanid Khanate (845-1212 AD) was located further to the west in the Tian Shan Mountains, and used the Karakhanid dialect of Old Turkic. The first capital of the Karakhanid Khanate was established in the city of Balasagun /bah-LAH-sah-GOON/ near Lake Issyk-Kul (present-day Kyrgyzstan), in the same region as the Western Turkic Kaganate with its capital Suyab, which again implies that the western Gkturk and Karakhanid population must have been closely connected. After some time, the Kara-Khanid capital was moved to Kashgar in the wesetrn part of the Takla-Makan.


Karakhanid Architecture
Figs: left to right, examples of the Karakhanid architecture:
(1) A decoration with swastikas;
(2) Burana Tower, ruins of Balasagun;
(3) Aisha Bibi Mausoleum in Taraz (p/d Kazakhstan);
(4) Mausoleum in Uzgen (p/d western Kyrgyzstan);
(5) a Karakhanid Minaret in Bukhara (1127 AD)

The Kara-Khanid Khanate was converted to Islam in 934. The Karakhanid and Old Uyghur languages were eventually displaced by spoken Chagatai, and its descendents, modern Uzbek and Uyghur dialects, after the 13th century.

A meritable mention must be given to Mahmud al-Kashgari ( = "Mohammed of Kashgar") (c. 1029-1102?), famous Arabic-speaking Turkologist, a son of a city mayor related to the Karakhanid dynasty, who in 1072-74 wrote the Diwan Lughat at-Turk "The Compendium of Turkic dialects", a comprehensive 700-page dictionary of the Karakhanid Old Turkic including notes on of the nearby Turkic dialects, such as Oghuz, and even describing regular phonological correspondences between Oghuz and Karakhanid. The Diwan Lughat at-Turk was a very, very professional and illustrative work of its time.


The Turkic tribes that moved further into Iran


Khalaj hada:q yulduz qzl qurruG yat- <*Azeri jigar,
hv bi: kki, :ch, sh t:rt be:sh,
alta, al.ta ye:tti, ytti skkiz

Khalaj /hah-LAHJ/ is a poorly classified Turkic language situated in western Iran about a 100 miles south of Tehran.

Khalaj is famous for several unusual features, such as (1) the presence of the initial h- where other languages have only vowels, (2) the retention of the intervocal -d- as in hadaq "foot" and (3) the retention of long vowels as in Turkmen, which suggest an early divergence from the rest of Turkic languages.

Khalaj had been first mentioned in a legend recited by Mahmud al-Kashgari, and then discovered and studied in vivo first by Minorsky (1906) and finally by Doerfer (1968-73), who nearly went to the extent of proclaiming Khalaj to be one of the most basic and early-diversified Turkic languages ever.

However, according to other studies, such as Mudrak (2002-08)[10b] and herein[1], Khalaj should be tentatively classified as a relatively late offshoot of the Karakhanid expansion, which is supported by such features as (1) the presence of the intervocal -D- (as in aDaq) in Orkhon-Kharakhanid; (2) the lack of profound historical changes in Khalaj glottochronologically consistent with an earlier separation from the main stem; (3) the presence of the prothetic h- in Khotanese, an extinct Iranian language spoken along the southern part of the Takla-Makan oases.


Consequently, as it was suggested as early as Minorsky (1906), Khalaj seems to be just the living continuation of a southern dialect of Karakhanid perhaps as used in towns near Khotan, and its peculiar archaic features can therefore be explained by the early separation of Oghuz-Orkhon-Karakhanid substem as a whole with a later Khotanese influence. Their later migration to the west may be connected with the perturbations of the Mongol invasion period [uncertain] and must have proceeded along the Silk Road, as in other similar long-distance migration cases (the Salars, the Moghols in the Hindu-Kush, the Xibe (Manchurian) at the Ili River, etc.)

Khalaj has also been strongly affected by Azeri or other local Seljuk languages, as well as the Iranian adstratum. Economy: agriculture, nomadic sheep breeding. Population: presumably, c. 42 000 speakers, mostly bilingual in Farsi.

Khalaj must not to be confused with a poorly-known Northwest Iranian language of the same name.

Subgroup 5c:

The Turks that migrated to the Aral-Caspian region

The Oghuz-Seljuk subgroup /aw-GOOZ sel-JOOK/, which includes languages closely related to Turkmen, Azeri and Turkish, has been usually known as just Oghuz. It has been renamed herein with the double name to stress the significance of the Great Seljuk Empire and its linguistic descendants.

The Oghuz-Seljuk languages are characterized at least by the following typical features:
(1) a number of specific voicing patterns as in *trt > Oghuz-Seljuk drt; *yetti > Oghuz-Seljuk yedi; *qïzïl > Turkmen Gïzïl, especially in the initial consonants;
(2) the m- > b- mutation as in mNz > *bNz > buynuz "horn";
(3) the loss of the final -G as in *quruG > Guru, kuru and the prevocalic -G- in the morphological suffixes -Gan > -an (participle), -Ga > -a (dative);
(4) the tendency to form a contracted -yor-/yar- present tense, as in Turkish bil-i-yor-um "I know";
(5) the active use of the -msh past participle, including the use of i-mish to produce an audative mood in nouns and adjectives (even though -msh is an archaism occasionally found here and there in Turkic languages (e.g. in Sakha, in Uzbek perhaps from Karakhanid/Old Uyghur, in Cuman/Polovtsian perhaps from Oghuz), it is presently actively used only in Seljuk languages);

Some of these features were actually mentioned as early as 1072 by Mahmud al-Kashgari as part of his brief description of the Oghuz language. This shows that by 1000 AD Karakhanid and Oghuz were already quite different dialects with a notable temporal separation, therefore it is reasonable to surmise that the diversification of Proto-Oghuz-Orkhon-Karakhanid must have occurred at least by 400-500 AD or even earlier (a glottochronological rule of thumb stating that a language takes about 700-800 years to form).

Oghuz ayaq               v *bir *iki *ch *drt *besh *alt *Jedi *sekiz *dokuz *on

The Oghuz clan confederacy was first attested circa 600 AD in Mongolia. In the 8th century, the Oghuz tribes waged a war with the Orkhon Gktrks and were subjugated by them, so at the time, they were already regarded as a tribal unity clearly different from Tr()k, Tatar and Qrgz. 

By 775, the Oghuz tribes were found near Talas in Sogdiana, assumingly having arrived there as part of a mass migration to the Western Gktürk Kaganate. Eventually, they seem to have traveled along the Syr-Darya /SIR DAR-ya[26]/, or Yaxartes River towards its delta in the Aral Sea where they formed the confederacy of the Transoxanian Oghuz with a capital named Yangi-Kent and a ruler titled yabgu (=prince). There in the Transoxanian steppeland, they were witnessed by several Arab travelers, including a vivid description by Ibn-Fadlan in 922. Mahmud al-Kashgari (1072) mentioned several Oghuz towns, some of which have been rediscovered by archaeologists; he also explicitly stated that "Turkmen" and "Oghuz" meant essentially the same, which means that the modern-day Turkmen people must be the direct descendants of the Transoxanian Oghuz clans. On the other hand, the name Turkmen apparently could initially be applied to any Islamized Turks.

The Oghuz dialect-language of the 11th century is documented in Al-Kashgari's writings mostly as a few words and phrases. By the 12th century, the Transoxanian Oghuz tribes apparently migrated towards the Kopet-Dag Mountains or partly dissipated. According to a poorly supported hypothesis, they could also be connected to the Pecheneg raids into the Kievan Rus, but the origins of the latter are highly controversial.

On the origins of the ethnonym Oghuz:

The ethnonym Oghuz was first attested as Alt Oghuz (The Six Oghuz) in a Yenisei inscription, and then as Toquz Oghuz (The Nine Oghuz), Sekkiz Oghuz (The Eight Oghuz) in another Orkhon inscription from Mongolia, and as the ch Oghuz (The Three Oghuz) near Kyrgyzstan. The numbers before the name apparently meant just the number of tribal units participating in a military confederacy, which could change depending on the situation[1].

The ethnonym Oghuz most likely goes back to a personal name of a legendary patrilineal clan progenitor, described in oral legends collected in the Oghuz-namah ("The Oghuz Narratives")[1], with the earliest written record by Rashid al-Din dating to the end of the 13th century. Presumably, this name or alias may have originally meant qz "bull, ox" implying force and vigor.


Juvwar, Oghuz city
The remnants of Juvara, an Oghuz city discovered by archaeologists near the Aral Sea in 2008

An early Turkmen yurt c. 1911, true color photography by Prokudin-Gorski

ayaG yldð Gðl Gur yapraG uqla- buynuð;
baGr y bir iki ch drt besh alt yedi ßekið dokuð on

Turkmenistan (capital Ashgabad /ush-gah-BAHD/, built from a village in 1918) is in fact a thin strip of arable land situated between the Karakum (Qaraqum) Desert /kah-RAH KOOM[26]/ lit. "Black Sand" and the Kopet-Dag mountain range.

When Russia took control of Turkmenistan in the 1880's, the Transcaspian Railway was built along the northern stretch of the Silk Road. In 1948, Ashgabad was destroyed by an earthquake, but was rebuilt anew, now making a beautiful city in the desert. In the 1950s, the Qaraqum Channel, the largest in the world irrigation system, was established diverting the waters of the Amu Darya towards Ashgabad.


Turkmen bride
A Turkmen bride, c. 2005

Ashgabad Trade Center

Turkmen people
The Turkmen people:
man and wife, c. 1905
Seljuk Monument
The Seljuk Monument
Turkmen girl
A Turkmen girl
The Arch of Independence,
Oil & Gas Ministry
Turkmen choban
A choban (shephard)
Turkmen village
A Turkmen village in Afghanistan
Seljuk Sultan Sanjar  Mausoleum
Seljuk Sultan Sanjar Mausoleum, 1157 AD, Merv (today's Mary)
Turkmen carpets
Turkmen carpets

One of the most notable phonological features of Turkmen is the pronunciation of <s> and <z> as the interdental /ß/and // in English, as well as the retention of long vowel, as in /ot/ "grass" vs. /o:t/ "fire". The latter phenomenon, also found at least in Khalaj, is known as the primary long vowels and presumably it goes back to Proto-Turkic.

The dialectal diversification in Aral-Caspian Oghuz has resulted in the formation of many variants of Turkmen. Turkmen is based on the Teke dialect. Other major dialects include Yomud (north and west of Turkmenistan), Ersarin (along the Amu-Darya), Salyr (along the Iranian border), Saryq (along the Murgab River), Chovdur (Dashoguz area, along the Amu-Darya), Trukhmen (Stavropol Krai, Russia).

There are circa 7 million Turkmen people, of which 2 million live in Afghanistan and Iran. Of all the ex-Soviet republics, Turkmenistan seems to have the highest percentage of non-Russophone popultaion (80%) [wiki].



The Turks that migrated into Iran and Anatolia

The Seljuk Empire descendants


The Great Seljuk Empire (1037-1077) was founded by the Seljuk Dynasty that goes back to the legendary founder Seljuk /sel-JOOK/ (c. 931-1038), whose clan had split off from the Oghuz confederacy c. 985 and traveled from the Aral Sea and Kopet-Dag region southward towards Persia. This period marks the divergence of modern Turkish and Azeri (Seljuk languages) from Turkmen (Oghuz languages).

Under Seljuk's grandson Togrul Beg, the Seljuk people migrated into eastern Persia, and by 1055 expanded their control all the way to Baghdad.

  Battle of Manzikert
Modern artist's impression of the Battle of Manzikert (1071)
A Seljuk archer
Seljuk (Oghuz) archer
Entry into Constantinople
The Entry of Mehmed II into Constantinople (1453), painting by Benjamin Constant (1876)

The advance of the Turkic armies caused the Byzantine emperors to desperately seek protection in Europe, thus contributing to the initiation of European Crusades that started in 1095. It seems that the first Crusades did not really fight against Muslims, rather they were directed against the Seljuk threat from the East. The fact that Mahmud al-Kashgari composed his famous Turkic dictionary by 1073 is also connected with the significance of the Oghuz language in the newly formed Seljuk Empire.

By 1071, the Seljuk Turks won the decisive Battle of Manzikert, which neutralized Byzantine and led to the foundation of the Turkic Sultanate of Rum (1077-1307) in Anatolia [from Arabic Rum /room/ "Rome", implying the Second (Eastern) Rome, or Byzantine].

The Seljuk language of this and the later period, written in Arabic script, is usually known as Old Anatolian Turkish.

The Turkish (Ottoman) Empire begins to rise by 1300, and to flourish with the capture of Constantinople in 1453, the year marking the final collapse of the Byzantine Empire. The Turkish language of the 16th to 20th century written in Arabic is called Ottoman Turkish.

A rather typical feature of Turkish and Azeri is a particularly high level of long synthetic agglutinating constructions exacerbated by a one-word orthography, that can also be found in other Turkic languages but probably not to the same extent, e.g. /anla-ya-bil-mish-tir/ "supposedly, (he, she) could really understand (it)" or /doktor-du/ "(he, she) was a doctor". Such constructions can also make the impression of nouns and adjectives being conjugated like verbs. [The closest phenomenon in English would be an excessive use of spoken contractions, e.g. "sh'could've bot'em", when several words are stuck into a single one-word-looking phrase, but in Seljuk languages they have long become sythesized into normal morphology.]

Another notable feature of Turkish and Azeri is a rather irregular loss of b- in *bol-mak > ol-mak "to be" and *bilen > ile "with".

Qashqai     g.zl     yat-       bir ikki ch drt b'sh         on

The Qashqai /kush-KY/ people have traditionally been nomadic pastoralists who lived around Shiraz in southern Iran and who had probably arrived there with the Seljuk invasion. Presently, they mostly dwell in settled households. The Qashqai are renowned for their magnificent pile carpets and other woven wool products.

Population: over 1-1.5 million.

  Qashkai people (real)
(1) A Qashkai wedding;
(2) Old ways still prevailing among the nomads;
(3) A Qashqai child

Azeri ayag ulduz gizl Guru,
yarpag yat- buynuz baGr ev bir iki ch drd besh alt yeddi sekkiz doqquz on

The Azerbaijani /AH-zehr-by-JAH-nee[26]/ people (an abbreviated substandard: Azeri) inhabit the territory southwest of the Caspian Sea along the lower course of the Kura River /koo-RAH/, and further southwest towards the Zagros mountains. They are the descendants of the Oghuz-Seljuk tribes that entered Persia by 1055 but did not migrate into Anatolia. These Seljuk tribes gradually Turkicized the North Caucasian, Northwest Iranic, Persian, and Armenian population, creating a blend of local cultures.

After a series of Russo-Persian wars (1812, 1826-28) Iran lost some of its northern territories to Russia, which finally became independent in 1991 as the Republic of Azerbaijan (capital: Baku /bah-KOO[26]/). The north Iranian provinces also bear similar names (East Azerbaijan, West Azerbaijan), akin to the name of Atropates, a satrap who ruled this region of ancient Persia.

  An Azeri princess (staged) An Azeri princess (staged)
(1-2) Aida Makhmudova
as an Azeri princess (2005)
Baku at night; Urmiyye market. Iran
(3) Baku, the capital of Azerbaijan,
facing the Caspian;
(4) Urmiyye fruit market
in Iran (early 20th century)

Azerbaijani differs to some extent from Turkish (80% in Swadesh-215 with borrowings included), though both languages are still largely mutually intelligible.

Religion: Shi'a Islam. Population: 7.5 million speakers in Azerbaijan + c. 15-20 million in Iran, though many of them now speak Russian or Persian as their 2nd language.

Here is an Azeri song Dashl gala "Stone castle".

Turkish ayak yldz kzl kuru yaprak uyu- boynuz kara
ev bir iki ch drt besh alt yedi sekiz dokuz on

The Ottoman Empire (c.1299-1922) was named after Osman I (1258-1326) who extended the frontiers of Seljuk settlements towards the edge of Byzantine, although Constantinople, its capital, would finally be captured by the Anatolian Turks only in 1453. In the 16th century, the Ottoman Empire reached its maxium extent covering most Middle East, northern Africa, southern Russia and southeastern Europe, but slave trade and low literacy rate were part of its society for centuries.

The Ottoman Empire entered WWI through the Ottoman-German Alliance in 1914. The occupation of Izmir in 1919 by the Greek troops promoted the establishment of the Turkish national movement under the leadership of Mustafa Kemal Atatrk, who is seen as the crucial historic figure and the founder of the Republic of Turkey (capital: Ankara /AHN-karah, AN-karah[26]/). An admirer of the Enlightenment, he sought to transform the anachronistic Ottoman Empire into a modern, democratic, secular nation-state. A Latin alphabet instead of the Arabic Ottoman script was introduced to increase literacy, and the Turkish language reform was initiated to exclude excessive Arabic and Persian borrowings.

  Istanbul, Izmir Istanbul Turkish girl, a tram in Istanbul
Views of Istanbul,
except left below: Izmir

It should be explained, however, that despite the effort of the language reform that succeeded in excluding several thousand non-Turkic words, replacing them with sometimes contrived neologisms, many Arabic, Persian and French borrowings are still a completely normal part of the modern Turkish language.

In Turkish phonology, the velar-uvular /G/ is normally entirely omitted in western dialects, e.g. daG > da: "mountain", daGï > daï "the mountain" (acc. or possessive nom.), etc. The 1st person pronoun *men "I" has evolved into ben, an almost unique feature among Turkic languages. Religion: Islam. Population: circa 70 million Turkish speakers.

What can express a Turkish soul better than a good old quaint Türkü song, such as those performed by Burcin (=Burchin)[at youtube or elsewhere]: Dane, dane (dialectal) "Your moles are like little seeds — Is there anything sweeter than the beloved one?"; Neredesin sen? "Where are you?", and a more dramatic one, Gönül daGï lit. "Soul mountain".


yldz qzl,
boynuz qara,
ev bir eki u:ch drt besh alt yedi sekiz doquz on

The Turkish migration to the Crimean Khanate during the 15th-18th century, when it was nominally subject to the Ottoman rule (1478-1774), led to the development of the so-called southern dialect of Crimean Tartar that was essentially "Crimean Ottoman Turkish". Presently, it is probably dissolved and intermingled with Central and Northern Crimean Tartar.

Gagauz ayaq ylds qzl quru yapraq uyu- buynus baGr ev, yev bir iki ch drt besh alti yedi sekiz dokuz on

Gagauz /gah-gah-OOZ/, often explained as Gök Oghuz > Gökouz in Turkish pronunciation that omits /G/, is the westernmost Turkic language spoken mostly in Gagauzia, a small autonomous territorial unit, formed in 1994 in Moldova, between Romania and Ukraine. Gagauzia includes only 2 towns and 27 villages.

The Gagauz people moved to this region from Bulgaria after the Russo-Turkish war (1806-1812), though their origins in Bulgaria are poorly understood. Presumably, they could have been the followers of the Seljuk Sultan Kaykaus II (1236-1276) from Anatolia or just Bulgarian Christians Turkified during the Ottoman expansion in the 16th century and afterwards.

Even more than Azeri, Gagauz is mutually intelligible with Turkish to a notable extent.

Religion: Orthodox Christianity. Population: c. 250.000 persons.


Gagauz people
The Gagauz villages in Moldova







1. The Internal Classification and Migrations of the Turkic Languages (2009-2012) [herein].

2. The Lexicostatistics and Glottochronology of the Turkic Languages (2009-2012) [herein].

3. Mongolic/Tungusic Language Cluster (2009, 2012) [herein].

4. Yu. V. Normanskaja, Rastitelnyj mir. Derevja i kustarniki. Geograficheskaja lokalizatsija prarodiny tyurkov po dannym floristicheskoj leksiki (The plant world. Trees and shrubs. The geographical localization of the Turkic homland based on the floristic lexis data.)

5. Hugjiltu, Sound Comparisons between Turkish and Mongolian, Inner Mongolia University // Infosystem Mongolei (1995)
5a. The History and Bibliography of the Uralic, Altaic, and Ural-Altaic Research (2012) [herein]

6. Nikolay Baskakov, K voprosu o klassifikatsii tyurkskikh yazyakov (On the matter of the classification of Turkic Languages) // Izvestiya AN SSSR, Otdeleniye yazyka i literatury, vol. 11/1, Moscow (1952)

7. Nikolay Baskakov, Vvedenije v izuchenije tyurkskikh jazykov (An introduction into the study of Turkic languages), Moscow (1969)

8. Sergey Starostin, Altajskaja problema i proiskhozhdenije japonskogo jazyka (The Altaic Problem and the Origins of the Japanese Language); Moscow (1991) [a dissertation that includes detailed 100-word Swadesh lists (in English) for the whole Altaic family]

9. Sravnitelno-istoricheskaja grammatika tyurkskikh jazykov. Leksika. (The Comparative Historical Grammar of the Turkic Languages. Lexis.); editorial board: E. Tenishev et al; Moscow (2002) [many lexical examples and supposed proto-forms reconstructing the life of Proto-Turks]

10. M. Dyachok, Glottchronologija tyurkskikh jazykov (The Glottochronology of the Turkic Languages), Materials of 2nd Scientific Conference, Novosibirsk (2001)

10a. Anna Dybo, Lingvisticheskije kontakty rannikh tyurkov. Leksicheskij fond. (Linguistic Contacts of the Early Turks: the Lexical Fund), Moscow (2007) [the book includes a lexicostatistical analysis with a couple of dendrograms (in English), and a detailed analysis of early borrowings into Proto-Turkic]

10b. Oleg Mudrak, Klassifikatsija tyurkskikh jazykov i dialektov s pomosch'ju metodov glottokhronologii na osnove voprosov po morophologii i istoricheskoj fonetike (The classification of the Turkic languages and dialects based on the glottochronological methodology with a morphological and phonological questionary); Moscow (2009) [there is also a lecture at youtube and a brief online summary.]

12. Mike Edwards, Siberia's Scythians, Masters of Gold // National Geographic (June 2003)

12a. Brigitte Pakendorf, Contact in the Prehistory of the Sakha, Linguistic and Genetic Perspective (2007)

12b. Atlas narodov mira (The Atlas of the Peoples of the World), Moscow (1964)

12c. Khakassko-russkij slovar, composed by N. Baskakov, A. Inkizhekova-Grekul (1953)

12d. Oyrotsko-russkij slovar, composed by N. Baskakov, Toskhakova (1947)

13. A series of articles concerning the origins of the ethnonym "Khakas" (in Russian) by S. Yakhontov, V. Butanayev, S. Klyashtornyij // Ethnograficheskoje obozrenije (1992)

14. Kratkaja grammatika kazak-kirgizskogo jazyka (The brief grammar of the Kazakh-Kirgiz language), composed by P. Melioranskij, Sankt-Peterburg (1894)

15. Kumekov, B.E., Gosudarstvo kimakov IX-XI vv. po arabskim istochnikam (The Kimak State of the 9th-11th century according to the Arab sources), Alma-Ata (1972)

15a. Baskakov, N.A., Sovremennyje kypchakskije yazyki (The modern Kypchak languages), Nukus (1987)

15b. Trepavlov, V.V., Malaja Nogajskaja Orda. Ocherk Istorii (The Lesser Nogai Horde. A historical essay.) // Tyurkologicheskij sbornik 2003-2004: tyurkskije narody v drevnosti i srednevekovye, Moscow (2005)

16. Messerschmidt, D.G., Forschungreise durch Sibirien (Dnevnik puteshestviya iz Tobolska) (The trip diary from Tobolsk), (1721-1725)

16a. Marzhanna Pomorska, Middle Chulym Noun Formation, Krakow (2004) (in English)

16b. Dialekty zapadnosibirskikh tatar (The dialects of West Siberian Tatars), Akhatov G. Kh.; avtoreferat dissertatsii [a dissertation summary], Moscow (1964)

16c. Govory sibirskikh tatar yuga tymenskoj oblasti (The dialects of the Siberian Tatars from South Tyumen Oblast), Alishina, Kh. Ch.; avtoreferat dissertatsii [a dissertation summary], Kazan (1992)

16d. Dmitriyeva, L.V., Yazyk barabinskikh tatar (materialy i issledovanija) (The language of the Baraba Tatars (materials and studies)), Leningrad (1981)

16e. Myagkov, D. A., Traditsionnoje khozyajstvo barabinskikh tatar vo vtoroj polovine XIX veka – pervoj polovine XX (The traditional economy of the Baraba Tatars from the second half of the 19th to the 1st half of the 20th century), avtoreferat dissertatsiji [a dissertation summary], Omsk (2009)

16f. Abakirov, M.Sh., Etnodemograficheskaya situatsiya u barabinskikh tatar Novosibirskoj oblasti (The ethnic and demographic situation of the Baraba Tatars in Novosibirsk Oblast) (2007)

17. Türik Bitig [a site dedicated to Orkhon-Yenisei inscriptions (translated into English)]

18. Lars Johanson, Eva A. Csato, The Turkic languages, London, New York (1998)

18a. Nicholas Poppe, Introduction to Altaic linguistics, Wiesbaden (1965)

18b. Jazyki mira: Tyurkskije jazyki (The Languages of the World: The Turkic Languages); editorial board: E. Tenishev, E. Potselujevskij, I. Kormushin, A. Kibrik, et al; The Russian Academy of Sciences (1996)

19. Mahmud al-Kashgari, The Compendium of the Turkic Dialects (c. 1073), translated by Robert Dankoff and James Kelly, (1982)

20. Classifications of Turkic Languages by various authors at,
A classification of Turkic Languages by Baskakov (1969) at
(in Russian)

21. 200-word Swadesh lists for Turkic languages [see also a more elaborate version of Swadesh-215 in The Lexicostatistics and Glottochronology of the Turkic Languages]

22. Talat Tekin, Türk Dilleri Ailesi (The Turkic Language Family) // Genel Dilbilim Dergisi, Vol. 2, pp. 7-8, Ankara (1979) (in Turkish)

23. Frier Iohn de Plano Carpini, The long and wonderful voyage of Frier Iohn de Plano Carpini (1245-46)

23a. The Secret History of the Mongols, c. 1240, translation by F. W. Cleaves (1982) from the Mongolian original

23b. Aus Sibirien. Lose Blätter aus meinem Tagebuche (From Siberia: Torn pages from my diary), Wilhelm Radloff, Leipzig, 1893

24. Sevda Sulejmanova, Istorija tyurkskikh narodov (The history of the Turkic peoples), Baku (2009)

24a. Sevan Nishanyan. agdas Trkenin Etimolojik Szlg (The Modern Ethymological Dictionary of Turkish) (2002-12)

24b. The Encyclopedia Iranica

24c. Pauli, Fyodor Khristoforovich, Description ethnographique des peuples de la Russie, Saint-Petersburg (1862)

24d. Okonchatelnyje itogi vserosijskoj perepisi naselenija 2010 goda (The final results of the population census of Russia (2010))

25. Brockhaus and Efron Encyclopedic Dictionary, Saint Petersburg (1906)

26. Webster's New World Dictionary of the American Language. Second College Edition, Editor-in-Chief: David Guralnik, Prentice Hall Press (1986)








2009-2014 (c)




Listed on: Dmegs Web Directory

the best hosting
for scientific applications