In historical linguistics, the homeland or Urheimat (//, from German ur- "original" and Heimat, home) of a proto-language is the region in which it was spoken before splitting into different daughter languages. A proto-language is the reconstructed or historically-attested parent language of a group of languages that are genetically related.
Depending on the age of the language family under consideration, its homeland may be known with near-certainty (in the case of historical or near-historical migrations) or it may be very uncertain (in the case of deep prehistory). Next to internal linguistic evidence, the reconstruction of a prehistoric homeland makes use of a variety of disciplines, including archaeology and archaeogenetics.
There are several methods to determine the homeland of a given language family. One method is based on the vocabulary that can be reconstructed for the proto-language. This vocabulary – especially terms for flora and fauna – can provide clues for the geographical and ecological environment in which the proto-language was spoken. An estimate for the time-depth of the proto-language is necessary in order to account for prehistorical changes in climate and the distribution of flora and fauna.
Another method is based on the linguistic migration theory (first proposed by Edward Sapir), which states that the most likely candidate for the last homeland of a language family can be located in the area of its highest linguistic diversity. This presupposes an established view about the internal subgrouping of the language family. Different assumptions about high-order subgrouping can thus lead to very divergent proposals for a linguistic homeland (e.g. Isidore Dyen's proposal for New Guinea as the center of dispersal of the Austronesian languages). The linguistic migration theory has its limits because it only works when linguistic diversity evolves continuously without major disruptions. Its results can be distorted e.g. when this diversity is wiped out by more recent migrations.
The concept of a (single, identifiable) "homeland" of a given language family implies a purely genealogical view of the development of languages. This assumption is often reasonable and useful, but it is by no means a logical necessity, as languages are well known to be susceptible to areal change such as substrate or superstrate influence.
Over a sufficient period of time, in the absence of evidence of intermediary steps in the process, it may be impossible to observe linkages between languages that have a shared Urheimat: given enough time, natural language change will obliterate any meaningful linguistic evidence of a common genetic source. This general concern is a manifestation of the larger issue of "time depth" in historical linguistics.
For example, the languages of the New World are believed to be descended from a relatively "rapid" peopling of the Americas (relative to the duration of the Upper Paleolithic) within a few millennia (roughly between 20,000 and 15,000 years ago), but their genetic relationship has become completely obscured over the more than ten millennia which have passed between their separation and their first written record in the early modern period. Similarly, the Australian Aboriginal languages are divided into some 28 families and isolates for which no genetic relationship can be shown.
The Urheimaten reconstructed using the methods of comparative linguistics typically estimate separation times dating to the Neolithic or later. It is undisputed that fully developed languages were present throughout the Upper Paleolithic, and possibly into the deep Middle Paleolithic (see origin of language, behavioral modernity). These languages would have spread with the early human migrations of the first "peopling of the world", but they are no longer amenable to linguistic reconstruction. The Last Glacial Maximum (LGM) has imposed linguistic separation lasting several millennia on many Upper Paleolithic populations in Eurasia, as they were forced to retreat into "refugia" before the advancing ice sheets. After the end of the LGM, Mesolithic populations of the Holocene again became more mobile, and most of the prehistoric spread of the world's major linguistic families seem to reflect the expansion of population cores during the Mesolithic followed by the Neolithic Revolution.
The Nostratic theory is the best-known attempt to expand the deep prehistory of the main language families of Eurasia (excepting Sino-Tibetan and the languages of Southeast Asia) to the beginning of the Holocene. First proposed in the early 20th century, the Nostratic theory still receives serious consideration, but it is by no means generally accepted. The more recent and more speculative "Borean" hypothesis attempts to unite Nostratic with Dené–Caucasian and Austric, in a "mega-phylum" that would unite most languages of Eurasia, with a time depth going back to the Last Glacial Maximum.
The argument surrounding the "Proto-Human language", finally, is almost completely detached from linguistic reconstruction, instead surrounding questions of phonology and the origin of speech. Time depths involved in the deep prehistory of all the world's extant languages are of the order of at least 100,000 years.
The concept of an Urheimat only applies to populations speaking a proto-language defined by the tree model. This is not always the case.
For example, in places where language families meet, the relationship between a group that speaks a language and the Urheimat for that language is complicated by "processes of migration, language shift and group absorption are documented by linguists and ethnographers" in groups that are themselves "transient and plastic." Thus, in the contact area in western Ethiopia between languages belonging to the Nilo-Saharan and Afroasiatic families, the Nilo-Saharan-speaking Nyangatom and the Afroasiatic-speaking Daasanach have been observed to be closely related to each other but genetically distinct from neighboring Afroasiatic-speaking populations. This is a reflection of the fact that the Daasanach, like the Nyangatom, originally spoke a Nilo-Saharan language, with the ancestral Daasanach later adopting an Afroasiatic language around the 19th century.
Creole languages are hybrids of languages that are sometimes unrelated. Similarities arise from the creole formation process, rather than from genetic descent. For example, a creole language may lack significant inflectional morphology, lack tone on monosyllabic words, or lack semantically opaque word formation, even if these features are found in all of the parent languages of the languages from which the creole was formed.
Some languages are language isolates. That is to say, they have no well accepted language family connection, no nodes in a family tree, and therefore no known Urheimat. An example is the Basque language of Northern Spain and southwest France. Nevertheless, it is a scientific fact that all languages evolve. An unknown Urheimat may still be hypothesized, such as that for a Proto-Basque, and may be supported by archaeological and historical evidence.
Sometimes relatives are found for a language originally believed to be an isolate. An example is the Etruscan language, which, even though only partially understood, is believed to be related to the Rhaetic language and to the Lemnian language. A single family may be an isolate. In the case of the non-Austronesian indigenous languages of Papua New Guinea and the indigenous languages of Australia, there is no published linguistic hypothesis supported by any evidence that these languages have links to any other families. Nevertheless, an unknown Urheimat is implied. The entire Indo-European family itself is a language isolate: no further connections are known. This lack of information does not prevent some professional linguists from formulating additional hypothetical nodes (Nostratic) and additional homelands for the speakers.