search this blog

Monday, December 23, 2013

Ancient human genomes suggest (more than) three ancestral populations for present-day Europeans

This new preprint at bioRxiv is quite the Christmas present for those of us with a passion for European genetics and prehistory. It's the first paper to report on full genomes from Mesolithic and Neolithic Europe.

All of the successfully tested Mesolithic Y-chromosomes, one from Luxembourg and four from Motala, Sweden, belonged to haplogroup I. This probably won't come as a surprise to many people, as this marker was always the main candidate for Europe's indigenous Y-haplogroup. However, three of the results fell into haplogroup I2a1b, and none into I1, which is today the most common Y-haplogroup in most of Scandinavia.

What this suggests is that I1 expanded after the Mesolithic and replaced most of the I2a1b across Northwestern Europe. I'd say these were mostly expansions from North-Central Europe, although recent chatter on the web suggests that two distinct I1 lineages might have arrived in North-Central Europe from Eastern Europe at different times.

All of the Mesolithic mtDNA sequences belonged to haplgroups U2 and U5, which is line with past results. The single Neolithic sample, from a 7500 year-old Linearbandkeramik (LBK) site in Stuttgart, Germany, belonged to mtDNA haplogroup T2. Again, not very surprising considering what we've seen to date.

The genome-wide results, on the other hand, are not as straightforward. The basic upshot is that Northern Europeans are mostly of indigenous European hunter-gatherer origin, while Southern Europeans are largely derived from Neolithic farmers of mixed European and Near Eastern origin. But the authors identify a minimum of three ancestral populations from their stats (WHG, EEF and ANE), and four meta-populations from the available ancient data (WHG, EEF, ANE and SHG). Here are brief summaries of each of these groups:

West European Hunter-Gatherer (WHG): this ancestral component is based on an 8,000 year-old forager from the Loschbour rock shelter in Luxembourg (one of the individuals mentioned above belonging to I2a1b). The WHG meta-population includes the Loschbour sample and two Mesolithic individuals from the La Brana Cave in Spain. However, today the WHG component peaks among Estonians and Lithuanians, in the East Baltic region, at almost 50%.

Early European Farmer (EEF): apparently this is a hybrid component, the result of mixture between "Basal Eurasians" and a WHG-like population possibly from the Balkans. It's based on the aforementioned LBK farmer from Stuttgart, but today peaks at just over 80% among Sardinians. Apart from the Stuttgart sample, the EEF meta-population includes Oetzi the Iceman and a Neolithic Funnelbeaker farmer from Sweden.

Ancient North Eurasian (ANE): this is the twist in the tale, a component based on a previously reported genome of a 24,000 year-old Upper Paleolithic forager from South Central Siberia, belonging to Y-hg R*, and known as Mal'ta boy or MA-1 (see here). This component was very likely present in Southern Scandinavia since at least the Mesolithic (see the summary of SHG below), but only seems to have reached Western Europe after the Neolithic. At some point it also spread into the Americas. In Europe today it peaks among Estonians at just over 18%, and, intriguingly, reaches a similar level among Scots. However, numbers weren't given for Finns, Russians and Mordovians, who, according to one of the maps, also carry very high ANE, but their results are confounded by more recent Siberian admixture (see the discussion on the European outliers below). The ANE meta-population includes Mal'ta boy as well as a late Upper Paleolithic sample from Central Siberia, dubbed Afontova Gora-2 (AG2).

Scandinavian Hunter-Gatherer (SHG): this is a meta-population made up of Swedish Mesolithic and Neolithic forager samples from Motala and Gotland, respectively. It's a more easterly variant of WHG, with probable ANE admixture.

Below are the two most important figures from the paper: a) the three-way mixture model that is a statistical fit to the data, and b) a plot of the proportions of ancestry from each of the three inferred ancestral populations. As per above, East Baltic populations are the most WHG, which is somewhat curious, because they mostly carry Y-DNA R1a and N1c1.

So if not for the ANE, we'd simply have a two-way mixture model between indigenous European foragers and migrant Near Eastern farmers, at least for most Europeans anyway. Moreover, the seemingly late and sudden arrival of ANE in much of Europe is important, because it's a smoking gun for a major population upheaval across the continent during the Late Neolithic/Early Bronze Age.

Interestingly, archeological data suggest that this was also the period which saw the introduction of new social organization and perhaps Indo-European languages across most of Europe. None of this was lost on the authors of the paper, but it appears they'd rather be cautious pending more ancient genomic data, because they chose not to explicitly mention the Indo-Europeans.

This study raises two questions that are important to address in future research. A first is where the EEF picked up their WHG ancestry. Southeastern Europe is a candidate as it lies along the geographic path from Anatolia into central Europe, and hence it should be a priority to study ancient samples from this region. A second question is when and where ANE ancestors admixed with the ancestors of most present-day Europeans. Based on discontinuity in mtDNA haplogroup frequencies in Central Europe, this may have occurred during the Late Neolithic or early Bronze Age ~5,500-4,000 years ago35. A central aim for future work should be to collect transects of ancient Europeans through time and space to illuminate the history of these transformations.


The absence of Y-haplogroup R1b in our two sample locations is striking given that it is, at present, the major west European lineage. Importantly, however, it has not yet been found in ancient European contexts prior to a Bell Beaker burial from Germany (2,800-2,000BC)12, while the related R1a lineage has a first known occurrence in a Corded Ware burial also from Germany (2,600BC)13. This casts doubt on early suggestions associating these haplogroups with Paleolithic Europeans14, and is more consistent with their Neolithic entry into Europe at least in the case of R1b15, 16. More research is needed to document the time and place of their earliest occurrence in Europe. Interestingly, the Mal’ta boy belonged to haplogroup R* and we tentatively suggest that some haplogroup R bearers may be responsible for the wider dissemination of Ancient North Eurasian ancestry into Europe, as their haplogroup Q relatives may have plausibly done into the Americas17.

No doubt, a lot of people will now be wondering about the main source of the ANE that apparently rushed into Europe at the onset of the metal ages. The Siberian steppe will probably be the favored option for many, since this is where Mal'ta boy and Afontova Gora-2 were dug up. However, I'm pretty sure the source was Eastern Europe.

First of all, as already mentioned, it seems that ANE was present in Sweden during the Mesolithic (Figure S12.7 shows around 19% ANE in the Motala12 sample). Secondly, despite the ANE and WHG being classified as separate ancestral and meta-populations, the differences between them appear to be clinal rather than discrete, which I think can be seen in the PCA and ADMIXTURE results from the study (see here and here). Thus, I'd expect a lot more ANE in Eastern Europe during the Mesolithic than in Scandinavia. Thirdly, it's likely that the ancestors of modern Uralic speakers were in Siberia very early, possibly during the Mesolithic, and they were probably East Eurasians aka. Eastern non-Africans (ENA), which ANE is not.

Indeed, latest linguistics research suggests that the pre-proto-Uralics migrated at some point from Siberia into the southern Urals, in far eastern Europe. The Uralics proper then expanded from the southern Urals, probably during the Bronze Age, both to the east and west, as far as the Baltic (see here). This Uralic expansion is certainly reflected in the Lazaridis et al. data, and it's not the only relatively late migration into Europe that shows up in their stats.

While our three-way mixture model fits the data for most European populations, two sets of populations are poor fits. First, Sicilians, Maltese, and Ashkenazi Jews have EEF estimates beyond the 0-100% interval (SI13) and they cannot be jointly fit with other Europeans (SI12). These populations may have more Near Eastern ancestry than can be explained via EEF admixture (SI13), an inference that is also suggested by the fact that they fall in the gap between European and Near Eastern populations in the PCA of Fig. 1B. Second, we observe that Finns, Mordovians, Russians, Chuvash, and Saami from northeastern Europe do not fit our model (SI12; Extended Data Table 3). To better understand this, for each West Eurasian population in turn we plotted f4(X, Bedouin2; Han, Mbuti) against f4(X, Bedouin2; MA1, Mbuti), using statistics that measure the degree of a European population’s allele sharing with Han Chinese or MA1 (Extended Data Fig. 7). Europeans fall along a line of slope >1 in the plot of these two statistics. However, northeastern Europeans fall away from this line in the direction of Han. This is consistent with Siberian gene flow into some northeastern Europeans after the initial ANE admixture, and may be related to the fact that Y-chromosome haplogroup N 30, 31 is shared between Siberian and northeastern Europeans32, 33 but not with western Europeans. There may in fact be multiple layers of Siberian gene flow into northeastern Europe after the initial ANE gene flow, as our analyses reported in SI 12 show that some Mordovians, Russians and Chuvash have Siberian-related admixture that is significantly more recent than that in Finns (SI12).

The authors are actually referring to the Kargopol Russians from the HGDP in that quote. But from my own analyses with a wide variety of samples from Russia, I know that other Russians show similar levels of Siberian admixture to Belorussians, Ukrainians and Estonians.

In any case, this of course means that there are more than three ancestral populations for present-day Europeans, albeit not all of them influenced all Europeans. Also, it's very clear that to learn all the details about the peopling of Europe, these sorts of studies really need to start focusing on the large swath of land that stretches from present-day Poland to the Urals. In other words, Eastern Europe.

I was also going to discuss the genetically inferred pigmentation of the ancient individuals, but, because of the small sample size, there's not much to discuss at this stage. The Loschbour forager possibly had blue eyes (50% chance), but dark hair and skin. On the other hand, the Stuttgart farmer definitely had dark eyes and hair, but relatively light skin. I wonder if this swarthy hunter-gatherer skin complexion has anything to do with the fact that today lots of people from around the Baltic tan really well?


Iosif Lazaridis, Nick Patterson, Alissa Mittnik, et al., Ancient human genomes suggest three ancestral populations for present-day Europeans, bioRxiv, Posted December 23, 2013, doi: 10.1101/001552

See also...

Another look at the Lazaridis et al. ancient genomes preprint

The really old Europe is mostly in Eastern Europe

EEF-WHG-ANE test for Europeans

Scratch the North Caucasus

First genome of an Upper Paleolithic human

ADMIXTURE analysis of Allentoft et al. and Haak et al. ancient genomes

Monday, December 16, 2013

Cluster analysis of West Eurasia: 13 clusters from 18 dimensions

I ran a quick Mclust analysis to get a better idea of the substructures in my recently updated dataset of West Eurasian samples. Mclust found that the optimal outcome was produced with 18 dimensions of genetic variation and 13 clusters, the latter of which are superimposed on a two dimensional MDS plot below. I chose the labels for the clusters myself and flipped the canvass to fit geography.

Here you can see the 13 clusters superimposed on all possible combinations of the 18 dimensions. Clicking on the image will take you to a 10.3MB PDF file.

It's interesting to note the presence of the very tight Jewish cluster, which includes Ashkenazi, Sephardic and Moroccan Jews. The Basques and Sardinians also cluster together, despite being clearly distinct from each other in the fist two dimensions. This is fascinating because these two groups have been mentioned a few times now in various studies and presentations as being the best modern proxies for Europe's Neolithic farmers.

The widespread Central and Eastern European cluster mostly includes individuals from populations that aren't easily characterized in these sorts of tests, and that's basically because they're of mixed origin. Indeed, I suspect things would look somewhat different in that part of the plot if I had more sizable samples from Germany, Scandinavia, Poland and nearby areas.

Mclust can produce many more clusters than just 13 from the same data, but as per above, I wanted to see what would happen if it was asked to come up with the optimal solution. For more on this type of analysis check out the articles here, here and here.

Update 17/12/2013: On a related note, here's an Mclust analysis of West, Central and South Asia. The optimal result was obtained with 10 dimensions and 14 clusters. Please note that although some of the clusters have the same names as in the analysis above, they aren't the same clusters.

See also...

Principal component analysis (PCA) of West Eurasia

Multidimensional views of South Asia, West Asia and Eastern Europe

Eurogenes' North Euro clusters - phase 2, final results

Wednesday, December 11, 2013

La Brana 1 had blue eyes

Update 27/01/2014: Mesolithic genome from Spain reveals markers for blue eyes, dark skin and Y-haplogroup C6.


Last year Current Biology put out a paper on the partial genome sequences of two Mesolithic Iberian hunter-gatherers, dubbed La Brana 1 and 2, which showed that they were genetically more similar to modern-day Northern Europeans than Iberians. According to Spanish news portal, the genome of La Brana 1 has now been fully sequenced, and the more comprehensive new data not only back up the initial findings, but also suggest that this individual had blue eyes:

El mesolítico 'leonés' afín al ciudadano del norte de Europa

As per the link above, the new paper will be published in a few weeks. I suppose this means we'll finally see a Y-chromosome haplogroup result from pre-Neolithic Europe. I'm betting on hg R, considering that this was the marker of the Mal'ta boy from Upper Paleolithic South Siberia (see here). Siberia might seem like a long way from Iberia, but in fact, for thousands of years both regions were connected by the Mammoth Steppe, which was inhabited by highly mobile herds of animals and human hunters who followed them. However, I won't be surprised if it turns out that La Brana 1 belonged to hg I or even Q.

See also...

Ancient DNA from Iberian Mesolithic hunter-gatherers

Thursday, November 21, 2013

First genome of an Upper Paleolithic human

A new paper at Nature reports on the genome of a 24,000 year-old Siberian known as Mal'ta boy or MA-1. Here's the abstract:

The origins of the First Americans remain contentious. Although Native Americans seem to be genetically most closely related to east Asians1, 2, 3, there is no consensus with regard to which specific Old World populations they are closest to 4, 5, 6, 7, 8. Here we sequence the draft genome of an approximately 24,000-year-old individual (MA-1), from Mal’ta in south-central Siberia9, to an average depth of 1×. To our knowledge this is the oldest anatomically modern human genome reported to date. The MA-1 mitochondrial genome belongs to haplogroup U, which has also been found at high frequency among Upper Palaeolithic and Mesolithic European hunter-gatherers10, 11, 12, and the Y chromosome of MA-1 is basal to modern-day western Eurasians and near the root of most Native American lineages5. Similarly, we find autosomal evidence that MA-1 is basal to modern-day western Eurasians and genetically closely related to modern-day Native Americans, with no close affinity to east Asians. This suggests that populations related to contemporary western Eurasians had a more north-easterly distribution 24,000 years ago than commonly thought. Furthermore, we estimate that 14 to 38% of Native American ancestry may originate through gene flow from this ancient population. This is likely to have occurred after the divergence of Native American ancestors from east Asian ancestors, but before the diversification of Native American populations in the New World. Gene flow from the MA-1 lineage into Native American ancestors could explain why several crania from the First Americans have been reported as bearing morphological characteristics that do not resemble those of east Asians2, 13. Sequencing of another south-central Siberian, Afontova Gora-2 dating to approximately 17,000 years ago14, revealed similar autosomal genetic signatures as MA-1, suggesting that the region was continuously occupied by humans throughout the Last Glacial Maximum. Our findings reveal that western Eurasian genetic signatures in modern-day Native Americans derive not only from post-Columbian admixture, as commonly thought, but also from a mixed ancestry of the First Americans.

Indeed, MA-1 looks like he could be an early ancestor of present-day West Eurasians, including and especially Europeans. Mitochondrial haplogroup U was almost fixed in Upper Paleolithic and Mesolithic Europe, while R1a and R1b are, after all, the most common and widespread Y-chromosome haplogroups in Europe today.

Below is the bar graph from the K=9 ADMIXTURE analysis, which turned out to be the optimal run. Note that the Mal'ta sample appears mostly South Asian (37%), European (34%), and Amerindian (26%), but also with minor Oceanian ancestry (4%). Interestingly, among the Europeans, it's the groups from Northern and Eastern Europe that carry the highest levels of these components. This is probably a reflection, at least in large part, of their elevated indigenous European hunter-gatherer ancestry (for instance, see here).

At K = 9, MA-1 is composed of five genetic components of which the two major ones make up ca. 70% of the total. The most prominent component is shown in green and is otherwise prevalent in South Asia but does also appear in the Caucasus, Near East or even Europe. The other major genetic component (dark blue) in MA-1 is the one dominant in contemporary European populations, especially among northern and northeastern Europeans. The co-presence of the European-blue and South Asian green in MA-1 can be interpreted as admixture of the two in MA-1 or, alternatively, MA-1 could represent a proto-western Eurasian prior to the split of Europeans and South Asians. This analysis cannot differentiate between these two scenarios. Most of the remaining nearly one third of the MA-1 genome is comprised of the two genetic components that make up the Native American gene pool (orange and light pink). Importantly, MA-1 completely lacks the genetic components prevalent in extant East Asians and Siberians (shown in dark and light yellow, respectively). Based on this result, it is likely that the current Siberian genetic landscape, dominated by the genetic components depicted in light and dark yellow (Figure SI 6), was formed by secondary wave(s) of immigrants from East Asia.

Here's a figure showing the levels of shared genetic drift between MA-1 and 147 present-day non-African populations. Among the Europeans it's the Lithuanians, Northwestern Russians and Baltic and Volga Finns who are most similar to the ancient sample. It's also interesting to note the relatively high position on the list of the Kalash from South Central Asia and Lezgins from the North Caucasus. At the bottom are Bedouins and Palestinians, mainly because of their non-trivial Sub-Saharan admixture, followed by Oceanians, East Asians, and South Indians, probably due to deep differentiation between their main ancestral clades and that of MA-1.

I've heard that the same team of scientists is now trying to sequence genomes from Upper Paleolithic sites west of Mal'ta. I wonder how far west? I see that the authors mention the Sungir site from near Moscow a couple of times in the paper, in relation to its similarity to the Mal'ta site. Perhaps they're working on a Sungir genome right now? If so, what's the bet that the Y-DNA turns out to be another basal R?


Raghavan et al., Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans, Nature, (2013), Published online 20 November 2013, doi:10.1038/nature12736

Saturday, November 16, 2013

mtDNA haplogroup U5a link between Eastern Europe and Iran

I'm reading a new paper at PLoS ONE on the mitochondrial DNA of Iranians. It's the first study to tackle the topic of Iranian maternal ancestry using complete mtDNA sequences. Here are a couple of quotes that caught my eye:

Between the third and second millennia BCE the Iranian Plateau became exposed to incursions of pastoral nomads from the Central Asian steppes, who brought the Indo-Iranian language of the Indo-European family, which eventually replaced Dravidian languages, perhaps by an elite-dominance model [13,17,20].


The U5a1a’g cluster itself (based on HVS1 sequence data) is concentrated in populations of the Pontic-Caspian steppe, extending from Romania, Ukraine, southern Russia and northwestern Kazakhstan to the Ural Mountains. The highest frequencies of the U5a1a’g were reported in the Volga-Ural region (5.3%), in particular in Bashkirs (4.3%) and Tatars (3.9%) [75], although the frequency varies from ,2.7% in Russians to ,1.5% in populations of the northern Caucasus [64,76–81]. It is worth mentioning that despite the low frequency of U5a1a’g haplotypes in Central Asian populations of Turkmens, Karakalpaks, Kazakhs and Uzbeks (,1.5% according to the data of [82], some haplotypes were common between Karakalpaks (haplotype marked by mutation at np 16293), Turkmens (by mutation at np 64) and Iranians. So, it seems likely that the sub-cluster U5a1g or its founder has arrived to Iran from Eastern Europe/southern Ural via the Caspian Sea coastal route.

Derenko M, Malyarchuk B, Bahmanimehr A, Denisova G, Perkova M, et al. (2013) Complete Mitochondrial DNA Diversity in Iranians. PLoS ONE 8(11): e80673. doi:10.1371/journal.pone.0080673

Thursday, November 14, 2013

The story of R1b: it's complicated

Ancient DNA is painting a remarkable picture of the period of European prehistory known as the Late Neolithic/Early Bronze Age. It's showing that after the collapse of genetically Near Eastern-like farming populations of middle Neolithic Central Europe - probably as a result of climate fluctuations, disease, famine and increasing violence - the vacuum was filled by genetically much more European-like groups from the eastern and western peripheries of Neolithic Europe.

First came the settlers from the east, belonging to the vast archeological horizon known as the Corded Ware Culture (CWC). About three hundred years later they were joined in Central Europe by migrants from the Atlantic Fringe, belonging to the Bell Beaker Culture (BBC). During the early Bronze Age, the CWC disappeared, and was replaced by the Unetice Culture (UC), which briefly overlapped with the late BBC.

Ancient DNA recovered to date suggests that the Bell Beakers were genetically the archetypal Western Europeans, characterized by Western European-specific mtDNA H subclades and Y-chromosome haplogroup R1b. Interestingly, R1b has also been found among remains of aboriginals from the Canary Islands, just off the coast of northwest Africa. It might be a stretch to attribute this directly to the Bell Beakers, but they were certainly capable sailors, so perhaps not?

On the other hand, the CWC and UC populations appear to have been Eastern Europeans to the core, carrying relatively low levels of mtDNA H, and showing strong mtDNA affinity to Bronze Age Kurgan groups of Kazakhstan and South Siberia.

Here are a couple of figures from recent studies, Brandt et al. and Brotherton et al., respectively, illustrating much of what I just said.

So it seems everything is falling into place, with ancient DNA, archeology, and modern European genetic substructures all showing basically the same phenomenon.

However, for a while now the ever more precise present-day phylogeography of R1b has been hinting that this haplogroup might not have expanded across Europe from the west. That's because the most basal clades of R1b are found in West Asia, and its SNP diversity decreases sharply from east to west in Europe. Below is a schematic of the latest phylogeography of R1b. It was presented at the recently held 9th Annual International Conference on Genetic Genealogy by Arizona University population geneticist Dr. Michael Hammer.

And here is another map shown by Hammer at the same conference, illustrating the frequencies of various R1b subclades across Europe.

I didn't see the presentation, so I don't know what Hammer actually said. But it appears as if his theory is that R1b spread across Europe from the Balkans during the late Neolithic or later, and then exploded in-situ from certain areas of Central and Western Europe during the metal ages. If true, this scenario obviously doesn't match the presumed west to east expansion of the Bell Beakers.

But here's yet another slide from Hammer's talk, which shows the frequency peaks of the most common European subclades of R1b: U106, L21 and U152. Curiously, these peaks are all located in and around former Bell Beaker territory (second image below, from Wikipedia).

Admittedly, we only have two Y-chromosome results from Bell Beaker remains, both from the same site in Germany dated to around 4500 YBP, but both belonging to R1b. Based on that, plus all of the indirect evidence outlined above, it's already very difficult to shake the association between the Bell Beakers and R1b. So I'm thinking there are three possible explanations why the latest R1b phylogeography doesn't support a Bell Beaker-driven expansion of this haplogroup in Europe.

1) The current mainstream theory positing the origin of the Bell Beaker Culture in Portugal is wrong, and the earliest Bell Beakers expanded from East Central Europe, as was once thought.

2) The latest R1b phylogeography is based on limited sampling, and many more individuals need to be tested from former Bell Beaker areas in Iberia and France to catch the basal R1b subclades in these regions.

3) The people who were to become the Bell Beakers in Iberia originally came from the southern Balkans, via maritime routes across the Mediterranean, and then dominated Western and Central Europe via a series of migrations and back migrations. The latest R1b phylogeography is simply not intricate enough to properly describe this complicated process.

The first option basically ignores ancient mtDNA data which shows that the Bell Beakers of Central Europe were of Iberian origin, at least in terms of maternal ancestry. So for now, I'm going with the third option, and looking forward to more ancient DNA results.

A lot can be said about what might have pushed the Balkan proto-Bell Beakers to Western Europe during the late Neolithic, if they actually existed. At the time Bulgaria was being invaded by steppe nomads from just north of the Black Sea, and its agricultural communities were disappearing rapidly. I suppose the ancestors of the Bell Beakers might have been refugees trying to escape these nomads. Then again, perhaps they were the descendants of the nomads who learned to sail after reaching the Mediterranean? I might revisit the issue when I have more data to work with.


Michael Hammer, Origins of R-M269 Diversity in Europe, University of Arizona, FamilyTreeDNA, 9th Annual Conference

Guido Brandt, Wolfgang Haak et al., Ancient DNA Reveals Key Stages in the Formation of Central European Mitochondrial Genetic Diversity, Science 11 October 2013: Vol. 342 no. 6155 pp. 257-261 DOI: 10.1126/science.1241844

Brotherton et al., Neolithic mitochondrial haplogroup H genomes and the genetic origins of Europeans, Nature Communications 4, Article number: 1764, Published 23 April 2013, doi:10.1038/ncomms2656

See also...

Latest speculation about R1b

High mtDNA affinity between Bronze Age Minoans and Western Europeans

Thursday, October 3, 2013

Tracing the Indo-Europeans

As far as I can tell, these videos came online only a few weeks ago. They're from a conference titled "Tracing the Indo-Europeans: Origin and migration", which was held in Copenhagen late last year as part of the Roots of Europe project. I've had a quick look at the selection below, and the impression I get is that the guest speakers would rather eat shards of glass than accept an Anatolian homeland for the Indo-Europeans (ie. the so called Anatolian hypothesis). Also, interestingly, in the last video, Kristian Kristiansen discusses the possibility that the origin of the Maritime Bell Beaker culture was in the Aegean region, and that it might have been Proto-Celtic. If the sound is too low, use VLC Media Player and crank up the volume.

Adam Hyllested: Indo-European homeland and dispersals: Contemporary linguistic evidence

Guus Kroonen: The linguistic heritage of the European Neolithic: Non-Indo-European words in Germanic

David Anthony: Early Indo-European migrations, economies, and phylogenies

Kristian Kristiansen: Trade, travels and the transmission of cultural change in the Bronze Age

Morten Allentoft: Using ancient DNA to study human evolution and migration

David Anthony: Wheeled vehicles, horses, and Indo-European origins

Kristian Kristiansen: The Bronze Age expansion of Indo-European languages

See also...

Hundreds of prehistoric North European skeletons to be genotyped for Y-DNA, mtDNA and autosomal DNA

Sunday, September 15, 2013

European-specific mtDNA C from prehistoric Ukraine

Maju points me to a thesis on ancient DNA from Ukraine which has recently become available to the public. I blogged about this paper when it was first announced in 2011, but at that time I could only access the abstract (see here). Not surprisingly, parts of the thesis are now somewhat outdated. For instance, the author suggests that Icelandic and German mtDNA haplogroup C1 lineages might have a recent Amerindian origin. However, we now know that C1 was present in Europe during the Mesolithic (see here). Nevertheless, there's still plenty of interesting reading in this report, like the comments below about the mtDNA C subclades specific to prehistoric and modern Europe.

Interestingly, the HVSI motifs present in the three Kurgan individuals appear to represent unique branches within the haplogroup C network. D1.8, L8, and L15 all branch directly from the ancestral node defined by Ya34, although D1.8 occupies a separate terminal branch from L8 and L15 (Fig. 5). This “L branch” is defined by the mutation at position 16218. L15 is separated from Ya34 by this mutation alone, whereas L8 occupies a terminal node due to its additional HVSI mutations mentioned previously (Table 4; Fig. 5). We have labeled this branch “C4a6,” since it has not been previously observed in other mtDNA studies of modern and ancient humans.


The C5 subgroup (HVSI motif 16223-16288-16298-16327) has a distinct presence in Europe. In fact, it contains a haplogroup C lineage unique to Europe, which possesses a derived mtDNA sequence type with mutations at positions 16223, 16234, 16288, 16298, and 16327. It is geographically restricted to northern Poland (Malyarchuk et al., 2002; Grzybowski et al., 2007) and northeastern Germany (Poetsch et al., 2003; Poetsch et al., 2004). This derived subcluster extends the presence of haplogroup C in Europe from the Carpathian Basin north to the Baltic coast. One individual belonging to the same European-specific lineage (except with two additional mutations) was reported in a study of Romanian Aromuns (Bosch et al., 2005) suggesting this subcluster has a persistent presence within Europe. Other examples of haplogroup C5 in Europe include another individual from Poland lacking the 16234 mutation (Malyarchuk et al., 2002) and one individual from Northern Greece with the HVSI motif 16223-16261-16288-16298 (Irwin et al., 2008). An additional member of C is located in Greece (Bosch et al., 2005) but belongs to an entirely different lineage.

Newton, Jeremy R., Ancient Mitochondrial DNA From Pre-historic Southeastern Europe: The Presence of East Eurasian Haplogroups Provides Evidence of Interactions with South Siberians Across the Central Asian Steppe Belt (2011). Masters Theses. Paper 5.

Thursday, September 5, 2013

A multidimensional view of East Asia

Asia isn't the focus of my project, but I thought many readers would find these PCA interesting:

Basically, there are three main poles of genetic variation on these plots: Northeast Siberian (Koryaks and Chukchi), East Asian (Japanese and Korean) and Southeast Asian (Malayan).

Overall, the Northeast Siberians appear to be the most distinct group, and that's because they're more closely related to some Amerindians (like Greenlanders) than even other Siberians. Interestingly, the two Koreans cluster firmly with the Japanese across the first two PCs, but are clearly separated from them in PC 4.

It's also worth noting that the Han Chinese sample from Beijing (from the HapMap project) doesn't look particularly homogenous, with some individuals overlapping with the Japanese and others with the Vietnamese.

See also...

PCA of the world

Tuesday, September 3, 2013

Saturday, August 31, 2013

PCA of the world

Following on from my last blog entry, in which I posted a PCA of West Eurasia, below is a PCA of the world. The outcome obviously looks very different, and that's because here the positions of the samples are determined by genetic clines that dominate the globe, and these are different from those that dominate West Eurasia. To view a much larger and detailed version of the image below, click on it. Individual IDs are shown in the PDF here.

See also...

A multidimensional view of Europe + West Asia

A multidimensional view of East Asia

Tuesday, August 20, 2013

Principal component analysis (PCA) of West Eurasia

In the past I've done MDS and SPA analyses of West Eurasia, but below is a PCA. Anorther version with individual IDs is available here.

The first eigenvector is a reflection of the genetic cline that runs from Northern Europe to the Middle East, with Finns being the most Northern European and Saudis and some Bedouin the most Middle Eastern. Mediterranean ancestry defines the second eigenvector, with Sardinians being the most Mediterranean, and the Mari of the Volga-Ural region the least.

See also...

Cluster analysis of West Eurasia: 13 clusters from 18 dimensions

PCA of the world

A multidimensional view of Europe + West Asia

Thursday, August 8, 2013

Moorjani et al. on recent population mixture in India

Despite some claims to the contrary across the web today, there's really nothing new or controversial about this Moorjani et al. paper, considering all of the non-academic data available online on South Asian genome-wide and Y-chromosome genetic structure. In fact, I think the authors were way too cautious and diplomatic in their assessment of the post-Neolithic population history of the region.

It is also important to emphasize what our study has not shown. Although we have documented evidence for mixture in India between about 1,900 and 4,200 years BP, this does not imply migration from West Eurasia into India during this time. On the contrary, a recent study that searched for West Eurasian groups most closely related to the ANI ancestors of Indians failed to find any evidence for shared ancestry between the ANI and groups in West Eurasia within the past 12,500 years3 (although it is possible that with further sampling and new methods such relatedness might be detected). An alternative possibility that is also consistent with our data is that the ANI and ASI were both living in or near South Asia for a substantial period prior to their mixture. Such a pattern has been documented elsewhere; for example, ancient DNA studies of northern Europeans have shown that Neolithic farmers originating in Western Asia migrated to Europe about 7,500 years BP but did not mix with local hunter gatherers until thousands of years later to form the present-day populations of northern Europe.15, 16, 44 and 45

Here's my non-diplomatic assessment of the data presented in the paper: South Asia has seen multiple waves of population movements from West and Central Asia since the Neolithic, including the Indo-Aryan invasion during the Bronze Age, which reshaped the genetic structure of the region in a remarkable way. Indeed, the Aryan invasion introduced into South Asia one of the most common Y-chromosome lineages there today: R1a-Z93 or R1a1a1b2*. Obviously, scientists working on the problem of the peopling of South Asia really need to become aware of this marker, and in particular its very close relationship to the Northern and Eastern European-specific R1a-Z283.


Priya Moorjani et al., Genetic Evidence for Recent Population Mixture in India, The American Journal of Human Genetics, 08 August 2013, doi:10.1016/j.ajhg.2013.07.006

See also...

Origins of R1a1a in or near Europe (aka. R1a1a out of India theory looks like a dud)

South Asian R1a in the 1000 Genomes Project

Southwest Eurasians + Northwest Eurasians + Mesolithic survivors = modern Europeans

Monday, June 3, 2013

Recent gene flow from Africa and the Near East into Europe

A new paper at PNAS by Botigué et al. takes a close look at African and Near Eastern admixture in Europe:

Human genetic diversity in southern Europe is higher than in other regions of the continent. This difference has been attributed to postglacial expansions, the demic diffusion of agriculture from the Near East, and gene flow from Africa. Using SNP data from 2,099 individuals in 43 populations, we show that estimates of recent shared ancestry between Europe and Africa are substantially increased when gene flow from North Africans, rather than Sub-Saharan Africans, is considered. The gradient of North African ancestry accounts for previous observations of low levels of sharing with Sub-Saharan Africa and is independent of recent gene flow from the Near East. The source of genetic diversity in southern Europe has important biomedical implications; we find that most disease risk alleles from genome-wide association studies follow expected patterns of divergence between Europe and North Africa, with the principal exception of multiple sclerosis.

The term "recent" is used throughout the paper to describe the IBD results, but as far as I can see there's no mention of any dates. Based on the data in the very thorough Ralph and Coop European IBD study (see here), I'd say that segments of over 1.5cM represent gene flow from well within the past 5,000 years. If this assumption is correct, then the results certainly make a lot of sense. That's because there were well documented historical events that could account for the main outcomes in the figure below: a) low level IBD sharing between Sub-Saharan Africa and much of Southern Europe; b) inflated IBD sharing between North Africa and Southwestern Europe; and c) inflated IBD sharing between Southeastern Europe and the Near East.

I probably don't need to discus in detail what these events might have been. Suffice it to say that the Mediterranean Basin has seen several major empires which facilitated regular population movements between Southern Europe, North Africa and the Near East. This process included the slave trade, which was one of the main economic activities in the region for a couple thousand years.

It's important to note, however, that fastIBD doesn't specify the direction of gene flow. In other words, shared IBD segments can be the result of our ancestors either receiving or giving admixture, or gene flow from a third party. But as Botigué et al. point out, the North African samples which show the highest IBD sharing with Iberians are also those with the lowest European ancestry proportions in the ADMIXTURE analysis (see below). Therefore, it's unlikely that this shared IBD is of European origin in any significant degree.

Key: Canis - Canary Islands; And - Andalusia; Gal - Galicia; Bas - Basques; Spa - Spain; Por - Portugal; Fra - France; Ita - Italy; Tsi - Tuscany; Gre - Greece ; ItaJ - Italian Jews; AshJ - Ashkenazi Jews; Qat - Qatar; NMor - North Morocco; SMor - South Morocco; OccS - Saharawi; Alg - Algeria; Tun - Tunisia; Lib - Libya; Egy - Egypt; Yri - Yoruba from Nigeria; Mkk - Maasai from Kenya.

There's also a PCA in the supplementary PDF which further underlines that most of the IBD sharing between Europe and North Africa, as well as Qatar, is not of European origin, because it creates significant substructures within the European sample.

Unfortunately the Qataris are the only Near Eastern sample used in the study. Then again, if I was to pick a single ethnic group to represent the Near East in an IBD study like this, then Qataris would probably be near the top of the list. That's because they've been affected by population movements from other parts of the Arabian Peninsula and also Persia, but at the same time never experienced significant gene flow from Europe. More information about the genome-wide genetic ancestry of Qataris is available in this recent open-access paper by Omberg et al.

Botigué et al. also make some interesting comments about Jewish genetic ancestry in Europe. The quote below comes from the supplementary PDF.

Another possible hypothesis to explain the increased diversity in southern Europe is that an influx of Jewish ancestry had a heterogeneous effect on genetic diversity in Europe. However, in most European populations here, virtually no Jewish ancestry was detected. On average, 1% of Jewish ancestry is found in Tuscan HapMap population and Italian Swiss, as well as Greeks and Cypriots. This may reflect the higher sharing with Near Eastern populations in the Italian peninsula and southeastern Europe (Fig. 2C) or low levels of gene flow with the early Italian Jewish communities (6). Estimates from the IBD analysis are in agreement with ADMIXTURE estimates that the amount of sharing between these populations is extremely low (SI Appendix, Table S3). Specifically, results of IBD sharing between southwestern Europe and North Africa are two orders of magnitude greater than those found between the same region and Jews, the average WEA for southern Europe and North Africa is 203, while for southwestern Europe and European Jews is 1.3.


LR Botigué*, BM Henn*, S Gravel, BK Maples, CR Gignoux, E Corona, G Atzmon, E Burns, H Ostrer, C Flores, J Bertranpetit, D Comas, CD Bustamante, Gene flow from North Africa contributes to differential human genetic diversity in Southern Europe, PNAS, published online before print June 3, 2013, doi: 10.1073/pnas.1306223110

Saturday, May 18, 2013

Norse dwarves: Bronze Age metallurgists from the Mediterranean?

First of all, for the lack of a better summary of what these dwarves were all about, here are a couple of quotes from Wikipedia. I checked the original sources and they look legit, so this ought to be accurate:

Dvergar or Norse dwarves (Old Norse dvergar, sing. dvergr) are entities in Norse mythology associated with rocks, the earth, deathliness, luck, technology, craft, metal work, wisdom, and greed. They are sometimes identified with Svartálfar ('black elves'), and Dökkálfar ('dark elves'),[1] due to their apparently interchangeable use in early texts such as the Eddas.

While the word "Dvergar" is related etymologically to "dwarves", the early Norse concept of Dvergar is unlike the concept of "dwarves" in other cultures. For instance, Norse dwarves may originally have been envisaged as being of human size.


The Dvergar are often called 'black', especially as the 'black elves' (svartálfar). In Old Norse, this byname 'black' (svartr) refers to hair color or eye color.

The illustration above is of Reginn the Dvergr, again courtesy of Wikipedia. Now here's an abstract from a recent open access paper on maritime contacts between the East Mediterranean and Scandinavia during the Bronze Age. Note the references to rocks, technology, craft, metal work and trade (and thus greed, I suppose).

The Bronze Age of Scandinavia (1750-500 BC) is characterized by the sudden appearance of bronze objects in Scandinavia, the sudden mass appearance of amber in Mycenaean graves, and the beginning of bedrock carvings of huge ships. We take this to indicate that people from the east Mediterranean arrived to Sweden on big ships over the Atlantic, carrying bronze objects from the south, which they traded for amber occurring in SE Sweden in the Ravlunda-Vitemölla–Kivik area. Those visitors left strong cultural imprints as recorded by pictures and objects found in SE Sweden. This seems to indicate that the visits had grown to the establishment of a trading centre. The Bronze Age of Österlen (the SE part of Sweden) is also characterized by a strong Sun cult recorded by stone monuments built to record the annual motions of the Sun, and rock carvings that exhibit strict alignments to the annual motions of the Sun. Ales Stones, dated at about 800 BC, is a remarkable monument in the form of a 67 m long stone-ship. It records the four main solar turning points of the year, the 12 months of the year, each month covering 30 days, except for month 7 which had 35 days (making a full year of 365 days), and the time of the day at 16 points representing 1.5 hour. Ales Stones are built after the same basic geometry as Stonehenge in England.

Interesting stuff. The only thing I'd add is that these contacts between the Mediterranean and Scandinavia most likely stretched back to the Neolithic, when Megalithic cultures dominated Southern and Western Europe. Indeed, the remains from a TRB (Funnelbeaker) Culture burial in western Sweden were recently genotype for autosmal DNA and they came out surprisingly Mediterranean (see here).

Nils-Axel Mörner, Bob G. Lind, The Bronze Age in SE Sweden Evidence of Long-Distance Travel and Advanced Sun Cult, Journal of Geography and Geology, Vol 5, No 1 (2013), DOI: 10.5539/jgg.v5n1p78

Wednesday, May 15, 2013

High mtDNA affinity between Bronze Age Minoans and Western Europeans

The first ever study on the ancient DNA of Minoans suggests that these enigmatic Bronze Age inhabitants of Crete were very similar in terms of mtDNA to present-day Cretans. Overall the Minoan sample shows the greatest affinity to the modern population of the Lasithi Plateau, in eastern Crete, where it originated. But here's the other really interesting part: as per the spatial maps below, the Minoan mtDNA sequences also show unexpectedly high affinity to those of modern English (a) and Bronze Age Sardinians and Iberians (b). See also Table 1 from the paper, where the top ten "nearest neighbors" to the Minoan sample are ancient and extant Western European populations.

So the results imply genetic links between Bronze Age Crete and Western Europe. Now, Martinez et al. 2007 found that 36.6% of Cretans from the Lasithi Plateau belonged to Y-chromosome haplogroup R1b. They only tested 41 individuals, but that's still an interesting result for Southeastern Europe, where R1b is generally uncommon. Indeed, perhaps the Minoans carried a much higher frequency of R1b, and they (or a related seafaring culture) spread this marker to Western Europe via maritime routes, where it has since become the most important Y-chromosome haplogroup? It's a valid question considering the ancient mtDNA data. The pics of Minoan bull leaping and Spanish bullfighting below are courtesy of Wikipedia (see here).

Update 16/05/2013: To add to my comments above about the Minoans, or a related group, being potentially responsible for the introduction of Y-DNA R1b to Western Europe, it's interesting to note that one of the Minoan mtDNA sequences belonged to the rare H13a1a haplogroup.

Both H13a1a and R1b were recently found in late Neolithic Bell Beaker remains from Germany (see here). Moreover, today H13a1a shows a peak in frequency and diversity in the Caucasus, particularly in Dagestan, but also occurs at low frequencies in Italy, Sardinia and Iberia. Interestingly, R1b is found at fairly high frequencies among some ethnic groups in and around Dagestan, like the Lezgins, and it's obviously also common in Italy and Iberia.

So what am I getting at? Well, it looks like a group with loads of R1b from what is now Dagestan or surrounds - perhaps the deep ancestors of Bell Beakers and Minoans - learned to sail, crossed the Mediterranean Sea from east to west, settled a few islands along the way, and eventually their descendants conquered much of Western and Central Europe. This is certainly not the most parsimonious theory of how R1b might have appeared on the scene in Western Europe during the late Neolithic, but it does make sense considering all the data.

But what might have caused this purported population movement from the Caucasus, and is it a coincidence that both R1a and R1b only appear among European ancient DNA from the late Neolithic onwards? It's unlikely that the Minoans and Bell Beakers were part of the Indo-European expansion, but perhaps their ancestors in the Caucasus felt the pressure of this expansion from the steppe to the north, which was at that time most likely dominated by Kurgan groups high in R1a?

Update 18/05/2013: Maju isn't convinced that the gradient maps and "nearest neighbor" analysis show explicit links between the Minoan and post-Neolithic Western European mtDNA gene pools. He calls it a "pseudo-affinity" which should be taken with a pinch of salt (see here). Moreover, he suggests the Minoan mtDNA shows closest links to early European Neolithic mtDNA because of four HVS-1 sequence matches.

But the high affinity between the Minoan and post-Neolithic Western European mtDNA can be seen clearly in two different analyses, so it's real, even if mostly indirect. Therefore, there's no need to take the results with a pinch of salt, they should just be viewed in their proper context. In other words, this affinity is certainly not due to a massive invasion of Western Europe by Minoan women, but the result of the same processes acting on the post-Neolithic Western European and Minoan mtDNA gene pools, which probably included some direct gene flow from the Eastern Mediterranean to Europe during the Bronze Age.


Hughey et al., A European population in Minoan Bronze Age Crete, Nature Communications 4, Article number: 1861, doi:10.1038/ncomms2871, Published 14 May 2013

Martinez et al., Paleolithic Y-haplogroup heritage predominates in a Cretan highland plateau, European Journal of Human Genetics (2007) 15, 485–493. doi:10.1038/sj.ejhg.5201769; published online 31 January 2007

South Asian R1a in the 1000 Genomes Project

After a recent update, the 1000 Genomes project now includes 62 individuals of South Asian origin belonging to Y-DNA haplogroup R1a-M17. Their full Y-chromosome sequences have been analyzed by Semargl and Maximus (aka. YFull project), with some interesting but not unexpected results:

- All individuals belong to R1a-Z93, which appears to totally dominate South Asian R1a-M17.

- A single Punjabi from Lahore, northeastern Pakistan, is ancestral for the Z94 mutation, which is just below Z93. All the other individuals are derived for Z94.

- Six individuals - of Punjabi, Bangladeshi and Gujarati origin - are ancestral for L657 and Z2124, the two main mutations immediately below Z94.

- All individuals of South Indian and Sri Lankan origin are derived for L657 or Z2124.

- Based on this sample, there appears to be no substructure along ethnic or geographic lines within South Asian R1a-M17 derived for L657 and Z2124.

Thus, it seems the SNP diversity of South Asian R1a-M17 is low, and decreases from Pakistan, North India and Bangladesh to South India and Sri Lanka. In comparison, there are only 12 European R1a individuals in the 1000 Genomes sample, and they represent all the major subclades of this haplogroup: R1a-Z283, R1a-Z93 and R1a-L664. Therefore, sampling bias can't be used as an argument for the more diverse result from Europe.

The lack of substructure along ethnic and geographic lines within South Asian R1a-L657 and R1a-Z2124 looks unusual, especially considering the caste system in India, and needs to be verified with more extensive sampling. However, if this outcome holds up, it'll suggest that paternal gene flow across South Asia has not been restricted by the caste system or geography. Then again, it could mean the caste system appeared after R1a-L657 and R1a-Z2124 arrived in South India via massive population movements from the north.

Below are all the results in as much detail as the current R1a SNP tree allows. Key: BEB - Bengali from Bangladesh; GIH - Gujaratai from Houston, Texas; ITU - Indian Telugu from the UK; PJL - Punjabi from Lahore, Pakistan; STU - Sri Lankan Tamil from the UK.

Z93+ Z94-
PJL - 1

Z94+ L657- Z2124- Z96-
BEB - 2 PJL - 3 GIH - 1

L657+,Y2+ etc.
1) Y9 (inc. Y7)
GIH - 7
STU - 4
ITU - 4
PJL - 8
BEB - 2

2) Y4+, Y8+, Y28+ (inc. Y6+)
GIH - 6
ITU - 6
PJL - 2
STU - 6
BEB - 5

Z2125+ (Z2124+ Z2122- Z2123-)
PJL - 1

Z2123+ (Z2124+ Z2122-, Z2125-)
PJL - 2
STU - 3
BEB - 1
ITU - 6
GIH - 2

Friday, March 22, 2013

A revised timescale for human evolution + new revelations about Paleolithic European mtDNA

Doubts have been raised about one of Europe's most talked about ancient DNA results, the mtDNA haplogroup H sequence from the Paglicci Cave remains. A preprint at Current Biology suggests the Paleolithic sample was contaminated with modern DNA:

To further evaluate the authenticity of the ancient DNA we calculated the proportion of nucleotide misincorporations arising from DNA damage, a quantity that is known to increase over time after the death of an individual [12] and has been used as an indication of authenticity in previous work [10]. It was suggested that bone samples 100 years and older have a minimum of 20% C to T misincorporations concentrated at the 50 end of the molecule [13]. Using this criterion, we excluded Paglicci Str. 4b from further analysis as the rate of C to T misincorporation at the 50 end was only 8.8%, thus making an ancient origin for the DNA in this sample uncertain [14].

Another sample thought to be from a Paleolithic European, dubbed "Cro-Magnon 1" and belonging to haplogroup T2b1, was also eliminated from the study after radiocarbon dating revealed it to be of medieval origin. What this means is that all Paleolithic European remains successfully tested to date belong to mtDNA haplogroup U, including six new samples featured in this paper.

It has been argued that hg U5 is the most ancient subhaplogroup of the U lineage, originating among the first early modern humans in Europe [18]. Our results support this hypothesis because we find that the two Dolni Vestonice individuals radiocarbon dated to 31.5 kya carry a type of mtDNA that is as yet uncharacterized, sits close to the root of hg U, and carries two mutations that are specific to hg U5. With our recalibrated molecular clock, we date the age of the U5 branch to approximately 30 kya, thus predating the LGM. Because the majority of late Paleolithic and Mesolithic mtDNAs analyzed to date fall on one of the branches of U5 (see also [15]), our data provide some support for maternal genetic continuity between the pre- and post-ice age European hunter-gatherers from the time of first settlement to the onset of the Neolithic.

Oh yeah, the paper also shows a revised timescale for human evolution based on most of the ancient mtDNA sequences listed above. However, I'd say this timescale will probably be revised a few more times as more aDNA samples become available.

Fu et al., A Revised Timescale for Human Evolution Based on Ancient Mitochondrial Genomes, Current Biology (2013),

Sunday, March 10, 2013

Genetic structure of European Russia

A paper at PLoS One reports on the genetic heterogeneity of Russian populations and the discovery of a "new pole of genetic diversity in Northern Europe":

Several studies examined the fine-scale structure of human genetic variation in Europe. However, the European sets analyzed represent mainly northern, western, central, and southern Europe. Here, we report an analysis of approximately 166,000 single nucleotide polymorphisms in populations from eastern (northeastern) Europe: four Russian populations from European Russia, and three populations from the northernmost Finno-Ugric ethnicities (Veps and two contrast groups of Komi people). These were compared with several reference European samples, including Finns, Estonians, Latvians, Poles, Czechs, Germans, and Italians. The results obtained demonstrated genetic heterogeneity of populations living in the region studied. Russians from the central part of European Russia (Tver, Murom, and Kursk) exhibited similarities with populations from central–eastern Europe, and were distant from Russian sample from the northern Russia (Mezen district, Archangelsk region). Komi samples, especially Izhemski Komi, were significantly different from all other populations studied. These can be considered as a second pole of genetic diversity in northern Europe (in addition to the pole, occupied by Finns), as they had a distinct ancestry component. Russians from Mezen and the Finnic-speaking Veps were positioned between the two poles, but differed from each other in the proportions of Komi and Finnic ancestries. In general, our data provides a more complete genetic map of Europe accounting for the diversity in its most eastern (northeastern) populations.

Figure 1. Geographic locations of the populations analyzed. Key: Komi_Izh – Izhemski Komi, Komi_Pr – Priluzski Komi, Rus_Tv – Russians from Tver, Rus_Ku – Russians from Kursk, Rus_Mu – Russians from Murom, Rus_Me – Russians from Mezen, Finns_He – Finns from Helsinki, Finns_Ku – Finns from Kuusamo, Rus_HGDP – Russians from the Human Genome Diversity Panel.

The distinct genetic character of the Komi and other North Russian populations isn't much of a surprise. These groups come from remote and sparsely populated regions near the Urals and the Arctic, and are thus affected by heavy genetic drift as a result of isolation and endogamy. They also carry higher levels of East Eurasian admixture than other Europeans due to contacts with populations of mostly Siberian origin from east of the Urals.

To explore the potential effect of population demographics on the population structures identified, ROH were compared across populations. ROH may indicate prolonged isolation and a reduced population size [29,35].


Regardless of the variations in the analysis, the highest nROH and cROH values were found in Izhemski Komi and in the Finnish sample from Kuusamo. Intermediate estimates were observed in Priluzski Komi, Veps, Finns from Helsinki, and Mezen Russians. Other populations had lower nROH and cROH values. An analysis of LD decay across genomes showed that Izhemski Komi and Finns from Kuusamo also exhibited elevated LD (Figure S6). Concomitantly, Priluzski Komi, Veps, Mezen Russians, and Finns from Helsinki exhibited only slightly elevated LD and were more comparable to the level observed in other European samples, including the remaining Russian samples.

These pronounced effects of genetic drift show up on the ADMIXTURE bar graph below, where the most drifted samples, like the Finns from the Kuusamo isolate, create their own clusters at the higher K (number of ancestral populations assumed). On the other hand, the inflated East Eurasian ancestry among the Komi and North Russians, as well as Veps and Baltic Finns, is most easily seen at K=2. It's represented by the green component which peaks in the Chinese sample.

Note also the extreme behavior of many of the samples on the PCA plot below and the bloated genetic distances between them and others in the Fst table. Again, that's mostly due to genetic drift.

Khrunin AV, Khokhrin DV, Filippova IN, Esko T, Nelis M, et al. (2013) A Genome-Wide Analysis of Populations from European Russia Reveals a New Pole of Genetic Diversity in Northern Europe. PLoS ONE 8(3): e58552. doi:10.1371/journal.pone.0058552

Saturday, February 16, 2013

Post-Mesolithic population replacements/extinctions in Northeastern Europe

The main theme of this paper by Der Sarkissian et al. is the changing character of the Northeastern European gene pool from the Mesolithic to the present. According to the authors, the genetic history of Northeastern Europe probably goes something like this...

- Northwestern Eurasia, all the way from Iberia to Central Siberia, was home to a relatively homogenous gene pool during the Mesolithic, with high frequencies of mtDNA haplogroups U4, U5 and U2e.

- Populations carrying high levels of East Eurasian mtDNA haplogroups (C, Z and D) migrated to Northeastern Europe during the early metal ages.

- Waves of migrants from Western and Central Europe caused large-scale population replacement/extinctions in Northeastern Europe possibly from the Neolithic onwards.

The paper is obviously open access, just like all PLoS articles, but below are some quotes and figures that caught my eye:

On the basis of modern genetic data, hg U was proposed to have originated in the Near East and spread throughout Eurasia during the initial peopling by anatomically modern humans in the early Upper Palaeolithic (around 45,000 yBP, [5]). It is then plausible that hg U constituted the major part of the Palaeolithic/Mesolithic mtDNA substratum from Southern, Central and North East Europe to Central Siberia. It can also be suggested that the Palaeolithic/Mesolithic mtDNA substratum has been preserved longer in NEE (Northeastern Europe) than in Central and southern parts of Europe, where new lineages arrived with incoming farmers during the Neolithisation from the Near East [16]. This is supported by ancient genomic data obtained from hunter-gatherers of Scandinavia [58] and Spain [57], that shows a genetic affinity between Mesolithic individuals and present-day northern Europeans and supports genetic discontinuity between Mesolithic and Neolithic populations of Europe.

The detection of haplogroup H in the Mesolithic site of aUz (one haplotype) is noteworthy. To date, haplogroup H has either been rare or absent in groups of hunter-gatherers previously described. It has not been found in hunter-gatherer mtDNA datasets of eastern Europe [12] and Scandinavia [13], but has been found in two hunter-gatherers of the Upper Palaeolithic sites of La Pasiega and La Chora in northern Spain [20]. The closest match to the ancient H haplotype in aUzPo belongs to subhaplogroup H2a2 [59], which is more common in eastern Europe [60] with highest frequencies in the Caucasus.


Interestingly, samples from aBOO, which are 4,000 years younger and located further North-West than aUzPo, were characterized by a large proportion and elevated diversity of mtDNA lineages showing a clear ‘Central/East Siberian’ origin (hgs C, D, and Z). Haplogroups C and D are the most common hgs in northern, central and eastern Asia. They are thought to have originated in eastern Asia and expanded through multiple migrations after the Late Glacial Maximum (,20,000 yBP [63]). Notably, haplotypic matches were observed between aBOO and modern-day central Siberian Buryats of the peri-Baikal region, which was proposed to be the origin of ancient migrations that disseminated hgs C and D [63]. Today, the sharp western boundary for the distribution of hgs C, D and Z lies in the VUB (Volga-Ural Basin), where they display intermediate frequencies: C (0.3–11.8%), Z (0.2–0.9%), and D (0.6–12%) [64].


The present-day Saami populations display clear haplotypic differences from all the ancient populations sampled for DNA so far (prehistoric hunter-gatherer populations of North/South/Central/East Europe, aUzPo and aBOO) where none of the hg V and U5b1b1a lineages distinctive of the Saami could be detected. We show here that the mitochondrial ancestors of the Saami could not be identified in the ancient NEE populations of aUzPo or aBOO, despite the latter site being within the area occupied by Saami today. The widespread modern-day distribution of U5b1 and V lineages makes it difficult to identify the origins of the Saami [32].


Saami mtDNA diversity has been influenced by a combination of founder event(s), (multiple) bottlenecks, and reproductive isolation, which are likely due to the challenging conditions of life in the subarctic taiga/tundra [32]. The complex demographic history of Saami renders their population history difficult to reconstruct on the basis of modern genetic data alone. Further temporal population samples will be required, especially along the proposed alternative western migration route into sub-arctic Europe.


The results of our coalescent simulation analyses show that the models that take account of genetic input(s) from CE (Central Europe) are better supported and could explain the genetic discontinuity observed between either aUzPo or aBOO and the modern population of NEE (Figure 5). The mtDNA lineages with a clear Central/Western European signature and currently prevalent in NEE might have reached the western Baltic and southern Scandinavia during the continuing influx of farming populations from Central or lastly southeastern Europe [13], [58], as from 6,000 yBP onwards [71–74]. However, intruding Neolithic farmers never reached Karelia and Fennoscandia [75], so the change in population would have to be a post-Neolithic process or to be due to migrations from other sources.

Der Sarkissian C, Balanovsky O, Brandt G, Khartanovich V, Buzhilova A, et al. (2013) Ancient DNA Reveals Prehistoric Gene-Flow from Siberia in the Complex Human Population History of North East Europe. PLoS Genet 9(2): e1003296. doi:10.1371/journal.pgen.1003296

See also...

New subclade of mtDNA haplogroup C1 from Mesolithic Northeastern Europe

Friday, January 11, 2013

Lots of ancient Y-DNA from China

Jilin University recently published an ancient DNA study on Chinese Y-chromosomes. It features 119 samples from 13 archaeological sites in northern China. That's quite impressive considering that aDNA is extremely difficult to extract from Y-chromosomes. A paper from 2010 on the Tarim Basin or Xiaohe mummies listed seven R1a1a results, while here we have 11, plus a K*. So it looks like at least some of these are from newly tested Tarim Basin samples.

Niuheliang, Hongshan Culture, 5000 YBP, 4 N, 1 C*, 1 O
Halahaigou, Hongshan-Xiaoheyan Culture, 4500 YBP, all N
Dadianzi, Lower Xiajiadian Culture, 3600 YBP, 3 N, 2 O3
Dashanqian, Upper Xiajiadian Culture, 3000 YBP, 1 C, 3 N1c, 1 N, 2 O3-M117, 2 O3-M324
Jinggouzi, 2500 YBP, all C

Xiaohe, Xinjiang, 3500-4000 YBP, 11 R1a1a, 1 K*
Tianshan Beilu, Hami, Xinjiang, 3300-4000 YBP, 5 N, 1 C
Heigouliang, Xinjiang, 2000 YBP, 6 Q1a*, 4 Q1b, 2 Q
Pengyang, Ningxia, 2500 YBP, all Q1a1-M120
Taojiazhai, Qinghai, 1500 YBP, all O3-M324

Miaozigou, Central-South Inner Mongolia, Yangshao Culture, 5500 YBP, all N
Sanguan site, Yu County, Hebei, Lower Xiajiadian Culture, 3400-3800 YBP, all O3
Hengbei site, Jiang County, Shanxi, 2800-3000 YBP, 9 Q1a1, 2 O2a-M95, 1 N, 4 O3a2-P201, 2 O3, 4 O*

See also...

European admixture in ancient East Asians (two-rooted canines carried by early Indo-Europeans to China)