Friday, September 25, 2009

Genetic Ancestry of Indians. A New Paper Is Creating a Ruckus

A new paper about the genetic history of Indian populations in Nature is making waves in press releases. A reader asked me if I could explain the finding in plain english. Since I am not an expert in population genetics I will instead point to two posts that have dissected the paper. John Hawks writes about the topic here. Gene Expression details it here.

The salient points as I understand:

1) This is not some pioneering work as some of the Indian press reports suggest it is. The broad finding of this study have antecedents and are not that surprising or shocking.This study though does expand on earlier work by using a larger number of genetic data points and so is significant in its scope.

2) Modern Indian populations are derived from two ancient populations referred to as the Ancestral North Indians (ANI) and the Ancestral South Indians (ASI). Ancestral North Indians were Caucasoids, genetically closer to Eurasians, Europeans and central Asians while the Ancestral South Indians  were distinct from Caucasoids and Mongoloids. The Onge of the Andamans are a good model for the original Ancestral South Indians. The modern Indians are admixtures of these two populations (the Onge are not ancestral but an early branch of the ASI)

3) The findings indicate that there is a larger amount of genetic variation between Indian groups than there is between say European groups. This the authors suggest is a result of a small number of individuals founding different ethnic groups that then remained endogamous and therefore genetically divergent. This has important medical value as recessive diseases may correlate with ethnic groups.

4) Despite these inter group differences on average Indians from various groups and across castes are more closely related to each other than they are to outgroups like Europeans or East Asians. This points to the deep residence time of people in the subcontinent and continued gene flow across groups. This has led to reporting in the press that there is no north south divide and the Aryan-Dravidian divide is a myth. Again this finding is not new. There is earlier work that suggest similar genetic relationships among Indians.

5) The study also clearly shows that for some genetic lineages there is a gradient in relationship to ANI (west Eurasian) that is a function of geography and caste. For some genes north Indians (Indo European speakers) and upper castes are more closely related to ANI than are south Indians and lower castes. Here is a table that summarizes this result. The first few samples in the table are from south India (Dravidian and Tribal) and the lower portion of the table represents north Indians (Indo-European speaking people).

6) The paper says little about when this admixture with west Eurasian genes occurred but hints it may coincide with the arrival of Indo-European speakers which has generally been timed post collapse of the Harrappan city states around 1800 - 1600 B.C. This is off course is a controversial topic. The amounts of admixture with ANI is high in some samples. This may be taken by some as a validation of the Aryan Invasion scenario in which there was a massive migration and population replacement of indigenous people in northern India by Indo-European tribes. I don't see it that way. The northwestern region has always been a conduit into India. There would have been people movements from the Central and West Asia into this region related to the spread of agriculture (6000 - 8000 B.C ?). City states like the Harrapan complex had extensive trading ties (2600 - 2000 B.C) with the Bactria Margiana Complex in the Turkmenistan - northern Afghanistan - Tajikistan - Uzbekistan area and with the Elamite civilizations in western Iran. The people involved in trading  with these city states included those from the Pontic -Caspian Eurasian steppes.

So it is unlikely that any one historical event shaped these genetic relationships. Migration and population movements of Caucasoid people into India have been taking place longer than the advent of the Aryans although it does seem that the arrival of Indo-European speakers did leave a recognizable genetic imprint on older Indian populations. 

7) The Indian Press has made a hash of the finding. For example they have only reported those parts of the study that deal with the kinship among Indians and have stressed that castes and tribes cannot be differentiated or that there is no divide between the Aryans (roughly north Indians) and Dravidians (south Indians). That is all true for average relatedness. But the study also clearly points out that there are genetic differences between north and south Indians and between upper and lower caste in terms of the degree of relatedness to Eurasians. North Indians and upper castes are more closely related to  Eurasians. North Indian upper castes have even more Eurasian ancestry. This part was ignored by the press.

But I can't blame the press entirely. The scientists who gave interviews to the press didn't mention this. They wimped out on reporting this potential inflammatory and politically incorrect finding. This is just poor and irresponsible science outreach on part of the scientists. How can you ignore a finding that is staring out at you from the very paper you are talking about? The press may be guilty of not digging in but it was just reporting what the scientists told them.


  1. I think the pres is being self-contradictory.
    At one point they read the Paper to see that the Indian population started from two different groups,
    and at the same time they say,
    the Paper debunks AIT

    I think there is a strong tendency in India to debunk the AIT (which I dont support, in favour of AMT, which I trust for a lack or a logical alternative)

  2. the paper leaves the origin of the Ancestral north indians unanswered. it does hint that ANI and Europeans share a common ancestor which might reasonably be the Proto Indo Europeans. But the Indian scientists who were part of the research team seem to be distancing themselves from this scenario at least in public.

    They suggest in press releases that ANI is a very ancient Indian population. this if extended could be taken to mean that Proto Indo European languages arose in India, something which is not supported by linguistic and archaeological data and from genetic reasoning derived from this paper.

  3. In the era of being Politically Correct I think the Indian Scientists are doing their Job. but thanks for a the detailed explanation!!

  4. Nice explanation and summary. Unfortunately, a lot of the press and pundits in India are intimidated by Hindutva propagandists, just like a lot of Republicans in the US are intimidated into waffling on evolution because they dont want to offend a relatively small but highly vocal creationist lobby. still, Indian science is generally sound; politics will continue to be a different story.

  5. long as politics doesn't intrude or influence science. unfortunately we see this occasionally as with the Ram Sethu controversy and with the Harappa /Vedic Saraswati debate

  6. Dr. Kher,

    Thanks for de-crypting the article! I'm the reader who asked for this!

    You are right. This whole topic is very touchy and gets mixed with politics all the time.

    Here is another blog which has lot of politically charged topics other than this topic:

    Just sharing. Not recommending :)

    Thanks again for the post.

  7. Dr. Kher,

    Here are some more links with some more opinions. Just sharing :

    A post touted as "brilliant" on Ram Sethu:


    And about Sarasvati river:

    For some reason I find that some humans cannot be dispassionate while searching for facts. It always gets mixed with emotions.

    Anyways, thanks for your blog. I enjoy reading your posts!

  8. yesarkay- thanks for the links. yeah..there are lots of these "scholarly" articles floating around :)

  9. In Article 2 you use the word 'derived from'. This is a misinterpretation since the original paper only says 'ancestral to' which leaves open the possibility of other ancestral groups of less genetic significance perhaps. I hope you will see that 'derived from' suggests a more restrictive hypothesis than was intended, and the danger it entails. Where the original paper treads carefully using Geographic terms to denote people from different parts of the world, you have introduced the culturally loaded terms 'Caucasoid' and 'Mongoloid'. This is unforgivable coming from a geographer/scientist. Perhaps you do want to appeal to readers who seek confirmation of their pet racial theories. I see this happening already on other blogs on this subject.

    The first sentence in Article 3 is another patent misinterpretation because you have failed to understand the meaning of 'allele frequency differences' and its significance in the concluding section of the abstract of the original paper. I'll grant that the original might have been a mite clearer had it read 'differences in allele frequencies'. The rest of Article 3 is OK though.

    In Article 4 you obfuscate rather than clarify by using words like 'deep residence time' and 'gene flow'. I think you want to suggest differences in migration rates and patterns, but you seem reluctant to say so directly.

    In Article 5, the table you reproduce adduces nothing, lacking an explanation of the statistic. Without an explanation the numbers are meaningless, especially considering the outliers. Considering that you say that the original paper 'clearly shows a gradient', how can you imply the opposite in your summation by claiming that its authors 'wimped out'? An article in Nature is not anywhere near intended for 'outreach', by which term I assume you imply 'to the layman'. Outreach of this kind is left to bloggers like you, and I'm not sure you've done it commendably and neutrally. The two links you give are hopeless in the sense that they have unraveled the mystery of the original one notch, to the level of a graduate in genetic science.

    Articles 6 and 7 are entirely your commentary and should have not been numbered to suggest that they are salient points of the original paper.

    Apparently, my concept of 'plain English' is different from yours.

  10. Sir,

    Unfortunately, the main problem of Indian scientists involved in this study is their ignorant about the main conclusion! It is already clear that they have not written a single word in the paper and just interpreted the result in their own way!!!!!!!!!

    How ironic!

  11. I know this topic is closed but for the identity of ANI there is another interpertation that was not thoroughly discussed is that ANI is a composite and it (possibly) not only identifies indo-european immigrants, but the older ANI is incoming dravidians (from the middle east direction). ASI does not equate with dravidian, just that where the dravidian languages survived (in the south) had higher ASI ancestries. One of the northernmost dravidian populations, Brahui, have very low levels of ASI comparatively. ANI = later indo european speakers, and earlier dravidian speakers, showing up mostly in the paternal ancestries, ASI = populations related to the Andaman islanders, showing up mostly in the maternal ancestries.

  12. Anon- yes.. some of the recent genetic analysis of Indian populations hint at the scenario you have outlined..

  13. And what about the possibility of 'Out of India' towards the North and West? If the period is such a long one, apparently since 50,000 yrs of movement then how come only 'coming in' is relevant and 'going out' deemed totally irrelevant? Even in such short period as a millennium the Roma/Gypsy are more than million in number which is strictly 'Out of India'. Linguistically which are the oldest surviving Indo-Germanic languages and where have the earlier ones survived? I am the layman so my queries are out of curiosity and common sense not based on any established dogma or expert commentary. Kindly clarify.

  14. Rahul - you are right, India has a very ancient population history going back 50,000 years and so people would have moved out both earlier in Pleistocene times as well as historical (Roma)... as well as in (Greeks , Mongols, Mughals). the time period of interest in regards to this controversy though is last few thousand years coincident and post Harappa. From what I have read, this paper and other new ones find evidence of northeast eurasian genes mixing with older Indian populations in multiple migration episodes interpreted to be in that time frame suggesting movement of people (s) from Eurasia into India before (proto Dravidians?) and post Harappa. (Indo- Aryans)

    I am not a linguist and so not sure which are the older surving Indo-Germanic (Indo-European languages) Anatolian, Celtic and Greek are supposed to be early branches. of IE. Tocharian was another early branch taken into what is now western China by Indo Europeans. that language is now extinct.

  15. This report is not very clear, except that he says that North indian populations (higher cast) are related to Euurasians. In this connection, one must know the details. The main Haplogroup that is common to the Eueurasinas and North Indian upper cast groups is the following chain of R1a, R1a1, R1a1a. Of these R1a and the R1a1 have earliest records in India. The R1a appears in Saharia Tribes of MP, Rajasthan between 30,00 to 40,000 years before the present. Next ear lies appearance of these is in KAshmiri Brahmans. The R1a1, first appears in Kashmiri Brahmans, otter bRahmans and the Saharia tribes and Paswans of Bihar. Among Kashmiris it appears abut 22,000 years to 30,000 years ago. The R1a1 appears in Eastern Europeans much later about 120000 years ago. Next the R1a1a appears in Eastern Europe and seems to have moved in to North about 7000 to 8000 years ago. I would have concluded that some of eh Northern tribes moved out to settle in Euroasia early on. And later some of these have come back. That would be my conclusion from the limited information, I have.

  16. From genetic evidence it seems to be two movements. The first movements out f India, carrying out R1a and R1a1 from India to Eastern Europe. The second movement of R1a1a coming back in to India about 10,000 years ago. It seems to fit with the textual picture that Rihgveda was written near Saraswati almost completely. It is very late in Rigveda that Sindhu come in to picture. And in in fact in later Mandalas three sapta Sindhus are mentioned. All the way from Ganga, Yamuna,Saraswati to Vitasta (current Jehlum) then rivers in the northern Asia, and last coming back to Afghanistan rivers and Sindhu.