Image by Gerd Altmann from Pixabay
The Genome India Project is the first stage of India's ambitious programme to map the genetic diversity of its people. This part of the project is complete, and its data is ready to be used. The Genome India Project (GIP) has catalogued the entire gene sequences of 10,000 individuals, healthy and unrelated Indians from 83 population groups. The preliminary findings were published in the journal Nature Genetics on April 8, 2025. After excluding two populations, the published findings are based on the genetic information of 9,772 individuals - 4,696 male participants and 5,076 female participants.
The project was initiated in January 2020 to study 10,000 human genomes with the financial assistance of the Department of Biotechnology. This project is a collaborative effort of 20 institutions. The Genome sequencing was carried out by the Centre for Brain Research at IISC Bengaluru under the leadership of Prof Vijayalakshmi Ravindanath, founder director. The other prominent collaborators are the Centre for Cellular and Molecular Biology in Hyderabad, the Institute of Genomics & Integrative Biology in Delhi, the National Institute of Biomedical Genomics in Kolkata, and the Gujarat Biotechnology Research Centre in Gandhinagar.1 In the process, blood samples and associated phenotype data such as weight, height, hip circumference, waist circumference and blood pressure were collected from 20,000 individuals representing 83 population groups - 30 tribal and 53 non-tribal populations- spread across India. Out of these individuals, DNA samples were collected from 10,074 individuals who were subjected to whole genome sequencing, but as mentioned earlier, two populations were excluded later.
For a collection of samples, a well-thought-out strategy was enacted. A median of 159 samples from each non-tribal group and 75 samples from each tribal group were collected from 83 population groups that were located over 100 distinct geographical habitats. It was aimed at estimating the relatively rare mutations that are important to understand complex diseases. In order to ensure accurate estimation of mutation frequencies across groups, the samples were gathered from unrelated individuals. Three to six parent-child pairs were included in each population group to uncover de novo mutations. Such mutations occur randomly in a child but are not visible in parents. Further, genomes of five tribes across India, and a continentally admixed outgroup were sequenced, namely, the Tibeto-Burman tribe, Indo-European tribe, Dravidian tribe, and Austro-Asiatic tribe. Genomes of three non-tribes - Tibeto-Burman non-tribe, Indo-European non-tribe, and Dravidian non-tribe were also sequenced. Keeping in view the language, an established proxy for genetic diversity in the Indian population, sampling was done to approximately represent the four major language families as well, namely, Indo-European, Dravidian, Austro-Asiatic and Tibeto-Burman. However, the four ancient populations living in the Andamans, dating back 65,000 years ago, and two relatively modern populations from about 55,000 years ago, were not included.2
The identification of 130 million variations is likely to spur studies that aim to determine the possible roles of population-specific genetic mutations in various diseases. The understanding of genetic variations can lead to preparation of precision medicine, ensuring treatments and interventions tailor-made for Indian genetic profiles. The data on variants associated with diseases will enable the development of affordable, genomic-based diagnostic tools, facilitating early detection, prevention and management of diseases in India.4
The main benefit of the genome database will be in the area of health. ‘The full genome of an individual that is prepared after obtaining their blood sample means getting the exact order in which four nucleotide molecules in the human DNA are arranged in an approximately three-billion-long sequence. These four nucleotide molecules- adenine, thymine, cytosine, and guanine, or simply A,T,C and G-along with a phosphate molecule and a sugar molecule, form the long double-helix DNA strands that are essentially the genetic blueprint of the individual.’5 In this regard, it is noteworthy that people within a closed and isolated population group are likely to have fewer variations in their nucleotide sequences. On the contrary, a heterogenous population will show greater genetic diversity. Most genetic variations do not result in any noticeable difference in the individual. Only 1-2%, a small fraction, are critical as their placement in the sequence affects appearance, traits or health.6
The scientists through this project collect and store germline sequences-the nucleotide sequence that a person was born with. The genetic sequence of a person changes within the course of time and every cycle of cell division introduces a few more variations, called mutations. The unique part of this germline sequence could offer, among other things, clues about an individual’s predisposition towards certain diseases. It can indicate an individual's predisposition towards certain diseases. It can not only indicate why a particular person might have developed a certain disorder, but also why some lines of treatment might not be very effective in his case. This could pave the way for personalized medicine, where a patient is not provided with general treatment but gets solutions best suited to them. Sometimes, population groups as a whole might be predisposed to certain diseases because all the individuals in the group share the same pattern in the consequential part of the sequence. That explains the widespread prevalence of diabetes in the Indian population which is likely linked to the genetic makeup of the population. This kind of information may prove to be useful in developing population-specific drugs.
Scientists can also gather evidence of the effect of how population moved from one place to another, socialized and intermingled, by comparing the genomes of a large number of people, over several generations and belonging to different ethnic, geographical, and linguistic groups. This kind of scientific information improves the scientific understanding of history and historical events, particularly in a large and diverse country like India. It will help to a great extent in resolving tough questions such as migration of people from one place to another or so to say, who we are and where did we come from.7
The currently running Genome India Project of the government of India is indeed an ambitious project which is going to unravel several mysteries of human health, personalized cure by medicines, group diseases and the most importantly, the reality behind the human migration coming and going of people from and outside the country. The perplexing question of Aryan migration is expected to be solved in the near future after the project is completed. Similarly, a lot of answers can be found in regard to individual health and group health. Accordingly, personalized medicines can be developed which will go a long way in eradicating diseases to a large extent.
References: