1000 Genomes Project publishes first results

By Tim Dean
Thursday, 28 October, 2010

What makes you you, and not somebody else? A groundbreaking international study seeks to uncover the genetic basis of individual difference, as well as find genetic links to disease, by sequencing and comparing the genomes of thousands of individuals from around the world.

The first pilot phase of the 1000 Genomes Project has now ended with the results published this week in the journal Nature (doi:10.1038/nature09534).

The goal of the 1000 Genomes Project is to find most genetic variants that have frequencies of at least 1 per cent in the populations studied.

The pilot phase consisted of three projects. The first sequenced the whole genomes with low coverage of 179 individuals from populations in West Africa, Europe, China and Japan.

The second project collected high-coverage sequences of two families, including the mother, father and child.

The third project sequenced only the protein coding regions of the genomes of 697 individuals.

The three projects represent three different approaches to genome sequencing, all of which have their pros and cons.

According to Rasmus Nielsen, from the Departments of Integrative Biology and of Statistics, at the University of California, Berkeley, writing in a Nature News and Views article, whole-genome sequencing can potentially yield tremendous amounts of valuable information, but the process is time and resource intensive.

A compromise is to perform a low-coverage whole-genome sequence that is then compared to a reference genome, picking out any rare variants that emerge using an inference technique called imputation.

However, this technique is not as reliable as high-coverage sequencing.

The other approach of sequencing only protein-coding regions of the genome also dramatically reduces the sequencing load, and it is expected that most of the 'interesting' variations will occur in protein coding regions anyway.

Genomic insights

The pilot results reveal around 15 million single nucleotide polymorphisms (SNPs), many of which were already known but many were also previously unknown.

Around one million were short insertions or deletions, and 20,000 were structural variants, most of which were previously undescribed.

On average each person carries around 250 to 300 loss-of-function variants, including 50 to 100 that are associated with disease.

The study has so far proven fairly comprehensive, with over 95 per cent of known variants, such as SNPs, included in this data set.

Another article appearing in Nature suggests that at least 2700 human genomes will have been sequenced by the end of October, with that number ballooning to over 30,000 by the end of 2011.

Related News

Anti-inflammatory drug may help treat alcohol use disorder

A drug that is already FDA-approved for treating inflammatory conditions may help reduce both...

Osteoarthritis study uncovers new genetic links, drug targets

The genome-wide association study (GWAS) uncovered over 900 genetic associations, more than 500...

How brain cells are affected by Tourette syndrome

US researchers have conducted a cell-by-cell analysis of brain tissue from individuals with...


  • All content Copyright © 2025 Westwick-Farrow Pty Ltd