Feature: Dark matter of the genome
This feature appeared in the July/August 2011 issue of Australian Life Scientist. To subscribe to the magazine, go here.
It’s been called the ‘dark matter’ of the genome: we know something is there, but we just can’t see it yet. It’s also known as the ‘missing heritability’ problem, because one look at a set of monozygotic (identical) twins shows that our genomes exact a tremendous influence on even complex traits, such as height, but when it comes to fossicking through the genome to find out precisely what genetic variants influence these traits, we often come up short.
The conundrum stems from the fact twin studies have found height is about as heritable as any trait gets, with a heritability of around 0.8, meaning that around 80 per cent of variability in height can be accounted for by variability in genes within a population. This means that something is being passed from parents to offspring that is influencing their height. But what?
And it’s not just continuous traits like height or cholesterol level that show clear signs of heritability: many diseases also appear to be heritable to some degree. Uncovering the details of the genetic factors that lead to susceptibility to certain diseases would prove a great boon, not only in terms of more sophisticated diagnosis but also in telling us something about what causes the disease and how it might be treated.
“The fact is most of us are going to die from complex diseases,” says Professor Nicholas Martin, from the Queensland Institute for Medical Research (QIMR). “And most of these diseases have genetic components.”
Martin has been one of the world leading figures in heritability studies, particularly with monozygotic and dizygotic (non-identical) twins at the Genetic Epidemiology Laboratory at QIMR. In recent years he’s turned his attention to the latest technology for exploring heritability, genome-wide association studies (GWAS), in an attempt to delve deeper into the genetic underpinnings of complex traits and diseases.
He gave the Sutherland Lecture at the Genetics in the Sun conference in early August, and spoke about the trials and tribulations of GWAS. ALS caught up with him just prior to the conference.
“People have been working on the genetic components of complex diseases for a couple of generations and not getting terribly far,” says Martin. “But there has been a tremendous explosion of knowledge since the advent of genome-wide association studies in 2005, and these have produced stunning new insights into the aetiology of many diseases.”
The first success for GWAS was with age-related macular degeneration (AMD) and subsequent, larger studies have revealed that just a handful of genes can account for up to 50 per cent of the heritability of this disease. Furthermore, the genes that appear to be related offer an insight into the cause of the disease and suggest strategies for how to cure it.
This initial success appeared to reinforce the ‘common disease, common variant’ hypothesis which suggested that common diseases were attributable in part to alleles present in more than one to five per cent of the population. However, the large effect sizes found for genes influencing risk of AMD have not been found for many other diseases.
Heritability hide and seekStudies in the late-2000s looked at complex traits such as height and instead of finding a few common variations with a large effect size they found a many common variants that each had a small effect size, and not nearly enough to explain the bulk of the heritability.
Up until 2009, 40 variants had been discovered that influenced height, but they only accounted for five per cent of heritability. By 2010 this had risen to 180 variants explaining about 12 per cent of heritability.
This, and several other examples with serum lipids, body mass index, and waist-hip ratio suggested that the larger the sample size, the more variants would be found affecting the trait. The failure of GWASes to maintain a steady stream of common variants of large effect underpinning common traits and diseases led to something of a crisis of confidence in the technology.
But not everyone gave up hope. Martin and his colleagues suspected that it wasn’t a problem with the technology, as such, but with the expectation that effect sizes would be large. The reason that common variants with large effect size weren’t emerging from the GWAS studies was simply because they didn’t exist.
Instead, it took many variants each with a very small effect to account for the heritability of complex traits. Martin’s colleagues at QIMR then sought to find evidence in support of this alternate notion in their own studies on height.
“People got hung up on this idea of missing heritability,” says Martin. “But our group has show this to be a furphy. It’s not missing, it’s just our sample sizes were not yet big enough to detect it.”
Peter Visscher, Jian Yang and their colleagues’ height study, which appeared in Nature Genetics in 2010, employed almost 4000 subjects and showed that the ~300,000 single nucleotide polymorphisms (SNPs) on a GWAS chip together are associated with around 45 per cent of variance in height. However, most of these individual SNPs have such a small effect size that they cannot be identified individually, only their joint effects can be detected.
Typically GWAS studies have taken each SNP individually and tested it for association with the trait in question. However, if that SNP accounts for a very small per cent of the variability it becomes difficult to discriminate it from noise in the numbers. Likewise, if a variant of large effect is rare it will not be captured by most GWAS studies with typical sample sizes of a hundred or so individuals.
By contrast, the height paper sought to look at the totality of association with all SNPs and show that there must be many variants of small effect, even though they could not be individually identified.
“What is now clear is that, with some exceptions, the genes influencing the traits have small average effects in the population,” says Martin. “No one SNP accounts for much disease risk, but cumulatively they do account for the heritability.”
So what about the remaining unexplained heritability? Martin and his colleagues suggest it’s due to incomplete linkage disequilibrium with SNPs that have been genotyped to date, and also because rarer SNPs that might tag variants of larger effect do not appear on the commercial GWAS chips.
This means the actual causal variants are still unidentified, and those SNPs that are probed in GWASes are not necessarily associated with the causal variants. In time, though, these causal variants will likely be revealed and probed on the next generation of GWAS chips, and with them the missing heritability will emerge.
Fossicking in the genomeWhile some quite loud voices have cast aspersions on GWAS and their usefulness in uncovering the genetic components to complex traits and diseases, Martin is not so sceptical. For one, he suggests the turn against GWAS is due in part to what he calls “breathlessness”.
“It’s one of the things that most drives me mad in genetics and molecular biology,” he says. “It’s promulgated by Nature, Nature Genetics and Science, where they always have to be rushing on to the next big thing before the full utility of current technology has been explored. Nowhere more is this true than over GWAS.
“We had a year of spectacular results, then they say it’s old-hat and want to move on. But there’s still a lot of life in GWAS technology yet. I think they’ll be extremely useful for at least another five years, until sequencing drops to a comparable price.”
The key, says Martin, is doing GWASes on a much larger scale. Now that we know common variants with large effects are going to be rare, we need to look for a vast number of common variants with small effect sizes. And to do that, we need greater sample sizes. Where this is being done, the newly discovered variants are giving fascinating insights to disease aetiology.
This is where GWASes will have a strong advantage over genome sequencing, at least in the short term. Sure, genome sequencing can perform a similar comparative role to GWAS studies, but the expense is still prohibitive.
“Sequencing will be great, and it will give the answers, but only on the same number of individuals as GWAS. But where GWAS cost a few hundred bucks per individual, sequencing still costs thousands,” says Martin. “I’m still a great advocate for continuing the GWAS revolution while waiting for the price of sequencing to drop until it’s affordable to do very large numbers.”
One of the interesting revelations to come out of GWAS studies is that many of the SNPs that appear to have an effect on phenotypic variability are in non-coding regions of the genome. “This is partly because we know the coding regions only make up one to two per cent of the genome. Most of it is intergenomic and introns.”
This suggests that it’s not just variation in genes and the proteins they code for that influence traits and diseases, but the complex RNA signalling networks that coordinate the actions of our genome.
This supports the position advocated by Professor John Mattick at the Institute for Molecular Bioscience at the University of Queensland, a long-time proponent of the role non-coding RNAs have on the evolution and development of complex organisms like us.
For Martin, this reinforces that GWAS studies are still capable of yielding valuable insights. As such, he is intent on continuing to use the GWAS as a tool to explore heritability, complex traits and diseases, particularly in twins, which have been the focus of his research efforts for the majority of his career. His hope is that the heritability that was once missing was only lost. And, in time, tools like the GWAS will help us find it.
By studying genetically altered zebrafish, scientists have managed to pinpoint a human gene...
Gardnerella bacteria in the cervicovaginal microbiome may serve as a biomarker to...
A funding injection of up to $17 million could help cut the timeline for an effective vaccine for...