All genomes are dysfunctional: broken genes in healthy individuals « Genomes Unzipped
All genomes are dysfunctional: broken genes in healthy individuals « Genomes Unzipped
So here’s the thing: the greater the predicted functional impact of a sequence variant, the more likely it is to be a false positive.
The reason for this will be pretty clear to the Bayesians in the audience (large-effect variants have a very low prior), but can take a while to fully appreciate for those without a natural statistical intuition. This effect occurs because variants with large effects on function are more likely to be harmful, so in general they are weeded out of the population by natural selection. In other words, the genome is highly depleted for variants with large functional effects.
Error, on the other hand, is more or less an equal opportunity annoyance – false positives, due either to DNA sequencing problems or issues with interpretation (e.g. thinking a region is protein-coding, when in fact it isn’t), appear without much regard for their effects on gene function. (…)
Still, while impressive, these are quantitative improvements, and the lesson stays the same: if you’re a PhD student working on large-scale sequencing data, and you find a fascinating mutation in your disease patient, be sure you validate the absolute hell out of that thing before you start drafting your paper to Science. The more fascinating it looks, the more you should disbelieve it – that’s as true in human genomics as it is in any other field of science.