By: David Shifrin, PhD Science Writer, Filament Life Science Communications

We talk a lot about the value of big data in genetics testing and the importance of having large-scale studies for things like genome-wide association. We don’t, however, talk as much about some of the problems that arise out of huge datasets. Aside from computational challenges, the biggest problem is simply quality control.

Large-scale genomics research studies are first hindered by the immense amount of variability between individuals, families, ethnic groups, environmental conditions, etc. Genetics would be easy if we didn’t have to dig through the noise and variation. Beyond the inherent biological inconsistencies, though, are the technological ones.

First, collecting thousands of samples takes time. Even with standardized procedures, it is difficult (likely impossible) to maintain perfect consistency in sample collection and storage over the course of months or years. The time of processing after collection will vary, with some samples run almost immediately, while others are stored for extended periods.

Donor/participant behavior will vary, as well. Spitting in a cup is simple, but handling will vary – again, even with standardized protocols and equipment. Scale that potential for individual variability across many thousands of participants, and the risk of problems is clear.

Earlier this year in a Genetics paper, Natalie Forneris and colleagues described a number of other places where quality could suffer. (It’s worth pointing out that this article primarily describes non-clinical samples, though many of the lessons are still valuable). They note operator and equipment errors, mistakes in sample labeling, and problems with genetic mapping.

In Europe, the European Molecular Genetics Quality Network functions expressly to, well, do exactly what the name suggests. EMGQN provides recommendations for quality control in genetic testing labs, and also provides accreditation to labs. Of course, in the States, CLIA provides similar functions.

Even when all appropriate best practices are implemented, the inherent complexity mentioned above can still cause problems. To look at this issue, a group from Kaiser Permanente Northern California and UCSF ran a quality control study on a massive cohort of more than 100,000 samples (resulting in more than 70 billion genotypes). To be more accurate, genotyping of those samples was the primary aim, but one which necessitated a significant quality control effort.

Time was a big issue for the team, though in this case it was due to a short timespan. The genotyping was carried out over the course of 14 months. With samples being run at such a high rate, the team needed a “near-real-time” QC process. Any problems needed to be resolved quickly because of the “sustained high throughput” of ~1600 samples per week. Variability was minimized through standardization, with robots doing as much work as possible and DNA being extracted at only one lab.

Quality assessment started with the raw samples, with about 2% being thrown out because the participants didn’t provide enough saliva or there was contamination. From there, the quality of extracted DNA was assessed using typical metrics (concentration, A260/280 ratios). Then, to deal with any variability over time, the group used a simple but elegant control mechanism. The platform used for genotyping was an Affymetrix microarray, with 96 wells per plate. One of these wells was used for a confirmed sample, providing a solid reference for the remaining de novo samples.

Reproducibility is another issue. Here, the authors used genotype discordance. This is essentially the proportion of genotype pairs (e.g. allele 1 and allele 2) in duplicate samples that differ by one or both alleles.

The team also used a unique method to mitigate variability through what they called “package design.” In this system, “sudden changes in genotype concordance” were linked to known events in the lab. Along with other factors (such as the lot number of kits), groups of samples were bundled together “to achieve homogeneity of conditions.” Subsequent analysis was then done independently on each package. In fact, using packages instead of individual plates provided better reproducibility between duplicate samples. This method alone provided a substantial improvement in the quality of results.

Ultimately, the group noted that approximately 93% of samples were successfully analyzed using “standard” quality control mechanisms. Variability came into play in a number of places. Interestingly, there was a difference in the concentration of DNA extracted from samples during different months, although it was not clear whether this was a true causative relationship. Not surprisingly, length of storage had a slight but still notable effect on the final concentration.

The key to this particular study was its use of both immediate and long-term analyses to maintain the highest possible level of control. As noted above, the package-based system, clustering samples run under similar conditions, helped minimize variability over time. Additionally, running validated samples as baseline controls with new assay plates improved reproducibility. Once samples were analyzed, bad data was filtered out, and trends in poorly performing probes on the microarray could be identified. Thus, not only did results from this particular set of 100,000 samples improve, but so did the kits used to carry genotyping, leading to better quality for future users.

Both the strategy and tactics used in this study will be valuable for future large-scale genomics projects. The specific methods used for genotype calling and discovering bad probes or reproducibility problems should be easily implemented. Beyond that, the two-tier system of immediate and long-term quality control can provide significant safeguards for reaching high levels of success in analyzing patient samples.

Wed. September 30, 2015