News & Announcements
July 16, 2012
"Argonne's MG-RAST achieves new record in metagenomic analysis "Media Contacts: Gail Pieper at email@example.com
MG-RAST, the metagenome analysis server developed and operated by researchers at Argonne National Laboratory and the University of Chicago, has achieved a new record—analysis of more than 50,000 metagenomic data sets.
Metagenomics refers to the sequencing of DNA from many organisms present in environmental samples. With metagenomics, researchers can gain new insights into microbial populations without requiring each organism to be cultured in a laboratory. Removing the culturing step has greatly broadened the applicability of sequence-based biology.
“Analysis of metagenomes helps scientists understand the structure and functional behavior of microbial communities, as well as comparison with one another,” said Folker Meyer, a computational biologist at Argonne who leads MG-RAST’s multidisciplinary development team. “But such comparisons are computationally demanding, particularly as we can produce more data from microbial systems in a cost-effective manner,” he emphasized.
Metagenomics is a key example of the shift occurring in bioinformatics as a whole, where the primary challenge is moving from data collection and generation to large-scale analysis of this data.
To meet the computational demands of these growing data sets, Meyer and his team redesigned the computational architecture of MG-RAST. Their goal was to improve MG-RAST throughput by an order of magnitude per year, to keep pace with data set growth.
“In order to scale the system this quickly, we needed to develop more efficient approaches for metagenome analysis, redesign our runtime infrastructure to support use of distributed computational resources, and aggressively adopt IaaS (infrastructure as a service) cloud computing resources,” said Narayan Desai, who co-leads the effort. “We effectively needed to pull out all the stops.”
This work has resulted in nearly a 700-fold improvement in MG-RAST throughput and has enabled the system to analyze over 50,000 data sets in the past 16 months.
“MG-RAST now houses the largest quantity of consistently analyzed metagenomes in the world, letting researchers perform comparisons across many data sets and hunt for patterns that help explain novel aspects of microbiology,” said Meyer. “With this new capability, researchers can for the first time study many microbial communities influencing major biogeochemical cycles or human health, without being hampered by method inconsistencies or biases."The MG-RAST server has been developed and operated with support from the Alfred P. Sloan Foundation, the U.S. Department of Energy, and the National Institutes of Health.