Announcing the Winners of Mosaic Microbiome Community Challenge: Strains #1

The application of next-generation sequencing in the study of microbial communities has fueled the rapid growth of interest in microbiome research. However, difficulties with the accuracy of computational analyses of these complex datasets have limited the translation of microbiome science into novel biotherapeutic products. In order to unleash the potential that metagenomics holds for human health, computational methods to identify unique microbial strains must be improved.

The Mosaic Community Challenge: Strains #1, sponsored by the Janssen Research & Development, LLC, through the Janssen Human Microbiome Institute, aims to benchmark and improve the performance of computational tools in analyzing these data, in order to provide better quality profiling of microbiome samples at high resolution. The challenge gave participants the opportunity to validate their bioinformatics tools in realtime on a neutral, unbiased platform, and see how they performed against other industry tools.

Participants of the challenge worked with datasets that were composed of four different sample types: a metagenomics dataset generated from real mouse fecal samples (of known bacterial composition), and three simulated datasets of varying complexity. Besides the challenge dataset, a distinct training dataset, which included the truth files, was provided to enable participants to train and improve their methods. Participants were then able to conduct analysis by either creating their own app on the Mosaic Platform, or by downloading the dataset and running their method in their own system. Over the four-month course of the challenge, participants could take advantage of a “Testing Ground” to get immediate feedback on their work with training datasets before submitting their final challenge entries.

Challenge Winners & Their Methods

We would like to congratulate the winners as well as thank all who participated for helping to take microbiome science to the next level.


CosmosID, a bioinformatics and NGS service laboratory, scored highest in the Profiling part of the challenge. The CosmosID analysis pipeline achieved the highest cumulative F1-score, which is a measure of precision and recall. According to Nur Hasan, Chief Science Officer at CosmosID, the strength of their approach lies with the manually curated database, whose structure follows the phylogenetic hierarchy of all represented microorganisms which enables reliable microbial identification at all taxonomic levels, down to strain-level.

CosmosID’s submission scored the highest in the analysis of the Biological Sample (80%), which was 64% higher than the score of the second submission (48.9%). Interestingly, however, submissions based on the popular Metaphlan tool, performed better across the simulated datasets. The observation that the performance of tools vary based on the source of the sequencing data highlights the importance of benchmarking the tools on both biological and simulated datasets.

Figure 1. Precision/Recall Curve for the winning submission for each of the challenge datasets (to view this chart visit the submissions page on Mosaic).

To interactively compare the Profiling submissions and view Precision Recall Curves, visit the Strains #1 Profiling comparison page. 


Rayan Chikhi, PhD, Computer Scientist at the French National Center for Scientific Research (CNRS) and CRIStAL research center, and an advisor at Clarity Genomics, scored highest in the Assembly part of the challenge by using the Minia assembler to assemble the metagenomic data provided for the challenge. The assembly portion was judged on the total number of aligned bases divided by the reference genome size (Genome Fraction). The winning submission scored well across all other metrics reported in the leaderboard, namely Misassemblies and Mismatches.

Figure 2. Genome fraction scores across 13 biological sample reference strains 

Honorable mentions go to two other participants. Peter McCaffrey came a close second with his DeepBiome submission, while his submitted assemblies were longer than the winning submissions. Additionally, the submissions from Sergey Nurk (Metaspades assembler) had consistently the largest contigs.

To make your own comparisons between the submissions and dive in deeper in the rich comparison data available, visit the Strains #1 Assembly comparison page. 

Learn about the winners’ methods during our webinar confirmed for Tuesday, June 26th at 10am PT (1pm ET).

Want More Ways to Participate in the Mosaic Microbiome Community?

Learn more and get involved at

Visit Us at Microbiome Drug Development Summit!  

DNAnexus will present Translation of Microbiome Research into Clinical Applications, this Friday, June 22nd at 12pm at the Microbiome Drug Development Summit in Boston. Join our talk, and stop by our exhibition table to learn more about DNAnexus microbiome capabilities, and the Mosaic Community Platform & Challenges. Email us to schedule a meeting in advance.

Translation of Microbiome Research into Clinical Applications

  • Crowdsourcing the advancement of microbiome research with the Mosaic Community platform and challenges
  • Considerations for incorporating microbiome data into clinical trials
  • Complying with GLP, 21 CFR Part 11, and more


   Omar Serang, Chief Cloud Officer, DNAnexus

  Michalis Hadjithomas, PhD, Microbiome Lead, DNAnexus

Advancing Microbiome Research through Community Engagement

The application of next-generation sequencing has transformed the study of microbial communities by providing a snapshot into these complex systems.  Although the application of these methods in basic research has led to breakthrough discoveries, the translation of microbiome research to the clinic has been delayed by the limited strain-level precision and accuracy of microbiome profiling methods.

At DNAnexus, we are excited by the opportunity to apply our cloud genomics expertise to power two crowdsourcing efforts that aspire to provide unbiased benchmarking of microbial strain detection methods and to accelerate translational microbiome research in the context of human health and food safety. Mosaic, a DNAnexus community platform focusing on Microbiome Research, is launching the Clinical Strain Detection Challenge, sponsored by Janssen Research & Development. The second challenge, CFSAN Pathogen Detection Challenge, is hosted by the Center for Food Safety and Applied Nutrition (CFSAN) at FDA.

Mosaic Community Challenge: Clinical Strain Detection

The Mosaic Community Challenge: Clinical Strain Detection, hosted by Janssen Research & Development, LLC, aims to speed the translation of microbiome science into novel products by tracking the presence of certain known strains in a sample. It is critical to accurately determine the type and quantity of microbes in a sample at the strain-level in order to bring safe and effective products to market, and to precisely monitor their status within the human body. Insights from the Challenge will provide an objective comparison of the performance of different tools. Participants can submit multiple entries and see immediate results of their performance throughout the Challenge using the Mosaic Platform. Learn more about this and other Mosaic challenges at

CFSAN Pathogen Detection Challenge

The Center for Food Safety and Applied Nutrition (CFSAN) at FDA has pioneered the use of whole genome sequencing (WGS) for outbreak detection via their GenomeTrakr. Although this tool has already greatly improved outbreak detection and traceback, current WGS approaches rely on culturing a pathogen before sequencing. Metagenomics, defined as the study of genetic material collected directly from environmental samples, is the next evolution of GenomeTrakr foodborne pathogen initiative because metagenomics is culture-independent.

As the food safety community moves to metagenomic sequencing, bioinformatics algorithms must be developed to detect pathogens amongst a mix of organisms sequenced directly from a sample. Thus, the precisionFDA team has designed a challenge as the first step towards this goal. In this challenge, participants will be asked to develop and use bioinformatics pipelines to identify the types and distribution of Salmonella strains in each of several metagenomics samples. This type of technology will expedite determining the source of foodborne illness. For more information and to get involved, please visit


Launching Mosaic Community Challenge: Strains #1

Mosaic Microbiome PlatformDNAnexus is excited to announce the launch of Strains #1, the first in a series of Mosaic Community Challenges. A series of community challenges will be hosted through Mosaic and sponsored by the Janssen Human Microbiome Institute (JHMI), part of Janssen Research & Development, LLC, to foster global collaboration focused on advancing the improvement of methods and standards around microbiome science.  The Strains #1 challenge will officially run from December 1, 2017 through February 28, 2018. You can learn more about Strains #1 by watching an introduction to the challenge from a recent webinar. We encourage you to join the Mosaic microbiome community via your DNAnexus account or register for a Mosaic account.

The first Mosaic Community Challenge, Strains #1, aims to improve the performance of computational tools in analyzing microbiome next-generation sequencing data, providing better quality profiling of microbiome samples at high resolution. Three parts of metagenomic analysis will be evaluated:

  1. Profiling: The ability of tools to correctly identify organisms in a microbiome.
  2. Assembly: The ability to reconstruct contiguous sequences (contigs or scaffolds) from metagenomic reads.
  3. Binning: The ability to correctly group contigs based on their taxonomic origin.

Why Strains?

Profiling a microbiome sample is the fundamental step in any microbiome analysis. Current state-of-the-art methods based on next-generation sequencing perform well in correctly identifying organisms down to the genera and species level. However, the performance of current methods at the strain level is poor.

Given the fact that individual strains within the same species can elicit different functionality and metabolic response, strain-level identification and quantification is key to the development and tracking of microbiome-based health solutions. It is critical to be able to accurately determine the type and quantity of microbes in a sample at the strain level in order to bring safe and effective products to market, and to accurately monitor their status within the human body.

How is the Strains #1 challenge different than previous efforts?

DNAnexus used its expertise in cloud-based genome informatics to build the framework for the Strains #1 challenge. Mosaic leverages the DNAnexus Platform to bring a collaborative experience to the community of microbiome researchers. Some of the unique features of the Strains #1 challenge are:

  • Participation is Free: creating tools, running analyses and storing data in the workspace provisioned for each participant in this challenge is free!
  • Participation is responsive: users get immediate feedback on the performance of their submissions.
  • Participation is communal: participants can perform comparisons against the submissions of the community.
  • Tools are easily created and shared: users can create tools using Mosaic’s App creation interface and share them with the community.
  • We’re taking a forward-looking approach: the challenge aims to accelerate innovation and enable the translation of microbiome research to the clinic.
  • Real microbiome samples are studied: real microbiomes that contain known bacterial strains were created and sequenced specifically for the Strains #1 challenge.
  • Anonymity is optional: to encourage maximum participation by both academic innovators and private sector experts, we enable anonymous submissions. Furthermore, participants can download the datasets for analysis on their own computing resources.

Join the Mosaic Community Challenge: Strains #1

What are you waiting for? Start Impacting the future of microbiome research, join the Strains #1 challenge today! Sign Up Today