The Next Next-Gen Sequencing
Validated Nanopore Datasets Made Available for Metagenomic Pipeline Benchmarking
Optimization of Next-Generation Sequencing (NGS) has advanced genomics significantly over the last ten years allowing cost effective metagenomics and de novo genome assembly. Current sequencing technology is based upon the accumulation of short reads (100 – 300 bp) and must be accurately reassembled into the original genome. Although powerful, short read sequencing faces several challenges when assembling genomes. For example, genomic sequence repeats and large structural variation such as translocations and duplications are more difficult to detect and may cause a decrease in accuracy1.
The Long-Read Revolution
The recent development of advanced long-read sequencing techniques (Nanopore and PacBio sequencing) allow for longer reads that range from less than 10 Kbp up to 2 Mbp. Longer reads make de novo genomic assembly easier and much less ambiguous due to the potential for longer overlap of sequencing reads2. This helps overcome the challenges facing short-read sequencing and genome assembly. However, this new technology is not without challenges of its own including the need for optimization and benchmarking with well-characterized controls and standards.
Using Standards to Resolve Challenges
To address these challenges, a recent article by Nicholls et al. sequenced mock microbial community standards in an effort to benchmark nanopore analysis3. The authors used the well-defined ZymoBIOMICS Microbial Community Standards which are composed of ten microbial organisms in even and log distributed abundances. High molecular weight microbial DNA was extracted from the standards using a modified protocol of the ZymoBIOMICS DNA Miniprep kit and were quantified via Qubit. The DNA was then size selected using AMPure beads (Beckman Coulter) and libraries were prepared using the Oxford Nanopore LSK-109 kit. Libraries were generated from both standards and then sequenced using GridION and PromethION flow cells from Oxford Nanopore Technologies. To verify the long-read sequences, libraries were also prepared and sequenced using Illumina’s HiSeq 1500 pipeline.
A total of 324 Gbps of sequencing data was generated from four sequencing runs and both the GridION and PromethION detected all 10 microbial species in the even- and log-distributed microbial standards. Furthermore, the GridION was able to detect S. aureus, the species with the lowest abundance in the log microbial standard. Remarkably, the average consensus accuracy for GridION and PromethION was greater than 99% for both standards where Illumina’s mean <30 Phred score ranged between just 75% and 95%. The successful Nanopore sequencing and genome assembly helps validate long-read sequencing as a promising option for metagenomic analysis.
To aid researchers developing new long-read bioinformatics tools for metagenomics and to validate current pipelines, the authors of the study have generously made the complete datasets available to the public. These assembled genomes of well-characterized and widely available standards are a valuable resource for the burgeoning field of long-read sequencing metagenomics which stands to revolutionize the accessibility of metagenomics. These experts in the fields of metagenomics and long-read sequencing are well aware of the bias pitfalls that plagued the early years of microbiome analysis. To ensure the best tools are developed for future users, they have utilized the ZymoBIOMICS Microbial Community Standards to rigorously validate their workflow from DNA extraction through bioinformatics.
Read the original article here.
1. Oxford Nanopore Technologies. 2017. Nanopore Sequencing The advantage of long reads for genome assembly [White paper]. Retrieved from www.nanoporetech.com.
2. Shin, J., Lee, S., Go, M. J., Lee, S. Y., Kim, S. C., Lee, C. H., & Cho, B. K. 2016. Analysis of the mouse gut microbiome using full-length 16S rRNA amplicon sequencing. Scientific reports 6: 29681.
3. Nicholls, S. M., Quick, J. C., Tang, S., & Loman, N. J. 2018. Ultra-deep, long-read nanopore sequencing of mock microbial community standards. BioRxiv. 487033.