Online ordering
Login Register
‹ Back to overview

Benchmarking and improvement of the BaseClear 16S rRNA gene profiling pipeline.

Benchmarking and improvement of the BaseClear 16S rRNA gene profiling pipeline.

16S rRNA gene based microbial profiling remains an important tool to determine the composition of micro-organisms in a myriad of environmental samples, including faeces, soil, water as well as samples derived from food products (e.g. milk, yogurt, and beer). Though the technological methodology by itself is reproducible, the influence of DNA extraction, primer specificity, PCR protocol, sequencing and data analysis may affect the outcome of the microbial profiles. As a result, the profiles may not accurately represent the true bacterial composition in the original sample.  

Here, DNA extraction and 16S rRNA gene profiling was evaluated by characterization of a mock community consisting of 8 bacterial strains. DNA was extracted in triplicate from the microbe suspension (Microbe) using mechanical lysis (bead beating) followed by column based DNA purification. In addition, pooled genomic DNA extracted from pure cultures of the 8 bacteria (DNA) was used for 16S rRNA gene microbial profiling as an evaluation independent of DNA extraction. As we know in which ratio the different bacteria are mixed we know what the theoretical outcome of our bacterial profiling should be (Figure 1).

Profiles from triplicate analyses for the DNA and microbe samples highly correlated (r > 0.95) supporting a high level of repeatability of 16S rRNA profiling. Pearson correlations between theoretical composition and profiles from the DNA and Microbe samples were lower (r > 0.75-0.93; Figure 1A), but the detected bacterial taxa in the microbe and DNA pool by 16S rRNA gene profiling matched those of the theoretical composition. This indicates that the complete pipeline from DNA extraction to data analysis generates profiles that are very similar to what would be expected: the ‘true’ microbial composition. 

The 16S rRNA gene bioinformatics pipeline employs the latest algorithms for with strict parameters to exclude error-prone reads from further analysis. Therefore, this methodology can be used to gain general insights into the composition of complex microbial communities.

FIGURE 1: THEORETICAL AND DETECTED RELATIVE CONTRIBUTIONS OF DETECTED BACTERIAL GENERA IN TRIPLICATE SAMPLES OF THE COMBINATION OF DNA EXTRACTED FROM PURE CULTURES (DNA) AND COMBINATION OF BACTERIAL CELLS (MICROBE) OF EIGHT BACTERIAL STRAINS. TAXONOMIC GROUPS THAT CONTRIBUTE AT LEAST 1% TO ONE OF THE PROFILES ARE INDICATED IN THE COLOR KEY. PEARSON PRODUCT-MOMENT CORRELATION COEFFICIENTS (R) BETWEEN PROFILES ARE SHOWN ABOVE THE BARS.

Although the obtained microbial profiles were similar to the theoretical (expected) microbial composition, we observed a number of sequences that could not be confidently taxonomically classified down to the genus level. These sequences are grouped as ‘unclassified’ at the family level. These ‘unclassified’ groupings likely consist of sequences with errors that hamper accurate taxonomic classification. To improve our taxonomic profiles we employed stronger stringent steps using the latest bioinformatics tool that are found to be more equipped.

Re-analysis of the 16S rRNA gene data using the improved pipeline showed higher Pearson correlations (Figure 2). Most notably the number of sequence reads that could not be assigned to a specific genus with a high confidence (unclassified reads) were significantly lower. We observed a number of taxonomic groups that, though not present in the original mock communities, were detected. These taxonomic groups were had a low relative contribution with an abundance not higher than 0.5%.


FIGURE 2: THEORETICAL AND DETECTED RELATIVE CONTRIBUTIONS OF DETECTED BACTERIAL GENERA USING THE IMPROVED 16S RRNA ANALYSIS PIPELINE, IN TRIPLICATE SAMPLES OF THE COMBINATION OF DNA EXTRACTED FROM PURE CULTURES (DNA) AND COMBINATION OF BACTERIAL CELLS (MICROBE) OF EIGHT BACTERIAL STRAINS. TAXONOMIC GROUPS THAT CONTRIBUTE AT LEAST 1% TO ONE OF THE PROFILES ARE INDICATED IN THE COLOR KEY. PEARSON PRODUCT-MOMENT CORRELATION COEFFICIENTS (R) BETWEEN PROFILES ARE SHOWN ABOVE THE BARS.

CONCLUSION 

In conclusion, the BaseClear 16S rRNA gene profiling methodology is highly repeatable and yields comparable profiles compared to the theoretical composition of mock communities. Next to DNA extraction methodology, primer specificity, PCR protocol and sequencing, bioinformatics of the 16S rRNA has an impact on the outcome of microbial profiling. The 16S rRNA gene bioinformatics pipeline employs the latest algorithms for with strict parameters to exclude error-prone reads from further analysis. Therefore, this methodology can be used to gain general insights into the composition of complex microbial communities.

Tom van den Bogert - Product manager Metagenomics 

Stay up to date with genomic developments
‹ Back to overview

More blogs

  • Genomics and bioinformatics tools unlocking the secrets of microorganism communities

    Increased realisation of the importance of microorganisms and the microbiome comes at a time when the next-gen sequencing techniques that enable the investigation of metagenomic communities have become both easier and cheaper.

    read more
  • Innovative methods for GMO traceability

    MinION sequencing of enriched samples identified successfully the expected GMO event, even with very low input!

    read more
  • MinION simplifies the analysis of genetic material

    Reliability and accuracy compared to traditional machines need to be tested further. But the potential is huge.

    read more

CONVINCED? GET IN TOUCH

Get a quote Meet Baseclear Contact form