Turbocharge your metagenomics projects with metabolomics

You are an investigator and have just arrived at the scene of a burglary. There’s a broken window, and expensive jewellery is missing. But look carefully and there are other clues: subtle footprints marking the route the burglars took, the pattern of glass shattering that shows how the window was broken, a record of the temperature of the room that pinpoints when the secure room was breached. By looking carefully at how the objects and their environment have changed together, you can solve the mystery of the missing jewellery.

Likewise, when we integrate metagenomics, metatranscriptomics and metabolomics, we get a more complete picture than the sum of the individual parts. These omics technologies look at what organisms are present and their relative abundance, the genes that are being expressed, and the types of metabolites produced by the community. How can we combine these technologies to unravel the secrets of the microbiome?

Metagenomics: who is there? And what are their potential functions?

The aim of metagenomics is to describe the members and the functional potential of a microbial community. After DNA is extracted from samples, it undergoes Next-Generation Sequencing (NGS) to determine the DNA sequences present. The state-of-the-art shotgun metagenomics sequencing method provides insights into the types of microbes present and also the biological functions encoded in their genomes. After a data cleaning step that prepares the dataset for further analysis, sequence reads are classified and downstream analyses extract further information about both the microbial community and their functional profiles. Metagenomic profiling can be performed by comparing sample reads with external sequence data, such as publicly available reference genomes. This method speeds up computation and allows the profiling of low-abundance microbes that are challenging to assemble. Metagenome assembly and the creation of abundance tables can also be done if required, for example when using samples containing diverse microbes that lack representative genomes. Thus, sample type and the research question determines the optimal method.

Downstream analyses of the metagenomic data allows the information to be converted to a form that can be easily interpreted with respect to the research question. During this step, integration of additional information that will aid in the understanding of the data and its functional capabilities can be added. Functional profiling performed here quantifies genetic and metabolic pathways, so they can be linked to environmental or health-related phenotypes. Microbiome richness and diversity can be calculated using different indices. Differential abundance of taxa important for the research question is determined. Data visualization tools such as heatmaps, graphical summaries and hierarchical clustering plots can be implemented here.

Metatranscriptomics: what are actually they doing?

In all complex microbial communities, multiple organisms are able to perform the same biochemical transformations. While it’s useful to know which microbes are present, the activity of the microbes can be more important for some applications. The transcriptome is highly dynamic, and metatranscriptomics depicts the active functional profile of the community at the moment that the sample was taken. Metatranscriptomics is advantageous as it provides information about which genes are actually being expressed in the sample, thus it is more sensitive in depicting the effect of differing environmental conditions.

After RNA extraction and rRNA depletion has taken place, the sequence library is prepared and sequencing is performed. From the meta-transcriptome-derived sequence reads produced, non-coding RNA and human reads are removed and alignment-based functional profiling takes place. Generally, either alignment or assembly-based method are used to determine microbial functions and their expression levels for metatranscriptomics projects. BaseClear uses an industry-leading hybrid method that overcomes the disadvantages of both methods. The assembly-based method avoids uncharacterised NGS reads and allows the detection of novel functions, while alignment-based methods reduces runtime, avoids assembly errors and can reconstruct genes with low expression levels. At the end of the pipeline, the results of the methods are combined to produce functional abundance tables.

Metabolomics: what are they making?

Metabolomics identifies and quantifies all small molecules produced by a microbial community in a sample, thus providing a direct indication of the health of the community or state of dysbiosis. Biosamples such as blood, urine and faeces are readily used as metabolomics samples, however tissues and organs can also be used, as well as samples from cell culture. As the metabolome is strongly affected by the environment in which the community is found, it provides a depiction of how the microbial community interacts with its environment. Through measuring processes such as quorum sensing, metabolomics provides an indication of communication mechanisms through which the community reacts to the environment, which opens up new avenues of research into infectious disease control, or ecological sustainability in environmental samples.

The generation of metabolomics data relies on a combination of liquid and gas chromatology, together with detection methods such as mass spectrometry and nuclear resonance imaging. Analyses can be targeted towards specific panels of metabolites of interest, or untargeted. The spectra produced correspond to metabolites, allowing their identification and quantification using curated databases. The automated analysis and generation of metabolomic profiles lends itself to high-throughput analytical methods. The technique has given rise to approaches that support its integration into other data sources such as database development and standardization initiatives. Thus, metabolomics allows us to see in great detail what metabolic reactions are taking place and to identify the end products of the pathways identified in functional profiling of metagenome data. After identification of the presence of a specific pathway in the metagenome data, proof that a pathway is active in the sample can be obtained by detecting its metabolic end product. The strength of metabolomics is that it can lead to mechanistic explanations of the interactions between microbiome, environment and host.

It’s the combination that works best

Which organisms are present, what is their functional potential and what are they actually doing? The integration of metagenomics, metatranscriptomics and metabolomics data is critical to providing a complete picture, in which the effect of microbial genes or their interaction with the host can be identified by their expression level, and linked to the metabolites produced. Due to the large datasets and complex nature of the interactions present, a focus on the structure of microbial communities and the effects of particular taxa has dominated current research directions. However, as science continuously develops, integrative analyses will become more sophisticated in identifying interactions between the microbial genome, transcriptome, metabolites and the host metabolism. Network representations can be used to map interactions within a microbiome when following an integrative approach.

There remain considerable challenges in integrating metagenomic and metabolomic data. First of all, it’s important to realise that both genomic and metabolomic analyses commonly contain information about both the host and the microbiome that has been sampled. In addition, while it can be beneficial to analyse the microbiome and metabolites from one sample type, such as the faeces, the metabolites produced by the microbiome can also be found in the urine, blood and tissues. Metabolomics measures the metabolites present in the sample and not those that are absorbed by the host. A more complete picture can be obtained in the case of the gut microbiome when samples of more than one source are included in the metabolomics analysis. Consideration of these factors during the study design phase will improve the value of results.

Advances in high-throughput sequencing, analytics and bioinformatics have opened up the possibility to combine disparate large datasets to uncover hidden secrets of the microbial world and how it interacts with its host. An integrative approach increases the value of the datasets, particularly for hypothesis generation and the establishment of putative causal links between the microbiome and host phenotype. Just like in our jewellery heist, the combination of information helps us to best characterise what has taken place.



Aguiar-Pulido V, Huang W, Suarez-Ulloa V, Cickovski T, Mathee K, Narasimhan G. Metagenomics, Metatranscriptomics, and Metabolomics Approaches for Microbiome Analysis. Evol Bioinform Online. 2016;12(Suppl 1):5-16. Published 2016 May 12. https://doi.org/10.4137/EBO.S36436

Vlachavas EI, Bohn J, Ückert F, Nürnberg S. A Detailed Catalogue of Multi-Omics Methodologies for Identification of Putative Biomarkers and Causal Molecular Networks in Translational Cancer Research. Int J Mol Sci. 2021 Mar 10;22(6):2822. https://doi.org/10.3390/ijms22062822

Get in touch