What’s that smell? Faecal DNA sequencing, blood analysis, and Mendelian randomization support causal relationships between the gut microbiome and metabolites in the blood
Within our gut lives the microbiome—a bustling metropolis of thousands of species of bacteria, fungi, parasites, and viruses. Numbering trillions in total, these microbiota coexist with their human hosts, usually peacefully aiding our digestive and immune systems and preventing more harmful microorganisms from overgrowing, among other services.
Dysbiosis in the gut—an imbalance among its species—has been implicated in numerous health conditions including obesity, anxiety and depression, diabetes, responses to cancer treatment, foods, and drugs. The microbiome has become a topic of huge public interest for consumers and businesses in addition to science and health sectors.
The potential to understand human health and disease affected by the microbiome has sparked enormous strides in metagenomics—the study of all species, usually bacteria, within a community. While research and technology have revealed correlations between compositions of the metagenomes and some health conditions, causality has been both controversial and difficult to substantiate.
Concluding that correlations are causal can be fruitless at best and dangerous at worst. If a gut microbiome species is a bystander or simply predisposes one to a condition, attempts to alter the condition by changing the species’ abundance will have little to no result. If a species increases its abundance as a compensation to a condition, changing its population could harm the host.
Erroneous conclusions about causality can jeopardize the credibility of the field and undermine efforts to translate valid discoveries into products and services.
Since there are hundreds of microbiota species populating the gut, and as many phenotypic traits of interest, establishing causality is a complex process.
In the laboratory, physical effects caused by ‘dysbiotic’ gut microbiota have traditionally been investigated by transplanting suspected dysbiotic faecal microbiota of humans into germ-free animals. Phenotypes are then compared to those of animals receiving microbiota from healthy individuals.
These kinds of experiments would be impractical and unethical in humans. Furthermore, we lack a baseline for a healthy microbiome; each microbiome is unique, resulting from a combination of inheritance, through to exposure to the mother’s microbiome in the birth canal, then diet, environment, and lifestyle habits.
Recently, statistical strategies combined with experimental data have proven viable and scalable.
In a study published in January in Nature Genetics, researchers used one such statistical method, bidirectional Mendelian randomization analysis, to examine causal relationships between the whole-metagenome, anthropometric traits, and blood metabolites among 3,432 Chinese individuals.
Mendelian randomization (MR) is a statistical method that takes advantage of Mendel’s laws; alleles, and thereby genetic variants, segregate randomly and independently of environmental factors during meiosis. If a genetic variant alters the effects of a ‘modifiable exposure’ on a physical outcome, e.g. the concentration of a metabolite or the abundance of a species of bacterium, then that genetic variant should affect that outcome.
If true, the extent to which genetic variants modify exposures can be used to predict the extent to which exposures modify physical outcomes.
Effectively, individuals are considered to have had lifelong exposure to a variable if they have a genetic variant known to correlate with that variable. Thus, MR as a strategy provides supporting evidence for causality complementary to randomized clinical trials. The latter require exposing control and experimental cohorts under tightly controlled conditions for one-off measurements. They can be applied well to some contexts, but few microbiome studies.
Out of hundreds of gigabytes of sequencing data from thousands of individuals, the team identified 58 causal effects between the gut microbiome and blood metabolites, of which 43 were replicated. How?
To carry out MR analysis, two types of correlations were established first: 1) correlation between genetic variants and microbial features, and 2) correlation between genetic variants and anthropometric and metabolic features.
Such data were gathered in two cohorts: a discovery cohort and a replication cohort. For the discovery cohort of 2002 individuals, whole genomes were sequenced to high depth (42x), anthropometric data were collected, and the levels of 103 blood metabolites were measured. More than 10 million variants were identified.
Anthropometric traits included height, weight, waistline, hipline, age, and gender. Due to the distinct chemical structures of different types of metabolites—amino acids, hormones, water-soluble vitamins, fat-soluble vitamins, and blood trace elements—plus their solubilities in different fluids, each had their own analysis procedure for ultra-high pressure liquid chromatography and mass spectroscopy.
Whole metagenomes from 1539 of these individuals were sequenced from their stool samples, and the abundances of phyla, order, family, class, genera, and species were determined from the gene abundances and classified into 500 unique features.
The team identified associations between human genetic variants and 500 unique microbial features using strategies for metagenome-genome-wide association studies (M-GWAS) and the data from whole-genome and whole-metagenome sequencing. While the analysis first identified over 20,000 associations, applying the most stringent statistical filter narrowed the pool down to 28 associations with microbial features and 27 human genomic loci. Five of these associations correlated with gut bacteria while the others were signatures of metabolic pathways.
The team identified associations between human genetic variants and each of the 112 metabolic features, following strategies for whole-genome-wide association studies (GWAS) and similar statistical criteria. They identified 39 associations with metabolites involving 28 genomic loci.
These corroborated previously known relationships, such as genes associated with bilirubin—a yellowish pigment made when red blood cells break down. Elevated levels can indicate kidney dysfunction.
To distinguish between statistically significant relationships and random false positives, the team gathered the same types of data with lower resolution and from the replication cohort of 1430 individuals, 1006 of whose stool samples were used for metagenome sequencing.
Observational correlations were calculated between the 500 microbial features and 112 metabolic traits (9 anthropometric features and 103 blood metabolites) using multivariable linear regression. These calculations yielded 457 significant associations from the discovery cohort after filtering out associations that did not pass statistical criteria for ruling out false discoveries.
To go beyond correlation and seek evidence for causality, the team performed bidirectional Mendelian randomization analysis on the 457 observational correlations and found 58 causal effects. Of these, 17 were effects on blood metabolites caused by microbial features, and 41 were effects on the microbial features caused by the blood metabolites. Four such relationships were bidirectional, while the rest were unidirectional.
The team was able to replicate 43 of these relationships in the independent replication cohort.
As summarized on Genome Web, the most statistically significant relationships included the effects of Oscillibacter and Alistipes—both gram-negative bacteria—on blood triglyceride concentration. Increased abundances of either strain caused a measurable decline in triglycerides in the blood.
Elevated levels of triglycerides can indicate heart conditions, obesity, or metabolic disorders.
Examples of significant effects found in the opposite direction include increased glutamic acid—an amino acid and neurotransmitter—leading to a decrease in Oxalobacter, and several metabolites, including alanine, glutamate, selenium, influencing the abundance of species from the Proteobacteria phylum.
The publication and extensive supplementary materials elaborate on the myriad effects and their directions, significance, and implications. The authors also made advances in linking these associations with diseases by applying MR summary statistics from the Biobank Japan database.
Establishing causality is no walk in the park. Doing so requires studies with extensive scale and depth, and technical and statistical quality control.
This team carefully designed this study to have significant cohort sizes and to collect samples for genetic, metabolic, and metagenomic analysis from thousands of individuals. Furthermore, they applied the most stringent statistical criteria to eliminate false positives and scrutinized the outcomes of their analysis with six different models.
This study will likely enable breakthroughs in methodology, and ultimately, more clarity in understanding how the gut microbiome affects human health.