Unravelling the complex causal effects of substance use behaviours on common diseases

In this study, we unravelled the complex causal relationship between substance use behaviours and common diseases by employing a combination of Mendelian Randomization, genetic correlation, and dosage-dependent analyses.
Unravelling the complex causal effects of substance use behaviours on common diseases

Share this post

Choose a social network to share with, or copy the shortened URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

Inferring causal effects between modifiable risk factors and complex disease is a daunting task. Presently, one of the most common methods for causal inference is Mendelian Randomization (MR), which uses genetic variants associated with an exposure as Instrumental Variables (IVs) to estimate its causal effect on a disease outcome. However, implementing such methods with substance use behaviours presents challenges due to their susceptibility to confounding effects that can compromise the validity of IVs.

What prompted us to initiate this project?

The genesis of this project dates back to 2018 when we encountered a puzzling situation while conducting genetic analyses using UK Biobank data. We found that the genetic correlation (rg) estimated by bi-variate LD score regression between alcohol consumption and cardiovascular disease GWAS was significantly negative, while the causal effects (bxy) estimated by several MR methods were significantly positive. Such a discrepancy was not observed for smoking-related traits, leading us to hypothesize unique biases present in the analyses with the alcohol consumption data.

Our investigation revealed that misreports and longitudinal changes introduced biases in genetic associations for alcohol consumption, resulting in the use of invalid IVs in the subsequent MR analyses. These biases could be rectified through the use of medical records and online follow-up questionnaires, and the results of this investigation have been published in Xue et al., 2021 [1]. Meanwhile, a contentious debate was ongoing about the alleged protective effects of alcohol consumption [2,3,4]. After the work published in [1],  we decided to develop an MR method that is robust against such biases. In this study, building upon the GSMR [5] method developed by our group, we developed a global heterogeneity test, termed ‘global HEIDI test’, which can iteratively detect invalid IVs and exclude them from the subsequent causal effect estimation. The latest version of the GSMR2 software is available at  https://github.com/jianyanglab/gsmr2.

What were our findings?

Prior to the MR analyses of real data, we conducted a series of simulations to assess the performance of 11 commonly used MR methods in the presence of horizontal pleiotropy. The results demonstrated that almost all methods maintained a well-controlled false-positive rate in the null model when the pleiotropy was small to moderate. However, in the presence of strong directional pleiotropy (that is, shared genetic variants exert consistent effects on both the exposure and the outcome), no MR method can simultaneously maintain a low false-positive rate under the null model and high statistical power under a causal model at the same time. The extent to which MR methods are robust to directional pleiotropy can vary depending on the specific method used. This underscores the importance of comparing different MR methods in real data analysis before making definitive inferences about causality.

We found widespread risk effects of tobacco smoking, and alcohol consumption displayed specific risk effects on cardiovascular/metabolic diseases. We further derived two additional phenotypes: Moderate Alcohol Consumption (MAC) and Heavy Alcohol Consumption (HAC), and re-ran the MR analysis. We found that MAC did not show any significant protective effects in any methods, while HAC still showed significant risk effects on dyslipidemia and hypertensive disease.

For Coffee Intake (CI) and Tea Intake (TI), we initially stratified the intake levels and conducted a dosage-dependent regression analysis. We found that many diseases displayed a J-shaped relationship, and the turning point was near 5~6 cups per day. Then, we derived four new traits, Heavy/Moderate Coffee Intake (HCI/MCI), and Heavy/Moderate Tea Intake (HTI/MTI), i.e., contrasting people with a daily intake of ≥ 5 or < 5 cups against those with zero intake. Surprisingly, while all significant genetic correlation (rg) estimates between CI/TI and common diseases were positive, all significant rg estimates between MCI/MTI were negative [Supp Fig. 13]. For instance, CI showed a positive rg estimate (0.12, s.e. = 0.02) with cardiovascular disease (similar to HCI), while MCI showed the opposite (rg = -0.22, s.e. = 0.03). This suggests the potential existence of strong pleiotropic effects and/or confounders in this phenotype.

We also discovered a specific case, where the T allele of rs4410790 (top IV for HTI) had a negative effect on Heavy Tea Intake (HTI) but a positive effect on Moderate Tea Intake (MTI). This dosage-dependent genetic effect was not observed in alcohol consumption or coffee intake. The rs4410790 was significantly associated with AHR expression in blood [6], and AHR (i.e., aryl hydrocarbon receptor) is known for its role in caffeine degradation. This is the first instance where we discovered that the effect allele changed its direction of effect when performing GWAS in different phenotype intervals, explaining the observed dosage-dependent effect of tea intake in genetic correlation and MR analyses.

In summary, our investigation revealed that the causal effects of substance use behaviour traits are complex and require careful interpretation when making causal inferences using the MR framework.

What are the primary take-home messages?

  • Verify the validity of IVs before conducting an MR analysis.
  • Perform a dosage-dependent and sensitivity test if a non-linear relationship between traits is suspected.
  • Experiment with different MR methods to enhance the robustness of causal inference.
  • No evidence supports the protective effects from moderate drinking.


1. Xue, Angli, et al. "Genome-wide analyses of behavioural traits are subject to bias by misreports and longitudinal changes." Nature communications 12.1 (2021): 20211.

2. Wood, A.M., Kaptoge, S., Butterworth, A.S., Willeit, P., Warnakula, S. et al. Risk thresholds for alcohol consumption: combined analysis of individual-participant data for 599 912 current drinkers in 83 prospective studies. Lancet 391, 1513-1523 (2018).

3. Burton, R. & Sheron, N. No level of alcohol consumption improves health. Lancet 392, 987–988 (2018).

4. Millwood, I., Walters, R., Mei, X., Guo, Y., Yang, L. et al. Conventional and genetic evidence on alcohol and vascular disease aetiology: prospective study of 500,000 Chinese adults. Lancet (2019).

5. Zhu, Zhihong, et al. "Causal associations between risk factors and common diseases inferred from GWAS summary data." Nature communications 9.1 (2018): 1-12.

6. Võsa, Urmo, et al. "Large-scale cis-and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression." Nature genetics 53.9 (2021): 1300-1310.

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Life Sciences > Health Sciences > Biomedical Research > Epidemiology
Statistical Software
Mathematics and Computing > Statistics > Statistics and Computing > Statistical Software
Life Sciences > Biological Sciences > Biological Techniques > Computational and Systems Biology > Biostatistics
Genome-wide association studies
Life Sciences > Biological Sciences > Genetics and Genomics > Population Genetics > Genetic association study > Genome-wide association studies
Genetic Variation
Life Sciences > Biological Sciences > Genetics and Genomics > Population Genetics > Genetic Variation
Population Genetics
Life Sciences > Biological Sciences > Genetics and Genomics > Population Genetics

Related Collections

With collections, you can get published faster and increase your visibility.

Liquid biopsy

Publishing Model: Open Access

Deadline: Aug 13, 2024