Development of a machine learning model related to explore the association between heavy metal exposure and alveolar bone loss among US adults utilizing SHAP: a study based on NHANES 2015-2018
Published in Biomedical Research
Background
Alveolar bone loss (ABL) is common in modern society. Heavy metal exposure is usually considered to be a risk factor for ABL. Some studies revealed a positive trend found between urinary heavy metals and periodontitis using multiple logistic regression and Bayesian kernel machine regression. Overfitting using kernel function, long calculation period, the definition of prior distribution and lack of rank of heavy metal will affect the performance of the statistical model. Optimal model on this topic still remains controversy. This study aimed: (1) to develop an algorithm for exploring the association between heavy metal exposure and ABL; (2) filter the actual causal variables and investigate how heavy metals were associated with ABL; and (3) identify the potential risk factors for ABL.
Methods
Data were collected from National Health and Nutrition Examination Survey (NHANES) between 2015 and 2018 to develop a machine learning (ML) model. Feature selection was performed using the Least Absolute Shrinkage and Selection Operator (LASSO) regression with 10-fold cross-validation. The selected data were balanced using the Synthetic Minority Oversampling Technique (SMOTE) and divided into a training set and testing set at a 3:1 ratio. Logistic Regression (LR), Support Vector Machines (SVM), Random Forest (RF), K-Nearest Neighbor (KNN), Decision Tree (DT), and XGboost were used to construct the ML model. Accuracy, Area Under the Receiver Operating Characteristic Curve (AUC), Precision, Recall, and F1 score were used to select the optimal model for further analysis. The contribution of the variables to the ML model was explained using the Shapley Additive Explanations (SHAP) method.
Results
RF showed the best performance in exploring the association between heavy metal exposure and ABL, with an AUC (0.88), accuracy (0.78), precision (0.76), recall (0.83), and F1 score (0.79). Age was the most important factor in the ML model (mean| SHAP value| = 0.09), and Cd was the primary contributor. Sex had little effect on the ML model contribution.
Conclusion
In this study, RF showed superior performance compared with the other five algorithms. Among the 12 heavy metals, Cd was the most important factor in the ML model. The relationship of Co & Pb and ABL are weaker than that of Cd. Among all the independent variables, age was considered the most important factor for this model. As for PIR, low-income participants present association with ABL. Mexican American and Non-Hispanic White show low association with ABL compared to Non-Hispanic Black and other races. Gender feature demonstrates a weak association with ABL. In the future, more advanced algorithms should be developed to validate these results and related parameters can be tuned to improve the accuracy of the model.
Follow the Topic
-
BMC Public Health
An open access, peer-reviewed journal that considers articles on the epidemiology of disease and the understanding of all aspects of public health.
Related Collections
With Collections, you can get published faster and increase your visibility.
Monitoring, preventing, and managing type 2 diabetes
BMC Public Health is calling for submissions to our Collection on Monitoring, preventing, and managing diabetes at the population level. With rates of type 2 diabetes rising globally, especially in low- and middle-income countries and underserved communities, prevention strategies are critical. As the disease progresses people with diabetes are at increased risk of complications such as cardiovascular and kidney diseases, neuropathy and visual loss.
This Collection seeks submissions that explore population-level approaches to monitoring rates of diabetes, preventing or delaying the development of type 2 diabetes, and system-wide efforts to improve the management of the disease and reduce rates of complications, with a focus on improving health outcomes and reducing healthcare burdens.
Submissions are encouraged on primary prevention initiatives and culturally adapted, community-level interventions to reduce the risk of diabetes. Research aimed at improving systems for monitoring rates of diabetes and its complications through routinely-collected health data, or for improving management by enhancing patient engagement with healthcare systems or better identifying those in need, are encouraged. Research on diabetes education and support systems is also welcomed, with a focus on empowering individuals to adopt and sustain healthier lifestyles and avoid known causes of diabetes.
Additional topics of interest include (but are not limited to):
Access to healthcare and diabetes management
The impact of food insecurity on diabetes outcomes
Community-based interventions for low-income populations
Interventions to reduce exposure to environmental causes of diabetes
Financial barriers to diabetes medication and treatment
Housing instability, employment status and type 2 diabetes
Health literacy, poverty, and diabetes management
Policies to reduce poverty-related health disparities in diabetes
This Collection supports and amplifies research related to SDG 3: Good Health & Well-Being.
All manuscripts submitted to this journal, including those submitted to collections and special issues, are assessed in line with our editorial policies and the journal’s peer review process. Reviewers and editors are required to declare competing interests and can be excluded from the peer review process if a competing interest exists.
Publishing Model: Open Access
Deadline: Jul 16, 2026
Male reproductive health
BMC Public Health invites submissions to our new Collection, "Male reproductive health”. Male reproductive health is an essential yet often overlooked aspect of public health that encompasses various factors affecting men's fertility and overall well-being. Issues such as declining sperm counts, poor sperm quality, the impact of environmental exposures and the effects of lifestyle and dietary factors on reproductive outcomes are gaining increasing attention.
This Collection seeks to examine the multifaceted influences on male reproductive health, considering environmental, infectious and sociocultural dimensions that affect male reproductive parameters and contribute to fertility challenges. Continued research in this area could help identify causative factors and contribute to advances in public health policies, ultimately benefiting future generations.
Key topics of interest include, but are not limited to:
Environmental pollution and reproductive health
Sperm count trends and implications for fertility
The role of occupational exposures in male fertility
The effects of infectious diseases on sperm parameters
Impact of diet and lifestyle factors on sperm quality and fertility
Psychological factors influencing reproductive health
Interventions for improving male reproductive health
This Collection supports and amplifies research related to Sustainable Development Goal 3- Good Health and Well-Being.
All manuscripts submitted to this journal, including those submitted to collections and special issues, are assessed in line with our editorial policies and the journal’s peer review process. Reviewers and editors are required to declare competing interests and can be excluded from the peer review process if a competing interest exists.
Publishing Model: Open Access
Deadline: Jun 29, 2026
Please sign in or register for FREE
If you are a registered user on Research Communities by Springer Nature, please sign in