Skip to main content

Multimethod, multidataset analysis reveals paradoxical relationships between sociodemographic factors, Hispanic ethnicity and diabetes.

Citation
Knight, G. M., et al. “Multimethod, Multidataset Analysis Reveals Paradoxical Relationships Between Sociodemographic Factors, Hispanic Ethnicity And Diabetes.”. Bmj Open Diabetes Research & Care.
Center Stanford University
Author Gabriel M Knight, Gabriela Spencer-Bonilla, David M Maahs, Manuel R Blum, Areli Valencia, Bongeka Z Zuma, Priya Prahalad, Ashish Sarraju, Fatima Rodriguez, David Scheinker
Keywords diabetes mellitus, ethnic groups, informatics, risk factors, Type 2
Abstract

INTRODUCTION: Population-level and individual-level analyses have strengths and limitations as do 'blackbox' machine learning (ML) and traditional, interpretable models. Diabetes mellitus (DM) is a leading cause of morbidity and mortality with complex sociodemographic dynamics that have not been analyzed in a way that leverages population-level and individual-level data as well as traditional epidemiological and ML models. We analyzed complementary individual-level and county-level datasets with both regression and ML methods to study the association between sociodemographic factors and DM.

RESEARCH DESIGN AND METHODS: County-level DM prevalence, demographics, and socioeconomic status (SES) factors were extracted from the 2018 Robert Wood Johnson Foundation County Health Rankings and merged with US Census data. Analogous individual-level data were extracted from 2007 to 2016 National Health and Nutrition Examination Survey studies and corrected for oversampling with survey weights. We used multivariate linear (logistic) regression and ML regression (classification) models for county (individual) data. Regression and ML models were compared using measures of explained variation (area under the receiver operating characteristic curve (AUC) and R).

RESULTS: Among the 3138 counties assessed, the mean DM prevalence was 11.4% (range: 3.0%-21.1%). Among the 12 824 individuals assessed, 1688 met DM criteria (13.2% unweighted; 10.2% weighted). Age, gender, race/ethnicity, income, and education were associated with DM at the county and individual levels. Higher county Hispanic ethnic density was negatively associated with county DM prevalence, while Hispanic ethnicity was positively associated with individual DM. ML outperformed regression in both datasets (mean R of 0.679 vs 0.610, respectively (p<0.001) for county-level data; mean AUC of 0.737 vs 0.727 (p<0.0427) for individual-level data).

CONCLUSIONS: Hispanic individuals are at higher risk of DM, while counties with larger Hispanic populations have lower DM prevalence. Analyses of population-level and individual-level data with multiple methods may afford more confidence in results and identify areas for further study.

Year of Publication
2020
Journal
BMJ open diabetes research & care
Volume
8
Issue
2
Date Published
12/2020
ISSN Number
2052-4897
DOI
10.1136/bmjdrc-2020-001725
Alternate Journal
BMJ Open Diabetes Res Care
PMID
33229378
PMCID
PMC7684662
Download citation