TY - JOUR
T1 - Risk factors and geographic disparities in premature cardiovascular mortality in US counties
T2 - a machine learning approach
AU - Dong, Weichuan
AU - Motairek, Issam
AU - Nasir, Khurram
AU - Chen, Zhuo
AU - Kim, Uriel
AU - Khalifa, Yassin
AU - Freedman, Darcy
AU - Griggs, Stephanie
AU - Rajagopalan, Sanjay
AU - Al-Kindi, Sadeer G.
N1 - Funding Information:
This study was partly funded by the National Institutes of Health (Awards# P50MD017351, R35ES031702, R01ES019616).
Funding Information:
Dr. Dong is supported by contracts from Cleveland Clinic Foundation, including a subcontract from Celgene Corporation. Dr. Dong does not have other competing interests to report. Dr. Motairek, Dr. Nasir, Mr. Chen, Dr. Kim, Dr. Khalifa, Dr. Freedman, Dr. Griggs, Dr. Rajagopalan, and Dr. Al-Kindi do not have any competing interest.
Publisher Copyright:
© 2023, The Author(s).
PY - 2023/12
Y1 - 2023/12
N2 - Disparities in premature cardiovascular mortality (PCVM) have been associated with socioeconomic, behavioral, and environmental risk factors. Understanding the “phenotypes”, or combinations of characteristics associated with the highest risk of PCVM, and the geographic distributions of these phenotypes is critical to targeting PCVM interventions. This study applied the classification and regression tree (CART) to identify county phenotypes of PCVM and geographic information systems to examine the distributions of identified phenotypes. Random forest analysis was applied to evaluate the relative importance of risk factors associated with PCVM. The CART analysis identified seven county phenotypes of PCVM, where high-risk phenotypes were characterized by having greater percentages of people with lower income, higher physical inactivity, and higher food insecurity. These high-risk phenotypes were mostly concentrated in the Black Belt of the American South and the Appalachian region. The random forest analysis identified additional important risk factors associated with PCVM, including broadband access, smoking, receipt of Supplemental Nutrition Assistance Program benefits, and educational attainment. Our study demonstrates the use of machine learning approaches in characterizing community-level phenotypes of PCVM. Interventions to reduce PCVM should be tailored according to these phenotypes in corresponding geographic areas.
AB - Disparities in premature cardiovascular mortality (PCVM) have been associated with socioeconomic, behavioral, and environmental risk factors. Understanding the “phenotypes”, or combinations of characteristics associated with the highest risk of PCVM, and the geographic distributions of these phenotypes is critical to targeting PCVM interventions. This study applied the classification and regression tree (CART) to identify county phenotypes of PCVM and geographic information systems to examine the distributions of identified phenotypes. Random forest analysis was applied to evaluate the relative importance of risk factors associated with PCVM. The CART analysis identified seven county phenotypes of PCVM, where high-risk phenotypes were characterized by having greater percentages of people with lower income, higher physical inactivity, and higher food insecurity. These high-risk phenotypes were mostly concentrated in the Black Belt of the American South and the Appalachian region. The random forest analysis identified additional important risk factors associated with PCVM, including broadband access, smoking, receipt of Supplemental Nutrition Assistance Program benefits, and educational attainment. Our study demonstrates the use of machine learning approaches in characterizing community-level phenotypes of PCVM. Interventions to reduce PCVM should be tailored according to these phenotypes in corresponding geographic areas.
UR - http://www.scopus.com/inward/record.url?scp=85148679220&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85148679220&partnerID=8YFLogxK
U2 - 10.1038/s41598-023-30188-9
DO - 10.1038/s41598-023-30188-9
M3 - Article
C2 - 36808141
AN - SCOPUS:85148679220
VL - 13
JO - Scientific Reports
JF - Scientific Reports
SN - 2045-2322
IS - 1
M1 - 2978
ER -