TY - JOUR
T1 - Geographic Variation and Risk Factor Association of Early Versus Late Onset Colorectal Cancer
AU - Dong, Weichuan
AU - Kim, Uriel
AU - Rose, Johnie
AU - Hoehn, Richard S.
AU - Kucmanic, Matthew
AU - Eom, Kirsten
AU - Li, Shu
AU - Berger, Nathan A.
AU - Koroukian, Siran M.
N1 - Publisher Copyright:
© 2023 by the authors.
PY - 2023/2
Y1 - 2023/2
N2 - The proportion of patients diagnosed with colorectal cancer (CRC) at age < 50 (early-onset CRC, or EOCRC) has steadily increased over the past three decades relative to the proportion of patients diagnosed at age ≥ 50 (late-onset CRC, or LOCRC), despite the reduction in CRC incidence overall. An important gap in the literature is whether EOCRC shares the same community-level risk factors as LOCRC. Thus, we sought to (1) identify disparities in the incidence rates of EOCRC and LOCRC using geospatial analysis and (2) compare the importance of community-level risk factors (racial/ethnic, health status, behavioral, clinical care, physical environmental, and socioeconomic status risk factors) in the prediction of EOCRC and LOCRC incidence rates using a random forest machine learning approach. The incidence data came from the Surveillance, Epidemiology, and End Results program (years 2000–2019). The geospatial analysis revealed large geographic variations in EOCRC and LOCRC incidence rates. For example, some regions had relatively low LOCRC and high EOCRC rates (e.g., Georgia and eastern Texas) while others had relatively high LOCRC and low EOCRC rates (e.g., Iowa and New Jersey). The random forest analysis revealed that the importance of community-level risk factors most predictive of EOCRC versus LOCRC incidence rates differed meaningfully. For example, diabetes prevalence was the most important risk factor in predicting EOCRC incidence rate, but it was a less important risk factor of LOCRC incidence rate; physical inactivity was the most important risk factor in predicting LOCRC incidence rate, but it was the fourth most important predictor for EOCRC incidence rate. Thus, our community-level analysis demonstrates the geographic variation in EOCRC burden and the distinctive set of risk factors most predictive of EOCRC.
AB - The proportion of patients diagnosed with colorectal cancer (CRC) at age < 50 (early-onset CRC, or EOCRC) has steadily increased over the past three decades relative to the proportion of patients diagnosed at age ≥ 50 (late-onset CRC, or LOCRC), despite the reduction in CRC incidence overall. An important gap in the literature is whether EOCRC shares the same community-level risk factors as LOCRC. Thus, we sought to (1) identify disparities in the incidence rates of EOCRC and LOCRC using geospatial analysis and (2) compare the importance of community-level risk factors (racial/ethnic, health status, behavioral, clinical care, physical environmental, and socioeconomic status risk factors) in the prediction of EOCRC and LOCRC incidence rates using a random forest machine learning approach. The incidence data came from the Surveillance, Epidemiology, and End Results program (years 2000–2019). The geospatial analysis revealed large geographic variations in EOCRC and LOCRC incidence rates. For example, some regions had relatively low LOCRC and high EOCRC rates (e.g., Georgia and eastern Texas) while others had relatively high LOCRC and low EOCRC rates (e.g., Iowa and New Jersey). The random forest analysis revealed that the importance of community-level risk factors most predictive of EOCRC versus LOCRC incidence rates differed meaningfully. For example, diabetes prevalence was the most important risk factor in predicting EOCRC incidence rate, but it was a less important risk factor of LOCRC incidence rate; physical inactivity was the most important risk factor in predicting LOCRC incidence rate, but it was the fourth most important predictor for EOCRC incidence rate. Thus, our community-level analysis demonstrates the geographic variation in EOCRC burden and the distinctive set of risk factors most predictive of EOCRC.
KW - colorectal cancer
KW - early-onset
KW - geographic information system
KW - machine learning
KW - random forest
KW - regionalization
KW - risk factor
UR - https://www.scopus.com/pages/publications/85149129259
UR - https://www.scopus.com/inward/citedby.url?scp=85149129259&partnerID=8YFLogxK
U2 - 10.3390/cancers15041006
DO - 10.3390/cancers15041006
M3 - Article
AN - SCOPUS:85149129259
SN - 2072-6694
VL - 15
JO - Cancers
JF - Cancers
IS - 4
M1 - 1006
ER -