TY - JOUR
T1 - Cross-validating models of continuous data from simulation and experiment by using linear regression and artificial neural networks
AU - Zakeri, Zohreh
AU - Mansfield, Neil
AU - Sunderland, Caroline
AU - Omurtag, Ahmet
N1 - Publisher Copyright:
© 2020
PY - 2020/1
Y1 - 2020/1
N2 - We are increasingly surrounded by sensors gathering massive amounts of data, and patterns in continuous variables are often discovered by using artificial neural networks (ANN), while linear regression (LR) is useful for detecting linear relationships. LR also provide preliminary estimates of potentially complex associations, and serve as a benchmark for the performance of ANNs. We show that while cross-validation (CV) is indispensable for insuring the robustness of the discovered patterns, it systematically leads, when combined with LR, to specific artefacts that underestimate the extent of the associations between predictor and target variables. We explain how this previously unnoticed type of artefact arises specifically from the combination of CV with LR and does not affect non-linear methods such as ANN. We also demonstrate through simulations that ANN were able to discover a wide range of complex associations missed by LR. The results were confirmed by the analysis of physiological, behavioural and subjective data collected from N = 31 human subjects performing laparoscopy training experiments.
AB - We are increasingly surrounded by sensors gathering massive amounts of data, and patterns in continuous variables are often discovered by using artificial neural networks (ANN), while linear regression (LR) is useful for detecting linear relationships. LR also provide preliminary estimates of potentially complex associations, and serve as a benchmark for the performance of ANNs. We show that while cross-validation (CV) is indispensable for insuring the robustness of the discovered patterns, it systematically leads, when combined with LR, to specific artefacts that underestimate the extent of the associations between predictor and target variables. We explain how this previously unnoticed type of artefact arises specifically from the combination of CV with LR and does not affect non-linear methods such as ANN. We also demonstrate through simulations that ANN were able to discover a wide range of complex associations missed by LR. The results were confirmed by the analysis of physiological, behavioural and subjective data collected from N = 31 human subjects performing laparoscopy training experiments.
KW - Artificial neural networks
KW - Cross-validation
KW - Linear regression
UR - http://www.scopus.com/inward/record.url?scp=85093655196&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85093655196&partnerID=8YFLogxK
U2 - 10.1016/j.imu.2020.100457
DO - 10.1016/j.imu.2020.100457
M3 - Article
AN - SCOPUS:85093655196
SN - 2352-9148
VL - 21
JO - Informatics in Medicine Unlocked
JF - Informatics in Medicine Unlocked
M1 - 100457
ER -