Document Type
Honors Project On-Campus Access Only
Abstract
Since the training error tends to underestimate the true test error, an appropriate test error estimator is necessary to evaluate and select predictive learning models. Our research builds on previous results to compare the single bootstrap and k-fold cross-validation, with a wider variety of parameters underlying the data causal structure, learning models and test error estimators. Using simulated data from a causal graph, we compared cross-validation estimates and bootstrap estimates with the true test error for LASSO and random forest models in varied parameter settings. We found that bootstrap underestimates the test error for both models, while k-fold cross-validation underestimates the test error for LASSO and performs well for random forests.
Recommended Citation
Giang, Chau H., "A Comparison of Bootstrap and K-Fold Cross-Validation as Test Error Estimators" (2021). Mathematics, Statistics, and Computer Science Honors Projects. 52.
https://digitalcommons.macalester.edu/mathcs_honors/52
© Copyright is owned by author of this document