2025-12-31
Prediction of depression relapse using machine learning with administrative data: Balancing complexity and simplicity
Publication
Publication
Quality and Reliability Engineering International , Volume 2025
Depression is a mental disorder with a high lifetime prevalence and one of the leading causes of disability worldwide. As many patients experience another depressive episode after being treated, predictive monitoring for the risk of relapse is essential for healthcare professionals to be able to follow up on patients and intervene early. However, automatically monitoring these large groups requires additional considerations going beyond predictive performance, such as data availability and interpretability. In the present paper, we study the suitability of using readily available administrative data for this prediction task. We contrast a logistic regression model containing only a small number of predictors on demographics, medication, and estimated depression severity with regularized regression and XGBoost models incorporating a large number of predictors describing individual treatment and social information. Our results demonstrate that the inclusion of more detailed input does not result in a significant improvement in performance when compared to simpler regression models. In similar data types, we therefore recommend to primarily focus on a small interpretable model.
| Additional Metadata | |
|---|---|
| doi.org/10.1002/qre.70139 | |
| Quality and Reliability Engineering International | |
| creativecommons.org/licenses/by-nc-nd/4.0/ | |
|
von Stackelberg, P., Goedhart, R., Huberts, L. C. E., Lokkerbol, J., & Birbil, I. (2025). Prediction of depression relapse using machine learning with administrative data: Balancing complexity and simplicity. Quality and Reliability Engineering International, 2025. doi:10.1002/qre.70139 |
|