Symbolic regression and feature construction with GP-GOMEA applied to radiotherapy dose reconstruction of childhood cancer survivors
The recently introduced Gene-pool Optimal Mixing Evolutionary Algorithm for Genetic Programming (GP-GOMEA) has been shown to find much smaller solutions of equally high quality compared to other state-of-the-art GP approaches. This is an interesting aspect as small solutions better enable human interpretation. In this paper, an adaptation of GP-GOMEA to tackle real-world symbolic regression is proposed, in order to find small yet accurate mathematical expressions, and with an application to a problem of clinical interest. For radiotherapy dose reconstruction, a model is sought that captures anatomical patient similarity. This problem is particularly interesting because while features are patient-specific, the variable to regress is a distance, and is defined over patient pairs. We show that on benchmark problems as well as on the application, GP-GOMEA outperforms variants of standard GP. To find even more accurate models, we further consider an evolutionary meta learning approach, where GP-GOMEA is used to construct small, yet effective features for a different machine learning algorithm. Experimental results show how this approach significantly improves the performance of linear regression, support vector machines, and random forest, while providing meaningful and interpretable features.
|Keywords||Dose reconstruction, Feature construction, Genetic programming, GOMEA, Machine learning, Radiotherapy|
|Project||3D dose reconstruction for children with long-term follow-up Toward improved decision making in radiation treatment for children with cancer|
|Conference||Genetic and Evolutionary Computation Conference|
Virgolin, M, Alderliesten, T, Bel, A, Witteveen, C, & Bosman, P.A.N. (2018). Symbolic regression and feature construction with GP-GOMEA applied to radiotherapy dose reconstruction of childhood cancer survivors. In GECCO 2018 - Proceedings of the 2018 Genetic and Evolutionary Computation Conference (pp. 1395–1402). doi:10.1145/3205455.3205604