SRBench++: Principled benchmarking of symbolic regression with domain-expert interpretation

De Franca, F.O.; Virgolin, Marco; Kommenda, M.; Majumder, M.S.; Cranmer, M.; Espada, G.; Ingelse, L.; Fonseca, A.; Landajuela, M.; Petersen, B.; Glatt, R.; Mundhenk, N.; Lee, C.S.; Hochhalter, J.D.; Randall, D.L.; Kamienny, Pierre-Alexandre; Zhang, H.; Dick, G.; Simon, A.; Burlacu, B.; Kasak, Jaan; Machado, Meera; Wilstrup, Casper; Cavaz, W.G.L.

doi:10.1109/TEVC.2024.3423681

Symbolic regression searches for analytic expressions that accurately describe studied phenomena. The main promise of this approach is that it may return an interpretable model that can be insightful to users, while maintaining high accuracy. The current standard for benchmarking these algorithms is SRBench, which evaluates methods on hundreds of datasets that are a mix of real-world and simulated processes spanning multiple domains. At present, the ability of SRBench to evaluate interpretability is limited to measuring the size of expressions on real-world data, and the exactness of model forms on synthetic data. In practice, model size is only one of many factors used by subject experts to determine how interpretable a model truly is. Furthermore, SRBench does not characterize algorithm performance on specific, challenging sub-tasks of regression such as feature selection and evasion of local minima. In this work, we propose and evaluate an approach to benchmarking SR algorithms that addresses these limitations of SRBench by 1) incorporating expert evaluations of interpretability on a domain-specific task, and 2) evaluating algorithms over distinct properties of data science tasks. We evaluate 12 modern symbolic regression algorithms on these benchmarks and present an in-depth analysis of the results, discuss current challenges of symbolic regression algorithms and highlight possible improvements for the benchmark itself.

Additional Metadata
Keywords	Accuracy, Benchmark testing, Competition, Current measurement, Evolutionary computation, Interpretable Machine Learning, Machine learning algorithms, Prediction algorithms, Symbolic Regression, Task analysis
Stakeholder	Abzu AI, Nordhavn, Denmark
Persistent URL	doi.org/10.1109/TEVC.2024.3423681
Journal	IEEE Transactions on Evolutionary Computation
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	De Franca, F. O., Virgolin, M., Kommenda, M., Majumder, M. S., Cranmer, M., Espada, G., … Cavaz, W. G. L. (2024). SRBench++: Principled benchmarking of symbolic regression with domain-expert interpretation. IEEE Transactions on Evolutionary Computation. doi:10.1109/TEVC.2024.3423681

View at Publisher

Full Text ( Author Manuscript , 706kb )

SRBench++: Principled benchmarking of symbolic regression with domain-expert interpretation

Publication

Publication

Address

CWI researchers

Questions or comments?

SRBench++: Principled benchmarking of symbolic regression with domain-expert interpretation

Publication

Publication

Workflow

Workflow

Add Content