2024-08-01
Exploring the search space of neural network combinations obtained with efficient model stitching
Publication
Publication
Machine learning models can be made more performant and their predictions more consistent by creating an ensemble. Each neural network in an ensemble commonly performs its own feature extraction. These features are often highly similar, leading to potentially many redundant calculations. Unifying these calculations (i.e., reusing some of them) would be desirable to reduce computational cost. However, splicing two trained networks is non-trivial because architectures and feature representations typically differ, leading to a performance breakdown. To overcome this issue, we propose to employ stitching, which introduces new layers at crossover points. Essentially, a new network consisting of the two basis networks is constructed. In this network, new links between the two basis networks are created through the introduction and training of stitches. New networks can then be created by choosing which stitching layers to (not) use, thereby selecting a subnetwork. Akin to a supernetwork, assessing the performance of a selected subnetwork is efficient, as only their evaluation on data is required. We experimentally show that our proposed approach enables finding networks that represent novel trade-offs between performance and computational cost compared to classical ensembles, with some new networks even dominating the original networks.
Additional Metadata | |
---|---|
, , , | |
Ortec Logistics Holding BV , Elekta, Stockholm, Sweden | |
doi.org/10.1145/3638530.3664131 | |
Distributed and Automated Evolutionary Deep Architecture Learning with Unprecedented Scalability | |
GECCO '24 Companion: Genetic and Evolutionary Computation Conference Companion | |
Organisation | Centrum Wiskunde & Informatica, Amsterdam (CWI), The Netherlands |
Guijt, A., Thierens, D., Alderliesten, T., & Bosman, P. (2024). Exploring the search space of neural network combinations obtained with efficient model stitching. In GECCO 2024 - Proceedings of the 2024 Genetic and Evolutionary Computation Conference (pp. 1914–1923). doi:10.1145/3638530.3664131 |