Machine learning models can be made more performant and their predictions more consistent by creating an ensemble. Each neural network in an ensemble commonly performs its own feature extraction. These features are often highly similar, leading to potentially many redundant calculations. Unifying these calculations (i.e., reusing some of them) would be desirable to reduce computational cost. However, splicing two trained networks is non-trivial because architectures and feature representations typically differ, leading to a performance breakdown. To overcome this issue, we propose to employ stitching, which introduces new layers at crossover points. Essentially, a new network consisting of the two basis networks is constructed. In this network, new links between the two basis networks are created through the introduction and training of stitches. New networks can then be created by choosing which stitching layers to (not) use, thereby selecting a subnetwork. Akin to a supernetwork, assessing the performance of a selected subnetwork is efficient, as only their evaluation on data is required. We experimentally show that our proposed approach enables finding networks that represent novel trade-offs between performance and computational cost compared to classical ensembles, with some new networks even dominating the original networks.

, , ,
Ortec Logistics Holding BV , Elekta, Stockholm, Sweden
doi.org/10.1145/3638530.3664131
Distributed and Automated Evolutionary Deep Architecture Learning with Unprecedented Scalability
GECCO '24 Companion: Genetic and Evolutionary Computation Conference Companion
Centrum Wiskunde & Informatica, Amsterdam (CWI), The Netherlands

Guijt, A., Thierens, D., Alderliesten, T., & Bosman, P. (2024). Exploring the search space of neural network combinations obtained with efficient model stitching. In GECCO 2024 - Proceedings of the 2024 Genetic and Evolutionary Computation Conference (pp. 1914–1923). doi:10.1145/3638530.3664131