Shrink-Perturb improves architecture mixing during Population Based Training for Neural Architecture Search

Chebykin, Aleksandr; Dushatskiy, Arkadiy; Alderliesten, Tanja; Bosman, Peter

doi:10.3233/FAIA230294

A. Chebykin (Aleksandr), A. Dushatskiy (Arkadiy), T. Alderliesten (Tanja) and P.A.N. Bosman (Peter)

2023-09-28

Shrink-Perturb improves architecture mixing during Population Based Training for Neural Architecture Search

Presented at the 26th European Conference on Artificial Intelligence, ECAI 2023 (September 2023), Krakow, Poland

In this work, we show that simultaneously training and mixing neural networks is a promising way to conduct Neural Architecture Search (NAS). For hyperparameter optimization, reusing the partially trained weights allows for efficient search, as was previously demonstrated by the Population Based Training (PBT) algorithm. We propose PBT-NAS, an adaptation of PBT to NAS where architectures are improved during training by replacing poorly-performing networks in a population with the result of mixing well-performing ones and inheriting the weights using the shrink-perturb technique. After PBT-NAS terminates, the created networks can be directly used without retraining. PBT-NAS is highly parallelizable and effective: on challenging tasks (image generation and reinforcement learning) PBT-NAS achieves superior performance compared to baselines (random search and mutation-based PBT).

Additional Metadata
Persistent URL	doi.org/10.3233/FAIA230294
Series	Frontiers in artificial intelligence and applicationsFAIA
Project	Distributed and Automated Evolutionary Deep Architecture Learning with Unprecedented Scalability , Optimization for and with Machine Learning
Conference	26th European Conference on Artificial Intelligence, ECAI 2023
Grant	This work was funded by the The Netherlands Organisation for Scientific Research (NWO); grant id nwo/18373 - Distributed and Automated Evolutionary Deep Architecture Learning with Unprecedented Scalability (DAEDALUS), This work was funded by the The Netherlands Organisation for Scientific Research (NWO); grant id nwo/OCENW.2019.015 - Optimization for and with Machine Learning
Organisation	Centrum Wiskunde & Informatica, Amsterdam (CWI), The Netherlands
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Chebykin, A., Dushatskiy, A., Alderliesten, T., & Bosman, P. (2023). Shrink-Perturb improves architecture mixing during Population Based Training for Neural Architecture Search. In Proceedings of the European Conference on Artificial Intelligence (pp. 381–388). doi:10.3233/FAIA230294

View at Publisher

Free Full Text ( Final Version , 531kb )

See Also
techReport Shrink-Perturb improves architecture mixing during Population Based Training for Neural Architecture Search A. Chebykin (Aleksandr), A. Dushatskiy (Arkadiy), T. Alderliesten (Tanja) and P.A.N. Bosman (Peter)

Shrink-Perturb improves architecture mixing during Population Based Training for Neural Architecture Search

Publication

Publication

techReport
Shrink-Perturb improves architecture mixing during Population Based Training for Neural Architecture Search

Address

CWI researchers

Questions or comments?

Shrink-Perturb improves architecture mixing during Population Based Training for Neural Architecture Search

Publication

Publication

techReport Shrink-Perturb improves architecture mixing during Population Based Training for Neural Architecture Search

Workflow

Workflow

Add Content

techReport
Shrink-Perturb improves architecture mixing during Population Based Training for Neural Architecture Search