StyleGAN2 is able to generate very realistic and high-quality faces of humans using a training set (FFHQ). Instead of using one of the many commonly used metrics to evaluate the performance of a face generator (e.g., FID, IS and P&R), this paper uses a more humanlike approach providing a different outlook on the performance of StyleGAN2. The generator within StyleGAN2 tries to learn the distribution of the input dataset. However, this does not necessarily mean that higher-level human concepts are preserved. We examine if general human attributes, such as age and gender, are transferred to the output dataset and if StyleGAN2 is able to generate actual new persons according to facial recognition methods. It is crucial for practical implementations that a face generator not only generates new humans, but that these humans are not clones of the original identities. This article addresses these questions. Although our approach can be used for other face generators, we only focused on StyleGAN2. First, multiple models are used to predict general human attributes. This shows that the generated images have the same attribute distributions as the input dataset. However, if truncation is applied to limit the latent variable space, the attribute distributions change towards the attributes corresponding with the latent variable used in truncation. Second, by clustering using face recognition models, we demonstrate that the generated images do not belong to an existing person from the input dataset. Thus, StyleGAN2 is able to generate new persons with similar human characteristics as the input dataset.

, , , , ,
Machine Learning with Applications

Pries, J., Bhulai, S., & van der Mei, R. (2022). Evaluating a face generator from a human perspective. Machine Learning with Applications, 10, 100412:1–100412:12. doi:10.1016/j.mlwa.2022.100412