BlendGAN: Implicitly GAN Blending for Arbitrary Stylized Face Generation
NeurIPS 2021
Mingcong Liu    Qiang Li    Zekui Qin    Guoxin Zhang    Pengfei Wan    Wen Zheng   
Y-tech, Kuaishou Technology
[Paper]
[Code]
[Bibtex]


Abstract

Generative Adversarial Networks (GANs) have made a dramatic leap in high-fidelity image synthesis and stylized face generation. Recently, a layer-swapping mechanism has been developed to improve the stylization performance. However, this method is incapable of fitting arbitrary styles in a single model and requires hundreds of style-consistent training images for each style. To address the above issues, we propose BlendGAN for arbitrary stylized face generation by leveraging a flexible blending strategy and a generic artistic dataset. Specifically, we first train a self-supervised style encoder on the generic artistic dataset to extract the representations of arbitrary styles. In addition, a weighted blending module (WBM) is proposed to blend face and style representations implicitly and control the arbitrary stylization effect. By doing so, BlendGAN can gracefully fit arbitrary styles in a unified model while avoiding case-by-case preparation of style-consistent training images. To this end, we also present a novel large-scale artistic face dataset AAHQ. Extensive experiments demonstrate that BlendGAN outperforms state-of-the-art methods in terms of visual quality and style diversity for both latent-guided and reference-guided stylized face synthesis.


Overview


The style encoder Estyle extracts the style latent code zs of a reference style image. The face latent code zf is randomly sampled from the standard Gaussian distribution. Two MLPs transform face and style latent codes into their W spaces separately, then they are combined by the weighted blending module (WBM) and fed into generator G to synthesise natural and stylized face images. Three discriminators are used in our method. The face discriminator Dface distinguishes between real and fake natural-face images, the style discriminator Dstyle distinguishes between real and fake stylized-face images, and the style latent discriminator Dstyle_latent predicts whether the stylized-face image is consistent with the style latent code zs.

Paper

M. Liu, Q. Li, Z. Qin, G. Zhang,
P. Wan, W. Zheng.

BlendGAN: Implicitly GAN Blending for Arbitrary Stylized Face Generation.
NeurIPS, 2021.

[Paper] | [Bibtex]


Acknowledgements

We sincerely thank all the reviewers for their comments. We also thank Zhenyu Guo for help in preparing the comparison to StarGANv2.