Image generation

A neural network generates anime pictures (faces).

Description

Machine learning can be used to create autopilots, accurate weather forecasts, and more. Or you can do something really useful like drawing anime-chan.

It's worth noting that the faces are very diverse, with changes in hair or eye colour, head orientation, and other small details, as well as the overall style. The images can look like a cartoon frame, computer graphics, 90s and 00s anime, or even watercolour or oil paintings.

StyleGAN was introduced in 2018. It uses the standard GAN architecture used in ProGAN, but draws inspiration from the style transfer mechanism. StyleGAN modifies its generator network (generator), which creates an image by magnifying it multiple times: 8px → 16px → 32px → 64px → 128px, etc. At each level, a combination of random input data or "style noise" with AdaIN is used. This tells the generator how to stylize images with a certain resolution: change the hair, skin texture, and so on. By systematically creating such randomness at every stage of the image generation process, StyleGAN can effectively select better options.

Comparison of ProGAN (a) and StyleGAN (b) architectures


StyleGAN also introduces a number of additional improvements: for example, it uses a new dataset of faces “FFHQ” with images of 1024 pixels (higher than ProGAN). In addition, the network demonstrates less loss and uses fully connected layers very intensively to process random input (at least 8 layers of 512 neurons, while most GANs have 1 or 2 layers). What's even more striking is that StyleGAN doesn't use the techniques that were considered critical for training other GANs: for example, relativistic losses, noise distribution, extended regularization, etc.

Apart from these features, the architecture is quite ordinary. So if you've dealt with any GAN - you can safely work with StyleGAN. The training process is the same, the hyperparameters are standard, and the code is largely the same as ProGAN.

One of the most useful things that can be done with a trained StyleGAN model is to use it as a "launching pad" for faster training of a new network on a smaller amount of data. For example, our model can be retrained for a subset of anime characters.: redheads, men, or one particular character. This will require about 500-5000 new images, but sometimes 50 is enough.

If you are interested in this neural network and it can help you solve your business and other technical problems, please send us an email:  info@ai4b.org