Skip to content

Chinese Student Creates AI That Can Turn Photos Into Anime

Last updated on April 10, 2023

Recently, a Chinese student developed an AI to transform photos into anime images. It uses the Generative Adversarial Network (GAN) and deep learning techniques to produce unique anime artwork from standard human pictures.

This software offers an alternative to Snapchat, and Instagram filters that often produce subpar results. Furthermore, it’s accessible even to non-professionals.

What is CycleGAN?

The student from Fudan University has used Generative Adversarial Network (GAN) and Deep Learning methods to develop software that can turn regular human pictures into anime masterpieces. Called Jin Yanghua’s Anime Girl Generator, the program allows users to customize details such as hair color or eye color.

The student has also made their program open-source on Github, enabling people to create their own ehentai girls using attributes taken from anime images such as hair color and eye color, length of hair (long or short), and whether the mouth is open or not. Another fascinating aspect of the program is its capacity to generate various styles, such as Van Gogh, Cezanne, Monet, and Ukiyo-e. This feat sets it apart from other models mimicking style transfer. Furthermore, it can transform objects from one class to another – like zebras to horses or apples to oranges. Moreover, it performs season transfer, changing images from winter to summer and vice versa.

Each image is calculated and combined

CycleGAN is a generative adversarial network (GAN) that can learn image-to-image translation without needing paired data. This property allows it to easily translate images with various textures and styles. The paper Unpaired Image-to-Image Translation Utilizing Cycle Consistent Adversarial Networks outlines how to train this model. First, a mini-batch of random images from each domain (imgsA and imgsB) is generated.

Following, GAB is used to translate images A to Y and GBA to X with the assistance of DA. Finally, both domain losses (DA and GAB) are calculated for each image and combined. CycleGAN can perform various tasks like object transformation, style transfer, and photo enhancement. For instance, the paper illustrates how the model can enhance the depth of field in close-up photographs of flowers. However, it has some limitations, such as its incapability to perform geometrical transformations. This may make it less suitable for other applications, such as translating medical images to CT data or adding whimsical clouds to a sky to make it appear painted by Van Gogh.

How does CycleGAN work?

CycleGAN is an extension of the Generative Adversarial Network (GAN) architecture. In GAN, a generator inputs a vector from a uniform distribution. Then, it gradually increases its spatial dimension to form an image. CycleGAN takes this concept further by transmuting objects from the domain of one generator to that of a second generator. It does so by combining cycle consistency with adversarial loss.

To achieve this, a comparison is made between an input photo and its generated counterpart, then computing the difference. This difference then serves to update the generator model during each training iteration. The generator also employs a discriminator model that estimates how likely each generated photo will come from the target image collection. These two models are trained with regular adversarial loss. They can synthesize an image from any given set of photos.

To effectively train CycleGAN, it is essential to comprehend how loss functions function. For example, the discriminator attempts to minimize its likelihood of rejecting a generated sample as accurate. At the same time, the generator strives to maximize it. The generator must understand this since an adversarial loss will incentivize them to generate valid samples from the target domain. Playing a minimax game with the discriminator will teach them to match generated models with natural image distributions within that domain.

It permits creating images that do not translate from their source image

However, this can be challenging since a randomly generated sample may have a random chance of being honest. That is why CycleGAN uses an adversarial loss system to guarantee that generated models have a realistic probability. CycleGAN is one of the most widely-used GAN architectures. However, it primarily serves to teach transformation between images with various styles. For instance, it can turn photos of horses into zebras, summertime images into winter ones, and vice-versa.

CycleGAN works on the principle that an image output by one generator should be used as input to another and produce an identical result to that original image. This extension to architecture provides powerful capabilities, as it permits creating images that do not translate from their source image – which may prove challenging for many standard GANs to accomplish.

Why is CycleGAN able to create anime pictures?

Yanghua Jin, an undergraduate student at Fudan University, has developed an AI that can transform any photo into an anime masterpiece. Utilizing Generative Adversarial Network (GAN) technology and Deep Learning, their software produces stunning realistic, and believable pictures. The system’s generator can create a picture that a human eye would recognize in sites such as animeidhentai using attributes like eye size and hair length. As it works, the software learns from its mistakes and improves over time. A discriminator is also built into the system and utilizes similar algorithms found in an actual image recognition model. When combined, these two networks form a symbiotic relationship, with each improving the other dynamically.

This fantastic device can produce photos that appear to have been drawn by an experienced artist. For instance, its AI capabilities enable it to make pictures with smoother transitions between colors and hairstyles than its predecessor, CouncilGAN; additionally, its drawings mimic professional manga quality. Most impressive of all is that this AI is capable of doing this across various subjects.

What are the limitations of CycleGAN?

CycleGAN is an extension of the GAN architecture used for image synthesis. This architecture consists of two models: a generator model that creates plausible images from within the domain and a discriminator model which determines whether an image is real or fake.

One major drawback of the GAN architecture is that it typically requires paired training data. Preparing this data can be costly and time-consuming and, in many cases, cannot be produced for specific applications due to technical limitations. Image translation poses a significant challenge, particularly when images lack any meaningful connections to one another. For instance, translating a summer landscape into a winter landscape requires substantial effort and expense to create paired training data for this task.

Each mapping closely replicates its original input

The good news is that the CycleGAN framework can solve this problem without needing paired training data, as it utilizes unsupervised learning. Furthermore, its efficiency makes results much superior to those produced with traditional GANs without paired training data.

CycleGAN uses an adaptive framework to integrate adversarial and cyclic losses, combining them into one hyperparameter g. This value balances marginal matching with cycle consistency. This helps the GAN learn to map from domain X to domain Y and vice versa. Furthermore, it guarantees that each mapping closely replicates its original input through cycle consistency. CycleGAN has demonstrated great success in image transformation but has some drawbacks. For example, reconstructed outputs are noisier and less accurate than the original input images.

Furthermore, the model could perform better when dealing with texture and color variations. Nonetheless, it has been demonstrated to achieve admirably in various broader-range applications that do not necessitate paired training data – indicating its potential effectiveness. The remarkable CycleGAN technique demonstrates how the GAN architecture can be tailored for various tasks such as style transfer and object transfiguration. Furthermore, it opens up exciting research avenues in the future.