You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
StarGAN is a very versatile example of how one can use Generative Adversarial Networks (Goodfellow. et. al) to learn cross-domain relations, and perform image-to-image translations based on a single discriminator and a generator unit.
47
+
48
+
## How does it do that ?
49
+
Let us define the following terms before going ahead with anything new.
50
+
51
+
**attribute** - Particular feature inherent in an image. Example: haircolor, age, gender.
52
+
53
+
**attribute value** - Value of an **attribute**. Example: If chosen attribute is haircolor, its values can be blonde, black, white, grey.
54
+
55
+
**domain** - Set of images sharing the same attribute value. Example: images of women is one domain. Similarly, images of men is another.
56
+
57
+
For our experiments, we use the CelebA dataset ([Liu. et. al](http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html)). It contains more than 200K images with over 40 labelled attributes.
58
+
59
+
The existing models were quite inefficient: forlearning mappings among all **K** domains, <sup>K</sup>P<sub>2</sub> generators were required to learn every single mapping among all domains. Alsoin these models, generator could not make full use of data and could only learn from 2 out of **K** domains at a single time.
60
+
61
+
StarGAN solves that problem by introducing a single generator which learns mappings between all domains. Generator inputs two things, **image**, as well as the **inference labels**.
62
+
63
+
<p style="text-align: center;"><b>G(x, c) → y </b></p>
64
+
65
+
<i>Where, y is the generated image, x is the original image, and c is the target label. </i>
66
+
67
+
We here use an auxillary classifier as our discriminator, which outputs both, the real/fake **D<sub>src</sub>**, and the original labels of the input image **D<sub>cls</sub>**.
Training has been elaborated in the following figures.
93
+
94
+

95
+
96
+
# Results
97
+
I selected a random image from the dataset.
98
+
99
+

100
+
101
+
[Black Hair, Male]
102
+
103
+
Training a single epoch was taking 9 hours on the Tesla K80 GPU. I trained for about 1500 iterations from 12000 iterations from a single epoch.
104
+
105
+
This was the translation to [Brown_Hair, Male]-
106
+
107
+

108
+
109
+
The generator seems to have recognised the spatial features. Since full training has not been done, we cannot infer anything more other than the fact that the generator has been learning features.
110
+
111
+
## Losses
112
+
113
+

114
+
115
+
Training was continued for 3000 iterations, but the computer crashed, erasing any progress I could have made.
0 commit comments