Training StyleGAN2 and Applying Latent Space Discovery Techniques on a Novel Gemstone Images Dataset

Harry X. Li, Marco Pretell

example generated gemstone images

We trained StyleGAN2 on a novel gemstone images dataset and applied a latent space discovery technique to control for output features. All the images and videos on this page are generated using an AI model.


Can You Tell the Difference?

Play a game to see if you can tell the difference between real and generated images!

Sample Generated Images

These are sample output images from a model trained on a diamonds dataset and an angled gemstones dataset. Each cell represents one image generated from input noise.

Diamonds, 27k Training Iterations

example generated diamond images

Gemstones, 36k Training Iterations

example generated gemstone images

Controlling Output Features via Latent Space Eignvectors

We used a closed-form factorization technique to identify eigenvectors in the latent space that control for output features. Each row (y axis) represents an eigenvector to be manipulated. The first row has the largest eigenvalue, and each subsequent row has smaller eignvalues. Each column (x axis) represents images generated using the same input noise vector. Change in time represents the change in manipulation intensity of the respective eigenvector from highly negative manipulation (early in the video), to no manipulation (middle of video), to highly positive manipulation (end of video). You can see that in the middle of the video when there is no manipulation, all the output images for a single column are the same.

In most cases, we found that individual eigenvectors controlled multiple output features, making it tricky to explain. If you think you've discovered a good explanation for any of the eigenvectors, let us know by opening an issue in our repository!

Diamonds 27k Training Iterations, 10 indices, 15 samples, 10 degrees

The 1st eigenvector (row) seems to decide whether the output diamond has a circular vs ovular shape. For example, when the manipulation is highly negative, the output shapes seem to be compact Rounds, Princesses (squares), and Hearts. When the manipulation is highly positive, the shapes are more elongated Ovals, Emeralds (rectangles), and Pears.

Interestinly, the 2nd eigenvector (row) seems to control for centering and padding. In our dataset, some images needed to be centered and gray-padded into square dimensions. With negative manipulation, all the output images have our centering and padding artifacts. With positive manipulation, none of the images are centered or padded.

Gemstones 36k Training Iterations, 8 indices, 10 samples, 10 degrees

The 2nd eigenvector (row) seems to decide whether the output gemstone is green and from a frontal angle (negative manipulation) vs pink and from a sideways angle (positive manipulation).

The 3rd eigenvector (row) seems to decide whether the output gemstone is yellow (negative manipulation) or blue (positive manipulation).