In this section, we display the debiasing results using our proposed method, which demonstrate the superiority of our method.

Debiasing single attribute

We first show the debias results on single attributes such as gender and eyeglasses. We visualize both the distribution variation and qualitative results.

Figure 1. Gender debias results for the prompt: “face photo of a person”. The image in the red frame indicates the changes of the perceived gender.

From Figure 1, we can clearly observe the gender distribution of sampled images for stable diffusion model changes from 60% : 40% to 50% : 50%. As for the qualitative results, the perceived gender has been changed from man to woman for the image in the red frame.

Figure 2. The gender distributional control via the training data and the corresponding predicted gender distribution. The red and orange dot lines stand for the exact matching between training and predicted data and original predicted ratio of stable diffusion. The solid blue squares indicate the obtained ratio (training ratio, predicted ratio).

In order to better demonstrate the availability of our method, we further conduct the distribution control experiments for the training dataset to see whether the distribution of predicted images will change correspondingly. From Figure 2, we can observe we set the gender distribution of training data as different ratio, the corresponding predicted gender ratio is roughly the same. This indicates that the distribution of sampled images can roughly match that in training dataset across the debiased attribute.

Figure 3. Eyeglasses debias results for the prompt: “face photo of a person”.

We show our debias results on Figure 3. From it, we can observe the changes of distribution before and after debias. Though it is still not perfect (50% each), it largely ease the imbalanced situation. From the qualitative results, we have seen many people wear eyeglasses.

Debiasing mutiple attributes

As mentioned in Method section, our first method does not work very well for debiasing multiple attributes simultaneously. Therefore, we come up with the method for debiasing multiple attributes and we display our results below:

Figure 4. Eyeglasses and gender debias distribution results for the prompt: “face photo of a person”.
Figure 5. The predicted images after debiasing using 4 corresponding debiased text embedding.

From Figure 4, we can clearly observe that after debiasing, the distribution of generated images across 4 subclasses are rougly the same. From Figure 5, we can observe that by using 4 debiased text embedding to sample images from stable diffusion, we can nearly obtain the accurate results. For better understanding the principle of our method, we leverage tSNE to visualize the image and text features.

Figure 6. tSNE feature visualization. Left: image features of training dataset, image features of sampled dataset, and text embedding features before and after debias. Right: text embedding features before and after debias. Circle: training data. Cross: sampled data. Triangle: text embedding.

We visualize the tSNE features for both the image and text embeddings in Figure 6. From the left part, we can observe that the sampled data are very close to the training data in the same subclass, which indicates the effectiveness of our method. From the right part, the debiased text embeddings for each subclass are ‘pushed away’ from the original embedding (“face photo of a person”) and move in the direction to the corresponding training data.

In conclusion, the results above demonstrate the effectiveness of our method in debiasing both single and multiple attributes.