Open AccessExploiting Cultural Biases via Homoglyphs in Text-to-Image SynthesisOpen Access
Author(s)
Lukas Struppek,
Dominik Hintersdorf,
Felix Friedrich,
Manuel Brack,
Patrick Schramowski,
Kristian Kersting
Publication year2024
Models for text-to-image synthesis, such as DALL-E~2 and Stable Diffusion,have recently drawn a lot of interest from academia and the general public.These models are capable of producing high-quality images that depict a varietyof concepts and styles when conditioned on textual descriptions. However, thesemodels adopt cultural characteristics associated with specific Unicode scriptsfrom their vast amount of training data, which may not be immediately apparent.We show that by simply inserting single non-Latin characters in a textualdescription, common models reflect cultural stereotypes and biases in theirgenerated images. We analyze this behavior both qualitatively andquantitatively, and identify a model's text encoder as the root cause of thephenomenon. Additionally, malicious users or service providers may try tointentionally bias the image generation to create racist stereotypes byreplacing Latin characters with similarly-looking characters from non-Latinscripts, so-called homoglyphs. To mitigate such unnoticed script attacks, wepropose a novel homoglyph unlearning method to fine-tune a text encoder, makingit robust against homoglyph manipulations.
To access your conversation history and unlimited prompts, please
Prompt 0/10