AIModels.fyi

AIModels.fyi

Share this post

AIModels.fyi
AIModels.fyi
Getting a consistent character across image generation runs

Getting a consistent character across image generation runs

Maintaining visual coherence of the same character across different generated images

aimodels-fyi's avatar
aimodels-fyi
Nov 21, 2023
∙ Paid
3

Share this post

AIModels.fyi
AIModels.fyi
Getting a consistent character across image generation runs
Share
The Chosen One: Generating Consistent Characters from Text Descriptions Using AI

Recent breakthroughs in AI-generated imagery via text-to-image models like DALL-E and Stable Diffusion have unlocked new creative possibilities. However, a persistent challenge is maintaining visual coherence of the same character across different generated images.

Researchers from Google, Hebrew University of Jerusalem, Tel Aviv University, and Reichman University proposed an automated solution to enable consistent character generation in a paper titled "The Chosen One: Consistent Characters in Text-to-Image Diffusion Models."

Let's see what they uncovered and how we can use it in our own work!

Standard vs. new approach
"Figure 2. Identity consistency. Given the prompt “a Plasticine of a cute baby cat with big eyes”, a standard text-to-image diffusion model produces different cats (all corresponding to the input text), whereas our method produces the same cat."

Subscribe or follow me on Twitter for more content like this!

The Context

Text-to-image models like DALL-E and Stable Diffusion contain deep neural networks trained on vast datasets to generate realistic images from text captions. They optimize text embeddings and image latents to align the generated image with the text description.

However, these models struggle to maintain consistency across multiple images of the same character, even when given similar text prompts. For instance, asking the model to generate "a white cat" may produce cats with differing fur patterns, colors, poses, etc. in each image.

This inability to preserve identity coherence poses challenges for applications like:

  • Illustrating stories or textbooks with recurring characters

  • Building unique brand personalities and mascots

  • Designing video game assets and virtual characters

  • Creating advertising campaigns with memorable spokespersons

Without consistency, characters change appearance unpredictably across images. Creators often resort to manual techniques like depicting the character in multiple poses first or carefully hand-picking the best results.

The Approach

AIModels.fyi is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Keep reading with a 7-day free trial

Subscribe to AIModels.fyi to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 AIModels.fyi
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share