ML//diffusion model//Imagen

- Google's text-to-image model family. Imagen (2022), Imagen 2 (2023), Imagen 3 (2024)


Google's text-to-image model family. Imagen (2022), Imagen 2 (2023), Imagen 3 (2024)

Architecture: T5-XXL text encoder → cascading diffusion upsampling (64 → 256 → 1024 resolution)

Key finding: scaling the text encoder matters more than scaling the image model — language understanding drives image quality.

Less open than Stable Diffusion but competitive. Google's counterpart to DALL-E