Google’s Whisk AI generator will ‘remix’ the pictures you plug in

A photo of a green bear from Whisk. — *An AI-generated image I made in Whisk* *using Google’s suggested images as prompts.* | Image: Google via Whisk

Google has announced a new AI tool called Whisk that lets you generate images using other images as prompts instead of requiring a long text prompt.

With Whisk, you can offer images to suggest what you’d like as the subject, the scene, and the style of your AI-generated image, and you can prompt Whisk with multiple images for each of those three things. (If you want, you can fill in text prompts, too.) If you don’t have images on hand, you can click a dice icon to have Google fill in some images for the prompts (though those images also appear to be AI-generated). You can also enter some text into a text box at the end of the process if you want to add extra detail about the image you’re looking for, but it’s not required.

Whisk will then generate images and a text prompt for each image. You can favorite or download the image if you’re happy with the results, or you can refine an image by entering more text into the text box or clicking the image and editing the text prompt.

A screenshot of Google’s Whisk tool. — A screenshot of Whisk. I clicked the dice to generate a subject, scene, and style. I swapped out the auto-generated scene by entering a text prompt. Whisk created the first two images, which I iterated on by asking Whisk to add some steam around the subject (because it’s a fire being in water), resulting in the next two images.

In the few minutes I’ve used the tool while writing this story, it’s been entertaining to tinker with. Images take a few seconds to generate, which is annoying, and while the images have been a little strange, everything I’ve generated has been fun to iterate on.

Google says Whisk uses the “latest” iteration of its Imagen 3 image generation model, which it announced today. Google also introduced Veo 2, the next version of its video generation model, which the company says has an understanding of “the unique language of cinematography” and hallucinates things like extra fingers “less frequently” than other models (one of those other models is probably OpenAI’s Sora). Veo 2 is coming first to Google’s VideoFX, which you can get on the Google Labs waitlist for, and it will be expanded to YouTube Shorts “other products” sometime next year.

An AI-generated image I made in Whisk using Google’s suggested images as prompts. | Image: Google via Whisk

Google has announced a new AI tool called Whisk that lets you generate images using other images as prompts instead of requiring a long text prompt.
With Whisk, you can offer images to suggest what you’d like as the subject, the scene, and the style of your AI-generated image, and you can prompt Whisk with multiple images for each of those three things. (If you want, you can fill in text prompts, too.) If you don’t have images on hand, you can click a dice icon to have Google fill in some images for the prompts (though those images also appear to be AI-generated). You can also enter some text into a text box at the end of the process if you want to add extra detail about the image you’re looking for, but it’s not required.
Whisk will then generate images and a text prompt for each image. You can favorite or download the image if you’re happy with the results, or you can refine an image by entering more text into the text box or clicking the image and editing the text prompt.

Screenshot by Jay Peters / The Verge
A screenshot of Whisk. I clicked the dice to generate a subject, scene, and style. I swapped out the auto-generated scene by entering a text prompt. Whisk created the first two images, which I iterated on by asking Whisk to add some steam around the subject (because it’s a fire being in water), resulting in the next two images.

In a blog post, Google stresses that Whisk is designed to be for “rapid visual exploration, not pixel-perfect edits.” The company also says that Whisk may “miss the mark,” which is why it lets you edit the underlying prompts.
In the few minutes I’ve used the tool while writing this story, it’s been entertaining to tinker with. Images take a few seconds to generate, which is annoying, and while the images have been a little strange, everything I’ve generated has been fun to iterate on.
Google says Whisk uses the “latest” iteration of its Imagen 3 image generation model, which it announced today. Google also introduced Veo 2, the next version of its video generation model, which the company says has an understanding of “the unique language of cinematography” and hallucinates things like extra fingers “less frequently” than other models (one of those other models is probably OpenAI’s Sora). Veo 2 is coming first to Google’s VideoFX, which you can get on the Google Labs waitlist for, and it will be expanded to YouTube Shorts “other products” sometime next year. The Verge – Tech Posts