Google has launched its latest AI experiment called Whisk, aimed at enhancing image generation capabilities for users within its Labs testing programme. The announcement was made in a blog post on December 16, where Google detailed how Whisk operates differently by allowing users to prompt the system using images instead of lengthy textual descriptions. This innovative approach is designed for those who may struggle to articulate their ideas verbally.
Whisk utilises Google's advanced Gemini and Imagen 3 models, generating new images based on three types of input from users: a subject, a scene, and a style. The AI extracts crucial characteristics from the uploaded images to create a unique rendition, catalogued as illustrations like a "Whimsical Walrus" or a "pink donut enamel pin." However, Google cautioned that Whisk does not replicate the images in their entirety; instead, the AI focuses on capturing the "essence" of the input images. As a result, variations in attributes such as height, hairstyle, and skin tone may occur, which the company acknowledges as significant to users. The platform will also feature a "review and edit" function, enabling users to adjust details after the image has been generated.
Testing for Whisk is currently available to users in the U.S., inviting them to sign up and experience the next generation of image-based prompts.
Additionally, Google has announced updates to its existing AI models, Imagen 3 and Veo 2. The new version of Imagen 3 is poised to deliver "brighter" and "better composed" images during its global rollout on December 16. Reports indicate that this version has improved its capacity to follow user descriptions with more intricate details in its outputs. The enhancements follow a re-introduction of Imagen 3 to users after its quieter launch earlier in the year.
The updated Veo 2 video generator is also set to widen its scope, enabling users to create high-quality videos with greater detailing and higher resolutions, reaching up to 4K. Users can now request specific settings, like an "18mm lens," and the AI understands how to interpret the directive to produce wide-angled shots. Notably, the improvements also decrease the frequency of "hallucinated" or inaccurate details in video presentations.
The rollout of the updated Veo 2 began on the same day, with Google slowly expanding the number of participants for this testing phase. Furthermore, the company hinted at future integrations of Veo 2 with platforms like YouTube Shorts and other products in the upcoming year.
Both Whisk and the updates to Imagen 3 and Veo 2 reflect Google's commitment to expanding AI capabilities in image and video generation, leading the way in current and future automation trends in the business domain. As the technology evolves, businesses may experience significant transformations in their creative processes, paving the way for enhanced visual and multimedia content creation.
Source: Noah Wire Services