Google is updating its Bard AI chatbot to step up its competition with rival OpenAI’s ChatGPT. The Sundar Pichai-led internet giant today announced it is expanding Bard to now include image generation capabilities, powered by its own Imagen 2 AI model, as well as a more capable version of Gemini Pro. The move gives more people access to Bard’s AI smarts, including a new free tool to create AI images.
“These updates make Bard an even more helpful and globally accessible AI collaborator for everything from big, creative projects to smaller, everyday tasks,” Jack Krawczyk, product lead for Bard, noted in a blog post.
Gemini Pro and ImageFX
Over a month ago, Google announced Gemini in three sizes: Nano for mobile devices, Pro for more intermediate use cases, and Ultra. Third-party comparisons between Gemini Pro, the most powerful large language model (LLM) currently available from Google, and other models found that it actually lags behind even OpenAI’s older GPT-3.5 Turbo, a worrying sign for Google as it seeks to show the world it has the juice to take on the new insurgents in the generative AI race.
Google did release a fine-tuned version of Gemini Pro on Bard last month, but only in English. Today’s announcement includes the availability of Gemini Pro in over 40 languages, across more than 230 countries and territories. This expands the reach of Gemini Pro’s advanced understanding, summarizing, reasoning, and coding capabilities, as well as Bard’s double-check feature, which validates a response by searching across the web.
The latest update for Bard also introduces AI image generation capabilities. This is made possible with the Imagen 2 model, which can produce high-quality, photorealistic outputs from text inputs, similar to the DALL-E 3 image generator model from OpenAI’s ChatGPT Plus subscription tiers. Users can simply type in a description, and Bard will generate custom visuals to bring the idea to life. However, it should be noted that aspect ratio changes and prompts in languages other than English are currently not supported.
“Just type in a description — like ‘create an image of a dog riding a surfboard’ — and Bard will generate custom, wide-ranging visuals to help bring your idea to life,” Krawczyk noted.
While image generation on Bard produces outputs in about 30-40 seconds with good consistency, there are cases where it fails to generate an image altogether, even without involving famous individuals that are filtered out. Google aims to avoid scandalous deepfakes that have occurred before.
In terms of legal and ethical concerns, Google Bard allows users to report any issues related to data protection, copyright, or other laws for all generated media. The company also limits the production of violent, offensive, or sexually explicit content and uses digitally identifiable watermarks to differentiate between AI-generated visuals and those created by human artists.
Other AI Experimental Projects
Google also announced the experimental tool ImageFX, which is powered by Imagen 2 for image generation. Available in AI Test Kitchen, Google’s app for experimental AI projects, ImageFX aims to spur creative ideas by providing users with adjacent dimensions and suggestions to iterate on their prompt. Similar features are found in competitive tools like Ideogram.
The AI Test Kitchen also includes other interesting experimental projects from Google:
- MusicFX: Creates tunes up to 70 seconds in length with text prompts and expressive chips
- TextFX: A generative AI experiment for lyricists, wordsmiths, and other creative artists