Machine learning

Google Expands Bard AI Chatbot with Image Generation Capabilities

03/22/2024

3 minute read

Google is updating its Bard AI chatbot to step up its competition with rival OpenAI’s ChatGPT. The Sundar Pichai-led internet giant today announced it is expanding Bard to now include image generation capabilities, powered by its own Imagen 2 AI model, as well as a more capable version of Gemini Pro. The move gives more people access to Bard’s AI smarts, including a new free tool to create AI images.

“These updates make Bard an even more helpful and globally accessible AI collaborator for everything from big, creative projects to smaller, everyday tasks,” Jack Krawczyk, product lead for Bard, noted in a blog post.

Gemini Pro and ImageFX

Over a month ago, Google announced Gemini in three sizes: Nano for mobile devices, Pro for more intermediate use cases, and Ultra. Third-party comparisons between Gemini Pro, the most powerful large language model (LLM) currently available from Google, and other models found that it actually lags behind even OpenAI’s older GPT-3.5 Turbo, a worrying sign for Google as it seeks to show the world it has the juice to take on the new insurgents in the generative AI race.

Google did release a fine-tuned version of Gemini Pro on Bard last month, but only in English. Today’s announcement includes the availability of Gemini Pro in over 40 languages, across more than 230 countries and territories. This expands the reach of Gemini Pro’s advanced understanding, summarizing, reasoning, and coding capabilities, as well as Bard’s double-check feature, which validates a response by searching across the web.

The latest update for Bard also introduces AI image generation capabilities. This is made possible with the Imagen 2 model, which can produce high-quality, photorealistic outputs from text inputs, similar to the DALL-E 3 image generator model from OpenAI’s ChatGPT Plus subscription tiers. Users can simply type in a description, and Bard will generate custom visuals to bring the idea to life. However, it should be noted that aspect ratio changes and prompts in languages other than English are currently not supported.

“Just type in a description — like ‘create an image of a dog riding a surfboard’ — and Bard will generate custom, wide-ranging visuals to help bring your idea to life,” Krawczyk noted.

While image generation on Bard produces outputs in about 30-40 seconds with good consistency, there are cases where it fails to generate an image altogether, even without involving famous individuals that are filtered out. Google aims to avoid scandalous deepfakes that have occurred before.

In terms of legal and ethical concerns, Google Bard allows users to report any issues related to data protection, copyright, or other laws for all generated media. The company also limits the production of violent, offensive, or sexually explicit content and uses digitally identifiable watermarks to differentiate between AI-generated visuals and those created by human artists.

Other AI Experimental Projects

Google also announced the experimental tool ImageFX, which is powered by Imagen 2 for image generation. Available in AI Test Kitchen, Google’s app for experimental AI projects, ImageFX aims to spur creative ideas by providing users with adjacent dimensions and suggestions to iterate on their prompt. Similar features are found in competitive tools like Ideogram.

The AI Test Kitchen also includes other interesting experimental projects from Google:

MusicFX: Creates tunes up to 70 seconds in length with text prompts and expressive chips
TextFX: A generative AI experiment for lyricists, wordsmiths, and other creative artists

The Latest

AI Agents and the Future of User Interface and User Experience

The Language of Immersive Technology: AR, MR, and VR

tag to contain the h1 title –> The Power of GPT-4 Turbo: The Latest Advancement in AI Language Models

Qualcomm becomes No. 2 Leader in U.S. Patent Grants, Surpassing IBM

Google Expands Bard AI Chatbot with Image Generation Capabilities

Gemini Pro and ImageFX

Other AI Experimental Projects

Leave a Reply Cancel reply

AI Agents and the Future of User Interface and User Experience

The Language of Immersive Technology: AR, MR, and VR

tag to contain the h1 title –> The Power of GPT-4 Turbo: The Latest Advancement in AI Language Models

Qualcomm becomes No. 2 Leader in U.S. Patent Grants, Surpassing IBM

MMGuardian Introduces AI-Powered Smartphone for Kids Focusing on Child Safety

The Rise of AI Wearables: Tab Raises $1.9 Million in Seed Funding

OpenAI’s GPT Store: A Platform for Custom GPTs

OpenAI Announces New ChatGPT Team Subscription Tier

Google Expands Bard AI Chatbot with Image Generation Capabilities

Gemini Pro and ImageFX

Other AI Experimental Projects

Leave a Reply Cancel reply

Related Posts