Software & Apps

Reka Introduces Yasa-1: A Multimodal AI Assistant That Takes AI Capabilities to the Next Level

10/07/2023

3 minute read

Reka, the AI startup founded by researchers from DeepMind, Google, and Meta, has unveiled their latest innovation called Yasa-1. Yasa-1 is a groundbreaking multimodal AI assistant that goes beyond text understanding to also comprehend images, short videos, and audio snippets. This powerful assistant, currently in private preview, offers immense customization potential on private datasets, enabling enterprises to create new and compelling experiences for a wide range of use cases.

Enhanced Capabilities and Language Support

Yasa-1 supports an impressive range of 20 different languages and offers more than just basic text-based answers. It has the ability to provide answers with context from the internet, process long context documents, and execute code. The assistant is a direct competitor to OpenAI’s ChatGPT, which has recently received its own multimodal upgrade. “I’m proud of what the team has achieved, going from an empty canvas to an actual full-fledged product in under 6 months,” commented Yi Tay, the chief scientist and co-founder of Reka.

Reka has invested significant effort in developing Yasa-1, from pretraining the base models to optimizing the training and serving infrastructure. Despite being a relatively new assistant, Reka acknowledges that there are some limitations which will be addressed in the subsequent months. Nonetheless, Yasa-1 is highly accessible, available via APIs and as docker containers for on-premise or VPC deployment, offering flexibility to suit different enterprise needs.

Unparalleled Understanding of Multimodal Inputs

The key strength of Yasa-1 lies in its ability to understand not only words and phrases, but also images, audio, and short video clips. By using a unified model trained by Reka, the assistant can combine text-based prompts with multimedia files to provide more specific answers. Its multimodal comprehension extends to various tasks, such as generating a social media post based on an image of a product or detecting a particular sound and its source. Moreover, Yasa-1 can even analyze videos, identify discussed topics, and predict future actions. This makes it incredibly valuable for video analytics.

“For multimodal tasks, Yasa excels at providing high-level descriptions of images, videos, or audio content,” said Reka in a blog post. “However, without further customization, its ability to discern intricate details in multimodal media is limited. For the current version, we recommend audio or video clips be no longer than one minute for the best experience.”

Reka emphasizes that while Yasa-1 delivers remarkable results, similar to other language and multimodal models, it can sometimes generate incorrect or exaggerated information. It is advised not to solely rely on the assistant for critical advice.

Additional Features and Future Developments

Beyond its multimodality, Yasa-1 offers support for 20 different languages, long context document processing, and the exclusive functionality to actively execute code (limited to on-premise deployments). This capability enables users to perform arithmetic operations, analyze spreadsheets, or create data visualizations. Additionally, users have the option to integrate real-time information from commercial search engines into Yasa-1’s answers.

Reka plans to extend Yasa-1’s accessibility to more enterprises in the coming weeks while continuously working to improve the assistant’s capabilities and address its limitations. The ultimate goal of Reka is to build a future where superintelligent AI collaborates with humans to effectively tackle major challenges and make a positive impact in various domains.

“We are proud to have one of the best models in its compute class, but we are only getting started,” stated Reka. “Yasa is a generative agent with multimodal capabilities. It is a first step towards our long-term mission to build a future where superintelligent AI is a force for good, working alongside humans to solve our major challenges.”

Although Reka benefits from having a talented team of researchers with backgrounds at companies like Meta and Google, it is worth noting that the company is still relatively new in the AI field. Just three months after emerging from stealth with $58 million in funding, Reka finds itself competing against industry giants such as Microsoft-backed OpenAI and Amazon-backed Anthropic. Notable competitors also include Inflection AI, which has raised nearly $1.5 billion, and Adept with $415 million in funding.

The Latest

AI Agents and the Future of User Interface and User Experience

The Language of Immersive Technology: AR, MR, and VR

tag to contain the h1 title –> The Power of GPT-4 Turbo: The Latest Advancement in AI Language Models

Qualcomm becomes No. 2 Leader in U.S. Patent Grants, Surpassing IBM

Reka Introduces Yasa-1: A Multimodal AI Assistant That Takes AI Capabilities to the Next Level

Enhanced Capabilities and Language Support

Unparalleled Understanding of Multimodal Inputs

Additional Features and Future Developments

Leave a Reply Cancel reply

AI Agents and the Future of User Interface and User Experience

The Language of Immersive Technology: AR, MR, and VR

tag to contain the h1 title –> The Power of GPT-4 Turbo: The Latest Advancement in AI Language Models

Qualcomm becomes No. 2 Leader in U.S. Patent Grants, Surpassing IBM

MMGuardian Introduces AI-Powered Smartphone for Kids Focusing on Child Safety

The Rise of AI Wearables: Tab Raises $1.9 Million in Seed Funding

OpenAI’s GPT Store: A Platform for Custom GPTs

OpenAI Announces New ChatGPT Team Subscription Tier

Reka Introduces Yasa-1: A Multimodal AI Assistant That Takes AI Capabilities to the Next Level

Enhanced Capabilities and Language Support

Unparalleled Understanding of Multimodal Inputs

Additional Features and Future Developments

Leave a Reply Cancel reply

Related Posts