Software & Apps

Meta’s Llama 2 Long AI Model Outperforms Competition in Long Prompts

10/05/2023

2 minute read

Meta’s Llama 2 Long AI Model Outperforms Competition in Long Prompts

Meta Platforms recently unveiled several new AI features for its popular consumer-facing services, Facebook, Instagram, and WhatsApp, at the Meta Connect conference. However, the most notable development from the company came quietly through a computer science paper published by Meta researchers on arXiv.org.

The paper introduces Llama 2 Long, an enhanced AI model derived from Meta’s open source Llama 2. This improved version of Llama 2 underwent continuous pretraining with longer training sequences and a dataset that includes upsampled long texts. According to the researchers, this modification resulted in Meta’s elongated AI model surpassing leading competitors, such as OpenAI’s GPT-3.5 Turbo and Claude 2, in generating responses to long user prompts.

Improving Performance Through Enriched Dataset and Model Architecture

The Meta researchers expanded the original Llama 2 training dataset by including an additional 400 billion tokens worth of longer text data sources. They also maintained the same architecture for Llama 2 Long but made a necessary adjustment to the positional encoding, specifically the Rotary Positional Embedding (RoPE) encoding. RoPE encoding maps token embeddings onto a 3D graph, facilitating accurate responses with less storage and computational resources.

By decreasing the rotation angle of the RoPE encoding, the researchers ensured that Llama 2 Long incorporated more distant tokens, thus enriching its knowledge base. They utilized reinforcement learning from human feedback (RLHF) and synthetic data generated by Llama 2 chat to further enhance the model’s performance in coding, math, language understanding, common sense reasoning, and question answering.

Implications for the Open-Source AI Community

The release of Llama 2 Long and its exceptional performance has garnered significant attention and enthusiasm within the open-source AI community on platforms like Reddit, Twitter, and Hacker News. It serves as validation of Meta’s commitment to an “open source” approach in generative AI and demonstrates that open-source models can compete with closed-source alternatives offered by well-funded startups.

The Latest

AI Agents and the Future of User Interface and User Experience

The Language of Immersive Technology: AR, MR, and VR

tag to contain the h1 title –> The Power of GPT-4 Turbo: The Latest Advancement in AI Language Models

Qualcomm becomes No. 2 Leader in U.S. Patent Grants, Surpassing IBM

Meta’s Llama 2 Long AI Model Outperforms Competition in Long Prompts

Meta’s Llama 2 Long AI Model Outperforms Competition in Long Prompts

Improving Performance Through Enriched Dataset and Model Architecture

Implications for the Open-Source AI Community

Leave a Reply Cancel reply

AI Agents and the Future of User Interface and User Experience

The Language of Immersive Technology: AR, MR, and VR

tag to contain the h1 title –> The Power of GPT-4 Turbo: The Latest Advancement in AI Language Models

Qualcomm becomes No. 2 Leader in U.S. Patent Grants, Surpassing IBM

MMGuardian Introduces AI-Powered Smartphone for Kids Focusing on Child Safety

The Rise of AI Wearables: Tab Raises $1.9 Million in Seed Funding

OpenAI’s GPT Store: A Platform for Custom GPTs

OpenAI Announces New ChatGPT Team Subscription Tier

Meta’s Llama 2 Long AI Model Outperforms Competition in Long Prompts

Meta’s Llama 2 Long AI Model Outperforms Competition in Long Prompts

Improving Performance Through Enriched Dataset and Model Architecture

Implications for the Open-Source AI Community

Leave a Reply Cancel reply

Related Posts