Stability AI Releases Smaller, More Powerful Language Model

Size certainly plays a significant role in the functionality of large language models (LLMs). Stability AI, known for its stable diffusion text to image generative AI technology, has recently unveiled one of its smallest models yet – the Stable LM 2 1.6B. This new text content generation LLM is the second model released in 2024 by Stability AI, following the launch of Stable Code 3B earlier this week.

The Compact and Powerful Stable LM 2 Model

The Stable LM 2 model from Stability AI aims to minimize barriers and allow more developers to engage in the generative AI ecosystem. It incorporates multilingual data in seven languages, including English, Spanish, German, Italian, French, Portuguese, and Dutch. Leveraging recent advancements in language modeling algorithms, Stability AI aims to achieve an optimal balance between speed and performance.

“In general, larger models trained on similar data with a similar training recipe tend to do better than smaller ones,” says Carlos Riquelme, Head of the Language Team at Stability AI. “However, over time, as new models implement better algorithms and are trained on higher quality data, we sometimes witness recent smaller models outperforming older larger ones.”

Stability AI claims that the Stable LM 2 model outperforms other small language models with under 2 billion parameters in various benchmarks, including Microsoft’s Phi-2 (2.7B), TinyLlama 1.1B, and Falcon 1B. Surprisingly, the smaller Stable LM 2 1.6B even surpasses Stability AI’s own earlier Stable LM 3B model.

“Stable LM 2 1.6B performs better than some larger models that were trained a few months ago,” explains Riquelme. “If you think about computers, televisions, or microchips, we could roughly see a similar trend – they got smaller, thinner, and better over time.”

Although the smaller size of Stable LM 2 1.6B does come with some drawbacks, such as potential issues with hallucination rates or toxic language, Stability AI believes in the power of smaller, more capable LLM options.

Building on the Model’s Capabilities

Stability AI has been focused on developing smaller and more efficient LLMs over the past few months. Previously, they released the StableLM Zephyr 3B model in December 2023, offering improved performance in a smaller size compared to the initial iteration in April.

“During training, the model gets sequentially updated and its performance improves,” says Riquelme. “The very first model knows nothing, while the last one has consumed and hopefully learned most aspects of the data.”

Stability AI is making the new models available in pre-trained and fine-tuned options, as well as providing researchers with a specific “half-cooked” model checkpoint before the final stage of pre-training. This allows individual developers to innovate, transform, and build upon the current model.

“Our goal is to provide more tools and artifacts for developers to specialize the model to other tasks or datasets they may want to use,” states Riquelme. “We believe in people’s ability to leverage new tools and models in awesome and surprising ways.”

Total
0
Shares
Leave a Reply

Your email address will not be published. Required fields are marked *

Related Posts