Stability AI, known for its stable diffusion text-to-image generative AI models, has expanded its offerings with the release of StableLM Zephyr 3B. This new model is a 3 billion parameter large language model (LLM) specifically designed for chat use cases, including text generation, summarization, and content personalization.
The key highlight of StableLM Zephyr 3B is its smaller size compared to the 7 billion StableLM models. This smaller size allows for deployment on a wider range of hardware with a lower resource footprint, while still providing rapid responses. The model has been optimized for Q&A and instruction following tasks, making it a versatile tool.
“StableLM was trained for longer on better quality data than prior models, for example with twice the number of tokens of LLaMA v2 7b which it matches on base performance despite being 40% of the size,” said Emad Mostaque, CEO of Stability AI.
StableLM Zephyr 3B is an extension of the pre-existing StableLM 3B-4e1t model and draws inspiration from the Zephyr 7B model developed by HuggingFace. The use of Direct Preference Optimization (DPO) training approach, which is typically used with larger models, sets StableLM Zephyr apart. Stability AI leveraged DPO with the UltraFeedback dataset from the OpenBMB research group to train the model effectively.
Stability AI’s continued innovation is evident with the release of StableLM Zephyr 3B, adding to their growing list of new model offerings. Previous releases include StableCode for application code development, Stable Audio for text-to-audio generation, and Stable Video Diffusion for video generation.
“We believe that small, open, performant models tuned to users’ own data will outperform larger general models,” Mostaque emphasized. “With the future full release of our new StableLM models, we look forward to democratizing generative language models further.”