The Allen Institute for AI (AI2), a non-profit research institute founded in 2014 by the late Microsoft co-founder Paul Allen, announced today that it has introduced the open source OLMo, which it calls a “truly open-source, state-of-the-art large language model” creating an “alternative to current models that are restrictive and closed” and driving a “critical shift” in AI development.
While other models have included the model code and model weights, OLMo also provides the training code, training data and associated toolkits, as well as evaluation toolkits. In addition, OLMo was released under an open source initiative (OSI) approved license, with AI2 saying that “all code, weights, and intermediate checkpoints are released under the Apache 2.0 License.”
The news comes at a moment when open source/open science AI, which has been playing catch-up to closed, proprietary LLMs like OpenAI’s GPT-4 and Anthropic’s Claude, is making significant headway. For example, yesterday the CEO of Paris-based open source AI startup Mistral confirmed the ‘leak’ of a new open-source AI model nearing GPT-4 performance. On Monday, Meta released a new and improved version of its code generation model, Code Llama 70B, as many eagerly await the third iteration of its Llama LLM.
Controversies Surrounding Open Source AI
However, open source AI continues to come under fire by some researchers, regulators and policy makers — a recent, widely-shared opinion piece in IEEE Spectrum, for instance, is titled “Open-Source AI is Uniquely Dangerous.”
The OLMo framework’s “completely open” AI development tools, available to the public, include full pretraining data, training code, model weights and evaluation. It provides inference code, training metrics, and training logs, and the evaluation suite used in development — 500+ checkpoints per model, “from every 1000 steps during the training process and evaluation code under the umbrella of the Catwalk project.”
The researchers at AI2 said they will continue to iterate on OLMo with different model sizes, modalities, datasets, and capabilities.
Advancements in Open Source AI
“Many language models today are published with limited transparency,” said Hanna Hajishirzi, OLMo project lead, a senior director of NLP Research at AI2, and a UW professor, in a press release. “Without having access to training data, researchers cannot scientifically understand how a model is working. It’s the equivalent of drug discovery without clinical trials or studying the solar system without a telescope,” said.
“With our new framework, researchers will finally be able to study the science of LLMs, which is critical to building the next generation of safe and trustworthy AI.”
Nathan Lambert, an ML scientist at AI2, posted on LinkedIn saying that “OLMo will represent a new type of LLM enabling new approaches to ML research and deployment because on a key axis of openness, OLMo represents something entirely different. OLMo is built for scientists to be able to develop research directions at every point in the development process and execute on them, which was previously not available due to incomplete information and tools.”
Jonathan Frankle, chief scientist at MosaicML and Databricks, called AI2’s OLMa release a “A giant leap for open science.” Hugging Face CTO posted on X that the model/framework is “pushing the envelope of open source AI.” Meta chief scientist Yann LeCun contributed a quote to AI2’s press release: “Open foundation models have been critical in driving a burst of innovation and development around generative AI,” he said. “The vibrant community that comes from open source is the fastest and most effective way to build the future of AI.”
Correction: An earlier version of this article referred incorrectly to the AI2 LLM as the first truly open-source model. This is not the case. There have been prior open-source models. We regret the error.