California-based Nucleus AI Emerges from Stealth with Launch of Large Language Model

California-based startup, Nucleus AI, composed of talent from Amazon and Samsung Research, has recently come out of stealth mode and unveiled its first product. The product, a 22-billion-parameter large language model (LLM), is now available under an open-source MIT license and commercial license. Positioned between 13B and 34B segments, this versatile model can be fine-tuned to suit various generation tasks and products. Nucleus claims that its model outperforms others of similar size, and has long-term plans of leveraging AI to revolutionize agriculture.

From Stealth to Innovation

Gnandeep Moturi, CEO of Nucleus AI, provided insights into the company’s trajectory during an interview with VentureBeat. While the 22-billion model is currently the focus, Moturi revealed that they will soon release the state-of-the-art RetNet models. According to Moturi, these upcoming models will offer significant benefits in terms of costs and inference speeds. Their ambitious goal is to transform the agriculture industry using AI.

Nucleus began training the 22B model around three and a half months ago after securing compute resources from an early investor. To create a robust knowledge base for the model, the company turned to existing research and the open-source community. They pre-trained the LLM on a context length of 2,048 tokens and eventually expanded its training to cover a trillion tokens of data. This data encompassed a wide range of information from the web, Wikipedia, Stack Exchange, arXiv, and code, providing a solid foundation of knowledge.

Expanding the Models

Nucleus has plans to release additional versions of the 22B model, trained on 350 billion tokens and 700 billion tokens. Additionally, they will introduce two RetNet models with 3 billion parameters and 11 billion parameters, which have been pre-trained on a larger context length of 4,096 tokens. These smaller-sized models combine the strengths of RNN and transformer neural network architectures, resulting in remarkable gains in terms of speed and costs. In initial experiments, they were found to be 15 times faster and required only a fraction of the GPU memory compared to similar transformer models.

“So far, there’s only been research to prove that this could work. No one has actually built a model and released it to the public,” said Moturi.

While the models will be available for enterprise applications, Nucleus envisions a broader application for its AI research. Rather than creating traditional chatbots like competitors such as OpenAI, Anthropic, and Cohere, Nucleus intends to develop an intelligent operating system for agriculture. Their goal is to optimize supply and demand while mitigating uncertainties for farmers, similar to how Uber optimizes the taxi industry.

“We have a marketplace-type of idea where demand and supply will be hyper-optimized for farmers in such a way that Uber does for taxi drivers,” explained Moturi.

By addressing challenges such as climate change, lack of knowledge, and supply optimization, Nucleus aims to empower farmers and revolutionize the agricultural landscape. Language models will serve as the foundation of their marketplace, with contributions from the open-source community playing a crucial role in its development. Further details about the farming-centric operating system and the RetNet models will be announced later this month.

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Posts