Cloudflare, a globally distributed platform founded in 2009, is addressing the challenge of rapidly deploying AI models for fast inference with a series of AI platform updates. With the increasing demand for AI model deployment, there is a clear need for platforms to support this demand.
Workers AI Service
Cloudflare’s new Workers AI service offers a serverless capability for delivering AI inference models globally. This service allows organizations to easily and safely deploy AI models.
AI Gateway
Cloudflare’s new AI Gateway provides governance and observability for AI deployment. It ensures that AI applications are managed, monitored, and controlled effectively.
Cloudflare is also expanding its AI services through partnerships. Hugging Face is partnering with Cloudflare to enable easy deployment of models onto the Workers AI platform. Additionally, Cloudflare is collaborating with Microsoft and will be using the Microsoft ONNX runtime model to power AI inference.
“One of Cloudflare’s ‘secret sauces’ is that we run a massive global network and one of things we’re really good at is moving data, code and traffic, so the right thing is in the right place,” says John Graham-Cumming, CTO of Cloudflare.
Cloudflare’s Workers product brand, which enables application code to run at the edge of a network, has been rolling out different services over the years. With the Workers AI launch, Cloudflare is deploying GPUs and AI optimized CPUs across its distributed network to meet specific AI workload requirements.
The global scalability of Workers AI provides a wide range of AI deployment opportunities. Image recognition and predictive analytics are just a few examples of the potential use cases. Cloudflare’s partnership with Hugging Face allows users to deploy models onto their network without the need for coding.
To manage and control AI deployments, Cloudflare has introduced the Cloudflare AI Gateway. This gateway sits in front of AI applications and provides tools for managing, monitoring, and controlling the usage of those applications.
Cloudflare’s Vectorize database offers a solution for storing vectorized data and embeddings. This distributed database ensures that the data is closer to where the inference needs to occur with Workers AI.
“We’re not there today with 300 cities, but you know, we’re going to be rolling out hardware all over the world for this,” explains Graham-Cumming. “That has been a logistic effort to get that right and we know we’re going to be in a lot of places very, very soon.”