Gadget

Red Hat updates enterprise AI platform

Red Hat, a hybrid cloud technology company, has updated its enterprise AI platform, Red Hat AI 3. The new version integrates Red Hat AI Inference Server, Red Hat Enterprise Linux AI (RHEL AI), and Red Hat OpenShift AI to streamline large-scale AI inference.

The platform is designed to help organisations transition workloads from proof-of-concept to production and enhance collaboration on AI-enabled applications.

“As enterprises scale AI from experimentation to production, they face a new wave of complexity, cost and control challenges,” says Joe Fernandes, VP and GM at Red Hat for AI Business Unit. “With Red Hat AI 3, we are providing an enterprise-grade, open source platform that minimises these hurdles.

“By bringing new capabilities like distributed inference with llm-d and a foundation for agentic AI, we are enabling IT teams to more confidently operationalise next-generation AI, on their own terms, across any infrastructure.”

Approximately 95% of organisations fail to see measurable financial returns from around $40-billion in enterprise AI spending. This is revealed in The GenAI Divide: State of AI in Business report from the Massachusetts Institute of Technology NANDA project.

Red Hat AI 3 is designed to address these challenges by providing a unified platform for managing AI workloads across hybrid, multi-vendor environments. It enables organisations to scale and distribute workloads efficiently while improving collaboration on advanced AI applications, including agent-based systems. Built on open standards, the platform supports a range of models and hardware accelerators, including datacentres, public cloud, and edge environments.

Enterprise AI inference

As organisations move AI initiatives into production, Red Hat says the emphasis shifts from training and tuning models to inference, the doing phase of enterprise AI. According to the company, Red Hat AI 3 emphasises scalable and cost-effective inference, by building on the vLLM and llm-d community projects and Red Hat’s model optimisation capabilities to deliver production-grade serving of large language models (LLMs).

Red Hat has announced the general availability of llm-d in OpenShift AI 3.0, a framework designed to enhance how large language models operate within Kubernetes environments. The system supports intelligent distributed inference by leveraging Kubernetes orchestration and vLLM performance, alongside open-source components such as the Kubernetes Gateway API Inference Extension, NVIDIA’s NIXL low-latency data transfer library, and the DeepEP Mixture of Experts communication library.

Red Hat says this allows organisations to:

llm-d extends vLLM from a single-node, high-performance inference engine into a distributed, scalable serving system that integrates closely with Kubernetes. It is designed to deliver predictable performance, support measurable ROI, and improve infrastructure planning. These enhancements aim to address the demands of managing variable LLM workloads and deploying large models such as Mixture-of-Experts (MoE) architectures.

Unified platform

Red Hat AI 3 provides a unified and adaptable platform designed to support the development of production-ready generative AI solutions. It aims to enable collaboration and streamlines workflows across teams by offering a single environment for platform and AI engineers to implement their AI strategies.

Red Hat says new capabilities aimed at improving productivity and efficiency to support scaling from proof of concept to production include:

Next-generation AI agents

AI agents are expected to significantly influence application development, with their autonomous and complex workflows creating increased demand for scalable inference systems. The Red Hat OpenShift AI 3.0 release expands support for agentic AI through enhanced inference capabilities and new features focused on agent management.

Aiming to simplify the creation and deployment of AI agents, Red Hat has added a Unified API layer built on the Llama Stack framework, aligning development with industry standards such as OpenAI-compatible LLM interface protocols. The company has adopted the Model Context Protocol (MCP), an emerging standard designed to improve interoperability by facilitating seamless interaction between AI models and external tools.

Red Hat AI 3 includes a modular and extensible toolkit for model customisation, developed from InstructLab functionality. The toolkit provides specialised Python libraries that enable flexibility and control, supported by open-source components such as Docling for data processing, which converts unstructured documents into AI-readable formats. It offers a framework for synthetic data generation, an LLM fine-tuning hub, and an integrated evaluation hub for monitoring and validating model performance. This aims to help engineers in achieving more accurate and contextually relevant outcomes using proprietary data.

* To learn more about Red Hat AI 3, visit the website here.

Exit mobile version