# Enterprise Reference Architectures

## Build AI Factories That Scale

Turn your data center into a high-performance AI factory with NVIDIA Enterprise Reference Architectures.

[Get Started](https://marketplace.nvidia.com/en-us/enterprise/ai-factory/#)

[Read Whitepaper](https://resources.nvidia.com/en-us-certified-systems/collection-9563de54)   |   [Explore NVIDIA-Certified Systems](https://www.nvidia.com/en-us/data-center/products/certified-systems.md)

Overview

## The Building Blocks for AI Success

NVIDIA Enterprise Reference Architectures (Enterprise RAs) enable organizations to design, deploy, and scale high-performance [AI factories](https://www.nvidia.com/en-us/solutions/ai-factories.md) using validated, repeatable infrastructure. These designs combine certified compute, high-speed east-west and north-south networking, observability tools, and software to ensure scalable performance, from four-node clusters to enterprise-scale environments.

### Palantir Teams With NVIDIA to Deliver Sovereign AI Operating System Reference Architecture

The Palantir Sovereign AI OS Reference Architecture is based on NVIDIA Enterprise RAs, tested and qualified to run Palantir's complete software suite on NVIDIA AI infrastructure.

[Read the Press Release](https://investors.palantir.com/news-details/2026/Palantir-and-NVIDIA-Team-to-Deliver-Sovereign-AI-Operating-System-Reference-Architecture/)

### Proven Design and Validated Performance

Learn how Enterprise RAs, built on real-world deployments and battle-tested configurations, simplify planning and maximize ROI for scalable AI infrastructure.

[Read the Whitepaper](https://docs.nvidia.com/enterprise-reference-architectures/white-paper/latest/index.html)

Enterprise Reference Architectures

## Your Guide to the Complete Family

A comprehensive suite of instructions for setting up clusters in the data center is now available.

### Infrastructure

NVIDIA [Enterprise Reference Architectures](https://docs.nvidia.com/enterprise-reference-architectures/index.html#nvidiatab-hardware) start with validated hardware configurations, including CPU-GPU-networking node patterns, cabling diagrams, and infrastructure details.

### Network Logic

The Networking Configuration and Logical Architecture Logic Guide for Enterprise RAs provides instructions for node management and provisioning through VLAN design and network simulation on [NVIDIA Air](https://www.nvidia.com/en-us/networking/ethernet-switching/air.md).

### Software

Our [software reference stack](https://docs.nvidia.com/enterprise-reference-architectures/index.html#nvidiatab-software) for Enterprise RAs outlines the software for managing, provisioning, and sizing infrastructure clusters. Current releases focus on open-source Kubernetes, with [NVIDIA AI Enterprise](https://www.nvidia.com/en-us/data-center/products/ai-enterprise.md) and [NVIDIA Run:ai](http://run.ai) software.

### Observability

The [Observability Guide for NVIDIA Enterprise Reference Architectures](https://docs.nvidia.com/enterprise-reference-architectures/index.html#nvidiatab-observability) utilizes open-source tools, such as Prometheus and Grafana, to monitor GPU and networking performance across the entire cluster. Dashboards provide real-time metrics for system health and workload efficiency.

### Deployment

The [*Deployment Guide for NVIDIA Enterprise Reference Architectures*](https://docs.nvidia.com/enterprise-reference-architectures/index.html#nvidiatab-deployment) is a collection of infrastructure best practices that our team has learned from bringing up, deploying, testing, and validating the in-house clusters on which we’ve built our program.

### Storage

The [NVIDIA-Certified Storage](https://www.nvidia.com/en-us/data-center/products/certified-storage.md) Program is a complementary effort by select partners who have created storage guides designed to integrate into Enterprise RAs. [Learn more](https://blogs.nvidia.com/blog/nvidia-certified-enterprise-storage/) about this unique program.

Use Cases

## Designed for Every Use Case

Accelerate agentic AI, physical AI, high-performance computing (HPC), and AI simulation workloads with proven NVIDIA Enterprise Reference Architectures and [NVIDIA-Certified Systems](https://www.nvidia.com/en-us/data-center/products/certified-systems.md) from global partners. The primary infrastructure cluster configurations for deploying enterprise AI factories are outlined below.

1. NVIDIA RTX PRO AI Factory
2. NVIDIA HGX AI Factory
3. NVIDIA NVL72 AI Factory

### NVIDIA RTX PRO AI Factory

The NVIDIA RTX PRO™ AI Factory configuration is designed for a broad spectrum of enterprise workloads, including generative and agentic AI, data analytics, visual computing, and engineering simulation. Deployments are optimized around 16- and 32-node design points, providing an ideal balance of performance, scalability, and deployment efficiency. Designed for universal workload acceleration across enterprise AI, simulation, and visual computing, [NVIDIA RTX PRO Servers](https://www.nvidia.com/en-us/data-center/products/rtx-pro-server.md) are optimized for PCIe environments, making them ideal for space-, power-, and cooling-constrained data centers. Purpose-built for modern AI workloads, they deliver efficient performance for agentic AI and large language model (LLM) inference.

[See Cluster Configuration Specs](#universal-enterprise-acceleration)

### NVIDIA HGX AI Factory

The high-performance NVIDIA HGX™ AI Factory configuration is purpose-built for multi-node AI training and inference at scale, leveraging [NVIDIA HGX](https://www.nvidia.com/en-us/data-center/hgx.md) systems. Available in 32-, 64-, and 128-node design points and supported by [NVIDIA Spectrum-X™](https://www.nvidia.com/en-us/networking/spectrumx.md) networking, the architecture features a flexible, rail-optimized design that enables efficient integration across diverse rack layouts while delivering high-throughput, low-latency performance. It provides breakthrough performance for AI power users running the most demanding workloads, enables large-scale model training and fine-tuning, and dramatically accelerates inference. With next-generation precision and ultra-fast interconnects, the solution achieves up to 15x higher token throughput.

[See​ Cluster Configuration Specs](#ai-optimized-performance)

### NVIDIA NVL72 AI Factory

The NVIDIA NVL72 AI Factory configuration is designed to train and deploy trillion-parameter models, delivering exascale computing power within a single rack. Built for massive model throughput, multi-user inference, and real-time inference at scale, it enables the next generation of AI-driven innovation. Deployment design points center on four- and eight-rack configurations. Built on a flexible, rail-optimized network, the architecture adapts to diverse rack layouts and system designs while delivering high-bandwidth, low-latency performance. The platform delivers exceptional AI factory output with industry-leading energy efficiency and is powered by fifth-generation NVIDIA NVLink™, FP4 Tensor Cores, and advanced thermal innovations.

[See Cluster Configuration Specs](#exascale-performance)

Benefits

## The Strategic Value of Enterprise RAs

Unlock scalable, high-performance AI infrastructure with proven, partner-ready configurations.

### Peak Performance for AI Workloads

Meet the intensive demands of AI inference, fine-tuning, and training with architectures that ensure full GPU utilization and performance consistency across multi-node clusters.

### Flexible Scaling, Simplified Operations

Easily expand your infrastructure and ensure scalable, streamlined deployment for up to 128 nodes. Build the foundation for full-stack solutions with the [NVIDIA Enterprise AI Factory validated design](https://www.nvidia.com/en-us/solutions/ai-factories/validated-design.md), which leverages our software ecosystem.

### Reduce Complexity and TCO

Simplify deployment processes and efficient designs, reduce complexity and total cost of ownership (TCO), while reducing time to value.

### Supportability

Follow specific, standardized design patterns to achieve consistent operation from one installation to the next, reduce the need for frequent support, and enable faster resolution times.

Partners

## Partnered for Performance

We’re proud to collaborate with leading partners as they bring Enterprise Reference Architectures and [AI factory solutions](https://www.nvidia.com/en-us/solutions/ai-factories.md) to market. Endorsed designs from these partners have passed our Design Review Board, offering guidance that earns our endorsement in one or more of the following categories: infrastructure, networking logic, and software.

[Get Started](https://marketplace.nvidia.com/en-us/enterprise/ai-factory/)

## Palantir Sovereign AI OS Reference Architecture With NVIDIA

The Palantir Sovereign AI OS Reference Architecture is based on NVIDIA Enterprise RAs, tested and qualified to run Palantir's complete software suite on NVIDIA AI infrastructure with our global system partners. This sovereign AI architecture is critical for customers with latency-sensitive workflows, data sovereignty requirements, and high geographic distribution. The architecture provides enterprises with total control over their data, AI models, and applications.

[Learn More](http://www.palantir.com/sovereignaios)

Resources

## Learn More About Enterprise RAs

### NVIDIA RTX PRO AI Factory Reference Architecture

The NVIDIA RTX PRO AI Factory configuration supports a broad range of enterprise workloads, including agentic AI inference, physical and industrial AI, visual computing, and high-performance computing for data analytics and simulation. This document details the hardware components underpinning this scalable and modular architecture.

[Read Whitepaper](https://docs.nvidia.com/enterprise-reference-architectures/rtx-pro-ai-factory/latest/index.html)

### NVIDIA HGX AI Factory Reference Architecture

The NVIDIA HGX AI Factory configuration is focused on high-performance AI inference, model training, and fine-tuning. This document outlines the hardware components of a scalable, modular architecture, including cluster guidance and network fabric topologies used to interconnect the cluster.

[Read Whitepaper](https://docs.nvidia.com/enterprise-reference-architectures/hgx-ai-factory/latest/index.html)

### Unlock Massive Token Throughput with NVIDIA Run:ai

Joint benchmarking with Nebius shows that fractional GPU deployments using NVIDIA Run:ai on NVIDIA Enterprise Reference Architectures significantly improve throughput and utilization for production LLM workloads.

[Read Blog](https://developer.nvidia.com/blog/unlock-massive-token-throughput-with-gpu-fractioning-in-nvidia-runai/)

### NVIDIA Enterprise Reference Architecture Overview

This whitepaper introduces NVIDIA Enterprise Reference Architectures, which provide proven guidance for designing and building AI factories for enterprise-class deployments ranging from 32 to 1,024 GPUs. These architectures help simplify AI infrastructure deployment, reduce operational complexity, and accelerate time to value.

[Read Whitepaper](https://docs.nvidia.com/enterprise-reference-architectures/white-paper/latest/index.html)

### North–South Networks: The Key to Faster Enterprise AI Workloads

NVIDIA Enterprise Reference Architectures guide organizations in deploying AI factories that utilize both north-south and east-west networks, providing design recipes for scalable, secure, and high-performing AI infrastructure.

[Read Blog](https://developer.nvidia.com/blog/north-south-networks-the-key-to-faster-enterprise-ai-workloads/)

### Deploying NVIDIA H200 NVL at Scale With a New Enterprise Reference Architecture

NVIDIA H200 NVL accelerates AI deployment with enhanced memory, high-speed NVLink, and an optimized Enterprise RA configuration.

[Read Blog](https://resources.nvidia.com/en-us-certified-systems/deploying-nvidia-h200)

# NVIDIA’s AI Factory Drives Enterprise Innovation at Scale

NVIDIA built a unified AI factory to scale generative AI and agentic workflows across the enterprise, ensuring security, performance, and consistency. The platform supports hundreds of AI agents that accelerate innovation, streamline software and hardware engineering, and optimize supply chain operations—reducing planning times by over 95 percent and achieving decades’ worth of engineering work in just one year.

[Explore Key Results](https://www.nvidia.com/en-us/case-studies/ai-factory-drives-enterprise-innovation-at-scale.md)

### NVIDIA Blackwell Ultra Delivers up to 50x Better Performance and 35x Lower Cost for Agentic AI

Built to accelerate the next generation of agentic AI, NVIDIA Blackwell Ultra delivers breakthrough inference performance with dramatically lower cost. Cloud providers such as Microsoft, CoreWeave, and Oracle Cloud Infrastructure are deploying NVIDIA GB300 NVL72 systems at scale for low-latency and long-context use cases, such as agentic coding and coding assistants.

This is enabled by deep co-design across NVIDIA Blackwell, NVLink™, and NVLink Switch for scale-out; NVFP4 for low-precision accuracy; and NVIDIA Dynamo and TensorRT™ LLM for speed and flexibility—as well as development with community frameworks SGLang, vLLM, and more.

[Explore Key Results](https://blogs.nvidia.com/blog/data-blackwell-ultra-performance-lower-cost-agentic-ai/?nvid=nv-int-bnr-552734)

## Next Steps

### Ready to Get Started?

Learn more about NVIDIA Enterprise AI Factory.

[Get Started](https://marketplace.nvidia.com/en-us/enterprise/ai-factory/#)

### Take a Deeper Dive Into NVIDIA Enterprise Reference Architectures

Explore how NVIDIA Enterprise Reference Architectures provide scalable, prescriptive blueprints for deploying high-performance AI infrastructure.

[Read Whitepaper](https://resources.nvidia.com/en-us-certified-systems/collection-9563de54)

## Cluster Configuration 2-8-5-200 Specs

|  |  |
| --- | --- |
| CPUs (Eligible) | 2x 64c Intel Xeon  2x 64c AMD EPYC |
| GPUs | 8x NVIDIA RTX PRO™ 6000 Blackwell Server Edition |
| Networking (East-West) | 4x NVIDIA® BlueField®-3 B3140H (1x 400 Gb) |
| Networking (North-South) | 1x BlueField-3 B3220 (2x 200 Gb) |
| Host Memory (Min) | Min 1,024 GB DDR5 ECC (1x DIMM per slot) |
| Host Boot Drive (Min) | 1x 1 TB NVMe |
| Host Storage (Min) | 2x 4 TB NVMe |

## Cluster Configuration 2-8-9-400 Specs

|  |  |
| --- | --- |
| CPUs (Eligible) | 2x 64c Intel Xeon  2x 64c AMD EPYC |
| GPUs | 8x NVIDIA Blackwell Ultra GPU |
| Networking (East-West) | 8x NVIDIA® BlueField®-3 B3140H (1x 400 Gb) |
| Networking (North-South) | 1x BlueField-3 B3220 (2x 200 Gb) |
| Host Memory (Min) | Min 1,536 GB DDR5 ECC (1x DIMM per slot) |
| Host Boot Drive (Min) | 1x 1 TB NVMe |
| Host Storage (Min) | 2x 4 TB NVMe |

## Cluster Configuration 2-4-6-400 Specs

|  |  |
| --- | --- |
| CPUs | 2x 72c NVIDIA Grace™ (36 per rack) |
| GPUs | 4x NVIDIA Blackwell GPUs (72 per rack) |
| Networking (East-West)​ | 4x NVIDIA® ConnectX®-7 (1x 400 Gb) |
| Networking (North-South) | 2x NVIDIA BlueField®-3 B3240 (4x 200 Gb) |

Cisco is the worldwide technology leader that is revolutionizing the way organizations connect and protect in the AI era. For more than 40 years, Cisco has securely connected the world. With its industry-leading AI-powered solutions and services, Cisco enables its customers, partners, and communities to unlock innovation, enhance productivity, and strengthen digital resilience. With purpose at its core, Cisco remains committed to creating a more connected and inclusive future for all.

NVIDIA Design Review Board-endorsed solutions:

* [Cisco Nexus Hyperfabric AI Enterprise Reference Architecture](https://www.cisco.com/c/en/us/products/collateral/data-center-networking/nexus-hyperfabric/hyperfabric-ai-era-ds.html)
* [AI Infrastructure With Cisco Nexus 9000 Switches](https://www.cisco.com/c/en/us/products/collateral/switches/nexus-9000-series-switches/nexus-9000-ai-era-ds.html)
* [Cisco AI POD Infrastructure for NVIDIA HGX™ Solution Brief](https://www.cisco.com/c/en/us/solutions/collateral/artificial-intelligence/infrastructure/ai-pods/enterprises-so.html)
* [Cisco AI Pod Infrastructure for NVIDIA HGX™ Reference Guide](https://www.cisco.com/c/en/us/products/collateral/switches/nexus-9000-series-switches/nexus-9000-ai-era-ds.html)

[Discover More](https://www.cisco.com/site/us/en/solutions/artificial-intelligence/index.html)

Dell Technologies helps organizations and individuals build their digital future and transform how they work, live, and play. The company provides customers with the industry’s broadest and most innovative technology and services portfolio for the AI era.

NVIDIA Design Review Board-endorsed solutions:

* [Dell AI Factory With NVIDIA: PowerEdge XE9680 With NVIDIA HGX™ B300](https://www.delltechnologies.com/asset/en-us/solutions/infrastructure-solutions/briefs-summaries/nvidia-2-8-9-400-configuration-era-endorsed-for-the-dell-ai-factory-with-nvidia-brief.pdf)
* [Dell AI Factory With NVIDIA: NVIDIA RTX PRO™ Servers](https://www.delltechnologies.com/asset/en-us/solutions/infrastructure-solutions/briefs-summaries/nvidia-2-8-5-200-era-configuration-endorsed-for-the-dell-ai-factory-with-nvidia-brief.pdf)

[Discover More](https://www.dell.com/en-us/lp/dt/nvidia-ai)

HPE is a leader in essential enterprise technology, bringing together the power of AI, cloud, and networking to help organizations achieve more. As pioneers of possibility, our innovation and expertise advance the way people live and work. We empower our customers across industries to optimize operational performance, transform data into foresight, and maximize their impact. Unlock your boldest ambitions with HPE.

NVIDIA Design Review Board-endorsed solutions:

* [HPE AI Factory With NVIDIA: NVIDIA RTX PRO™ Servers](https://www.hpe.com/psnow/doc/a00157780enw)

[Discover More](https://www.hpe.com/us/en/solutions/artificial-intelligence/nvidia-collaboration.html)

Lenovo is a US$69B revenue global technology powerhouse, ranked #196 in the Fortune Global 500, and serving millions of customers every day in 180 markets. Focused on a bold vision to deliver Smarter Technology for All, our ongoing partnership with NVIDIA combines Lenovo servers with accelerated GPUs. The Lenovo Hybrid AI Advantage™ with NVIDIA boosts productivity and innovation with faster AI deployment, powered by the Lenovo AI Library and a full-stack portfolio of AI infrastructure, devices, solutions, and services.

NVIDIA Design Review Board-endorsed solutions:

* [Lenovo Hybrid AI 289 Platform Guide](https://lenovopress.lenovo.com/lp2286-lenovo-hybrid-ai-289-platform-guide)

[Discover More](https://www.lenovo.com/us/en/servers-storage/solutions/ai/)

Supermicro is a global leader in application-optimized total IT solutions. Founded and operating in San Jose, California, Supermicro is committed to delivering first-to-market innovation for enterprise, cloud, AI, and 5G telco/edge IT infrastructure. We are a total IT solutions provider with server, AI, storage, IoT, switch systems, software, and support services. Supermicro’s motherboard, power, and chassis design expertise further enables our development and production, enabling next-generation innovation from cloud to edge for our global customers.

NVIDIA Design Review Board-endorsed solutions:

* [AI Factory Solutions With NVIDIA RTX PRO™ Servers](https://www.supermicro.com/datasheet/Datasheet_Supermicro_NVIDIA_AI_Factories_RTX_PRO_6000.pdf)
* [AI Factory Solutions With NVIDIA HGX™ B300 Datasheet](https://www.supermicro.com/datasheet/Datasheet_Supermicro_NVIDIA_AI_Factories_HGX_B300.pdf)
* [AI Factory Solutions With NVIDIA HGX™ B300 Reference Architecture](https://www.supermicro.com/en/products/supercluster/srs-48uac-b300sx)
* [AI Factory Solutions With NVIDIA HGX B200](https://www.supermicro.com/datasheet/Datasheet_Supermicro_NVIDIA_AI_Factories_HGX.pdf)

[Discover More](https://www.supermicro.com/en/accelerators/nvidia/ai-factory)