Supercomputing as a Service
Yes! You read it right. Nvidia is all set to provide AI-based services to enterprises that could potentially train advanced models for generative AI
Microchip manufacturer Nvidia is taking a leap of faith to offer users access to supercomputing power from web browsers. The service called Nvidia DGX Cloud could potentially allow enterprises to run workloads through this service, thus creating a much-needed advanced training model for generative artificial intelligence (AI).
To be available initially only through the Oracle Cloud Infrastructure (OCI), the DGX Cloud aims to make AI supercomputers available to enterprises through a web browser that potentially simplifies on-premises deployment and management.
What does it actually mean for users?
The company made the announcement via a blog post, which said “DGX Cloud provides dedicated clusters of NVIDIA DGX AI supercomputing, paired with NVIDIA AI software. The service makes it possible for every enterprise to access its own AI supercomputer using a simple web browser, removing the complexity of acquiring, deploying and managing on-premises infrastructure.”
“We are at the iPhone moment of AI. Startups are racing to build disruptive products and business models, and incumbents are looking to respond,” says Jensen Huang, founder and CEO of Nvidia. DGX Cloud gives customers instant access to Nvidia AI supercomputing in global-scale clouds, he said in a formal company announcement.
But the service doesn’t come cheap at all
Of course, the service won’t come cheap as Nvidia would charge a whopping $37,000 per month for access to eight of their A100 or H100 chips – both designed for AI computing. Each instance of the DGX Cloud features eight Nvidia H100 or A100 80GB Tensor Core GPUs that add up to a massive 640GB of GPU memory per node.
“A high-performance, low-latency fabric built with Nvidia Networking ensures workloads can scale across clusters of interconnected systems, allowing multiple instances to act as one massive GPU to meet the performance requirements of advanced AI training,” says Nvidia’s blog post by way of explanation of how the system works.
Reports suggest that several companies have already piloted the DGX Cloud service. A report published on SdxCentral says ServiceNow has deployed DGX Cloud with on-premises Nvidia DGX supercomputers to power its AI research on large language models (LLMs), code generation, and causal analysis.
What does it all include?
The DGX Cloud environment includes the company’s software layer Nvidia AI Enterprise designed to provide end-to-end AI frameworks and pre-trained models to accelerate data science pipelines and streamline the development and deployment of production AI. The OCI Supercluster aims to provide a remote direct memory access network, bare-metal compute, and high-performance local and block storage that can scale to superclusters of over 32,000 GPUs.
Companies can now rent cloud clusters on a monthly basis to optimize scalability while also developing large, multi-node training workloads. On its part, Nvidia claims this move would reduce wait times for highly sought-after accelerated computing resources and expects Microsoft Azure to start hosting DGX Cloud from the next quarter.
Given the growing concerns over generative AI and its limitations around accuracy, scalability and recollection capabilities, this supercomputing as a service could bring the requisite processing power to future efforts around AI. The need for long-term memory for AI applications has been spoken about as this allows models to recall things for a longer time.