Join our daily and weekly newsletters for the latest updates and the exclusive content on AI coverage. Learn more
During the recent NVIDIA GTC conference, the company unveiled what it described as the first server system to a single rack capable of an exaflop – a billion billion dollars, or a quintillion, floating commas (flops) operations per second. This breakthrough is based on the latest GB200 NVL72 system, which incorporates the latest Blackwell (GPU) graphic treatment units. A standard computer grid measures approximately 6 feet high, just over 3 feet deep and less than 2 feet wide.
Correcting an exaflop: from the border to Blackwell
A few things about the announcement hit me. First, the first world computer compatible with Exaflop was installed only a few years ago, in 2022, at the Oak Ridge National Laboratory. For comparison, the “frontier” supercomputer built by HPE and fed by AMD GPUs and CPUs, composed at the origin of 74 servers racks. The new NVIDIA system has reached a performance density of around 73 times more than 73 times in just three years each year, which is equivalent to a triple of performance each year. This progression reflects remarkable progress in computer density, energy efficiency and architectural design.
Second, it must be said that, although the two systems reach the Exascale milestone, they are built for different challenges, one optimized for speed, the other for precision. The Exaflop specification of NVIDIA is based on low -precision mathematics – in particular 4 -bit and 8 -bit floating commas operations – considered optimal for IA workloads, including tasks such as the training and management of large language models (LLM). These calculations prioritize speed on precision. On the other hand, the exaflop rating for frontier was obtained using mathematics with double precision 64 bits, the gold stallion for scientific simulations where the precision is critical.
We have gone a long way (very quickly)
This level of progress seems almost incredible, especially since I remember the point when I started my career in the IT industry. My first professional work was as a programmer on DEC KL 1090. This machine, which is part of the PDP-10 series of DEC-10, offered 1.8 million instructions per second (MIPS). Aside from its CPU performance, the machine connected to the cathodic ray tube (CRT) is displayed via wired cables. There was no graphic capacities, just clear text on a dark background. And of course, no internet. Distant users connected to the telephone lines using modems operating at up to 1,200 bits per second.
500 billion times more calculation
Although the comparison of MIPs to flops gives a general feeling of progress, it is important to remember that these measures measure different IT workloads. MIPS reflects the process of processing integers, which is useful for computers for general use, in particular in commercial applications. Flops measure floating commas that are crucial for scientific workloads and the heavy digital kibble behind modern AI, such as matrix mathematics and linear algebra used to train and execute automatic learning models (ML).
Although it is not a direct comparison, the pure scale of the difference between MIPs and flops now provides a powerful illustration of rapid growth in computer performance. Using them as a rough heuristic to measure the work done, the new Nvidia system is about 500 billion times more powerful than the DEC machine. This kind of jump illustrates the exponential growth of calculation power on a single professional career and raises the question: if these progress is possible in 40 years, what could the next 5?
Nvidia, for its part, offered some clues. At the GTC, the company shared a roadmap predicting that its complete new generation system based on the ultra architecture “Vera Rubin” will offer 14x the performance of the Blackwell Ultra Rack expedition this year, reaching somewhere between 14 and 15 exaflops in optimized works AI during the next year or two.
Equally notable is efficiency. The realization of this level of performance in a single rack means less physical space per work unit, fewer materials and potentially lower energy consumption by operation, although the absolute power requirements of these systems remain immense.
Does AI really need everything that calculates power?
Although such performance gains are indeed impressive, the AI industry is now struggling with a fundamental question: what amount of calculation power is really necessary and at what price? The race for the construction of new massive AI data centers is driven by the growing requirements of exascale IT and ever more capable AI models.
The most ambitious effort is the $ 500 billion Stargate project, which plans 20 data centers in the United States, each covering half a million square feet. A wave of other hyperscal projects is underway or in the planning phase in the world, while businesses and countries are jostling to ensure that they have the infrastructure to support the workloads of the AI of tomorrow.
Some analysts are now afraid that we can build the capacity of the AI data center too much. The concern intensified after the release of R1, a model of reasoning of Deepseek from China which requires much less calculation than many of its peers. Microsoft then canceled leases with several suppliers of data centers, arousing speculation that he could recalibrate his expectations for the future AI infrastructure request.
However, The register suggested That this decline may have more to do with some of the AI data centers provided without capacity sufficiently robust to meet the power and cooling needs of new generation AI systems. Already, AI models push the limits of what current infrastructure can support. MIT review of MIT technology reported That this can be the reason why many data centers in China are in difficulty and fail, having been built on specifications that are not optimal for the current need, not to mention those of the next few years.
IA inference requires more flops
The reasoning models do most of their work during a process known as the inference. These models today feed some of the most advanced and high intensity of resources applications, including deep research assistants and the emerging wave of agent AI systems.
While Deepseek-R1 initially frightened the industry, thinking that the future IA could require less Power of calculation, the CEO of Nvidia, Jensen Huang, pushed strong. Speaking For CNBC, he countered this perception: “It was exactly the opposite conclusion that everyone had.” He added that the IA reasoning consumes 100x more computer more than AI not of the season.
While the AI continues to evolve from reasoning models to autonomous agents and beyond, the demand for calculation is likely to go up again. The next breakthroughs can come not only in language or vision, but in the coordination of AI agents, fusion simulations or even large -scale digital twins, each possible by the type of calculation capacity jump which we have just seen.
Apparently on the signal, Openai has just announced 40 billion dollars in new funds, the largest private technological financing cycle ever recorded. The company said in a blog That the financing “allows us to push the boundaries of research on AI even further, to scale our calculation infrastructure and to deliver increasingly powerful tools for the 500 million people who use the ChatPPT each week.”
Why are so much capital flowing in AI? The reasons range from competitiveness to national security. Although a particular factor stands out, as illustrated by an McKinsey title: “AI could increase the profits of companies by $ 4.4 billions per year.”
What comes next? It is supposition
Basically, information systems concern the complexity of complexity, whether via an emergency vehicle routing system that I wrote once in Fortran, a student performance report tool built in Cobol or modern AI systems accelerating the discovery of drugs. The goal has always been the same: to give greater sense in the world.
Now, with a powerful AI starting to appear, we cross a threshold. For the first time, we can have calculation power and intelligence to solve the problems that were once out of human reach.
The columnist for the New York Times Kevin Roose Recently captured this good moment: “Each week, I meet engineers and entrepreneurs working on AI who tell me that change – a big change, a change of jump into the world, the type of transformation that we have never seen before – is just at the corner of the street.” And that doesn’t even have the breakthroughs that arrive every week.
In the past few days, we have seen the GPT-4O of OpenAi generate almost perfect images From the text, Google publishes what can be the most advanced reasoning model to date in Gemini 2.5 Pro and Runway unveils a video model with a character and scene coherence, something that Venturebeat Notes has escaped most of IA video generators so far.
What comes then is really a supposition. We do not know if a powerful AI will be a breakthrough or a rupture, if it will help resolve fusion energy or release new biological risks. But with more and more flops online in the next five years, one thing seems certain: innovation will come quickly – and forcefully. It is also clear that, as flops evolve, our conversations also owe responsibility, regulations and restraint.
Gary Grossman is the executive vice-president of technological practice at Edelman And world leader in Edelman AI Center of Excellence.