About 500 million years ago, Earth experienced a sudden explosion of species, marking a phase in biological evolution where development could occur not just gradually but in leapsThis phenomenon is reminiscent of the current trajectory of artificial intelligence (AI), which, after more than seventy years of complex development, finds itself at a pivotal junctureThe advancements in foundational technologies such as models, data, and computational power have fueled an explosion of various forms of intelligent agents, showcasing an astonishing pace of “leapfrog” evolution.

In this context, computational power plays a crucial and underappreciated role in the leapfrog development of AIIt not only sustains rapid innovation but also undergoes significant transformation and explosive growth itselfAccording to the latest report from Gartner, global server sales are projected to reach $61.71 billion in the third quarter of 2024, representing a substantial year-on-year increase of 85.1%. The shipment volume is anticipated to hit 3.032 million units, reflecting a year-over-year rise of 7.2%. Notably, Inspur Information has retained its position as the second-largest supplier globally with an 11.7% market share, making it the leader in ChinaFurthermore, Gartner predicts that the global server market is forecasted to surpass the $200 billion mark, reaching $216.4 billion with expected shipments of approximately 11.99 million units, a growth of 6.5% compared to the previous year.

Delving deeper, the demand for AI computational power exhibits a robust growth trajectory, steering the future evolution of the computational industryGartner even forecasts that by 2028, the size of the global server market will surpass $332.9 billion, with AI servers commanding a remarkable 70% market shareAn era of AI-driven “big bang” is thereby unfolding.

The vigorous advancement of foundational large models stands out as the principal driver behind this explosive growth in global server markets in 2024. Reflecting on the AI landscape over the past two years, foundational large models have undeniably emerged as the stars of the show

Advertisements

Bolstered by Scaling Laws, the evolution of these foundational large models is progressing at an unprecedented speedCurrently, foundational large models serve as the bedrock for various intelligent applications, endowing them with human-like understanding and reasoning capabilities while ramping up their integration into specific industries, thus establishing themselves as the bedrock for the digital transformation and upgrading of various sectors.

If data can be seen as the raw material for foundational large models, then computational power serves as their engineConsequently, the intense competition for foundational large models has directly led to a global increase in investment in computing power and related infrastructures, positioning computational power as the cornerstone of competitiveness in the age of AI-generated content (AIGC).

For instance, hyperscale cloud service providers are both deep participants in foundational large models and key players in the demand for AI serversGartner’s data indicates that companies like Microsoft, Meta, and Google are projected to invest $72.2 billion in AI server procurement in 2024, constituting 56% of the total AI server market, exemplifying the significance these tech giants place on AI infrastructure.

Looking at regional developments, North America and Greater China, which house concentrated hyperscale data centers, are also the leading regions in global server salesSpecifically, North America has noted a year-on-year growth of 149%, while Greater China experienced a significant 116% increaseIn reality, both regions represent the current high ground for foundational large models and intelligent application innovations, featuring not only internet giants but also a wave of AI startups such as OpenAI, Anthropic, Dark Side of the Moon, and DeepSeek, all with burgeoning demands for AI computational power.

Moreover, the trend of vertical industries fully embracing AI should not be overlooked

Advertisements

Sectors such as autonomous driving, smart transportation, intelligent finance, industrial AI, and healthcare AI are driving demand for AI serversGartner's findings show that in 2024, global enterprise expenditure on AI server procurement is expected to surge by a staggering 184%. As foundational large models continue to elevate their capabilities, vertical-specific large models have rapidly emerged this year, with industry users increasingly eager to possess large models tailored to their respective fields and to broadly build a variety of intelligent applications based on these tailored models to fundamentally reshape their business scenarios, workflows, and user experiences.

It can be asserted that 2024 represents a milestone year for the evolution of artificial intelligenceWe observed the release of numerous remarkable large models like OpenAI’s o1 and o3, Anthropic’s Claude 3.5i, in addition to a myriad of embodied intelligent products springing forthThere are also various vertical industries attempting to deepen the integration between AI and real-world scenariosAll of this dazzling activity fundamentally relies on the support and propulsion of computational power.

As we look ahead to 2025, with the advancement of artificial intelligence reaching a crucial inflection point, what significant impacts will this have on the demand for computational power?

As we shift our focus from the pre-training of large models to post-training and inference, we acknowledge that the Scaling Law will remain critically significant for quite some timeContinuous training iterations across varied scenarios will be necessary to enhance model performance.

In 2024, the technology pathway for large models is evolving from LLM (Large Language Model) to LRM (Large Reasoning Model), placing a pronounced emphasis on inference as the next crucial breakthrough for the industryAccording to Ilya Sutskever, co-founder and Chief Scientist at OpenAI, the foundational large model domain is entering a new phase of “discovery and exploration.”

On the flip side, with foundational large models permeating various business scenarios, there is an acceleration in their integration with software, leading to a rich array of AI applications and diverse AI agents, thereby substantially increasing the demand for inferences

Advertisements

Last year, AWS CEO Matt Garman stated at the 2024 re:Invent conference that inference is becoming increasingly vital, emerging as one of the core component modules for application construction.

In 2024, the user base for generative artificial intelligence products in China has reached 249 million, indicating an increasing scale of application and penetrationBy 2025, all enterprises will face AI transformations, which will subsequently lead to an explosive demand for inference computational power accompanied by the universal proliferation and implementation of AI.

It is foreseeable that while “training” will continue to be the significant consumer of computational power, “inference” will also attain equal importance, evolving into a principal player on the AI stage in the upcoming yearsThis shift is not only a focal point for innovation within the entire industry but will also perpetuate the demand and innovation in computational powerVerified Market Research forecasts that the compound annual growth rate for inference chips will reach 22.6% from 2024 to 2030. Gartner also predicts that by 2025, the computational power cluster for inference will exceed that of training, marking a pivotal year for the diffusion of AI; by 2028, servers utilized for inference will account for 70% of the overall market size.

With the deep integration of AI into business scenarios, complex training and inference tasks pose new demands on computational infrastructureThe evolution of computational power has to address a variety of new challenges, including diversification in computational modalities, data center energy consumption, and operational modeling.

Today, the leapfrog evolution of artificial intelligence not only generates enormous and sustained demands for computational resources but also directly hastens the evolution of the computational industry.

First and foremost, propelled by rapid advancements in AI, the diversification of computational power is becoming an unmistakable trend

The evolution of computational power will accelerate the dismantling of outdated closed ecosystemsUpdates and iterations of computational systems and chips will occur within shorter cycles, necessitating a more open and diverse ecosystemRecent years have seen significant industry emphasis on standards like OAM and OCM, with chip manufacturers such as NVIDIA and AMD, along with hyperscale data center users like Amazon, Meta, Google, and Alibaba, investing resources to unify computational resources on a common platformThis initiative aims to decrease the costs of innovation and adaptation, allowing varied application scenarios to swiftly align with tailored solutions, thus accelerating the innovation in AI computational power.

Secondly, in light of the growing demand for AI, the future will necessitate all forms of computation to incorporate AI capabilities, with even general computational power needing to manage AI inference workloadsThe operational boundaries of server and computational devices will continually expand to accommodate the burgeoning demands for AI computational powerIn fact, exorbitant computational costs remain a prominent challenge, and the conception of “all computation is AI” holds promise to be an effective strategy to alleviate the scarcity and high costs of computational resourcesFor instance, Inspur Information exemplifies this with its Meta Brain servers, which can run hundreds of billions of parameters using just four CPUs, thanks to algorithmic optimizations like tensor parallelism and NF4 model quantizationThe recently launched new platform for Meta Brain servers has also demonstrated considerable performance enhancements in AI inference scenarios for the Llama large model.

Thirdly, the high power consumption challenge presented by AI computational power is pressing, and green computational power standards will become increasingly refined while related technologies gain tractionCurrently, significant AI clusters, whether at the scale of ten thousand or a hundred thousand nodes, are emerging constantly

Coupled with enhancements in AI computational performance, the power consumption of individual systems is continually rising, rendering cooling a long-term challenge for data centersAs a result, specifications for AI liquid-cooling cabinets are being established, with products such as cold plate liquid cooling, heat pipe liquid cooling, and immersion cooling expected to proliferateThere is even a growing market interest in optimal liquid cooling solutions throughout the entire lifecycle, encompassing planning consulting, equipment customization, and delivery and constructionFor instance, Inspur Information recently constructed a 10MW Meta Brain “computational factory” using 119 containers over 120 days, building it like a “Lego set.” This factory deploys an intelligent computing resource storage system that includes air-cooled cabinets rated for 50kW loads and liquid-cooled cabinets rated for 130kW loads, achieving high-density deployment of intelligent computing resources while promoting green energy efficiency.

Lastly, the use patterns and service ecosystem surrounding computational power are evolving rapidlyIn addition to a public cloud providing various convenient computational services, demands for GPU leasing and hosting are also surging, paving the way for a new AI computational services ecosystemFor example, AI infrastructure investments in the United States are experiencing explosive growth, with startups successfully implementing GPU leasing and hosting modelsIn China, the enthusiasm for building computational intelligence centers remains highGartner forecasts that by 2028, 90% of Chinese enterprises will opt for hosted solutions rather than constructing their own AI infrastructure, emphasizing the future importance of intelligent computing centers in the overall AI infrastructure landscape.

In summary, the latest global server market report by Gartner indicates that the computational power market has witnessed a comprehensive explosion in 2024, driven by the rapid development of foundational large models and AI applications

Advertisements

Advertisements