“Let’s start with some new data points from last week:
First, Exelon’s CEO says it has seen what he terms “high probability” datacenter load jump from 6 to 11 GW this year.
Google’s CEO indicated that over 25% of new code in the company is generated by AI and then reviewed by engineers. AI substitution for human labor is a huge part of the AI value proposition.
The Financial Times estimates spending on AI datacenters for the big four - Alphabet, Amazon, Meta, and Microsoft will exceed $200 billion this year.
The FERC rejected a request from Amazon Web Services to expand a contract involving power supply to its data center co-located at Talen Energy's Susquehanna nuclear power plant in PA.
Now, let’s talk compute: the goal here for competitors is to improve the quality of their language learning models so they can deal with more complex logic and increase overall accuracy. They do this by training on lots of data with increasingly powerful machines.
Analyst Epoch AI notes that annual compute capability has recently been growing at a rate of 4X. Will this growth will continue at that torrid pace, and what the implications are for our power grids?
Epoch AI looks at this issue by examining four underling factors: 1) power availability – our sweet spot that we will talk a lot more about later; 2) global chip capacity; 3) the “latency wall,” delays in increasingly complex computations; and 4) the availability of data to train on. Let’s look at 2 through 4 - we will deal with power in its own session.
Chips are in high demand. These game processing units – GPUs – bring power and parallel processing to the game, performing highly complex calculations at rapid speeds. GPUs keep getting better, but are in high demand and also expensive. Nvidia’s newest Blackwell chip cost about $10 billion to design and create, and buyers are paying $30,000 to 40,000 per GPU. That same Blackwell chip draws between 700 W and 1.2 kW depending on the configuration and cooling strategy.
Nvidia owns about 80% of GPU market share, followed by AMD, and the industry cannot keep up with current demand. But Google, Amazon, Meta, and Microsoft are all at work developing their own chips, so that strain may eventually ease.
Next, let’s look at the “latency wall.” It takes a certain amount of time (latency) for an AI model to process each datapoint, and that latency increases as model sizes grow. Models train by separating data into batches, and each AI training run takes as long as is needed to process a batch. The more batches processed, the longer the run takes. Today’s latencies aren’t that big – batches can be processed quickly. But as future training runs get larger and models get bigger, this could become an issue, and efficiencies might fall off. This scaling issue may limit future growth rates.
Finally, let’s look at data. AI datacenters train on data. Everything we ever posted to LinkedIn, Facebook or Insta. Youtube videos. Scientific papers, Movies, TV shows, stupid clips on TikTok. All of it. To understand data, we must understand the concept of a token - the smallest element into which text data can be broken down into in order for the AI model to process it. One word is usually a single token. With images, audio clips or video,s computers typically break them into smaller patches for tokenization (one picture or one second of video might represent 30 tokens).
It’s estimated that the web holds about 500 trillion words of unique text, which may grow 50%. by 2030. Add in images, audio, and video and you might get to 20 quadrillion tokens for computer training by the end of the decade. BUT, projections are that with ever faster computers and more efficient algorithms we might actually run out of data to train on, even as soon as 2026. Then, machines may learn to generate their own synthetic data. Or they could find other ways to learn. Nobody really knows. This uncertainty leads to a critical question for utilities. What if we build all this infrastructure, and then by 2030, there’s less to do with it? The phrase “stranded assets” should come to mind.
Meanwhile, chips become increasingly more efficient, requiring less electricity for processing and addressing the waste heat. Nvidia says its GPUs used in training have seen a 2000x reduction in energy use over 10 years. Until now, such gains have allowed data centers to do more and their appetite appears endless. But if future gains continue, how does that affect future datacenter power needs? Nobody truly knows. What we do know now is that the power grab continues unabated, and data centers are looking at all kinds of supply strategies to get the juice wherever they can. And that’s the topic we will focus on in the next session.