from Goldman trader Jia Wen Tuea
AI’s Sputnik Moment.
Back in May 2017, AlphaGo (backed by Google) beat Ke Jie (world’s #1 Go player) over the course of three marathon matches of more than 3h each. AlphaGo won ALL three matches. Much like what happened in 1957 after the Soviet Union launched the first human-made satellite into orbit which had an instant and profound effect on American psyche and govt policy, Beijing got to work. Less than two months after Ke Jie’s defeat, the Chinese govt issued an ambitious plan to build AI capabilities and projected that by 2030, China would become a leading centre of global innovation in AI.
In Dec 2024, China launched DeepSeek – its answer to GPT, Llama 3.1, etc, alongside Bytedance’s Doubao-1.5 Pro and Moonshot’s Kimi k1.5 models, all within days of each other, illustrating Chinese players’ continued AI advancements. Performance wise, these models are at least on par or better than the US ones and more importantly, significantly lower in training and inference costs and computing power requirements – a key pushback to the AI spend so far. Goldman’s Ronald Keung expects the race for agentic AI to continue in China and outlines DeepSeek’s implications for the China internet giants
I believe we are in the early-mid innings of this AI race and this time round, Stargate is US’ sputnik moment. I took a quick look at the spending that went into NASA during the space race which built rapidly in the late 50s/early 60s and peaked at ~4.4% of federal budget (t/y DeepSeek for digging this data for me!). Stargate plans to invest up to $500bn in AI infra by 2029 (on avg $100bn a year). If we assume an avg annual federal budget of around $8tn from 2025-29, $100bn a year is merely 1.25% of federal budget – are we just scratching the surface here? We can clearly debate the exact dollars spent but directionally, I don’t think it’s a stretch to imagine US “overshooting” this AI spend much like they did during the space race… before a “Moon Landing” catalyzed the top. I wish I had a crystal ball to tell you what that moon landing is but for now, the party continues and don’t fade Power Up America (GSENEPOW), AI Data Centres (GSTMTDAT), AI (GSTMTAIP).
* * *
Finally, for those who are just now catching up to speed, here are two threads, one from dropbox AI VP, Morgan Brown, and the other from Chamath Palihapatiya.
First, here is Morgan…
Finally had a chance to dig into DeepSeek’s r1…
Let me break down why DeepSeek’s AI innovations are blowing people’s minds (and possibly threatening Nvidia’s $2T market cap) in simple terms…
0/ first off, shout out to @doodlestein who wrote the must-read on this here
All the reasons why Nvidia will have a very hard time living up to the currently lofty expectations of the market.
1/ First, some context: Right now, training top AI models is INSANELY expensive. OpenAI, Anthropic, etc. spend $100M+ just on compute. They need massive data centers with thousands of $40K GPUs. It’s like needing a whole power plant to run a factory.
2/ DeepSeek just showed up and said “LOL what if we did this for $5M instead?” And they didn’t just talk – they actually DID it. Their models match or beat GPT-4 and Claude on many tasks. The AI world is (as my teenagers say) shook.
3/ How? They rethought everything from the ground up. Traditional AI is like writing every number with 32 decimal places. DeepSeek was like “what if we just used 8? It’s still accurate enough!” Boom – 75% less memory needed.
4/ Then there’s their “multi-token” system. Normal AI reads like a first-grader: “The… cat… sat…” DeepSeek reads in whole phrases at once. 2x faster, 90% as accurate. When you’re processing billions of words, this MATTERS.
5/ But here’s the really clever bit: They built an “expert system.” Instead of one massive AI trying to know everything (like having one person be a doctor, lawyer, AND engineer), they have specialized experts that only wake up when needed.
6/ Traditional models? All 1.8 trillion parameters active ALL THE TIME. DeepSeek? 671B total but only 37B active at once. It’s like having a huge team but only calling in the experts you actually need for each task.
7/ The results are mind-blowing:
- Training cost: $100M → $5M
- GPUs needed: 100,000 → 2,000
- API costs: 95% cheaper
- Can run on gaming GPUs instead of data center hardware
8/ “But wait,” you might say, “there must be a catch!” That’s the wild part – it’s all open source. Anyone can check their work. The code is public. The technical papers explain everything. It’s not magic, just incredibly clever engineering.
9/ Why does this matter? Because it breaks the model of “only huge tech companies can play in AI.” You don’t need a billion-dollar data center anymore. A few good GPUs might do it.
10/ For Nvidia, this is scary. Their entire business model is built on selling super expensive GPUs with 90% margins. If everyone can suddenly do AI with regular gaming GPUs… well, you see the problem.
11/ And here’s the kicker: DeepSeek did this with a team of <200 people. Meanwhile, Meta has teams where the compensation alone exceeds DeepSeek’s entire training budget… and their models aren’t as good.
12/ This is a classic disruption story: Incumbents optimize existing processes, while disruptors rethink the fundamental approach. DeepSeek asked “what if we just did this smarter instead of throwing more hardware at it?”
13/ The implications are huge:
- AI development becomes more accessible
- Competition increases dramatically
- The “moats” of big tech companies look more like puddles
- Hardware requirements (and costs) plummet
14/ Of course, giants like OpenAI and Anthropic won’t stand still. They’re probably already implementing these innovations. But the efficiency genie is out of the bottle – there’s no going back to the “just throw more GPUs at it” approach.
15/ Final thought: This feels like one of those moments we’ll look back on as an inflection point. Like when PCs made mainframes less relevant, or when cloud computing changed everything.
AI is about to become a lot more accessible, and a lot less expensive. The question isn’t if this will disrupt the current players, but how fast.
* * *
And here is Chamath.
Several important questions/comments come to my mind as I read more about DeepSeek. Listing them here:
1) Let’s give 1% probability to all the conspiracy theories upfront so we can address it and move on. If it is possible for China/Chinese companies to use shell companies in Singapore or other countries to be a “beard” to buy otherwise export controlled chips from Nvidia and use them for AI training, this likely needs to be investigated and adjudicated.
2) The battle of usage is now more about AI inference vs Training. We always knew this day would come but it probably surprised many that it could be this weekend. With a model this cheap, many new products and experiences can now emerge trying to win the hearts and minds of the global populace. Team USA needs to win here. To that point, while we may still want to export control AI Training chips, we should probably view Inference chips differently – we should want everyone around the world using our solutions over others. I can explain my reasoning as follows: we should never export our knowledge of enriching uranium to be weapons grade to other countries but we should export our ability to build nuclear energy (which requires far less sophistication) if it can help advance American priorities and leadership abroad. Training and Inference can be roughly equated this way. (Disclaimer: Groq, of which I’m a shareholder, is in this game so this benefits me tbf.)
3) We need to cooperate with our allies (especially those in the ME) to stand up the necessary infrastructure to enable Inference – Data centers, subsidized energy etc. all around the world ASAP.. They pay to build it, we supply the Inference hardware and the software to run the clouds. We need this buildout to happen ASAP. This is clearly our version of Belt and Road and we need to take it as seriously as China took their version, similarly named.
4) There will be volatility in the stock market as capital markets absorb all of this information and re-price the values of the Mag7. Tesla is the least exposed, the rest are exposed as a direct function of the amount of CapEx they have publicly announced. Nvidia is the most at risk for obvious reasons. That said, markets will love it if Meta, Microsoft, Google etc can win WITHOUT having to spend $50-80B PER YEAR.
5) The innovation from China speaks to how “asleep” we’ve been for the past 15 years. We’ve been running towards the big money/shiny object spending programs (AI is not the first and it likely won’t be the last) where we (Team USA) have thrown hundreds of billions of dollars at a problem vs thinking through the problem more cleverly and using resource constraints as an enabler. Let’s get our act together. We need all the bumbling middle managers out of the way – let the engineers and the brilliant folks we have actually working on this stuff to cook! More spending, more meetings, more oversight, more weekly reports and the like does not equate to more innovation. Unburden our technical stars to do their magic.
6) Startups need to realize that they are “default dead” companies. This means that they must, by definition, grasp victory from the jaws of defeat. Meanwhile, VCs are asleep at the switch – massively overfunding marginal ideas. We need to get better at taking huge shots on goal and allocating capital to the best of these ideas. I worry that in this current melee, we’ve overspent billions on dumb features which these next-gen models will roll over in the next 12months or earlier. Lots of capital losses are coming.
Vær et skridt foran
Få unik indsigt i de vigtigste erhvervsbegivenheder og dybdegående analyser, så du som investor, rådgiver og topleder kan handle proaktivt og kapitalisere på ændringer.
- Vi filtrerer støjen fra den daglige nyhedscyklus og analyserer de mest betydningsfulde tendenser.
- Du får dybdegående og faktatjekket journalistik om vigtige erhvervsbegivenheder lige nu.
- Adgang til alle artikler på ugebrev.dk.