Dansk resume af Goldman analyse:
Goldmans analyse er meget bullish på agentic AI og ser teknologien som næste store fase efter generativ AI. Hvor generativ AI typisk forbindes med chatbots og enkeltstående svar på brugerprompts, handler agentic AI om systemer, der kan planlægge, handle, validere, gentage processer og udføre arbejdsopgaver mere autonomt.
Den centrale økonomiske pointe er, at agentic AI kan skabe en voldsom stigning i tokenforbrug, samtidig med at omkostningen per token falder hurtigere end priserne. Hvis den udvikling fortsætter, kan AI-sektoren gå fra en fase, hvor stigende brug primært blev set som en omkostningsbyrde, til en fase hvor stigende brug også bliver en margin- og profitabilitetshistorie for hyperscalere, modeludbydere, chipproducenter og dele af softwaresektoren.
1. Agentic AI er ikke bare “mere chatbot”
Analysen skelner tydeligt mellem almindelige chatbots og egentlige AI-agenter.
En chatbot svarer typisk på en prompt. En agent udfører derimod en sekvens af handlinger: Den forstår brugerens mål, indsamler information, bruger værktøjer, evaluerer resultater, retter fejl og afslutter først, når opgaven er løst eller sendt videre til et menneske.
Goldman beskriver agenten som et workflow-system, hvor LLM’er indgår i mange forskellige trin. En enterprise-agent kan for eksempel bruge én model til klassifikation, en stærkere ræsonneringsmodel til planlægning og fejlfinding, en kode-model til programmering og en vision- eller dokumentmodel til at læse PDF’er, skærmbilleder eller formularer.
Det er netop denne sekventielle og iterative struktur, der gør agenter langt mere tokenintensive end traditionelle chatbots.
2. Tokenforbruget kan eksplodere
Goldmans vigtigste kvantitative budskab er, at agentic AI kan udløse en dramatisk vækst i tokenefterspørgslen.
Analysen anslår:
| Område | Goldman-estimat |
|---|---|
| Samlet global tokenvækst frem mod 2030 | ca. 24x |
| Globalt tokenforbrug i 2030 | ca. 120 kvadrillioner tokens/måned |
| Consumer-agenter i 2030 | ca. 60 kvadrillioner tokens/måned |
| Enterprise-agenter i 2030 | ca. 56 kvadrillioner tokens/måned |
| Enterprise-agenter ved peak adoption i 2040 | ca. 278 kvadrillioner tokens/måned |
| Enterprise-agenters langsigtede tokenløft | ca. 55x |
Goldman fremhæver, at denne vækst ikke primært skyldes, at folk bruger chatbots lidt mere. Den skyldes et skift fra episodisk brug til vedvarende agentisk aktivitet. I stedet for at en bruger stiller et spørgsmål og lukker appen, vil agenter kunne køre i baggrunden, overvåge kontekst, reagere på ændringer og handle løbende.
På side 5 vises en graf, hvor Goldman illustrerer, at både consumer- og enterprise-agenter kan drive en kraftig stigning i tokenforbruget frem mod 2030. Enterprise-delen vokser især kraftigt i det langsigtede scenarie, fordi arbejdsprocesser i virksomheder kræver mere præcision, flere valideringsloops og flere integrationer.
3. Den økonomiske nøgle: Tokenpriser stabiliseres, mens tokenomkostninger falder
En af analysens mest centrale pointer er, at tokenøkonomien ser ud til at være ved at vende til fordel for hyperscalere og modeludbydere.
Tidligere var bekymringen, at mere AI-brug blot ville betyde mere inference-belastning, flere GPU’er, mere strøm og højere capex. Goldman argumenterer nu for, at billedet er ved at ændre sig:
- Priserne på førende LLM-tokens er faldet kraftigt, men er begyndt at stabilisere sig eller i visse tilfælde stige.
- De underliggende compute-omkostninger per token falder fortsat hurtigt.
- Semi-selskaber og specialiserede acceleratorer driver ifølge analysen årlige fald i tokenomkostninger på cirka 60-70 %.
- Hvis prisniveauet holder sig over compute-omkostningen, kan stigende tokenforbrug give positiv marginudvidelse.
Det er en afgørende forskel. Goldman ser ikke kun agentic AI som en efterspørgselshistorie, men som en mulig margininfleksion for AI-værdikæden.
Grafen på side 2 og igen på side 5 viser netop denne pointe: tokenpriserne stabiliseres, mens omkostningskurverne for blandt andet Nvidia, AMD, Google TPU/Broadcom og Trainium/Marvell fortsætter nedad.
4. Den selvforstærkende AI-flywheel
Goldman beskriver en potentiel økonomisk flywheel-effekt:
- Lavere compute-omkostning per token gør det muligt at bygge rigere og mere komplekse agenter.
- Mere komplekse agenter bruger langt flere tokens via længere kontekst, flere beslutningsloops, validering og overvågning.
- Højere udnyttelse forbedrer økonomien i AI-infrastrukturen.
- Bedre økonomi giver udbyderne mulighed for at fortsætte investeringer i modelkvalitet, distribution og infrastruktur.
Pointen er, at faldende enhedsomkostninger ikke nødvendigvis reducerer markedets størrelse. Tværtimod kan lavere omkostninger gøre langt flere anvendelser økonomisk rentable, så volumen stiger mere end prisen falder.
Analysen understreger dog også risikoen: Margininfleksionen er ikke garanteret for alle workloads. Især simple tekst-chatbots kan blive udsat for hård konkurrence, hvor priserne falder hurtigere end omkostningerne.
5. Consumer-agenter: Fra chat til daglig digital assistent
Goldman opdeler consumer AI i tre kategorier:
| Kategori | Beskrivelse |
|---|---|
| Chatbots | Brugeren åbner et værktøj og stiller et spørgsmål, eksempelvis ChatGPT, Gemini eller Claude. |
| Embedded tools | AI indlejres i eksisterende produkter og apps, eksempelvis shoppingassistenter eller personlige intelligenssystemer. |
| Agents | Mere autonome systemer, der kan køre i baggrunden på tværs af enheder og apps med minimal brugerinput. |
Goldman ser forbrugeradoptionen som et skift fra samtale og informationssøgning til egentlige workflows. Eksempler er søgning, shopping, rejsebooking, email, kalender, personlig produktivitet og digitale livsassistent-funktioner.
Analysen estimerer, at AI-forespørgsler kan vokse fra cirka 5 mia. om dagen i 2025 til cirka 23 mia. om dagen i 2030. Op til 30 % af consumer-forespørgslerne kan ifølge Goldman blive rettet mod LLM-agenter. Det kan give cirka 60 kvadrillioner tokens om måneden fra consumer-agent workloads i 2030.
6. On-demand-agenter versus always-on-agenter
Goldman skelner mellem to typer consumer-agenter:
On-demand-agenter starter, når brugeren beder dem om noget. De planlægger, handler, gentager og afslutter opgaven. Eksemplerne i analysen omfatter browserbaserede eller task-baserede agenter.
Always-on-agenter kører mere permanent i baggrunden. De kan overvåge email, kalender, rejseplaner, beskeder eller andre kontekstdata og handle, når der er behov.
Goldman mener, at den største tokenstigning kommer fra always-on-modellen, fordi agenten ikke kun aktiveres ved en enkelt prompt, men kontinuerligt overvåger situationen.
På side 11-14 illustreres dette med et rejsebooking-flow og tabeller, der viser, hvordan tokenforbruget opstår trin for trin: brugerintention skal forstås, manglende information afklares, fly og hoteller søges, resultater normaliseres, valideres og rangordnes, brugerfeedback indarbejdes, og til sidst gennemføres booking.
7. Hvorfor agenter bruger så mange tokens
Goldman fremhæver, at tokenforbruget i en agent ikke bare kommer fra det endelige svar, men fra hele workflowet.
En agent skal typisk:
- forstå opgaven,
- hente kontekst,
- stille opklarende spørgsmål,
- kalde eksterne værktøjer,
- analysere resultater,
- vurdere alternativer,
- validere output,
- håndtere fejl,
- gentage processen,
- og producere et brugbart slutresultat.
I analysens consumer-eksempler anslås det, at simple LLM-chatbots bruger omkring 1.000 tokens per session, mens embedded copilots kan bruge mere end 5.000 tokens per dag. Mere aktive agenter kan komme over 100.000 tokens per dag. Goldman understreger, at den største driver ikke kun er opgavens kompleksitet, men hvor kontinuerligt agenten er aktiv.
8. Enterprise-agenter: Det største langsigtede potentiale
Selv om consumer-agenter kan få mange brugere hurtigt, ser Goldman enterprise-agenter som den største langsigtede kilde til tokenefterspørgsel.
Enterprise-agenter er mere tokenintensive, fordi de skal arbejde i komplekse, præcise og ofte regulerede processer. De skal ikke bare give et “godt nok” svar, men levere output, der kan bruges i reelle forretningsprocesser. Det kræver flere kontroller, mere kontekst, auditability, systemintegrationer og fejlhåndtering.
Goldman estimerer, at omkring 37 % adoption blandt globale knowledge workers ved peak adoption kan føre til cirka 278 kvadrillioner tokens per måned og en samlet tokenstigning på cirka 55x fra nuværende niveauer.
Analysen anfører også et meget stort langsigtet økonomisk potentiale: Ved peak workflow-adoption på omkring 35-40 % anslås det, at agentic AI kan berøre cirka 1,4 billioner arbejdstimer, skabe omkring 220 mia. dollar årligt til AI-infrastruktur og understøtte en software-TAM på cirka 5,4 billioner dollar.
9. Enterprise-adoptionen er stadig tidlig
Goldman vurderer, at virksomhederne er i en tidlig fase. Surveys indikerer, at 70-90 % af virksomheder eksperimenterer med AI-agenter, men færre end en fjerdedel skalerer dem reelt. Adoptionen er typisk begrænset til én eller to workflows, ofte inden for kundesupport, IT-drift, sales enablement og intern videnssøgning.
Goldman nævner to milepæle i enterprise-agentlandskabet:
- Salesforce Agentforce i august 2024, som løftede ambitionen fra RAG-chatbots til mere avancerede kundeworkflows.
- Anthropic Claude Cowork i januar 2026, som tilføjede en knowledge-worker-wrapper til coding-værktøjer.
Analysen understreger, at de største barrierer ikke længere nødvendigvis er modeltekniske. De er organisatoriske og institutionelle. Virksomheder spørger ikke kun, om agenter kan handle, men hvornår og hvordan de må handle.
10. De to største enterprise-barrierer: Trust og ROI
Goldman fremhæver to centrale begrænsninger:
Trust at scale: Virksomheder skal have styr på datagovernance, sikkerhedsgrænser, deterministic versus non-deterministic controls, auditability og change management. Det bliver især vanskeligt, når agenter skal operere på tværs af systemer of record, og når ansvar for output er uklart.
ROI visibility: Mange gevinster viser sig som produktivitetsforbedringer snarere end direkte omkostningsreduktioner. Det gør budgettering og ejerskab sværere. Derudover gør inference-omkostninger det nødvendigt med klare cost-guardrails.
Goldmans korte konklusion er: Agentic AI er teknologisk klar, men økonomisk og institutionelt begrænset.
11. Adoptionen ventes at følge en S-kurve
Goldman bruger historiske teknologiadoptionsdata fra Comin og Hobijns CHAT-datasæt til at vurdere, hvordan enterprise-agentic AI kan sprede sig.
Analysen overvejer tre mulige kurveformer:
| Kurve | Beskrivelse |
|---|---|
| J-kurve | Hurtig diffusion efter gennembrud, mest bullish for efterspørgslen. |
| S-kurve | Langsom start, acceleration ved tipping point og senere udfladning. |
| Lineær kurve | Mere jævn adoption, ofte ved kapitalbegrænsede eller mindre virale teknologier. |
Goldman vurderer, at S-kurven er mest sandsynlig for enterprise-agenter. Vi er i en tidlig trial- og adoptionsfase, og Goldman forventer, at adoptionen accelererer frem mod omkring 2030, når flere virksomheder bevæger sig fra POC’er til produktion.
Goldman anslår en time-to-peak adoption på cirka 15 år, hvilket er hurtigere end medianen for historiske teknologier i datasættet, men stadig ikke øjeblikkeligt. Begrundelsen er, at agentic AI er et ikke-fysisk gode, men samtidig kræver datahygiejne, process management, governance og institutionel tilpasning.
12. Alle workflows bliver ikke automatiseret lige hurtigt
En vigtig pointe i analysen er, at høj “AI exposure” ikke automatisk betyder hurtig adoption.
Goldman vurderer, at enterprise-adoption afhænger af fire variable:
| Variabel | Betydning |
|---|---|
| Tokenvolumen | Hvor mange tokens workflowet kræver. |
| API-omkostning | Den faktiske pris for at replikere arbejdsprocessen. |
| Modalitetsmix | Om workflowet er tekst, stemme, video, billeder, dokumenter eller systemdata. |
| Implementeringskompleksitet | Hvor svært det er at integrere agenten i virksomhedens systemer og processer. |
Teksttunge workflows med moden tooling ventes at skalere først. Voice-tunge eller dybt integrerede backoffice-processer kan tage længere tid, selv hvis de teoretisk er meget automatiserbare.
13. Tre konkrete enterprise-eksempler: Coding, call center og data entry
Goldman modellerer tre centrale agent-workloads.
Coding-agent
En coding-agent kan ifølge Goldman bruge omkring 7 mio. tokens per dag, men kun koste cirka 13 dollar per dag. Det forklarer, hvorfor softwareudvikling allerede har oplevet relativt hurtig agentadoption: arbejdet er højværdi, primært tekstbaseret, og tooling-økosystemet er modent.
Call center-agent
En call center-agent kan bruge færre tokens, cirka 2 mio. tokens per dag, men koste omkring 92 dollar per dag, hvis den i høj grad er afhængig af realtidsstemme. Det gør fuld voice-baseret automation mere kompleks og dyrere. Goldman forventer derfor, at mange kundeservice-workflows først går mod tekstbaseret automation, mens stemme bruges selektivt.
Data entry-agent
En data entry-agent kan bruge omkring 25 mio. tokens per dag og koste cirka 59 dollar per dag. Det afspejler et lavmodalitets-, men højvolumen-workflow, som stadig kan være væsentligt billigere end menneskelig arbejdskraft per dag.
Disse eksempler viser analysens måske vigtigste operationelle pointe: Det er ikke tokenmængden alene, der bestemmer adoptionen. Det afgørende er den samlede API-omkostning, kvaliteten af outputtet, modaliteten og implementeringskompleksiteten.
14. Arbejdskraft erstattes ikke nødvendigvis én til én
Goldman argumenterer for, at agentic AI ikke kun skal forstås som substitution af menneskeligt arbejde. Teknologien kan også udvide den samlede mængde arbejde eller service, der bliver leveret.
Analysen bruger kundeservice som eksempel. Hvis call center-kapacitet er begrænset, opstår der ventetid og kunder, der giver op. AI-agenter kan betjene noget af denne uopfyldte efterspørgsel. Det betyder, at virksomheder både kan blive mere effektive og levere mere service. Menneskelig arbejdskraft vil derfor ofte udvikle sig mod mere komplekse opgaver frem for blot at blive erstattet fuldt ud.
Goldman nævner Navan som eksempel, hvor AI har opnået over 50 % call deflection med høj kundetilfredshed, mens mennesker håndterer mere komplicerede kundebehov.
15. Investeringsimplikationer: Hyperscalere og modeludbydere
For hyperscalere og modeludbydere ser Goldman agentic AI som en væsentlig driver for compute-efterspørgsel og som en mulighed for at frigøre værdi gennem distribution, intent capture og bedre monetisering.
Analysen fremhæver, at aktørerne stadig er supply constrained i forhold til compute, og at capex-intensiteten forbliver høj. Derfor vil investorer i stigende grad kræve synlighed for afkastet på AI-investeringer.
Goldman foretrækker i denne gruppe:
| Selskab | Goldman-rationale |
|---|---|
| Amazon | AWS-vækst, stor backlog og momentum omkring custom silicon som Trainium og Graviton. |
| Alphabet | Cloud-momentum, multimodal Search og full-stack AI-position. |
| Meta | Stærk digital annonceforretning, AI-drevet engagement og bedre ads monetization. |
16. Investeringsimplikationer: Semiconductors
For halvledersektoren ser Goldman en klart positiv effekt fra hyperscalernes og modeludbydernes capex. Faldende tokenomkostninger gør flere agentiske use cases rentable, hvilket kan øge det adresserbare compute-marked. Volumenelasticiteten kan mere end opveje faldende compute-omkostninger.
Goldman foretrækker:
| Selskab | Goldman-rationale |
|---|---|
| Broadcom | Leder inden for custom computing og ASIC-løsninger til hyperscalere og modeludbydere. |
| Nvidia | Fortsat førende inden for AI-performance på tværs af træning og inference. |
| AMD | Styrket position i datacenter-GPU’er og enterprise CPU’er; mulig fordel når CPU attach rate stiger i agentic workloads. |
17. Investeringsimplikationer: Software og IT services
For software og IT services er billedet mere nuanceret, men Goldman ser langsigtede medvinde.
Lavere tokenomkostninger gør det lettere for softwareleverandører at indbygge agenter uden at ødelægge bruttomarginerne. Samtidig kan pricing bevæge sig fra seat-baserede modeller mod betaling for outcomes, produktivitet eller “units of work”.
Goldman foretrækker:
| Selskab | Goldman-rationale |
|---|---|
| Microsoft | Copilot-feedback forbedres; Copilot kan sameksistere med domænespecifikke agenter og app-software. |
| Cloudflare | Kan tage andel i AI-inference workloads via performance- og omkostningsfordele i edge/netværksarkitektur. |
| Accenture | Kan drage fordel af enterprise-skiftet fra AI-piloter til skalerede agent-deployments, især integration, governance og change management. |
Samlet vurdering
Goldmans analyse fremstiller agentic AI som en potentiel økonomisk infleksion snarere end blot en ny produktkategori. Hovedideen er, at AI-agenter kan skabe en massiv volumenstigning i tokens, netop samtidig med at tokenøkonomien forbedres. Dermed kan AI-infrastrukturen blive mere rentabel, ikke mindre, selv om brugen eksploderer.
De vigtigste pointer er:
- Agentic AI er langt mere tokenintensiv end chatbots.
- Consumer-agenter kan løfte brugen via søgning, shopping, rejser, email og personlig produktivitet.
- Enterprise-agenter bliver sandsynligvis den største langsigtede tokenkilde.
- Tokenpriser ser ud til at stabilisere sig, mens compute-omkostninger falder.
- Det kan skabe positiv margininfleksion for hyperscalere og modeludbydere.
- Adoptionen i virksomheder bliver sandsynligvis S-formet og tager omkring 15 år til peak.
- De største barrierer er trust, governance, ROI-synlighed og implementeringskompleksitet.
- De første vindende workflows bliver teksttunge, højværdi og relativt modne — især coding.
- Voice- og multimodale workflows kan tage længere tid på grund af højere API-omkostninger.
- Goldman foretrækker især Broadcom, Nvidia, AMD, Alphabet, Amazon, Meta, Microsoft, Cloudflare og Accenture.
Kort sagt: Goldman ser den agentiske økonomi som næste fase i AI-cyklussen, hvor værdien flytter fra enkeltstående chatbot-interaktioner til løbende, autonome workflows. Hvis Goldmans cost- og adoptionsantagelser holder, kan det gøre AI-cyklussen både større, mere profitabel og mere holdbar, end markedet hidtil har indregnet.
———————–
Uddrag fra Goldman Sachs – med grafikker:
Step aside Generative AI: Agentic AI has burst onto the ever-evolving AI scene with significant promise, capturing the imagination of industry practitioners and investors alike (and certainly countless grifters). As Goldman writes in its in depth - and very bullish - report on the topic "Decoding the Agentic Economy: The Coming Inflection in AI Usage and Margins"published earlier this week "at its fullest, Agentic AI can handle a wide range of tasks currently done by humans in a fully autonomous way. On the other hand, Agentic AI could be misdirected and counterproductive, consuming vast resources with little return." In this report, Goldman outlines some of the likely use cases for Agentic AI across the enterprise and consumer sphere - and quantifies potential upside to business outcomes, along with the investment levels required. It concludes with its top trade recommendations for the sector.
Moving Past Concepts to Numbers: Goldman sees Agentic AI driving a dramatic increase in token consumption of 24X or 120 quadrillion tokens per month by 2030. The bank thinks enterprise agents will be the largest token multiplier, lifting token consumption 55X by 2040. Consumer agents will broaden usage away from episodic chats to utility beyond traditional search, driving 12X token consumption by 2030.
An Economic Inflection Point for Hyperscalers and Model Makers: Critically, Goldman outlines why Agentic AI is an inflection in token economics ahead for hyperscalers and model providers, enabled by continued token cost declines (powered by leading semi companies driving 60%-70% lower annualized cost per token) and stabilizing token prices (which have moved from ~40% annual declines to flat or increasing pricing). These improving economics are likely to improve margins and the overall economic model for hyperscalers and model providers – with a positive gross margin inflection likely over the next 3-12 months – thus making infrastructure spending more sustainable for the entire ecosystem.
Where Goldman’s view is differentiated: The bank has taken a bottom-up approach to arrive at its forecasts, building “real world” implementable agents in pseudo-code to estimate token consumption and overall costs. Similarly, on the compute cost side, Goldman used chip performance, benchmark, and pricing data to arrive at its estimates of all-in token costs and estimate margin inflection points for the industry.
Investment views: In Semiconductors, Goldman prefers Broadcom (leader in custom silicon), Nvidia (high-performance merchant solutions leader), and AMD (enterprise CPU leader & emerging GPU player). In Internet, the bank prefers Alphabet (cloud & consumer computing utility), Amazon (leader in cloud computing, eCommerce), and Meta (digital advertising & spatial computing). In Software, the preference is Microsoft (leader in broad enterprise workflows), Cloudflare (leader in edge compute), and Accenture (AI business transformation).
Executive Summary
AI Unit Economics and Margins Inflect as Agentic AI Usage Takes Off
Token explosion meets margin inflection. According to Goldman, Agentic AI could drive a step-function change in token consumption, just as token economics are beginning to improve. Although the industry is compute-capacity constrained in 2026, the bank’s inference price vs. cost analysis curve suggests that unit economics of tokens are set to improve. Leading LLM token prices have now started to stabilize – but underlying compute cost per token across Nvidia, AMD, Google TPU, and Trainium continues to fall significantly faster. In other words, the Agentic AI explosion in token consumption modeled by 2030 may not only reflect a demand story, but also a margin expansion and profitability story across the AI value chain.
Sizing the Consumer Agent Opportunity
Goldman estimates consumer AI agents can lift global token consumption 12X by 2030. Consumer AI queries are already a large and growing market, but the mix is shifting quickly as AI overviews and LLM agentic queries take share from traditional search. At ~23 billion AI queries per day by 2030, up from ~5 billion in 2025, Goldman believes up to 30% will be directed to agents across search, shopping, travel, email, and other personal productivity functions. This would add 60 quadrillion tokens per month by 2030, or 12X current global token consumption as of 2026. Simulations of basic consumer agents suggest the largest token step-up will come when agents move from user-initiated tasks to persistent, “always-on” background activity that continuously monitors context and acts when needed.
Sizing the Enterprise Agent Opportunity
Goldman estimates enterprise AI agents can lift global token consumption 24X by 2030 and 55X by 2040. Enterprise adoption of agentic AI is still early: while surveys suggest 70–90% of enterprises are experimenting, less than one-quarter are scaling agents. Ultimately, the curve will likely be S-Shaped. To validate token intensity, Goldman built simulated agents across AI-exposed occupations to estimate the minimum tokens required for an AI agent to replicate core workflows. It found that some agents can consume very high token volumes but remain relatively low cost if the workflow is mostly text-based (such as coding), but others can consume fewer tokens and carry much higher API cost because they require multi-modal processing (such as real-time calls and video). This tension means that Goldman does not expect agentic ROI to be adopted evenly across all enterprise workflows
Unit economics are inflecting for Agentic AI: Implications for margins, ROI, and CapEx
Token volume explosion may finally give way to a positive margin inflection for the hyperscalers
The Bottom Line: Agentic AI is likely to drive a step-function change in token consumption, just as token economics are beginning to improve for hyperscalers and LLM providers. By 2030, Goldman’s bottom-up framework (built on simulated agents that validate real-world token intensity) suggests that global token demand could rise by 2,400% versus 2026 levels, with consumer agents potentially reaching ~60 quadrillion tokens per month and enterprise agents reaching ~56 quadrillion tokens per month (or 278 quadrillion by peak adoption in 2040), assuming adoption broadens.
More important, the bank’s inference token price vs. cost curve suggests that the underlying unit economics of tokens are set to improve significantly for hyperscalers and LLM providers. Leading LLM token prices have declined rapidly, but have now started to stabilize or even increase in some cases. At the same time, Goldman’s calculated all-in token compute costs for hyperscalers and LLM providers – powered by Nvidia, Broadcom, AMD, and Marvell appear to be falling faster. Simply put – the dramatic increase in token consumption modeled by 2030 may not only reflect a demand and revenue narrative, but also a margin expansion and profitability story across the AI value chain.
In the first phase of the AI cycle, investors largely viewed compute and tokens as a cost driver: more usage meant more inference load, more accelerators, more power, and more CapEx. But the shape of the inference price vs. cost curve suggests that this trend is changing. Although leading LLM token prices have declined meaningfully over time they have now started to stabilize or even increase in some cases. At the same time, the calculated all-in cost of compute per token for Nvidia, Google TPU (Broadcom), AMD, and Trainium (Marvell) continues to fall rapidly and more consistently. If token prices stabilize at levels higher than token costs, this implies that an increase in agentic AI adoption could produce positive margin expansion, not just revenue growth. In other words, the industry could be moving from a phase where inference economics were uncertain and potentially dilutive to margins to a phase where token growth increasingly drops through at attractive incremental margins.
Agentic AI may also create a self-reinforcing economic flywheel as compute costs decline:
- lower compute cost per token enables richer, more complex agents;
- richer agents consume considerably more tokens through longer context, more loops, more validation, and more persistent monitoring;
- higher utilization improves the economics of AI infrastructure and better economics allow providers to keep investing in model quality and distribution.
This flywheel is very different from the prevailing market narrative that AI usage will simply drive an increasing and unsustainable cost burden. However, there continues to be a significant risk that this positive margin inflection is not guaranteed across all AI workloads. Competition could still force token prices down faster than compute costs, especially for more commoditized text-only chatbots.
But even assuming some level of pricing pressure, Goldman believes that the overall price vs. cost trend suggests that the industry has significant room for economic improvement as accelerator efficiency, model optimization, routing, caching, and utilization continue to scale. The key investment conclusion is not that every token will be profitable, but that the marginal economics of agentic AI may improve at the same time that token volumes accelerate. That combination is what makes the agent economy potentially much larger, more profitable, and more durable than a simple extrapolation of today’s chatbot usage would imply.
Agent ROI: Falling token costs pull more enterprise use cases into the money
The Bottom Line: The margin story is not just about cheaper tokens; it is about cheaper tokens expanding the set of enterprise AI use cases that can deliver positive ROI to the ultimate AI end consumer: enterprises. To test that, Goldman translated its simulated agent workloads into estimated cost per task and compare those costs against the human labor they could augment or replace across coding, call center, and data-entry workflows. The key conclusion is that even workflows that consume millions of tokens per day can still compare favorably against human labor costs, particularly as token prices continue to deflate.
However, Goldman does not expect this ROI to unlock evenly across all enterprise workflows yet. A coding agent could consume roughly 7mn tokens per day at only ~$13/day, which helps explain why software development has already seen faster agent adoption: the workflow is high-value, largely text-based, and the tooling ecosystem is comparatively mature. By contrast, a call center agent could consume roughly 2mn tokens per day but cost ~$92/day if it relies heavily on real-time voice, making full voice-based automation materially more expensive and operationally complex than outsourced human labor presently.
Still, the direction of travel is increasingly favorable. As token costs fall, the breakeven threshold for enterprise agents moves lower, pulling more workflows into positive ROI over time. This is the core economic bridge between Goldman’s bottom-up token estimates and enterprise adoption: agentic workflows may be token-intensive, but if the cost per token declines faster than the complexity of the workflow rises, then many agents can generate attractive returns well before they reach full autonomy. For investors, this means the enterprise agent opportunity should not be evaluated only through today’s product maturity or near-term deployment friction. It should be evaluated through a declining cost curve, where each step down in compute costs expands the set of workflows that software, services, and infrastructure providers can economically automate and generate positive ROI.
The Consumer Agent Landscape
What does the consumer landscape look like today?
Goldman sees trends of increased consumer utilization and a more robust product innovation landscape that continues to reflect progress of evolving consumer habits against the backdrop of an evolving computing landscape. This shift is characterized as a transition from consumers using AI for conversation and tool-based functions (prompting chatbots for information retrieval/action) and towards a potential end state where LLM agents and AI systems may play a more involved role in consumers’ lives day-to-day, given the autonomy to execute multi-step workflows directly within a user’s computing environment without being prompted by the user.
To level-set, consumer AI adoption continues to be broad-based and accelerating, as users increasingly turn to generative AI products with queries (away from traditional search) and interact with AI applications more frequently due to embedded features within consumer-facing products, as companies seek to implement more features in existing products and interfaces (i.e. shopping assistants such as AMZN’s Rufus) while developing and launching AI-native products.
Monthly active users (MAUs) continue to scale across leading AI platforms (both on web and in-app). Goldman note that consumer preference for AI platforms continues to broaden and diversify as well, both as a function of operators pushing product/platform developments that capture user intent and as consumers become more comfortable with AI, finding incremental value in specific platform strengths. Gemini has established itself as the second-largest application by leveraging Google’s extensive distribution channels, showing substantial growth in both user acquisition and engagement.
Given the spectrum of consumer-facing AI applications and use cases available today, Goldman identifies three emerging categories of AI products: a) conversational chatbots, b) embedded tools, & c) agents. These are delineated by increasing levels of interference/autonomous functionality performed by an LLM, with decreasing reliance on instruction/prompts from the user:
1. Chatbots. Today, chatbots (i.e. ChatGPT, Gemini, Claude) represent the lowest-friction entry point for consumers – tools that the user accesses voluntarily and prompts to create an output, typically as a medium for discovery and with the goal of identifying information.
2. Embedded tools. Embedded tools, such as Gemini’s Personal Intelligence system, sit one layer deeper in the stack and are adopted by users who seek to use AI in a more interactive role in their lives and still operate on a set of instructions from the user. These tools operate autonomously/independently (beyond the generalist chatbot function) given a set of instructions, coordinating with integrated apps/personal data history to execute on and suggest tasks.
3. Agents. The third and most autonomous degree today, agents are characterized by minimal direct user input, running persistently in the background across devices or applications.
Where are we going?
The key enablers to broader consumer agent adoption are a) mitigation of user friction and value realization on the consumer end & b) operators overcoming supply constraints to meet compute demand, both at current levels and at scale. Historical technology cycles (Web 1.0 – first iteration of Internet, Web 2.0 – mobile/interactive computing) show that durable consumer adoption takes hold when technology fundamentally reduces user friction.
Goldman frames forward consumer AI adoption through the lens of daily query volume and shifting query share mix as consumer computing habits evolve – with traditional search engine volume growth moderating and LLM query volume accelerating as consumers increasingly utilize generative AI products/tools over time. The bank’s base-year view reflects a market where traditional search still dominates query volumes, but AI has already introduced incremental demand. Looking to 2030, Goldman assumes continued growth in total consumer queries, driven by a mix shift towards LLM mediated and AI-native interactions, partially offset by ongoing compute costs (albeit decreasing).
As per-query costs decline, absolute compute demand rises as AI expands the surface area of consumer interaction.
Consumer Agents: Sizing the Token Impact
Consumer agents could expand AI demand from answering questions to running everyday workflows
The Bottom Line: Goldman estimates that consumer AI agents could become a significant driver of token volumes as consumer AI behavior shifts increasingly from episodic chatbot sessions to on-demand agents and always-on personal assistants (e.g. OpenClaw). Assuming broad adoption of these agents by 2030, the bank estimates that consumer agent workloads could reach ~60 quadrillion tokens monthly and global token consumption could rise ~12X from current levels.
Goldman estimates the potential for ~23 billion AI queries per day by 2030 (up from ~5 billion queries per day in 2025), with up to 30% of consumer queries directed to LLM agents across search, shopping, travel, email, and other personal productivity functions. This would result in an additional ~60 quadrillion tokens processed per month by 2030, or ~12X the current global token consumption as of 2026.
In short, the key consumer token multiplier is not simply “more chatbot usage”, but instead a major shift in usage patterns. Today, most consumer AI usage is still episodic: a user asks a question or gives a prompt, receives an answer from the LLM, and exits. Goldman sees evidence of this behavior in the average tokens consumed per query, which as of 2025 was an average of 1,715 tokens per query (equivalent to a ~3-5 minute session with an AI chatbot). However, in an agentic consumer model, the AI tool plans, executes, monitors, and acts on the user’s behalf across a range of daily tasks. The single largest token step-up is likely to come as agents move from user-initiated tasks to persistent or “always on” background activity, where the agent continuously monitors context and decides when action is needed.
Consumer agents: two distinct models, divergent token demand
To frame a realistic forecast of the consumer AI agents in coming years, consumer agents can be segmented into “on-demand” and “always-on” use cases – and today’s agentic products already appear to follow this line. Tools such as OpenAI Operator, Claude Code, and other browser-based agents are predominantly on-demand. This means users must initiate a task – and from there, the agent then plans, takes actions, loops through execution, and finally stops once the task is complete and approved. By contrast, Goldman believes the next wave of consumer agents may look more like persistent email monitors, digital life assistants, or schedule managers that perpetually run in the background and act only when needed (seen with OpenClaw). Understanding the token consumption model of this latter always-on consumer agent category matters most because enterprise agents are also persistent by design, having to operate continuously across business workflows rather than waiting for discrete user prompts
Inside the agent: why agents consume significantly more tokens
To ground its consumer token forecast, Goldman built simulated agents that show how token usage accumulates across a full workflow rather than a single chat response. The travel booking agent flow (Exhibit 14) illustrates the core mechanism: an agent parses intent, clarifies missing inputs, searches external systems, ranks options, incorporates user feedback, validates availability, and loops before completing the task.
The accompanying tables translate that logic into token counts by step, showing why even simple consumer agents can become materially more token-intensive as they add context, tools, validation, and repeated decision loops (Exhibit 15).
The bank observed through its simulated agents how token demand rises materially as agents shift from discrete, user-initiated tasks to persistent background operation. Based on the agents tested, LLM chatbots consume roughly 1,000 tokens per session, while embedded copilots consume >5,000 tokens per day, or several times more per user before accounting for broader adoption. Embedded copilots can reach >100,000 tokens per day, highlighting that the biggest driver of token demand is not just task complexity, but how continuously the agent remains active.
The Enterprise Agent Landscape
Goldman sees two key milestones in the Enterprise Agent landscape over the past two years: 1) Salesforce announced Agentforce in August 2024, which raised the ceiling for AI implementations (at least in theory) in customer experience from text-based RAG chatbots to more sophisticated processes; and 2) Anthropic announced Claude Cowork in January 2026, which added a knowledge worker wrapper to its coding tool. In practice, enterprise adoption of agentic AI is early. Headline survey data suggests upwards of 70–90% of enterprises are now experimenting with AI agents, but functional penetration is limited (McKinsey; PwC). Fewer than one quarter of firms are scaling agents and adoption is typically confined to one or two workflows. These are generally bounded, high‑ROI use cases: customer support triage, IT operations, sales enablement, and internal knowledge workflows. (conversations with WRITER showcase their success with targeting very specific workflows). Additionally, these are often under the banner of “copilots” rather than fully autonomous agents.
The key enablers are organizational rather than technical. Advances in foundation models, tool‑calling, and memory architectures have removed much of the core technical challenges, such that cloud software vendors are shipping agents that increasingly can do more. The main area of focus in industry conversations has shifted from model performance to orchestration; enterprises are less focused on whether agents can act, and more focused on how and when they should be allowed to act. As a result, today’s deployments are overwhelmingly human‑in‑the‑loop, workflow‑embedded, and tightly permissioned, reflecting a bias toward augmentation rather than autonomy. There are two limiting factors:
- Trust at scale: data governance, security boundaries, deterministic vs. non-deterministic controls, auditability, and change management Enterprises struggle most where agents must operate across systems of record, particularly when accountability for outcomes is diffuse.
- ROI visibility: benefits often show up as productivity gains rather than hard cost take‑out, which complicates budgeting and ownership. In addition, new highs on inference costs illustrate the importance of guardrails on cost.
In short, agentic AI is technologically ready, but economically and institutionally constrained.
While Goldman stops short of predicting an inflection in the next 12 months, it does believe enterprises will see gradual adoption in agent use cases as risk management frameworks improve and POCs move to production. The next phase of enterprise adoption is likely to center on domain‑specific agents embedded deeply within vertical workflows (finance, customer support, etc.) where success criteria are well-defined and error costs are understood; the bank models examples of these in the following section. Over time, enterprises that invest early in agent governance, observability, and exception handling will unlock higher levels of autonomy, shifting agents from task execution to goal‑level orchestration.
What will adoption look like over the next 10 years?
Goldman introduces an adoption curve for enterprise agentic AI to inform investors’ forecasts of compute demand, software demand, and IT budgets, which in turn serve as inputs for the companies across Goldman’s TMT coverage. The two questions to focus on are: i) what is peak adoption, and ii) how long does it take to get there? To guide its forecasts, the bank leveraged Comin and Hobijn’s 2009 dataset on Cross-country Historical Adoption of Technology (CHAT). This project compiles data on adoption of 101 technologies in 161 countries from 1800-2000. Many of these technologies are primarily consumer-driven, but there are 39 technologies that are inputs to enterprises and have a robust enough timeseries for robust analysis. Understanding the diffusion of these prior technologies is a helpful guide to understanding how technologies become adopted over time. Ultimately, Goldman predicts that the curve will be S-Shaped and agentic demand 20x-25x what it is today, using these historical analogues and what we know about the technology today.
1. What is peak adoption?
- Peak adoption is more complicated than simply measuring the percent of enterprises using agents. Some technologies work this way; Comin and Hobjin’s data on number of telephones per capita shows no country went above a 1:1 ratio, while data for radios shows ~2 is the natural ceiling. Given homes, offices, and some public places are the most likely places to have phones and radios, it makes sense that the ratio was roughly ~1-2 per person. Other technologies, however, do not have an obvious ceiling. The number of mail parcels sent per capita varied widely by country, and the range of 7-665 per person does appear constrained by an inherent physical constraint such as the number of households. Instead, economic factors mattered more, with richer countries generally having more mail sent per capita. Similarly, the number of telegrams sent clearly increased over time, was higher in wealthier countries, and declined during the Great Depression, meaning it was impacted by real-world economic factors. But the peak quantity sent (~1,000-~5,000) varied between countries the way mail penetration did, and did not appear governed by external factors.
- There are ways to theorize/contextualize a logical upper limit. In this context, the most logical way to measure and forecast peak agentic penetration is the way software companies are increasingly pricing their offerings: in terms of “units of work.” The two inputs here are 1) how much knowledge work there is to do at enterprises/in the economy; and 2) how much of it ends up being done by agents. The analysis in the following section is essentially the product of these two variables (with additional assumptions on incrementality/substitution), with the result being ~35-40% peak workflow adoption, implying ~1.4 trillion labor-hours, ~280 quadrillion tokens, ~$220bn flowing to AI infrastructure annually at peak and an $5.4trn software TAM.
- Another conclusion from the data is that peak adoption of a new technology can be higher than prior technologies. This is because new technologies do not necessarily replace existing technologies. The various forms of communication technologies (mail, telegrams, radios, telephones, and television) were not substitutes for each other; instead the overall TAM of communication/information technology expanded. Applying this framework to agentic AI, agents do not necessarily need to substitute 1:1 for human labor; the amount of knowledge work that is done in the economy should expand due to agentic AI. Consider the example of a call center (also explored below). A simple framework that does not factor in an expanding pie would suggest that at a certain ratio of agent cost to human wages, firms will switch all their labor from human call center workers to AI agents (100% peak adoption). But as anybody who has ever been on hold with customer support can attest, there is unmet demand in the form of long wait times and customers who give up on waiting and hang up. Without resource constraints, wait times would be zero, and no customers would ever give up. AI agents can theoretically serve this unmet demand, which means the total amount of customer service provided increases. Even if firms decide to retain some of the savings of automation, some of the leftover budget becomes consumer surplus. The net of this means that firms become more efficient, more of the service is provided, and human labor is likely to evolve rather than be replaced. We have seen real-life examples of this: Navan (travel and expense management for enterprises) has >50% AI call deflection rates with high customer satisfaction. AI handles most of Navan’s basic queries, and humans support customers with more complicated needs.
2: How long does it take to get there?
- The speed of adoption varies. Rail, steamships, and landline phones all took 100 years or more to peak. Cable TV, ATMs, and new surgery techniques took 20 years or less. The median technology in the dataset took 29 years
- The shape of the curve varies. Some technologies very quickly diffuse following their invention (and/or an up-front investment period), following a J-Curve (internet in the 1990s, the blast oxygen furnace method for steel production). Some follow an S-curve (at-home kidney dialysis, the telegram) where adoption starts slowly, then accelerates rapidly following a tipping point (often driven by network effects), before plateauing. Finally, some technologies lack virality or face significant capital constraints, such that the spread of the technology is more constant over time (postal volumes/electricity output). We can see elements of all of these factors for agentic AI, given: a) the significant capital constraints the sector faces implies linear adoption; b) token economies of scale/network effects from figuring out use cases with ROI implies an S-curve adoption; c) the rapid trial from enterprises and consumers ~3 years on from ChatGPT’s popularization implies virality and hence J curve adoption. The J-curve scenario is the most bullish for demand, while the linear scenario implies a more tempered outlook.
- The speed of adoption is independent of the shape of the curve. For example, both at-home kidney dialysis and the telegram roughly followed S-Curves, but at home dialysis took ~25 years to reach peak penetration while telegrams took 70. Because agentic AI is a non-physical good, and because the peak is bounded by economics and not per capita constraints (e.g. number of households), the peak of demand might be so large that the technology could take a long time to defuse. On the other hand, the pace of change may be speeding up with physical mail diffusing for millennia before peaking (in 2006 in the US), telegrams/the radio each taking ~75 years, and the internet at ~75% penetration in 2025 according to the United Nations (link) after 36 years since the advent of the World Wide Web. Goldman therefore thinks a <20 year time-to-peak is not unreasonable for agentic AI.
Goldman lays out its baseline enterprise agentic forecast:
1. An S-Curve is most likely: We are in an initial trial/adoption period. As we near 2030, Goldman expects a critical mass of firms will begin adopting agents at the steepening part of the curve followed by a leveling out as the laggards adopt. It will take time for the technology to diffuse given: 1) There are upfront costs outside of tokens that only need to be spent once, such as data hygiene and process management; 2) there are institutional barriers that need to be overcome; 3) enterprises are still figuring out the use cases, which in the short term will skew domain-specific and more likely to leverage SLMs than LLMs. Further out, though, data hygiene and process management will be less of a barrier as once an enterprise has leveled up its data warehouse strategy, new data created will by default live within an agent-friendly framework. As the use cases are figured out and workers get used to the technology, adoption will follow.
2. Goldman sees a time to peak adoption of 15 years. Expect the time to peak adoption to be faster relative to the median technology (26 years), with a peak of diffusion of 15 years. There are plausible arguments for faster/elongated time to peak adoption. On the side of faster adoption: a) newer technologies appear to be reaching mass adoption more quickly than before; b) a key input in building software (the cost of code) is going down enabling more experimentation/finding use cases faster; c) we know that startups are reaching milestones of $100mn and $1bn much faster than before because of their success in monetizing specific use cases; d) enterprises are acutely aware of the need to show progress with AI roadmaps to minimize the risk of market share loss to more tech-forward competitors. On the side of a longer path to peak adoption: a) the enterprise technologies that took longer to diffuse tended to be commodity technologies with wide applications like air freight electricity, or telegrams; b) the long tail of laggards may take a very long time to spread.
Enterprise Agents: Sizing the Token Impact
Enterprise agents could drive the greatest token consumption
The Bottom Line: By building model agents which simulate workflows across AI-exposed occupations, Goldman estimates the minimum tokens required for an AI agent to replicate core workflows is likely to be far greater than basic chatbots given its always-on, persistent nature. Assuming ~37% knowledge-worker adoption by 2040 (peak adoption), global token consumption could rise by ~55X from current levels.
Goldman estimates that ~37% adoption across global knowledge workers could drive total token consumption to ~55X current levels (or ~278 quadrillion tokens processed per month) by 2030, with enterprise workloads representing >70% of total usage. Although consumer agents have already reached a larger user base than enterprise agents in 2026, enterprise agents should be materially more token-intensive per user and eventually become the primary source of incremental token demand by 2030. Critically, this is because enterprise workflows require agents to perform considerably more complex and precise actions (e.g. monitor tasks, retrieve context, reason through exceptions, validate outputs, update systems, and escalate issues throughout the workday). Enterprise agents may also involve heavier multi-modal inputs (voice, images, documents, screen activity, application data, logs, and structured system records) in order to replicate the complex and varied workflows of real knowledge workers, which can materially increase token intensity versus text-only consumer prompts. In short, the core difference is rigor: a consumer agent can often stop at a “good enough” answer, but an enterprise agent may need multiple loops to produce work that is accurate, auditable, and usable for real business processes.
Where is AI being used in today’s enterprise?
To estimate token intensity, Goldman built model agents for the most exposed occupations (according to Anthropic’s AI usage observations) in order to gauge where enterprise token demand could scale first. Specifically, Anthropic defines “observed exposure” based on actual Claude usage patterns mapped to occupational tasks, rather than a purely theoretical assessment of whether a job could be automated. In Goldman’s analysis, these occupations represent roles where users are already observed applying AI to meaningful parts of the workflow, including software development, customer service, data entry, administrative support, QA, security analysis, and other knowledge-work functions.
Goldman built its own model agents to move beyond abstract “job exposure” and directly estimate tokens consumed by task steps, reasoning loops, validation checks, tool calls, and multi-modal inputs. It then used these bottom-up agent builds to triangulate how total enterprise token demand could scale through 2030 as adoption broadens across exposed occupations.
Why roles exposed to potential automation may not translate to rapid agentic AI adoption
The Bottom Line: In the enterprise, agentic AI adoption will depend less on token volume and more on the full API cost of replicating a user’s workflow. The radial charts (Exhibit 26) show a key tension: some agents can consume very high token volumes but remain relatively low cost if the workflow is mostly text-based, while others can consume fewer tokens but carry much higher API cost because they require real-time voice, cadence, latency, and multi-modal processing.
This is clearest in the contrast between coding and call center agents. Goldman estimates a coding agent could consume roughly 7mn tokens per day at only ~$13/day, which helps explain why software development has already seen faster agent adoption: the workflow is high-value, largely text-based, and the tooling ecosystem is comparatively mature.
By contrast, a call center agent could consume roughly 2mn tokens per day but cost ~$92/day if it relies heavily on real-time voice, making full voice-based automation materially more expensive and operationally complex. As a result, many customer-service workflows may initially shift toward text-first automation, with voice used selectively rather than continuously. Finally, Goldman estimates a data entry agent could consume roughly 25mn tokens per day and cost ~$59/day, reflecting a low-modality but high-volume workflow that may still be considerably cheaper than the cost of a human per day.
Overall, adoption speed by occupation will depend on four variables: token volume, API cost, modality mix, and implementation complexity. Text-heavy workflows with mature tools should scale first. Voice-heavy or deeply integrated back-office workflows may scale more slowly, even when the underlying tasks are highly exposed to potential AI automation. As a result, adoption may be uneven across occupations because the most exposed roles are not necessarily the cheapest to automate. Enterprises may first optimize existing workflows around current price points before full-scale human-equivalent agents become cost-effective. However, as model costs fall and performance improves, organizations may increasingly integrate agents into production.
Inside the agent: What simulating Goldman’s own agents taught us about enterprise agent architecture
The Goldman methodology consisted of building occupation-specific, minimally-functional agents on paper by decomposing each role into workflow steps, model calls, tool calls, validation checks, and retry loops. This is an appropriate way to measure token intensity because enterprise agents will rarely run as a single end-to-end model, and Goldman’s agents represent the likely minimum token intensity for such workflows.
What distinguishes an AI agent from a chatbot? A chatbot typically responds to a discrete user prompt; an autonomous enterprise agent must ingest context, decide what to do, retrieve relevant information, execute the task, check its own work, resolve errors, update systems, and produce a usable output. That means the agent is not a single model call, but a sequence of model calls across a workflow. Different steps may require different model capabilities. For example, a lower-cost model may classify or extract information, a stronger reasoning model may plan, debug, reconcile conflicts, or resolve exceptions, a code model may write or test software, and a vision or document model may parse images, PDFs, screenshots, or forms. The resulting architecture looks less like a chatbot and more like a workflow system with LLMs embedded across each decision point.
What drives token intensity for AI agents? One key driver of token intensity is the loop structure required to make the output reliable enough for enterprise use. For a software coding agent, loops are needed to read the codebase, generate an implementation plan, write code, run tests, debug errors, review security or performance issues, and then revise until the output passes functional tests. In the data entry agent, loops are needed to ingest documents, extract fields, compare against source systems, flag inconsistencies, correct errors, and validate that the final entry is complete. These loops increase but do not guarantee accuracy, as they simply create more opportunities for the agent to catch and correct mistakes before the output reaches production.
Bottom line: The token consumption curve can steepen quickly in complex workflows. A clean task may pass through the loop once; however, a messy task with incomplete inputs, failed tests, or additional human guidance may trigger multiple rounds of reasoning and validation.
* * *
Investment Implications
For hyperscalers and model providers, the rise of consumer agents and agentic computing represents a growing driver of compute demand while also creating an opportunity for value unlock, marking a shift where value lies in distribution and monetization of intent/utilization across both consumer and enterprise landscapes. Overall, operators remain supply constrained in their ability to meet current/forward compute demand (both internally and externally) and continue to invest in the necessary infrastructure to support an evolving computing landscape and broad-based AI adoption. As capex intensity remains elevated (with GOOGL & META raising FY2026 capex estimates and AMZN mgmt reiterating their strategy of maintaining elevated capex coming out of Q1’26 earnings), Goldman expects investors to increasingly look for evidence of scope/visibility for returns, against the broader market debate if large-scale AI usage can become economically attractive to justify the capex cycle. Goldman prefers the following stocks (covered by Eric Sheridan):
- AMZN (Buy, $325 12-m PT): Continue to see visibility into returns as AWS revenues compound, supported by a reported $364bn revenue backlog, driven by both AI workloads and rising momentum around its custom silicon (Trainium, Graviton, etc.).
- GOOGL (Buy, 12-m $450 PT): Alphabet is seeing momentum across its Cloud business and Search multi-modality, leveraging a full-stack approach as management continues to see AI repositioning the company for sustained growth.
- META (Buy, 12-m $830 PT): Meta remains a leader in its core advertising business (significantly outpacing total digital ad industry growth) as the application of AI-related compute is driving momentum around engagement and ads monetization.
For semiconductor companies, Goldman sees a clearly positive impact from ongoing CapEx spending from hyperscalers and LLM providers. Falling token costs enabled by merchant GPU and ASIC leaders will make token-intensive use cases economically viable and hence increase the addressable compute market as volume elasticity more than offsets lower compute costs. Just as important, the positive margin inflection for hyperscalers and LLM providers means that their margins can improve significantly – thus creating significantly more headroom for increased CapEx and making today’s elevated infrastructure investments sustainable. Goldman prefers the following stocks (covered by Jim Schneider):
- AVGO (Buy, 12-m $480 PT): As the market leader in custom computing, Goldman sees more hyperscalers (Google) and LLM model providers turning to Broadcom to deliver cost-optimized chip solutions tailored to their specific workloads.
- NVDA (Buy, 12-m $250 PT): Nvidia can retain its dominant market leadership in the medium term as it remains the leader in AI performance across a broad range of training and inference workloads.
- AMD (Buy, $450 12-m PT): AMD’s market position is strengthening as the company scales its high-performance datacenter GPU offerings over the next two years. Importantly, AMD is also poised for an increasing share of agentic AI workloads in the enterprise as it gains share in X86 server CPUs and the CPU attach rate increases.
For software and IT services companies, the margin story is more nuanced but there are longer-term tailwinds. Lower token costs make it easier for software vendors to embed agents into existing products without significantly impacting gross margins, while also allowing them to price around outcomes, productivity, or units of work rather than seats alone. This supports Goldman’s argument that agentic AI can expand software TAM: if the cost of delivering an automated workflow falls while the value of the completed work remains tied to labor substitution or productivity gains, software companies can capture a spread between falling AI delivery costs and the much larger value of the task being automated. Meanwhile, IT Services companies to benefit as agents shift AI consumption from standalone tools to enterprise-wide, integration-heavy workflow transformation – increasing demand for integration, governance, and managed orchestration to levels not yet seen with standalone tools. Goldman prefers the following stocks:
- MSFT (Buy, $610 12-m PT, covered by Gabriela Borges): Copilot feedback is getting better and the E7 upgrade cycle may drive further acceleration in Microsoft 365. The most likely scenario may be an ecosystem where Copilot coexists alongside domain-specific agents and domain-specific app software, and the usage of one pulls through usage of the others reciprocally.
- NET (Buy, $250 12-m PT, covered by Gabriela Borges): Goldman expects Cloudflare to take outsized share of AI inference workloads because of its performance and cost advantages, in turn driven by its architectural network advantages and the sophistication of its isolates software.
- ACN (Buy, $300 12-m PT, covered by Jim Schneider): Goldman expects Accenture to see growing tailwinds from agentic adoption as enterprises increasingly move from AI pilots to scaled agent deployments, driving demand for integration, workflow redesign, governance, and change management.
2 md. adgang for
2 x 49 kr.
Få straks adgang til denne artikel og derefter 2 måneder til alle artikler på ugebrev.dk
- Alle artikler på ugebrev.dk
- Om investering, finans, ledelse, samfundsansvar, life science og Bestyrelsesguiden.dk
- Daglige nyhedsmails med nyheder og analyser
Tilbuddet gælder til 31. juni 2026. Abonnement fortsætter til normalpris på 249 kr. efter bindingsperiode på to måneder. Opsig når du vil - til udgang af den anden måned. Tilbud gælder kun, hvis du ikke har haft abonnement på ØU udgivelser de seneste tre måneder
Allerede abonnent? Log ind her


























