technology

Are Agent Swarms the Next AI Paradigm?

The AI Daily Brief: Artificial Intelligence News and Analysis·Jan 28, 2026·22:27

About this Episode

Agent swarms are quickly moving from theory to practice, with early 2026 model releases making coordinated, multi-agent work feel like a real shift rather than a niche experiment. This episode focuses on Moonshot’s Kimi K2.5, what its agent swarm design reveals about the future of AI work, and why this may mark a transition from single assistants to teams of AI operating in parallel. In the headlines: Anthropic’s huge new funding round and revised revenue forecasts, Nvidia chip sales reopening in China, a UK-wide AI upskilling initiative, and new agentic features from Google and Chinese labs.

Brought to you by:

Zencoder - From vibe coding to AI-first engineering - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠http://zencoder.ai/zenflow⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

Optimizely Opal - The agent orchestration platform build for marketers - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.optimizely.com/theaidailybrief⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

AssemblyAI - The best way to build Voice AI apps - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.assemblyai.com/brief⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

Section - Build an AI workforce at scale - ⁠⁠⁠⁠⁠https://www.sectionai.com/⁠⁠⁠⁠⁠

LandfallIP - AI to Navigate the Patent Process - https://landfallip.com/

The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://pod.link/1680633614⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

Interested in sponsoring the show? [email protected]

Hosts & Guests

Nathaniel Whittemore

Host

Transcript

Today on the AI Daily Brief is 2026 going to be the year of AI agent's

forms, before that on the headlines, some big jumps in Anthropics fundraising and

revenue. The AI Daily Brief is a daily podcast and video about the most

important news and discussions in AI.

Alright friends, quick announcements before we dive in. First of all, thank you to

today's sponsors, KPMG, Zencoder, and Super Intelligent. To get an ad-free

version of the show, go to patreon.com slash AI Daily Brief or you can

subscribe on Apple Podcasts to learn more about sponsoring the show. Send us a

note at sponsors at aidealybrief.ai. Also, if you were interested in the

research that we did at the end of last year, we have our next research

kicking off soon. To keep track of all that, as well as to hear about future

products we have coming AI maturity maps, AI opportunity radars, and much more,

go to aidebintel.com where you can sign up to get that information as soon as it comes out.

Now with that out of the way, let's dive in. Welcome back to the Aidealy Brief

headlines edition, all the daily AI news you need in around five minutes.

We kick off today with some fundraising and business news out of Anthropic.

The company is close to finalizing their latest funding round, which could raise more than 20

billion dollars. Reports state that Anthropic has between 10 and 15 billion

firm commitments that could be finalized early next week, including the Singapore

sovereign wealth fund and Sequoia making large investments. Anthropic has also recently

doubled the size of the round from 10 to 20 billion in response to excessive interest.

One investor told the financial times that the round was five to six times over-subscribed

before the size increase. In addition to venture capital and sovereign wealth,

Microsoft and Nvidia have also committed to invest a total of 15 billion in the company,

which is on top of the 20 billion from investment firms. The round would reportedly value Anthropic

at $350 billion, almost a doubling from their series F, which closed in September.

The fundraising frenzy firmly cements Anthropics momentum. Last year, remember,

OpenAI raised 40 billion anchored by 30 billion from Softbank, meaning that Anthropic is now

neck and neck with those figures. In addition to fundraising news, the information has an update

on Anthropics revenue growth forecasts. They report that Anthropic updated investors in December

and hiked forecasts across the board. 2026 revenue is now expected to come in at 18 billion,

around a 4x increase from last year's numbers, and up 20% from estimates made last summer.

In 2027, Anthropic expects to generate 55 billion in revenue. For 2029, their most optimistic

forecast calls for 148 billion. That forecast is particularly notable as its 3 billion more than

OpenAI's last forecast, which was made during the summer. OpenAI, of course, may have hiked

expectations since then, but still very notable that Anthropic believes they could overtake OpenAI

within three years. The other big number from the financial update was Anthropics increasing

training costs. They expect to spend 12 billion on training this year, which is a 50% increase

from summer projections. Their forecast also project training costs to exceed 100 billion by

2029. These increased costs push back Anthropics timeline for profitability by a year,

with the company now expecting to flip cash flow positive by 2028.

Now, one of the things that Dario and Anthropic have, of course, been weighing in a lot about,

is chip exports to China, with Anthropic being firmly in the camp that we should not be

exporting chips to China. An update on that front, as Beijing has approved

the first batch of NVIDIA chip imports. Rotors reports that Chinese officials have

improved the import of several hundred thousand H-200s, allowing access to the advanced chips for

the first time. Sources said the first batch of approvals were primarily allocated to three

unnamed tech giants. The Wall Street Journal later named Alibaba and bite dances two of the

three receiving approval. Other enterprises are still in the queue awaiting a subsequent round of

approvals, presumably including high flying startups like DeepSeek, who may have to wait in line

to set up their H-200s. Reports stated that Chinese AI firms will be required to support local

chip makers as well, using their chips for some training tasks and most AI inference.

Basically, it seems like officials are trying to strike a balance, allowing Chinese companies

to train advanced models while also protecting domestic chip makers.

Now, this could be a huge boom to NVIDIA's first quarter financials. Several hundred thousand

H-200s is in the ballpark of 10 billion in sales, and that's only the first round of approvals.

In Q2 of last year, when Chinese chip exports were shut down by the US government,

NVIDIA reported a $5.5 billion right down associated with losing Chinese sales.

That implies NVIDIA could see record Chinese sales this quarter simply based on this first round

of approvals. NVIDIA CEO Jensen Huang is currently visiting China to meet with local employees,

but reports suggest that he hasn't met with any senior officials. That said, his next stop is

Taiwan, where people familiar with the trip set he plans to ask suppliers to bump up H-200 production

to meet Chinese demand. Moving over to the training side of the House, the UK government

has expanded their AI training initiative with an ambitious new goal to upskill every worker

in the country. The Department for Science, Innovation, and Technology announced on Tuesday that

free AI training will be made available to every adult worker. The training will come in the form

of 20-minute online courses with modules covering use cases like drafting text, content creation,

and automation of administrative tasks. Technology Secretary Liz Kendall said,

We want AI to work for Britain, and that means ensuring Britain's can work with AI.

Change is inevitable, but the consequences of change are not. We will protect people from the risks

of AI while ensuring everyone can share its benefits. New partners including Cisco,

Cognizant, and the National Health Service will join existing partners including Amazon,

Google, Microsoft, and Salesforce, and the Upskilling Initiative. The Department claimed this would be

the largest targeted training program since the establishment of Open University in the late 1960s,

which delivers distance learning for higher education. They said the program had already

delivered a million courses and the government would aim to retrain 10 million workers by the

end of the decade. Workers that complete the training will be certified within AI Foundation's

badge to give employers confidence they have basic AI skills. Now, there is a lot that we could

say about this. The cynic in me of course sees all of the potential challenges with this program,

most of which sort of amount to a question of whether this is too little to move the needle,

but we got to start somewhere. Governments need to get involved in a way that is actually

helpful to people adapting to a new world rather than just trying to pretend that they have control

over whether that new world exists. And so for that reasons, I think this is a good thing,

and I'm excited to see it hopefully go even farther than they're thinking right now.

Now, our main episode today is about a new model out of China and its agent swarm capabilities,

but Ali Baba's Quen team also released a new model earlier this week, specifically called

Quen 3 Max Thinking. Now, as you can probably tell from the naming convention, this is the

big flagship model from the Quen team. They're equivalent of GPT-52 Pro, Gemini 3 Pro, or Opus 45.

The model makes use of an inference technique that the Quen team are calling heavy mode.

Quen is doing things slightly differently from existing approaches to test time scaling,

generating a response, then feeding it back into the model for improvements in a recursive loop.

It appears to be generating some pretty significant gains. Quen said that this method

improved benchmark scores on GPQA, which is a PhD level science test, from 90.3% to 92.8%,

on live code bench scores jump from 88% to 91.4%. Overall, the benchmarking looks pretty strong.

Now, the cost is a little beefy for a Chinese open source model.

Quen 3 Max Thinking comes in and around the same cost as Claude Haiku 4.5,

meaning that it's still much cheaper than models like Gemini 3 Pro, or GPT-5.2,

but about 10 times more expensive than Deepseek V3.2.

Now, Quen 3 is already being used by many American companies.

Airbnb CEO Brian Chesky for example recently said that his company was relying on Quen 3 as a

more affordable alternative to US models, meaning that you got to think that they will be watching

this model release closely, although again, how it stacks up compared to Kimi K2.5,

which we will talk about in our main episode, remains to be seen.

Lastly today, it's not just the Chinese labs with some interesting new product to show off.

Google has released a new feature for Gemini 3 Flash called Agendic Vision.

The feature leverages Gemini state-of-the-art multimodal reasoning with code to execute unique

capabilities. Right, Google? Agendic Vision introduces an agentic, think, act,

observe loop into image understanding tasks. Think, the model analyzes the user query in the

initial image, formulating a multi-step plan, act, the model generates and executes Python code

to actively manipulate images, such as cropping, rotating, or annotating, or analyzing them,

such as running calculations, counting bounding boxes, etc. Last is observed, the transformed

image is appended to the model's context window. This allows the model to inspect the new data

with better context before generating a final response. Overall, this promises to improve

Gemini's ability to annotate images, perform data visualization tasks, help with basic image

analysis. Google said that the loop improves model performance by between 5% and 10% across

most vision benchmarks. Still developer experience lead Omar Sansaviero hinted at the most exciting

unlock from the new feature. He showed an output of an annotated image of a table containing a spill.

Gemini had identified a spill, a piece of cloth, and several other items. The annotations appear

to be instructions for a robot to clean up the spill by first clearing away the items in the way,

debiting the cloth and wiping up the spill. The implications of course being that this

feature could be used to give robots all the fly analysis and reasoning ability, allowing them

to tackle tasks that they've never seen before. Ultimately, as I said, when it comes to new models,

the big conversation is around Kimi K2.5, and so with that, we will wrap up the headlines and move

on to the main episode. Hello friends, if you've been enjoying what we've been discussing on the show,

you'll want to check out another podcast that I've had the privilege to host, which is called

You Can With AI from KPMG. Season 1 was designed to be a set of real stories from real leaders,

making AI work in their organizations, and now season 2 is coming and we're back with even bigger

conversations. This show is entirely focused on what it's like to actually drive AI change

inside your enterprise, and as case studies, expert panels, and a lot more practical goodness that

I hope will be extremely valuable for you as the listener. Search You Can With AI on Apple,

Spotify, or YouTube, and subscribe today. If you're using AI to code, ask yourself,

are you building software, or are you just playing prompt roulette? We know that unstructured

prompting works at first, but eventually it leads to AI slop and technical debt. Enter ZenFlow.

ZenFlow takes you from vibe coding to AI first engineering. It's the first AI orchestration

layer that brings discipline to the chaos. It transforms freeform prompting into spec-driven workflows

and multi-agent verification, where agents actually cross-check each other to prevent drift.

You can even command a fleet of parallel agents to implement features and fix bugs simultaneously.

We've seen teams accelerate delivery to x to 10x. Stop gambling with prompts. Start orchestrating

your AI. Turn raw speed into reliable production grade output at ZenFlow.free.

Today's episode is brought to you by my company Super Intelligent. In 2026, one of the key themes

in Enterprise AI, if not the key theme, is going to be how good is the infrastructure into which you

are putting AI in agents. Superintelligence agent readiness audits are specifically designed to

help you figure out one, where and how AI in agents can maximize business impact for you,

and two, what you need to do to set up your organization to be best able to leverage those

new gains. If you want to truly take advantage of how AI in agents can not only enhance

productivity, but actually fundamentally change outcomes in measurable ways in your business this

year, go to psuper.ai. Welcome back to the AI Daily Brief. Today we're talking about something

that has been of interest to people for quite some time. When I first started this show, all the way

back in April of 2023, already there were people who were extremely interested in the way

that LLMs could generate code. Now it would take a couple of years and some significant advances

in the models to actually unleash vibe coding in the way that had happened over the course of 2025,

but the idea was there very early. We've similarly had interest,

invest teams of agents that can coordinate amongst themselves to accomplish more things,

even if the capability set hasn't fully been there. Which isn't to say that people haven't

been experimenting. Lindy released their agent swarm tool back in April of 2025, and the concept

is related to something that I've talked about on this show, the Doctor Strange Theory of AI Agent

Work. Now the specific point that I've made is actually about the difference in how enterprises

think agents will play out versus how I think they will play out, with the difference being that

I don't think that agents are going to be one to one replacements for existing human work.

I think that we're going to be able to deploy lots and lots of agents to scenario and war game

different types of work, which while not exactly the same as agent swarms, which are more about

breaking down complex tasks into specific subtasks, is in some ways still part of the same larger

conversation about how agents will actually work in the future. Over the last couple of days,

we have started to get the first big model releases of 2026, and maybe the most significant so far

is Moon Shots Kimi K2.5. While it is the agent swarm feature of K2.5 which has the most chatter,

it's worth checking out the broader model as a whole. Artificial analysis sums up the shift

when they write, Moon Shots Kimi K2.5 is the new leading open weights model, now closer than

ever to the frontier, with only open AI and traffic in Google models ahead, and indeed the benchmarks

are impressive. K2.5 for example claims 50.2 on humanity's last exam, which would put them ahead

of GPT-5-2 running on high settings, Opus 4.5 and Gemini-3. On a variety of other benchmarks as well,

they claim performance that matches or exceeds these premier western models.

On the overall independent artificial analysis index, Kimi jumps from

11th place overall with their K2 thinking model, into 5th only behind two iterations of GPT-5.2,

Opus 4.5 and Gemini-3 Pro, and of course the cost is cheaper than any of those models.

In a A's test, Kimi K2.5 was about four times cheaper than Opus 4.5 or GPT-5.2, but was still

much more expensive than for example Deepseek version 3.2. One of the things that Moon Shots

emphasized in their launch is the model's native multimodality. Artificial analysis again writes,

Kimi K2.5 is the first flagship model for Moon Shots to support image and video inputs.

This is the first time that the leading open weights model has supported image input,

removing a critical barrier to the adoption of open weights models compared to proprietary models

from the frontier labs. They point out that this makes a significant difference as compared to

other open weights leaders like Deepseek's V3.2. Now anytime we get a model out of China,

of course one aspect of the discourse is what it says for the state of the AI race.

On that front, there were a number of people who took to Twitter slash X to share examples of

Kimi K2.5 claiming that it was clawed. Enrico from Big AGI says identity crisis or training set.

Still overall, even with some of the suspicion of distillation of Western models,

the release of 2.5 certainly validates the recent arguments from people like Demesis Abbas

that Chinese models are very, very close to the US when it comes to performance,

if not yet having had an example of actually pushing the frontier. As Balaznamethi points out,

however, the real value in 2.5 is not as he puts it pure IQ dominance. It's about how it does

in an actual work environment. He calls it less chatbot and more employee. And indeed there are

a couple things that stood out to me about the 2.5 announcement that are really impressive.

One is the way that they're using this multimodal input capability in the context of coding.

They show an example of taking a screen recording of a website, dumping it into Kimi and asking

it to clone it with Kimi shipping that code, including UX and interactions. If this actually works

like that, it opens up a significant new frontier in AI coding that you have to imagine that everyone

will race to copy very quickly. Another thing that moonshot emphasize is how good 2.5 is that

office skills, things like financial modeling and Excel or creating high quality power points.

Now again, this could be incredibly valuable when it comes to work, although I haven't really

been able to find a ton of examples yet of people testing this out that don't just feel like

paid influencer posts. One that I found that did seem to positively test out these features came

from Shafi. He wrote, this new AI model Kimi from China created a full slide deck from my journal

article in one single shot prompt. I just gave it the keyword and journal name not even the link

or PDF to the article. It searched the article and found the correct one, developed the contents

after reading the paper, created contents for 12 slides including searching images from internet,

asked for suggestions to make edits which I declined and asked it to go ahead and generated slides

in a PowerPoint format. Everything happened inside my phone in 5 to 6 minutes. Since it's my own

article, I know it got most of the things right. And yet, as I said at the beginning, probably the

feature that people are most excited about is this agent swarm parallelization. An example that

Kimi gave was adapting Oh Henry's short story, the gift of the magi into a 10 minute short film.

They asked it to generate a highly consistent storyboard script and embed it into an Excel file,

which they said from a single prompt created a 100 megabyte Excel file generated with images

with a total of 55 scenes. Simon Willison writes the self-directed agent swarm paradigm claim

there means improved long sequence tool calling and training on how to break down tasks for

multiple agents to work on at once. He gave it the prompt, I want to build a dataset plugin

that offers a UI to upload files to an s3 bucket and stores information about them in an sqlite table.

Break this down into 10 tasks suitable for execution by parallel coding agents. He said the

response was pretty good. It produced 10 realistic tasks and reasoned through the dependencies between

them. Global soul writes tried Kimi moonshot agent swarms and it is quite magical. Basically,

they gave Kimi a list of stocks and asked it to create a report that analyzes each from a variety

of different factors. They said it created individual files for each company and overall summary

and finished the output for all companies in 10 minutes. Swix also had an interesting experience

in his testing. He writes little detail from exploring the K2.5 agent swarm preview today.

I asked it to make a custom website for the latenspace podcast and despite it being trained

to parallelize eagerly and having full permission to do so, it recognized that this was a newb task

and did a highly competent job with one agent and refunded my credits. This thing might be AGI,

I've never expected a parallel agent lab to use less than what it was trained or opted into use.

In other words, just because it could use a parallel agent structure, it recognized that

for certain tasks it doesn't need that. Client Founder, Soud, Rizwan explains a little bit about

what's going on in the background. He writes, LLMs are trained on sequential reasoning,

breaking tasks down step by step one to do after another. When you ask them to orchestrate parallel

work, they don't know how to split tasks without conflicts. Moonshot calls this serial collapse and

solved it with reinforcement learning. They used P-A-R-L parallel agent reinforcement learning

where they gave an orchestrator a compute and time budget that made it impossible to complete

tasks sequentially. It was forced to learn how to break tasks down into parallel work for sub-agents

to succeed in the environment. Simon Smith from Click Health did a full test as well and came

away pretty impressed. He writes, I've been thinking about the best way to organize agents in

step by step workflows where each agent has skills defined by an agent's skills file and to then

scale this across an enterprise. Today, Kimmy dropped its K2.5 model along with agent swarms and

I thought, could this be it? The answer? Mostly. He then locks through how you do this. First,

using Kimmy, you actually use the model selector to select agent swarm in the same way that you would

select between, for example, instant or thinking mode. For Simon's task, he gave agent swarm,

the task of responding to an RFP, which included in his words, research, strategy, creative,

brainstorming, and concept development, media planning, analytics planning, high-level project

planning, and consolidating everything into a final written response and a word document.

He continues, as would be familiar to users of agent coding tools like Cloud Code and Codex,

Kimmy turns your request into a step by step plan and then proceeds to work through it,

where things get interesting, however, is how it executes the plan with multiple agents.

For each step in the plan, he writes, Kimmy creates a set of relevant agents. And importantly,

these aren't generic agents. Agents each have roles and names. Each agent he writes plays a

specific role, defined for it in a prompt, and even gets a name in Avatar. The role description

ensures the agent focuses on a specific job to be done, and the name in Avatar make this extremely

user friendly. The model is then smart enough to figure out which agents can work in parallel,

or in the case that an agent requires the output of a different agent, how to run them sequentially.

Simon writes that you can monitor agents overall via a dashboard with progress indicators,

and also select individual agents to monitor their work. One of the important things that Simon

points out is that part of the big upgrade here is not just the performance, but the user experience.

He writes, when I think about something that would scale up to an enterprise which will include a lot

of users who won't be comfortable in something like Cloud Code in the terminal, this feels like

it would be easily adopted. It's extremely clear and intuitive. The model gave Simon both not only

the final output, but also all of the intermediate outputs from each of the distinct agents.

Now, Simon's big request, and his caveat, is that he wants access to connectors or MCPs as well

as agent skills, to be able to fully sync this with the larger ecosystem of data that people work in.

Overall, though, he says, I'm impressed. I've been waiting for something like this that makes it

easy for anyone, regardless of technical expertise, to ask AI to do something and have it complete

the task with multiple agents playing different roles and working collaboratively. This feels like

the emerging future of humans managing teams of AI agents the way they currently manage teams of

other humans. I honestly don't understand how Kimi got here first. There are other solutions out

there for agents to work together on tasks, but everything I've seen is too technical for the

average user, requiring you to use the terminal or too rigid, requiring you to pre-build workflows.

How did Kimi create such a great model with such excellent agentic capabilities and build such an

intuitive interface? Now, this is the interesting question, and why it makes me feel like we are very

much seeing the beginning of a broader phenomenon around these agent swarms. In addition to K2.5,

I've seen a couple of people talking about Cloud Code's new task system in the same context,

and so it seems like something that's probably on the minds of those folks as well.

Langchain developer Sydney Runkle is also talking about this sub-agents architecture,

all of which makes me feel like 2026 might be the year of the agent swarm.

Indeed, there's enough chatter that Ethan Mollick is making one last perhaps vain glorious attempt

to steer us away from using the swarm terminology. On Monday, he tweeted,

let's not call groups both terrifying and not a useful analogy.

Groups of agents should be called teams or organizations. It both describes how to structure them

and also how to use them. Don't let the weird AI folk naming win again.

I'm not sure where we'll land when it comes to terminology, but it really does feel like this is

something new happening, and I'm excited to see how it develops. I will be testing out K2.5,

maybe we'll do a special bonus operator's episode about that. For now, however,

that is going to do it for today's AI Daily Brief. Appreciate you listening or watching.

As always, and until next time, peace!

More from The AI Daily Brief: Artificial Intelligence News and Analysis

View all episodes →

The Week AI Grew Up

The AI Daily Brief: Artificial Intelligence News and Analysis

May 1, 202625:26failed

How Harness-as-a-Service Will Change Agents

The AI Daily Brief: Artificial Intelligence News and Analysis

Apr 30, 202628:47failed

AI Lab Power Rankings

The AI Daily Brief: Artificial Intelligence News and Analysis

Are Agent Swarms the Next AI Paradigm?

About this Episode

Hosts & Guests

More from The AI Daily Brief: Artificial Intelligence News and Analysis

The Week AI Grew Up

How Harness-as-a-Service Will Change Agents

AI Lab Power Rankings

The AI Subsidy Era is Over