technology

Code AGI is Functional AGI (And It's Here)

The AI Daily Brief: Artificial Intelligence News and Analysis·Jan 18, 2026·24:10

About this Episode

This episode argues that the most important AGI threshold has already been crossed. As coding agents learn to reason, iterate, and operate autonomously over long horizons, they unlock a form of functional general intelligence that matters for real work. Coding isn’t just another domain—it’s a universal lever that collapses the distance between idea and execution, reshaping how companies build, decide, and compete. The result isn’t a gradual improvement, but a structural shift in how work gets done.

Readings from:

https://x.com/gradypb/status/2011491957730918510

https://x.com/danshipper/status/2011617055636705718

Brought to you by:

Zencoder - From vibe coding to AI-first engineering - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠http://zencoder.ai/zenflow⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

Optimizely Opal - The agent orchestration platform build for marketers - ⁠⁠⁠⁠⁠⁠⁠⁠https://www.optimizely.com/theaidailybrief⁠⁠⁠⁠⁠⁠⁠⁠

AssemblyAI - The best way to build Voice AI apps - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.assemblyai.com/brief⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

LandfallIP - AI to Navigate the Patent Process - https://landfallip.com/

The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: ⁠⁠⁠⁠⁠⁠⁠⁠https://pod.link/1680633614⁠⁠⁠⁠⁠⁠⁠⁠

Interested in sponsoring the show? [email protected]

Hosts & Guests

Nathaniel Whittemore

Host

Transcript

Today on the AI Daily Brief, why Code AGI is Functional AGI and why Functional AGI is here.

The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI.

All right friends, quick announcements before we dive in.

First of all, thank you to today's sponsor Zencoder, robots and pencils,

section and super intelligent to get an ad-free version of the show, go to patreon.com slash

AI Daily Brief. And if you are interested in sponsoring the show, send us a note at sponsors at

aidealybrief.ai. So we are back now with another long-read slash big think episode. And this week

we're getting into a topic that I have been kind of obsessing about for the last several weeks.

It feels to me quite clear that something dramatic has shifted. Obviously I don't mean

some new model that changes everything, but more it feels as though we've digested what the latest

round of models is actually capable of. We've had enough time with them for them to start to shift

our behaviors. And the implication of all of that is fundamentally speaking some different new era

in the story of AI and more broadly in the story of work. It is a shift which I am still trying

to figure out how to put words around, but one that I am convinced has profound implications

for how companies do what they do. To some extent, the shift is starting to come home to roost

in a concerted conversation around whether we are finally at AGI. I will argue that we are with

some nuance. But what I'm going to do first is read some excerpts from a recent piece by Sequoia's

Pat Grady called 2026. This is AGI, followed up with a more skeptical piece by every Dan Shipper,

called Toward a Definition of AGI. And then I'm going to add my own thoughts,

steel manning both perspectives, and trying to end with where I think is the most useful place to be.

Let's start with Pat's piece. It's actually by Pat Grady and Sonya Huang. And begins, years ago,

some leading researchers told us that their objective was AGI, eager to hear a coherent definition

we naively asked, how do you define AGI? They paused, looked at each other tentatively,

and then offered up what's become something of a mantra in the field of AI.

Well, we each kind of have our own definitions, but we'll know it when we see it. The vignette

typifies our quest for a concrete definition of AGI. It has proven elusive. While the definition

is elusive, the reality is not. AGI is here now. Coding agents are the first example. There are more

on the way. Long horizon agents are functionally AGI, and 2026 will be their year.

Now, in the next section, Pat and Sonya make sure to qualify that they do not have any

sort of scientific authority to propose this definition. And yet, with that said, they offer

what they call a functional definition of AGI. AGI, they write, is the ability to figure things out.

That's it. A human who can figure things out has some baseline knowledge, the ability to reason

over that knowledge and the ability to iterate their way to the answer. An AI that can figure

things out has some baseline knowledge, pre-training, the ability to reason over that knowledge,

inference time compute, and the ability to iterate its way to the answer, long horizon agents.

First ingredient, knowledge and pre-training, is what fueled the original chat GPT moment in 2022.

The second, reasoning and inference time compute, came with the release of a one in late 2024.

The third, iteration and long horizon agents, came in the last few weeks with Claude code and

other coding agents crossing a capability threshold. Generally intelligent people can work

autonomously for hours at a time, making and fixing their mistakes and figuring out what to do

next without being told. Generally intelligent agents can do the same thing. This is new.

So what's an example of this new capability that they're talking about? They provided

an example of a founder telling his agent that he needs a developer relations lead. He gives

a set of qualifications, including the fact that this person needs to enjoy being on Twitter.

The agent starts in an obvious place. LinkedIn searches for developer advocate, for example.

Unfortunately, it finds hundreds of examples, so it has to iterate. It pivots they write to

signal over credentials. It searches YouTube for conference talks. From there, it finds 50 plus

speakers and filters for those with talks that have strong engagement. Next, because of that

Twitter qualification, it cross references those speakers with Twitter. The total number is now

whittled down to a dozen with real followings and posting real opinions. Honing in even further

for who's been most engaged in the last few months, that total list, which was hundreds and then

50 and then dozen, is now down to three. Now it can hone in on those three. One just announced

the new role. One is the founder of a company that just raised funding. The third was a senior

dev route at a series D company that just did layoffs in marketing. The agent they write

drafts an email acknowledging her recent talk, the overlap with the startup's ICP,

and a specific note about the creative freedom a smaller team offers. It suggests a casual

conversation, not a pitch. Total time, 31 minutes. The founder has a short list of one,

instead of a JD posted to a job board. This patent's on your right is what it means to figure

things out, navigating ambiguity to accomplish a goal, forming hypotheses, testing them,

hitting dead ends, and pivoting until something clicks. The agent didn't follow a script. It ran

the same loop a great recruiter runs in their head, except it did it tirelessly in 31 minutes

without being told how. To be clear, agents still fail. They hallucinate, lose context,

and sometimes charge confidently down exactly the wrong path. But the trajectory is unmistakable,

and the failures are increasingly fixable. So what, while soon they say you'll be able to hire an

agent, which with a hat tip to Sarah Guo, they call one that missed tests for AGI. You can hire

GPT 5.2 or Quad or GROC or Gemini today. More examples are on the way. In medicine, open evidence

is deep consult functions as a specialist. In law, Harvey's agents function as an associate.

They go through examples in cybersecurity, DevOps, go to market, recruiting math, semiconductor

design, and AI research. All of this they say has profound implications for founders.

The AI applications of 23 and 24 were talkers. Some were very sophisticated conversationalists,

but their impact was limited. The AI applications of 26 and 27 will be doers. They will feel like

colleagues. Usage will go from a few times a day to all day every day, with multiple instances

running in parallel. Users won't save a few hours here and there. They'll go from working as an

IC to managing a team of agents. Remember all that talk of selling work? Now it's possible.

What work can you accomplish? The capabilities of a long horizon agent are drastically different

than a single forward passive of model. What new capabilities do long horizon agents unlock in

your domain? What tasks require persistence? Where sustained attention is the bottleneck?

Saddle up they say it's time to ride the long horizon agent exponential. Today your agents can

probably work reliably for around 30 minutes, but they'll be able to perform a day's work of

work very soon and a centuries worth of work eventually. Ultimately they write, the ambitious

version of your roadmap just became the realistic one. Let's move over to Dan Shippers toward a

definition of AGI. Dan writes, when an infant is born, they are completely dependent on their

caregivers to survive. They can't eat, move, or play on their own. As they grow, they learn

to tolerate increasingly longer separations. Gradually the caregiver occasionally and intentionally

fails to meet their needs. The baby cries in their crib at night, but the parent waits to see if

they'll self-sooth. The toddler wants attention, but the parent is on the phone. These small, manageable

disappointments, with the psychologist DW Winnecott called good enough parenting, teach the child

that they can survive brief periods of independence. Over months and years, these periods extend from

seconds to minutes to hours until eventually the child is able to function independently.

A.I. is following the same pattern. Today we treat A.I like a static tool we pick up when needed

and set aside when done. We turn it on for specific tasks, writing an email, analyzing data,

answering questions, then close the tab. But as these systems become more capable,

we'll find ourselves returning to them more frequently, keeping sessions open longer and

trusting them with more continuous workflows. We already are. So here's my definition of AGI.

Artificial general intelligence is achieved when it makes economic sense to keep your agent running

continuously. In other words, we'll have AGI when we have persistent agents that continue thinking,

learning, and acting autonomously between your interactions with them, like a human being does.

I like this definition because it's empirically observable. Either people decide it's better to

never turn off their agents or they don't. It avoids the philosophical rigmarole inherent to

trying to define what true general intelligence is. And it avoids the problems of the

Turing test and open A.I.s definition of AGI. In the Turing test, a system is AGI when it can

fool a human judge into thinking it's human. The problem with the Turing test is that it sets

up movable goal posts. If I interacted with GPT-4 10 years ago, I would have thought it was human.

Today, I'd simply ask it to build a website for me from scratch that I'd instantly know it was

not human. Open A.I.s definition of AGI, which is A.I. that can outperform humans at most

economically valuable work, suffers from the same problem. What constitutes economically valuable work

constantly changes. We will invent new economically valuable work that we can perform in conjunction

with A.I. These hybrid roles then become the new benchmark that A.I. will need to learn to do

before it counts as A.I. So the definition isn't ever receding target. By contrast, the definition

I proposed A.I. is achieved when it makes economic sense to keep your agent running continuously

is a binary, irreversible, and immovable threshold. I like this definition because in order to

meet it, we will need to develop a lot of necessary but hard to define components of A.I.

One, continuous learning. The agent must learn from experience without explicit user prompting.

Two, memory management. The agent needs sophisticated ways to store, retrieve, and forget

information efficiently over extended periods. Three, generating, exploring, and achieving goals.

The agent requires the open-ended ability to define new, useful goals and maintain them across days,

weeks or months while adapting to changing circumstances. Four, proactive communication.

The agent should reach out when it has updates, questions, or requires input,

rather than only responding when summoned. It must also be able to be interpreted and redirected

by the user. Five, trust and reliability. The agent must be safe and reliable. Users will not

keep agents running unless they are confident the system will not cause harm or make costly errors

autonomously. While I've described these capabilities, I'm deliberately avoiding the

trap of trying to specify exact technical criteria for each one. What precisely constitutes

continuous learning or trust is difficult to pin down. Instead, my A.G.I. definition entails

that all of these capabilities are present to some extent, and these capabilities already are

present in limited ways. Judging for example has rudimentary forms of memory and proactive

communication. The length of time during which an A.I. can run on its own is increasing,

gradually and consistently. When GBT-3 first came out, the primary use case for A.I. was the

GitHub co-pilot. The best it could do was complete the line of code you were already writing.

Chad GBT lengthened the amount of time the A.I. could run from the amount required for

you to press tab to complete a line of code to the time required to deliver a full response

in a chat conversation. Now, agentic tools like Clawed Code, Deep Research, and Codex can run

for between five and twenty minutes at a stretch. The trajectory is clear, from seconds to minutes

to hours and days and beyond. Eventually, the cognitive and economic costs of starting fresh

each time will outweigh the benefits of turning A.I. off.

If you're using A.I. to code, ask yourself, are you building software or are you just playing

prompt roulette? We know that unstructured prompting works at first, but eventually it leads to A.I.

Slop in technical debt. Enter ZenFlow. ZenFlow takes you from vibe coding to A.I. first

engineering. It's the first A.I. orchestration layer that brings discipline to the chaos.

It transforms freeform prompting into spec-driven workflows and multi-agent verification,

where agents actually cross-check each other to prevent drift. You can even command a fleet of

parallel agents to implement features and fix bugs simultaneously. We've seen teams accelerate

delivery 2x to 10x. Stop gambling with prompts. Start orchestrating your A.I. Turn raw speed into

reliable production-grade output at ZenFlow.free. Today's episode is brought to you by robots and

pencils, a company that is growing fast. Their work as a high-growth AWS and Databricks

partner means that they're looking for elite talent ready to create real impact at velocity.

Their teams are made up of AI native engineers, strategists, and designers who love solving hard

problems and pushing how AI shows up in real products. They move quickly using RoboWorks,

their agentic acceleration platform, so teams can deliver meaningful outcomes in weeks, not months.

They don't build big teams. They build high-impact nimble ones. The people there are

wicked smart with patents, published research, and work that's helped shape entire categories.

They work in velocity pods and studios that stay focused and move with intent.

If you're ready for career-defining work with peers who challenge you and have your back,

robots and pencils is the place. Explore open roles at robotsandpensals.com slash careers.

That's robotsandpensals.com slash careers.

Here's a harsh truth. Your company is probably spending thousands or millions of

dollars on AI tools that are being massively underutilized. Half of companies have AI tools,

but only 12% use them for business value. Most employees are still using AI to summarize

meeting notes. If you're the one responsible for AI adoption at your company, you need section.

Section is a platform that helps you manage AI transformation across your entire organization.

It coaches employees on real use cases, tracks who's using AI for business impact,

and shows you exactly where AI is and isn't creating value.

The result? You go from rolling out tools to driving measurable AI value.

Your employees move from meeting summaries to solving actual business problems,

and you can prove the ROI. Stop guessing if your AI investment is working.

Check out section at section AI dot com. That's SEC T-I-O-N-A-I dot com.

Today's episode is brought to you by super intelligent.

Super intelligent is a platform that very simply put is all about helping your company figure out

how to use AI better. We deploy voice agents to interview people across your company,

combine that with proprietary intelligence about what's working for other companies,

and give you a set of recommendations around use cases, change management initiatives,

that add up to an AI roadmap that can help you get value out of AI for your company.

But now we want to empower the folks inside your team who are responsible for that transformation

with an even more direct platform. Our forthcoming AI strategy compass tool is ready to start to be tested.

This is a power tool for anyone who is responsible for AI adoption or AI transformation inside

their companies. It's going to allow you to do a lot of the things that we do at super intelligent,

but in a much more automated, self-managed way, and with a totally different cost structure.

If you are interested in checking it out, go to AIDailyBrief.ai slash compass,

fill out the form and we will be in touch soon.

So both good entries into the canon of what is AI, but as I indicated at the beginning,

what I think is actually most relevant about them right now is the fact that we are having

this conversation right now. We are having this conversation because people have a sense that

something big has shifted, but something big does not necessarily convey AI. Indeed,

one of the best steelman arguments against what we have now being AI is the need to separate

wow-level competence from general autonomy. The argument would go along the lines of,

AGI isn't just about being able to generate impressive outputs across domains. It's about robust,

self-directed competence under real-world constraints. AGI could be dropped into novel situations,

defined success criteria, managed-long horizon execution, and reliably converge,

without a human acting as the external executive function. And as much as things have changed,

what both of the pieces we just read have in common is that more than anything else,

they're disagreeing about which point we're on on an agreed upon trajectory.

The funny thing in fact about that this is AGI piece is that when you actually read it closely,

it's not so much saying that this is AGI. It's saying that we're really, really close,

that what is AGI, i.e. these long horizon agents, are available ish now and just getting better,

and that because we're now within call it months rather than years of AGI,

you better start preparing. Dan isn't really disagreeing with that, although he doesn't get into

timelines. Instead, he's pointing out all of these things that need to happen to get to a certain

point of indispensability, which he is arguing is the key thing. But what about what we've seen over

the last couple of weeks? The sense among some of the most enfranchised and powerful users of AI,

that we really are in a fundamentally different moment. To take one example of a type of testimony

we've seen lots of, mid-Journey founder David S. Holt's tweeted on January 3rd,

I've done more personal coding projects over Christmas break than I have in the last 10 years.

It's crazy. I can sense the limitations, but I know nothing is going to be the same anymore.

And honestly, this brings up a more interesting and nuanced take on it's not AGI yet.

That argument would go something like, yes, Claude code and similar tools have crossed the

threshold for coding specifically, but generality is the whole point of general intelligence.

There's still so much that current AI fails at, like novel reasoning, multi-step planning

and unfamiliar domains. These new big breakthroughs that everyone is sensing happened in a domain that's

really well suited to LLMs, while documented, pattern-rich, verifiable outputs. That's not the

argument would go evidence of general intelligence, it's evidence of domain fit. This would in some

ways be an argument about the jaggedness of AI, the idea that it can be superhuman in one area

and infantile in another. And indeed it is the case that this sense of what has shifted is about

AI's capacity to code. But I keep coming back to this essay from Sean Wang aka Swix when he decided

to join cognition. This line which absolutely wins the award for the couple of sentences that

have lived most rent free in my head since they were written. Sean wrote, the central realization I

had was this, code AGI will be achieved in 20% of the time of full AGI and capture 80% of the

value of AGI. Now for him, this is an argument to simply do code AI now rather than later, but I

think what I would argue is that code AGI doesn't quote unquote capture 80% of the value of AGI.

I think code AGI is more or less just functional AGI. The argument here is that coding is

effectively a universal lever in the modern world. Most economically valuable work to reference

open AI's terminology has been computer shaped for a long time. If your job touches a screen and

API a database of spreadsheet, a ticketing system, a CRM, a repo, a dashboard, or a docs tool,

then in principle it's addressable by software. So if an AI can understand intent, translate

intent into procedures, write and modify code, run tools, inspect outputs, and iterate until

it meets acceptance criteria, then it has a meta skill that can simulate competence in many

domains by building the missing tool. And in that framing, coding isn't one domain, it instead

is closer to instrumental generality. Want data analysis? Write SQL or Python notebooks,

run them, interpret the results, generate charts, and build pipelines. Want operations?

Automate workflows across systems, tickets, approvals, audits, alerts. Want finance?

Pull data, reconcile, generate variance analysis, draft narratives. Want product, spin up

prototypes, instrumentation, AB analysis, telemetry pipelines. Basically the idea is that if you can

program you can create capabilities. And if you can create capabilities on demand, you're not

narrow, you're general in a way that matters for real work. You could take this argument even

farther. Coding doesn't just help you build general capabilities. It also is to be good in some

ways, a test of general reasoning, non-trivial coding forces abstraction decomposition,

causal reasoning, adversarial thinking, and iterative debugging. Those are indicators

ultimately of general intelligence. And I would argue that a lot of what feels different about

building with AI coding tools now as opposed to six months ago is in that set of general

reasoning capabilities rather than just how good it is at knowing a bunch of different coding

languages. We recently had an issue come up where a company that we were producing an AI and

agent readiness audit for found contained in one of the recommendations, a tool or platform that

was at contrast at the tech stack that they currently have. And this is a problem that we are

extremely conscientious of. It would be very easy to recommend a bunch of platforms that an

enterprise is never going to use. It's much more difficult and much more valuable to let them know

how to work with the band of tools they have or the things that they would consider to actually

solve the problems that are clear and present for them. And so we spent a lot of time making sure

that that type of recommendation doesn't make it through, but it did. And so as the team was talking

about new processes and procedures for making sure this didn't happen anymore, I was following

the conversation on Slack from a haircut. It struck me that this might not actually be all that

difficult, at least if you removed it from the domain of the human. And so as I was sitting there,

I fired up my mobile web browser version of lovable, and by the time the haircut was done,

I had a checker to run final reports through that would make sure to compare the tech stack

of the company to all the recommendations in the report. And in literally a matter of seconds,

make sure before the final deliverable was sent that that sort of thing didn't happen.

Now, of course, there are even more sophisticated ways to do this with code.

Basically, we could just build that capability that I built as a standalone into the overall

processing pipeline. But the point that I'm trying to make is that this wasn't coding to solve

a technical problem. It was coding to solve a business problem. Increasingly, the people who are

most adept at working with AI, for the people who are most adept at working with AI, that's what

they're doing. The question is decreasingly, what's the best AI tool for this? And increasingly,

can I build some custom software to solve or enable this? So it's happening right now at all

these different startups is not some cute little example of the non-technical folks being able to

vibe code and prototype features. What's happening is a complete collapse in the distance between idea

and execution for everyone. And that's amazing for those startups. We are going to see complete

and utter re-evaluations of how to do everything. And startups and small companies are going to be

the incubators of that change. The thing that would worry me right now, if I was an enterprise leader,

is that this Rubicon that we've crossed starts to feel more like a shift in kind than a shift in

scale. What I mean by that is that for the three years since ChachiBT was launched, obviously

enterprises were behind more nimble companies and AI users relatively speaking. There's all

sorts of systems inertia, there's compliance issues, governance issues, etc. But the patterns of

what they were doing were still similar to the patterns of what other people were doing,

just maybe with a little bit slower adoption and a little bit more process along the way.

Still, they were running on a parallel track. I now believe that the tracks have diverged.

The frontier of what's possible and the median of what's deployed in enterprises has

decoupled in a way in which I believe that they are increasingly pointed in different directions.

The standard enterprise invocations at this point to audit and automate your workflows,

experiment with AI, will ultimately contain the transformation possible within existing power

structures, more or less keeping the org chart intact. The reality is, in a world of code AGI,

a world of functional AGI, the org chart is broken. Bottom next shift from who can code to who

has good ideas, the role of management shifts from resource allocation to taste and judgment,

competitive advantage shifts from execution capability to speed of iteration, and the gap

increasingly isn't linear. It's compounding. Every month, someone is building in the new paradigm.

They are getting comparatively farther ahead from those who aren't.

So is the answer as simple as letting everyone on your team vibe code?

Honestly, I think you could do worse. I think we are at a moment. We are increasingly the

modality by which things are produced in this world. Looksin is different to the way that it was

just a few years ago, even just a few months ago. The message is less upskill your workforce and

audit your AI use cases, and much closer to your entire organizational model is built for a world

where execution was the bottleneck, and that world is over. I worry for enterprises because this

new set of shifts involves accepting a loss of control, a restructuring of incentives,

and a total transformation of process that is even harder than the AI transformation that has come

so far. And to the extent that you are in one of those enterprises and looking for a bright spot in

this, it's that at least most of you are in this together, and that it is unlikely that many

enterprises are going to get comfortable fast with the types of change they really should be making.

But the change, I believe, has happened. And I think that the rewards for the companies,

not just startups, but enterprises too, who can lean into this new capability set and can live

on the other side of this inflection will be immense. So like I said at the beginning,

I think code AGI is functional AGI, and I think it's here. This is something that I will be

exploring a lot more in the weeks to come. For now, though, that is going to do it for today's

AI daily brief. Appreciate you listening or watching as always. And until next time, peace!

More from The AI Daily Brief: Artificial Intelligence News and Analysis

View all episodes →

What I Learned Testing GPT-5.5

The AI Daily Brief: Artificial Intelligence News and Analysis

Apr 24, 202636:32failed

How Headless Agents Will Change Work

The AI Daily Brief: Artificial Intelligence News and Analysis

Apr 24, 202630:34failed

What GPT Images 2 Unlocks

The AI Daily Brief: Artificial Intelligence News and Analysis

Code AGI is Functional AGI (And It's Here)

About this Episode

Hosts & Guests

More from The AI Daily Brief: Artificial Intelligence News and Analysis

What I Learned Testing GPT-5.5

How Headless Agents Will Change Work

What GPT Images 2 Unlocks

How Apple's AI Strategy Changes with a New CEO