Loading...
Loading...

In this episode, we provide insights into the investment implications of Anthropic’s Claude design. We discuss potential investment strategies in the evolving AI market.
Whether it's Slots or Live Dealers, SpinQuest.com has the fun and action you're looking for,
with SpinQuest exclusives, Blackjack, Roulette, Fakera, and even Live Dice with Craps and
Bubble Craps.
The games never stop, so you don't have to.
And right now, new users get $30 coin packs for just $10 bucks.
Play now at SpinQuest.com.
SpinQuest is a free to play social casino.
Boydware prohibited, visit SpinQuest.com for more details.
Liberty Mutual customizes your car and home insurance, and now we're customizing this
ad for your morning commute to wake you up.
Which could help your driving?
Science says that stimulating the brain increases alertness.
So here's a pop quiz.
How many months have 28 days?
What gets wetter as it dries?
What is keys but can't open locks?
If you don't want to hear the answers, turn off this Liberty Mutual ad.
Now.
12 months, a towel, piano.
Enjoy being fully alert.
Liberty, Liberty, Liberty.
Welcome to the podcast.
I'm your host, Jayden Schaefer.
Today on the show, I want to talk about OpenAI fighting back against the giant onslaught
of features.
Anthropic has been pushing some negative PR.
Anthropic has been getting, but at the same time, an incredible new tool called Claude Design
that just came out.
I also want to talk about where some VC dollars are going in the AI space, some surprisingly
interesting things there.
On a new term called token, Maxine, in addition, OpenAI just massively beefed up codex for desktop
control, memory, and in-app browser and over 100 plug-in integrations.
Basically, this is them suing directly at Anthropics, Claude, and Claude co-work.
And I think it matters a lot for where editing agents is going to be going in the future.
So let's get into it.
Before we do, I wanted to mention AI box.
The thing I keep hearing from people is that they're paying for chat, GPT, Claude, Gemini,
complexity, even mid-journey.
And by the time you get all of that added up, you're likely $70 or $80 a month across
a bunch of different logins.
AI box gives you access to over 80 different AI models in one interface.
All of this is just $8.99 a month.
If you want to get access to it, there's a link in the description to AIbox.ai.
And in addition, we have something called the AI box builder where you can essentially
link together multiple AI models.
We build it the entire workflow for you.
And you vibe build tools without needing to know any code at all.
I'm not a developer.
I built this for other people that are not developers.
So if you want to check it out, there's a link in the description to AIbox.ai.
OK, the first thing I want to talk about is a company called Factory.
So this is an AI coding startup that are focused specifically on enterprise engineering
teams.
They just closed $150 million series A at a $1.5 billion valuation.
Coach Sla ventures led the round.
Sequoia insight partners and Blackstone all were participating in this.
The founder is named Mattin Grinberg.
He was a physics PhD student at Berkeley.
He basically cold emailed Sequoia partner Sean McGuire in 2023.
And they apparently were good friends.
They bonded over physics research.
McGuire convinced him to drop out and Sequoia seated the company.
Their customer list already included Morgan Stanley Ernst & Young and Palo Alto Networks.
So obviously this is a very enterprise focused, you know, they're not targeting individual
developers in any way.
They're focused on the enterprise.
Grinberg's pitch is why Factory I think is kind of differentiated their model flexibility.
They basically can let you switch between claw deep seek whatever makes sense.
Although honestly cursor does that too as do I think most of the serious players at this
point.
What I think this does tell us is that even with an anthropic and open AI and cursor
already in the market, enterprise AI coding still has room for some category specific
players.
Morgan Stanley isn't going to let some random developer tool run inside their network
unless it's built with their compliance and security posture in mind.
And I think that's basically the gap that Factory is filling.
And I think $1.5 billion in their current valuation says that VCs are believing there is a real
gap here.
Of course, Cloud Code is trying to get in there as well.
And you can look at things like cognizant just getting all of their employees, 350,000
of them on to claw and all the anthropic tools.
So I think there's probably competition from a lot of players, but it's interesting
that they're carving out a niche there.
Okay, the next thing I want to talk about is a brand new tool from anthropic called
Clawed Design.
It's a research preview right now.
It's available to pro max teams and enterprise subscribers and is powered by a Clawed Opus
4.7, the model that just came out a day or two ago.
So this is what anthropic just shipped.
And basically you can describe what you want in pitch deck, a one-pager, a landing-page
prototype and Clawed generates a first draft.
You've probably seen it kind of make web pages before.
So it's interesting because you actually can use Clawed Design to kind of come up with
the mockups ahead of time.
And then you can refine it by either directly editing it or just talking to it.
I actually appreciate both of these options as I've used a lot of tools like, I mean,
I don't want to throw too much shade at level, because I know they do have some direct
editing features.
It actually never worked super good for me in the past.
Maybe they're better now.
But in the past, levelable would have something where you could, you know, describe the website
or whatever you're trying to build.
It would generate the design.
And then you're supposed to be able to click on it and edit directly.
It never worked.
And sometimes when I do the chat after I do that, it would like undo them.
It's just kind of bad.
So I think Clawed has cracked this a little bit better.
Maybe levelable is there as well.
But you're able to export as PDFs, URLs, PPTX files.
And you can send the outputs straight to Canva.
So Canva has a big integration with them.
And you can keep all of your collaborations there.
You can also read your company's code base and design files to apply a consistent design
system across all of your outputs, which is actually, I think, the more interesting piece
if you kind of look at this technically.
Anthropic is positioning this as kind of complimentary to Canva.
They're not, they're saying like, look, we're not going to compete with them.
This is compliment.
The target audience is specifically people who aren't designers.
So founders or PM startup operators, you need to make something to look presentable really
fast.
What I think about this is that Anthropic is continuing to move up the stack earlier
this year.
They launched clot co-work, then agentic plugins, first specific departments.
And I think now this, I think where they're going with this is that they're not just trying
to be an API company.
They want to actually own actual workflows and surface area.
It's the same play that opening eyes have been making.
And I think you're going to hear more why this matters when we go into the deep dive later
on in the episode.
But also just show it, shout out to Google who's been doing this and basically every vertical
Google has stitch, which is a very similar design tool as well.
So yeah, I think we're going to see a lot of these players get more into the software
itself beyond just the models, which is pretty interesting.
Okay, the next thing we want to talk about is token Maxine.
So if there's a funny story on, on TechCrunch recently where they're talking about token
Maxine, basically it's the pattern of companies and developers when they're bragging about
how many tokens are AI coding tools burned through as if, you know, the more tokens that
they're using means that they're more productive.
I think the actual data that they've been crunching is very different.
So tools like Cloud code cursor codex, they're all generating way more accepted code on
the first pass 80 to 90% of initial acceptance rates.
But when you look at the same code two weeks later, the effective acceptance drops to 10
to 30% because engineers were constantly rewriting it, right?
So basically what's going on here is when they're like, look, we're writing like all of
our code, 50% of our code, a lot of companies are like 50% of our codes written by like
clod and it sounds amazing.
And they're like, and it's all perfect and great or, you know, high level of it.
And maybe that's true.
And I'm not saying that that's necessarily bad.
I use clod code heavily as well with my startup.
But what's interesting is I think people that pretend it's, you know, basically a marker
of productivity because what they found get clear, I was doing this specifically and they
found that AI users have 9.4 times higher code churn than non AI users.
And Pharaoh AI found code churn increased 861% under high AI adoption.
And then you also had jellyfish, which was looking at 7,500 engineers and found that
the teams with the biggest token budgets got two times the throughputs at 10 times the
token costs.
And I think basically it's just a reality check, right?
The productivity gains from AI coding are real, but they're also a fraction of what the
output number suggests, right?
If you're like, I can write a million, you know, lines of code a day now and I used to
only be able to write this amount.
Okay.
Well, you definitely are getting real productivity gains.
It's not an argument, but I just think it's important to check ourselves and, you know,
the million lines of code are not all good because if you look at it three weeks later,
a big chunk of it has to be written or fixed, which is fine.
I mean, a normal developer writes code and read, you know, and works on it and optimizes
it and fixes it.
I think senior engineers, interestingly, are less accepting and then AI of AI code than
juniors and problem because they know which parts are subtly wrong, right?
So when there's like a code push, then it's less likely to accept that code push.
If you're a manager thinking about how to measure AI ROI, I think counting merged and
shipped is important and not just how much is generated.
Okay.
Let's get into physical intelligence.
This is a robotics foundation model startup.
It just published research on a new model called pi 0.7.
And I think this might be the most novel kind of technical story of the day, but basically
what they're claiming right now is that pi 0.7 can perform tasks.
It was never specifically trained on by composing skills.
It learned in other contexts.
So the example that they're kind of highlighting all this, they have like an air fryer and
the robot had only briefly seen this air fryer in training.
So, you know, it wasn't trained to operate the air fryer.
And anyway, it briefly seemed in training and I think it only seemed like two short clips
of an air fryer.
Then they gave it a step-by-step verbal instruction and it figured out how to operate it.
And in like some broader testing, the generalist model actually matched specialized models on
jobs like making coffee, folding laundry, and assembling boxes.
Researcher said that the generalization ability was really surprising to them, right?
So basically, you train a special, you train the robot specifically to fold laundry and
yeah, it does good.
And then they have a new model where it's just kind of a generalist at everything.
It's not trained to fold laundry, but you explain step-by-step out of trained fold laundry.
And it does it just as good as the robot or almost just as good as the robot that was
specifically trained on this.
That is fascinating.
And I think especially when you look at, you know, I would say quote-unquote general models
like opening an air-anthropic that they're building that can do a lot of different things
generally good.
It's kind of good news for them because, you know, you may not have to have models specifically
trained on just a specific task when you just talk about, you know, kind of like physical
robotics and stuff.
So on the business side, physical intelligence has already raised over a billion dollars.
They were last valued at 5.6 billion.
They're reportedly in talks to nearly double that to 11 billion.
They're co-founder Laci Grum has a track record of backing, Figma, Notion, and Ramp.
So I think obviously this is why VC dollars are going to Laci.
I think something that is interesting, I caveat that I'll put on this if I'm trying to
be honest, like basically Pi 0.7 still can't handle a lot of multi-step tasks, right?
It's not doing this autonomously without any coaching.
I think the robotics field doesn't really have a lot of clean benchmarks like LLums do.
You know, we have like humanities last exam and like all these different benchmarks that
we give AI models on, you know, like engineering and math and other areas and we can tell exactly
how good they are at those tasks.
There's not a lot of that with robotics.
I'm sure there's going to be more as we get more into it.
So basically for robotics, so you kind of have to just trust whatever their demo is.
But I think if this kind of generalized behavior is going to be something that we're looking
at, it's pretty significantly stepping us towards robots that actually work in really
messy real world environments.
So this is something I'll be closely watching over the next six months or so.
Okay.
Open AI has just released a new, a whole bunch of new features to their desktop app, which
is called Codec.
I'm going to have tried in the past, but have opted for Anthropics Cloud Code and Cloud
Codework in recent, you know, weeks in the last month.
And I think OpenAI sees that and they really want to make a big push to win people back
or to have people try OpenAI Codex for the first time.
So huge upgrade to Codex.
I'll walk through a couple of things that are new because I think OpenAI is basically
swinging directly at Anthropics Cloud Code, which honestly has been really crushing it,
right?
So this is what they added.
Codex can now run in the background on your Mac, which is phenomenal, right?
It can open up applications, it can click around, it can type into your desktop while you
keep working on something else.
This is actually something I like.
Cloud sort of does this, but I'm going to be honest, even with Cloud Codework a lot of times,
if I have like an automated task running, like I've got a bunch of things that I'm just
like, you know, every day at 9 a.m. do this, every day at noon do this and it's like
grabs like analytics or grabs data or goes and gets me a report on something.
So actually the thing that I love using it for is if there's no API for a service, I'll
just have it go log into the account and go grab the data I need and bring it back to
me.
Hopefully those companies offer APIs in the future, but for now, that's what I do.
In any case, it is annoying with Cloud Codework that lots of times when those automated tasks
start happening.
All of a sudden, this Chrome browser pops up on my screen if there's, if it can't do it
like in the background and all of a sudden it's like clicking on things right in front
of me and I'm like, swatting flies, like trying to get this thing away will I keep working
on something different.
Either I have its own computer like possibly, but a lot of the times I just have it running
on the side.
In any case, OpenAI is trying to combat that and have it work on things in the background.
So it's not just writing code in an editor.
It's actually operating your entire machine.
So this is what I'm excited about computer use from OpenAI, which they sort of have done
for a long time.
They were the OGs way before Anthropic was shifting things in this, but they really just felt
stale and bad.
I've tried a lot in the past and trust me.
OpenAI agents back in the day were like six months a year ago were as good as what Anthropics
doing now.
I'd be shouting them from the rooftops, but it seems like now they're making come back.
So they can also run multiple agents in parallel without interfering in your desktop, which
means that you can have one fixing a bug and you can also have one running tests, one
writing docs all at the same time.
They also have a new in app browser so it can hit web applications directly.
They have 111 plug-in integrations, so code rabbit, GitHub, or GitLab issues.
They have a bunch of new exciting things with their memory feature, so it can remember
previous sessions.
They have an image generation that is now inside of codex, which to be fair, Claw does not
have any sort of image generation.
They also rolled out a pay-as-you-go pricing specifically for enterprise and business
customers.
So, will I think Anthropic is definitely a head right now.
I would say that the plug-in ecosystem is probably part of one of the most underrated
pieces of this entire announcement because they have 111 different plug-ins at launch.
They're going to be adding more.
With Clawed co-work, I mean, it's awesome, but I have maybe four things, my Google calendar
and Chrome synced up, so a bunch of Google tools and GitHub, but beyond that, there's
so many different tools I use that don't integrate very well with it.
It's going to go and use my Chrome browser to access them.
So anyways, I think a lot of these integrations that OpenAI is pulling in are going to be
very useful.
Alright, that is the show.
If you're getting value from these episodes, please drop a comment over on Apple Podcasts
or leave a couple of stars over on Spotify.
You hit the About tab on Spotify to drop a review.
Basically, the reviews help the show reach way more people.
It boosts it in the algorithm.
It helps it out a ton.
If you haven't done it already, I would be eternally grateful.
Also, if you want to consolidate the AI subscriptions you're already paying for, go check
out AIbox.ai.
There is a link in the description, 80 plus models, plain English automations, 899 a month.
I'll catch you guys all in the next episode.
Forget everything you had planned for this weekend because you are sitting on your couch
and winning from the comfort of your own home.
I'm here with SpinQuest where you can play hundreds of slot games, all of the table games
you love, and you could even win real cash prizes.
New users, $30 coin packs are on sale for 10 at SpinQuest.com.
SpinQuest is a free-to-play social casino, voidware prohibited.
Visit SpinQuest.com for more details.
Liberty Mutual customizes your car and home insurance, but now we're customizing this
ad for your morning commute to wake you up, which could help your driving.
Science says that stimulating the brain increases alertness, so here's a pop quiz.
How many months have 28 days?
What gets wetter as it dries?
What is keys but can't open locks?
If you don't want to hear the answers, turn off this Liberty Mutual ad now.
12 months, a towel, piano.
Enjoy being fully alert.

AI Investing: for the AI Investor

AI Investing: for the AI Investor

AI Investing: for the AI Investor
