technologynews

Amazon EKS Auto Mode wants to end Kubernetes toil — one node at a time

The New Stack Podcast·Apr 7, 2026·22:31

About this Episode

At KubeCon + CloudNativeCon Europe 2026 in Amsterdam, Alex Kestner, principal product manager for Amazon Elastic Kubernetes Service (EKS), discussed how Amazon EKS Auto Mode aims to reduce the operational burden of running Kubernetes at scale. While Kubernetes delivers significant power, it also introduces complexity—particularly through repetitive, day-to-day tasks like managing node lifecycles, ensuring security updates, and selecting optimal infrastructure.

Kestner emphasized that much of this “undifferentiated heavy lifting” distracts platform teams from delivering business value. Amazon EKS Auto Mode addresses this by automating infrastructure operations across the full node lifecycle, shifting responsibility for key operational components outside the cluster and into AWS-managed services.

Built in collaboration with the EC2 team and leveraging technologies like Karpenter, Auto Mode dynamically provisions right-sized compute resources based on workload requirements. While it doesn’t eliminate all challenges—such as unpredictable workloads or diverse deployment needs—it provides a more application-focused approach to scaling and cost optimization. Ultimately, Auto Mode represents a meaningful step toward simplifying Kubernetes operations in increasingly complex cloud-native environments.

Learn more from The New Stack about the latest developments around the latest with Amazon Elastic Kubernetes Service (EKS):

2026 Will Be the Year of Agentic Workloads in Production on Amazon EKS

How Amazon EKS Auto Mode Simplifies Kubernetes Cluster Management (Part 1)

A Deep Dive Into Amazon EKS Auto (Part 2)

Join our community of newsletter subscribers to stay on top of the news and at the top of your game.

Hosts & Guests

The New Stack

Host

Transcript

Since its inception, Amazon Web Services has been the best place for customers to build

and run open source software in the cloud. AWS is proud to support open source projects,

foundations, and partners.

Hi there, I'm Adrian Bridgewater. I'm here with the new stack at CubeCon and Cloud

NativeCon Europe, 2026, we're in Amsterdam, here in the Netherlands, and I'm with Alex

Kessner. I'm just going to hit your job title right, Alex, your principal product manager,

AWS elastic Kubernetes service, otherwise known as EKS, and you're here throughout the

whole convention. Specifically, we're going to look at EKS Auto mode, I know, but there's

a sort of kind of intro into that. I think we probably is worth covering the fact that

Kubernetes complexity has always been the two words go together, don't they? As it's become

the de facto orchestration standard within Cloud Native environments, we've been talking about all

of the self-service and automation layers that are going to be there, incrementally there every

year to make things easier. Is Auto mode part of the answer to making things that much better?

Certainly for part of it. I mean, I think the challenge with Kubernetes is that because it

is so powerful, there's a certain amount of complexity that just comes with that space and

that and solving for that problem. One of the ways that we think Auto mode can help with

that complexity is that the infrastructure layer. While there are all kinds of complexity

and Kubernetes that we can't directly address through this kind of feature for Amazon EKS,

there are a very healthy portion of undifferentiated heavy lifting that we can take on for customers.

Where are the difficulty points? Is it scaling that's hard? Is it interconnections to new services?

Is it preparing for quantum? I'm probably jumping ahead of the game. Is it going the other way?

Is it trying to make form connections to legacy systems? Where are all the real difficulties

in operational terms? So these difficulties come from sort of the day-to-day tasks that take

platform teams time away from delivering true value for their business, like unique and

differentiated value, helping them ship their applications faster, serve their own users better,

and these often take the form of repeated and ongoing operational tasks. So handling the life

cycle of the nodes in the cluster, making sure that they're secure, up-to-date, the right

instance types are selected for performance and cost, making sure that all of the software

in the cluster that helps it operate is consistent, is up-to-date, the right fit for the workloads

in that cluster, and all of this adds up to a very real amount of work for platform teams.

And this kind of infrastructure toil is where we really focused for auto mode to help

publiciate some of that, and by mainly giving it to us to take on. Let's get into auto mode in a

second, but before we do, you just said, you know, node life cycle, and without being too simplistic,

we've for most of the last 20 years, we've been talking about software development life cycle,

people haven't even, although cloud native and cloud has been around before the millennium,

we haven't been talking about node life cycle. Well, all of that software has to run somewhere,

right? And that's infrastructure that needs to be managed by the teams that are running these

Kubernetes-based platforms. And so node life cycle is just one of those examples of things that

are critical for these Kubernetes platform teams to manage, but not necessarily unique to their

business or the problems that they're facing. And so this was a kind of an obvious place for us to

start. When we were talking with customers about the things that they really didn't like doing,

that didn't provide a lot of value for their businesses, that was one example that we just heard

repeatedly was that, you know, making sure that the instances where their software runs are secure

and up-to-date is just not something that's that differentiated from them. Got you. So,

so how long's auto modes been around? Yeah. How do you, if you had to do an elevator cell,

which probably everyone would? I've done a couple of times, yeah. You might find people getting out

than early. No, it's complex technology. How long's it been around and how do you encapsulate it

and, you know, someone that has no idea? So, like, like a lot of services and features at AWS,

we launched it at ReInvent in 2024. So, celebrated its first birthday last year, roughly, you know,

15 months old at this point. Fundamentally, auto mode is meant to take on a lot of this

undifferentiated, heavy lifting that we're seeing platform teams do just to get the benefits of

this incredible ecosystem that we see here through Kubernetes and the Cloud Native Computing Foundation.

It's almost a tax that you have to pay to get all of these benefits. And it struck us as something

that we could take on for our customers, letting them focus on things that are particularly valuable

for their businesses. So, what's been happening over the 18 months, not quite probably? Yeah, almost

that time period. Well, you said that you've seen commonalities in node execution and behavior,

and for everything I presume, from spreading up to retiring and node, there are key activities

that you can codify. Can you put specific mates on most phones? What are those?

Yeah, so I think there's two main things that auto mode takes care of for customers. One is

for a Kubernetes cluster to be truly useful in like a production environment. There's key

sorts of operational software that need to run on that cluster that help it interact with all kinds

of other infrastructure primitives or in the case of EKS, other AWS services. And then the other

part that we take those over for customers, we run them outside of the cluster and basically take

those sort of maintenance of that software off of their plates. These are key things for enabling

compute storage and networking in the cluster, table stakes for any kind of a Kubernetes environment.

The other side of this is a sort of unique innovation that we worked on with the EC2 team in AWS

called EC2 Managed Instances. So one of the ways that we're able to let customers offload all of

this work to us is that every instance that auto mode launches into a Kubernetes cluster is an EC2

Managed Instance. And this is an instance that looks like any other kind of Amazon EC2 instance.

You get the rich sort of portfolio of 850 or more EC2 instance types that you could leverage,

all of the different kinds of ways of purchasing this capacity on-demand spot reserved instances.

But it is ours operationally to manage. So trying to kind of thread this needle of letting customers

have their cake access to all of these these amazing capabilities from the AWS portfolio while

eating it too by giving it giving a lot of that kind of heavy lifting does. And is there enough

are there enough managed services within these offerings to really overcome the major.

It seems that because modern workloads are so dynamic and unpredictable, almost sounds like

I'm reading a marketing brochure, but that's what everyone always says. And because

you've got agentic services, some of which will enjoy rapid uptake and some of which will be

prototyped and fall off and skew and die, you've still got this kind of workload provisioning.

Yep, difficulty, which is and there's a lot of wastage. And you know, that's that's something

you want to avoid. I know AWS every reinvent you talk about the, I can't remember whether it's called

the shared responsibility model or something like that. It's like the customers have to take some of

this responsibility for themselves to be efficient. And at the same time at the back end, you're

the hyperscaler. You're going to be doing this much. You'll provide enough tooling to make it

as manageable as possible. But you're still, we're still going to have unpredictable workloads.

Is it something we'll never get around? I think this is a place, certainly not something that

will likely change in the near future, just by virtue of the diversity of kinds of use cases

that customers bring to Kubernetes and EKS. But it is a place where automotive really excels.

So, you know, just as we are taking on the operational responsibility for common

sorts of like maintenance tasks for this infrastructure, we also bring a very application oriented

perspective to scaling and cost optimization. So one of the things that we want to get customers out

of the sort of need to do is capacity planning. So automotive is built on a series of open-source

standards and products, one of which is the Carpenter project. This is a project that we launched in 2021

as an open-source project. Hasn't that become part of what you all know? That is right.

And also part of the CNCF here, which is such a nice sort of way that we've seen this

that kind of fledgling project emerge into something else. It's not common to exist anymore.

Yeah, of course. Sorry, it's an open-source project that we also build and operate as part of

EKS automobiles. So, as a result, what this means is that customers can have their workloads

specify the kinds of infrastructure they need. They're compute requirements, essentially.

And behind the scenes, automotive will go and look for the optimal and most cost-effective

infrastructure to meet those requirements. So when I say you don't have to think about capacity

planning anymore, it means that we'll let a system like EKS Automode figure out what the workloads

need and then deliver that infrastructure at the right time to allow them to perform their job

as they need to. And there's so much misconfiguration in cloud. And not just, I mean, that's probably

the preserve of the cyber security companies. They're in there talking about where things break,

just you to simple misconfigurations. But there's so many other reasons that you can have instances

corrupt, authentication issues. Or just inconsistencies across clusters in general. Like,

one of the things that we see a lot is that it's not that a customer is running one singular

Kubernetes cluster. They have a fleet of clusters, maybe hundreds. For all kinds of different

reasons, they maybe want to segment things by the way their organization structure, or they want

to keep things separate for technical reasons. But suffices to say that keeping a fleet of clusters

consistent and operating them with sort of a level hand is really a challenging task. And so,

this is one of the things I think we're also able to help customers with is by having all of

these best practices built into the product that we have in Automode, it means that you don't have to

try to, this includes security for what it's worth. You don't have to try to achieve that kind of

consistency through effort, but you get it by default through the system. And this is like just one

of those things that, you know, in aggregate adds up to less and less work, less and less sort of

maintenance effort that teams need to take on themselves that they get just as part of the offer.

And is there, I know we were talking about trying to codify commonalities. Are they common

across industries? Is that patterns that is that too broader question? I mean, certainly there are

unique things, industry to industry. There's, you know, even kind of uniquenesses within individual

organizations, particularly the larger ones. And, you know, one of the things, one of the kind of

balances that we tried to strike with Automode was, and as I think like the easy to manage

instances is part of this story, is letting customers have the subtraction that takes away this

kind of effort that is not valuable for them to be, you know, to be doing and spending, you know,

valuable engineering time on while also giving them the configurability and customization they need

to meet their use cases requirements. And it seems to be a key feature. I'm sure I'm going to

say it wrong. Is it 21-day maximum node runtime? Yeah. That's, and so it seems to make a lot of

sense for hygiene, system hygiene. That's right. And are we coming back to configuration drift and

all of the reasons that you want to enforce that? That's exactly right. I mean, there's certainly one

of the benefits in that, you know, but the 21-day maximum lifetime of Automode instances brings,

is that you can effectively rest assured that at 21 days, your all the instances in your cluster

will have been updated with ever the latest, you know, Amazon machine image or configuration you

have. What does nothing needs to be changed? Well, there's always something going on, you know,

whether that's through like a CVE that easy-get patched or just the latest performance improvements

all the way down through to like the Linux kernel. And so typically with Automode, we see that

about once a week there's a good opportunity for us to release a new version of the Amazon machine

image that powers all of the the Automode instances. And, you know, so the point of not letting the

abstraction get in the way of customers like real-world use cases, there's all kinds of ways that they

can configure how those new instances, those new images are rolled out across their cluster.

Right. Yeah. So is there over the last almost 18 months of its existence? Have there been

significant, I mean, there must be tears where you've seen the whole service graduate and step

up. Has there been significant moments? I think, you know, with the kind of beginning of any new

product, you certainly see adoption in common fits and spurts. We're really excited about,

you know, the amount of use that we've seen customers bring to Automode. I think, you know,

the thing that we found to that really, you know, encourages those kinds of like, you know, big

upticks in adoption are often because we've delivered some critical functionality that we, you know,

for whatever reason we didn't have the feedback or we just had to be, you know, diligent about what

the launch scope was going to be, you know, when we delivered those kinds of features, for example,

a recent one, one of the things that we knew customers would need is the ability to not only

trust us that we were going to be like kind of running things as effectively as possible,

but very be able to verify that. And so just recently we launched the ability for customers to get

logs from all of the managed components that we're running behind the scenes on our side of the

fence. A really popular ask from customers that, you know, obviously then results in, you know,

big surge of adoption now that that, that sort of gap has been closed.

And is that for that son to take him to analyze for vulnerabilities, mainly for troubleshooting?

So customers are so used, especially in Kubernetes with as sort of transparent

of an ecosystem as it is, you can go dig in as deeply as you'd want, right?

Customers are used to being pretty hands-on to understand why things maybe aren't

working the way they'd expect, whether there's a misconfiguration or, you know, some,

some kind of a sort of updated they could make that would make things work better.

And when we took on the responsibility for running a lot of that software,

you know, they wanted to see if they couldn't still have the same kind of visibility that they

used to have. And so that was feedback that we heard pretty early on. It was actually something

we wanted to include when we launched the service, but just, you know, it couldn't get time for it.

It's called auto mode. You know, you're almost imagining clicking a button.

It's, it's, it's what, once you, there is a, how long does it take to get running with all that

matter? Well, I think one of the things that is like the most exciting about when we launched auto

mode is that we also took a pretty hard look at what it took to get an EKS cluster up and running.

Right. And so as we, we're bringing auto mode into the final stages of development,

we decided that it'd be really critical, it'd be really great if we could use this as a way

to simplify what it took to get an EKS cluster. And so with auto mode, whether that's through the EKS APIs

or through the AWS Management Console, you can get started with an auto cluster with effectively

a single click, single click, production ready cluster up and running, you know, ready to integrate

with all of the various Kubernetes ecosystem tooling that you use on, you know, on, on, on all

environments where you run Kubernetes. And we think that is a, that's one of the things that,

that helps remove some of that complexity, or at least the burden of getting started.

But when you provide this type of service or product service technology, you, you know,

a naysayer might say, that's another level of abstraction, you know, my BMW is over-engineered,

I don't drive a BMW, you know, you're not allowing me to look into the engine room enough. And so

it's, it's, you know, a hyperscanner as much as we love you guys, you, you know, you are,

you're providing an encapsulated service at so many levels that you're going to get engineers

who will want to adjust to seeing a few more of the guts and gears working. How do you, how do you

respond to that? So we've been trying to create a really intentional balance with the amount of

configurability and then the amount of kind of hands-offness that auto mode has. So one of the

other ways that you can get started using auto mode is to enable it in an existing cluster and

decide which workloads in that cluster are a good fit or are ready to be migrated over,

ready to be handed off to us to operate, so that you can kind of gradually adopt this, this new

model for infrastructure and Kubernetes clusters. And is it, is it something that are you purely

talking to developers or do, you know, is it, do you find, you're, you're finding system and

general operations? Yeah, platform engineers is primarily who we think will get the most

benefit out of this. These are platform engineers who are even labels and tell about two years.

I know. And you, you know, DevOps, DevOps engineers before perhaps. Yeah. No, the reality is that

these, these are folks who are deeply devoted to building golden paths at their companies,

helping their companies increase kind of the, the velocity with which they're able to deploy

applications to their end users. And the work that, that, that kind of work that they want to

be doing is often like the ratio of that to this operational sort of day to day maintenance

of Kubernetes clusters isn't, isn't, isn't right, in my view. And so is the actual user feedback?

Great. You have automated something or what, you know, almost kind of like, why weren't we doing

this two years ago? Is it, has it been really welcomed in practical use cases? Yeah, absolutely.

I think that, you know, one of the things that's been really exciting is to see kind of the

variety of ways that folks have been using, using automoting and achieving not only its sort of

operational benefits, but also its cost benefits. You know, one of the, I think the easiest ways to,

to see someone take a really hard look at an offering like this is to be able to say

concretely, there are true cost benefits to, to using this. And, and one of the things that we do

see with, with customers using automotives that they're able to achieve pretty meaningful

reductions in their infrastructure. You speak to the actual users. There's, in fact,

there's a company called Stormforge, who is not itself a cost optimization, Kubernetes company,

who's able to achieve 30% infrastructure cost savings with automobiles.

Right. So it's kind of pace for itself kind of message.

Correct. Which in multiple ways. Yeah. And so I know you're not like to look at, you know,

we're not here to sort out forward the forward progression of the roadmap, but

is that, I mean, do you see this one day becoming subsumed as a utility service,

into some other, you know, into the wider EKS offering? Or, I think that it's that being too

suggestive. If our vision for automotives becomes the way that the vast majority of EKS

customers use Kubernetes, they're able to get away from having to do all of this

infrastructure maintenance and management and delegate that to us at AWS. And only for the

most sort of specialized and unique use cases, will they have to drop down a layer in the stack

and dig into, you know, that that kind of level of infrastructure configuration and management.

Of course, to do that means that we'll have to continue to be building in like the same

ways that we always have for automobiles to date, you know, continuing to strike that balance

between configurability and abstraction, looking for ways to make it even a better fit for the kind

of emerging use cases that we're seeing, particularly in AIML. Yeah. And bringing that same sort of

like application centrity or orientation to infrastructure so that customers can think about

the applications that they want to run, not how they're going to run them on the infrastructure

that's available through Kubernetes. It's almost like we should classify use cases into

that we should have a nomenclature for drawing. You've got an embryonic, you know, edge use,

not as in computing, you know, an edge use use case that's very potentially dynamic in terms of

the way it execute. So, you know, your provisioning should be at the most abstracted layers possible

because you just don't know what's going on. But your, you know, customer B, your, you know,

you sell shoes in a big city, so pretty much could tell what you're doing, or, you know, you sell

tickets for a concert or something. There must be almost a spectrum of use cases, which

that's right. I think we don't talk about that much, do we? No, and to be honest, you can see all

kinds of companies and all kinds of use cases on Kubernetes. That's one of the great things about

it as a platform technology. And I think, you know, in order to meet all of our customers' needs,

we're going to have to have this kind of spectrum of offerings that's let's them sort of place

themselves into how much they want to be involved and how much they want to be able to hand off

to someone like AWS. Great. Okay. Listen, thanks so much. Yeah. Thanks for giving us the insight.

It's, I'm thanking for being sort of, you know, colorful and illustrative with your definitions.

So I'm going to wrap up and say, thanks very much. This is us at CubeCon and Cloud NativeCon.

Europe amps to them. I'm Adrian Bridgewater with the New Stack. And if you want to see more of

this kind of content, please go to thenewstack.io.

Amazon EKS Auto Mode wants to end Kubernetes toil — one node at a time

About this Episode

Hosts & Guests

More from The New Stack Podcast

Can you make Kubernetes invisible? Here's why AWS is on a mission to do it.

The next stages of AI conformance in the cloud-native, open-source world

Microsoft wants to make service mesh invisible

Edge-forward: Akamai eyes sweet spot between centralized & decentralized AI infe...