Loading...
Loading...

Welcome to Total Network Operations, where we come together with great guests to talk about
great ideas and net ops.
I'm your friendly neighborhood podcast host Scott Robon, and today we have a sponsored
episode with StatSeeker.
We want to welcome Dylan Hensler, a customer solution specialist with StatSeeker.
StatSeeker is a network intelligence platform that helps net ops teams detect faster, investigate
root cause using complete historical data and operate with confidence without the noise
of SAS with data loss or surprise costs.
I really love to have products and solutions on the show that are new to me, so you can
get exposed to them too and see how they could work for you.
So let's get into the discussion.
Dylan, welcome to the show today.
Hey Scott, thanks for having me.
I'm happy to be here.
You bet.
You've got a pretty grounded background to talk about all this.
Can I tell us a little bit about what you've been doing in networking and network operations
and how that enables you and your current role?
Sure.
So I've been working in networking MIT for about 20 years now, it'll be 21 this year.
Actually started my career in the Marine Corps.
I went in infantry, but really discovered a map for communications and technology.
Ended up building communication networks, which was pretty cool because it was a mixture
of analog and digital around that time.
That was during Operation Iraqi Freedom.
And since then, I've just worked across the range of roles including network engineering,
infrastructure operations and project delivery.
Before joining Statsheeker, I was actually a customer.
I used it while managing an analyzed network infrastructure for a large restaurant chain
here in the United States.
We had a lot of distributed locations and needed reliable visibility and what was happening
across the network.
And I really fell in love with Statsheeker as a customer.
I got to know the product really well.
And now I work with helping other Statsheeker customers deploy and get value from the platform.
So really I've seen both sides, the operator side and now the vendor side.
Supporting a restaurant chain, you're making me really hungry.
We're recording in the morning.
I had my breakfast.
I shouldn't be hungry at this point.
But we'll use those examples and others as we go through the conversation today.
Okay.
Sounds good.
If I were to sum all that up, you've been in the operator seat in some high-pressure situations
and maybe less high-pressure situations, but it gives you a really practical perspective
on what users need from network intelligence and how they're trying to use it to troubleshoot
network issues.
Right?
Fair enough.
Yep.
That's exactly where I set it.
Cool.
So, we want to kind of use the theme of detect, prove, predict today.
Turning network monitoring and operational intelligence.
We hear a lot about that.
I want your help unpacking.
And as we go through this today, we'll hit on issues of challenges in troubleshooting
or distributed networks.
Think about that as we go.
Why accurate historical telemetry matters during incident investigations and having it
at the ready versus having to request months of data back and waiting to get it back.
And how you see network teams separating real problems from noise and false alarms and
trying to get to the real issues.
But there's one thing that really connected with me when we started talking.
I, a long time ago in a galaxy far, far away, let's put it in the mid to late 90s.
I was at one of the baby bells and I had responsibility for a couple large campuses in our multi-state
corporate IT network.
We had a lot of pride in delivering network connectivity to our customers.
That was our job, providing network connectivity.
So IT was kind of this little, little brother, little sibling to the big brother of providing
connectivity to our customers.
I constantly had to spend significant time and effort detecting and proving what the
issues were, especially proving when it wasn't the network.
When that was true, especially with different groups who are putting a lot of pressure and
stress on the network for remote training and other applications.
So your detect and proof messaging has my attention right away.
And back then, I was never really able to get to the predict part of your objectives.
So you'll help us go move forward three decades later.
So why don't you tell us a little bit about Statsiker as a company and a high level product
overview?
What are we talking about here?
Sure.
So we like to say we are a network intelligence platform.
And like you said, detect, prove, prevent detection is just about all about visibility and
speed.
Then the second step is proving what actually happened and then prevent is recognizing
other patterns fairly.
But to just really give you an idea of what the product is, is we are a network monitoring
platform and our bread and butter protocols, the things that we do best really excel at
our SNMP and ICMP monitoring.
And what we do is we really scale really well.
We go out, we look at every device on your network.
We pull them through SNMP every 60 seconds.
And then we are doing ICMP pings every 15 seconds, then we're combining all that data
to into a database where we've got all your data indefinitely, you own it, platform is
either you own it.
It's not a SaaS, you install it, either on physical hardware, on prem, in the cloud.
And then we don't roll anything up, we don't aggregate it, we just add all your data from
all your network devices there.
And you can combine data in a lot of really cool ways with reports, dashboards.
And then you can also set up thresholds and alerts.
And really what we're all about is gathering all the data and then giving you a lot of ways
to work with it from your network.
So you know what's happening, why it's happening, when it's happening.
And then because we store everything in full resolution, you can always go back and look
at what happened in the past as well and look into the future with the prediction that
kind of give an idea of what we expect to happen.
So your ability to keep all that historical data sounds pretty important to me.
Yes, that's huge.
That's definitely one of the things our customers love the most about Stasi.
So tell me, give me some examples of where that's really been a huge benefit, either from
having access to the way back machine for what's happened in the network or access speed.
How does that manifest itself with your customers?
Let me go back to your example you're talking about with the baby bells.
We've had customers that have got in those fights with their ISPs, their service providers.
And everyone always wants to blame the network as the network engineers, I'm sure your listeners
are used to totally experience the job of a network engineer is always to prove that
it's not the network.
So that's where Stasi here comes in.
We had a customer, they were able to go back through all the historical data and prove
that with their bundled connections, only one was passing a traffic at a time.
We had every data point, every poll that happened for every 60 seconds and they were able
to send over three months of CSV files to the carrier, showing exactly what happened.
And then the carrier was like, yeah, you're right, it was us.
And they were able to get a big service credit from that ISP.
It's just kind of an example of just something you can do with historical data.
People don't have to take your word for it.
You have all the data at your fingertips and you can make a lot of decisions and outcomes
with that data once you have it in that kind of format.
When you were back in the operator C and before you had touched Statsheeker, I'm sure
you have some like, man, I wish I had what happened a month ago or six months ago.
What were things that you were doing pre-Statsheeker to actually keep track of things?
I was taking a lot of screenshots because we had other tools, but what they would do is
they average up data over time or they'll only poll every five minutes or they'll aggregate
things or some vendor APIs, but data disappears after three months because they don't want
to be paying those data costs, but with Statsheeker, it's your data if you store it for however
you want.
So, yeah, it was really that depth of historical data that really stood out to me as a customer.
Nothing gets rolled up.
And then let's say I was working on a case and then it sounded kind of like something
that happened before, but there's so many incidents, so much happens.
I was able to go back using our time filters to look at like a specific every Wednesday,
every third Wednesday of the month, something like that, or when I was doing the restaurants,
I was able to go back and look at every Mother's Day, every Valentine's Day, to look at exactly
what happened on those days, so give me a really good idea, not just yeses, but actual
intelligence of what to expect in the past or what to expect in the present based on
what happened in the past.
And then, of course, owning your data and storing it where you want to certainly have security
implications, I'm guessing some of your customers really, really appreciate not having to send
the data elsewhere and worry about that for either regulatory reasons or privacy reasons.
Yeah, of course, that's huge.
By default, we were storing 18 months of data, but there's no limit on that.
We have customers been using stat seekers for 10, 15 years and they have all that data saved,
and you're right, defense, finance, sectors like that, they want the data in-house, they don't want
it going over insecure connections, they don't want to go into something they don't control,
and there's a lot of compliance things to consider there.
And that's really where a great on-prem solution like this really stands out compared to some other
things out there, and it's why a lot of people choose stat seekers.
Sure.
One of the things you mentioned too was you basically rely on trusted and proven polling mechanisms
with S&MP and ICMP, and you can create averages from that data over time if you want,
but you have the actual data, so you can both see what happens in real time and then do whatever
manipulation you need to after the fact.
Yeah, exactly, that's the huge thing.
StatSeeker does have those options in the tool, we're figuring out baselines, we're figuring out
trend lines for your data, so if that is something you're interested in, you don't have to go to
a separate tool, but so we're trying to try to give you the best of both worlds there in statseeker,
full data, but then we also make it readable and kind of give you high level reviews as well to
work off of, really just depends on your use case, and that's what we try to do, we try to solve
as many use cases as possible, because we just got so much data coming in, there's a lot of
really cool things you can do with it.
Sure. Do you have a couple of top use cases that you see very, very typically with your customers?
What are some of the biggest things that you tackle?
I'd say the big things are just going to be those core network metrics, especially for
giant organizations that have just a ton of things to worry about, or monitor, so for example,
with enough hardware, we have a pretty low footprint, so it doesn't take a huge amount of hardware,
but a single statSeeker server can monitor a million interfaces every 60 seconds, so
that's where really we see customers looking at utilization, setting up alerts, thresholds.
We'd like to be able to give our customers the tools to automate a lot of things, so,
for example, critical WAN links. This is the utilization on those starts creeping up. We'd like
to set up thresholds for those, or air accounts, discards, anything like that, the kind of
is going to signal that a problem is coming up before it turns into an outage, and then you take
those alerts and thresholds and export them out to a platformer your choice, or stats you can
send you emails, we do webhooks, so out to service now, Slack's teams, whatever your preferred
platform is, and then yeah, it's really just kind of an idea of making sense of all that data,
getting ahead of it, and our customers love the alerting, the thresholding, things like that,
or what I see the most call for. That seems to be pretty important for the predict
part of the equation, right? If you're watching air rates go up, let's talk about more about that,
because back in my ancient example, that was definitely an issue for me, the lack of predictability.
Sure, so we do have a very cool analytic solution built in a Statsieker, it's all machine
learning, it's not AI, it's algorithm based, but so what we're doing is by default, because we store
your data in full resolution, and I don't average either aggregate it, and because we're pulling
every 60 seconds, that's a ton of data to work off of. Sure, we take all those metrics, and by default,
we're looking at the past six months of your network data, and that's for everything in the database,
so every metric that's coming in, we go and we figure out what does the network normally look like
this day of the week, and this time of day, and then what we do is we apply an anomaly score to
that, and we can really call out those anomalies in reports, dashboards, just put them front and center
for you, or you can also set up those alerts and thresholds to when something's anomalous,
you get an alert on it, and that's where just having all that data and the longer Statsieker's
been running on your network, the more it learns, and then you get these anomaly scores where
my primary handling for my data center suddenly has an anomaly score that's really high,
and what's causing that, and it will be on that metric, so it might be anomalous for that time of
day, which comes in really handy because you might care about business hours, Monday through Friday,
if it's happening after hours, that's also a big sign that something's out of the ordinary,
and before Statsieker, that was really something that just you're relying on your
networks and your engineers and your admins and your analysts to really know the network and
haven't memorized, and when you do work on a network for a time, you do get to know it, you get
a feeling for it, but with this, Statsieker's kind of doing that for you, it knows what's normal,
and it tells you when something's not normal. That seems to be one of the good extensions of
traditional monitoring versus getting to operational intelligence. That's why we like to say we're
an intelligence platform, we are a network monitoring tool, but we're a lot more than that too.
TSA had out a little bit, so what does it mean to be delivering operational intelligence
in different ways? We've talked about this, the anomaly score, and letting you know when things
are not normal, and by the way, let's just call out there, number one, you don't know if something's
abnormal, unless you know what normal is. And baselining, baselining's got to be part of the
equation. Yes, that's where everything starts with our analytics. So, but longer you've been
using Statsieker, the more value you're going to get out of this. Like I said, we take the six months
of data to kind of establish baselines, trendlines, figure out what's normal, but if you're a long time
Statsieker customer, and you have a really static network, or there's only certain things you
care about that have stayed static, you can just, you can go back even further than those six months.
But for the first part of your question, what does, what do we mean by like operational intelligence?
I'd say traditional monitoring platforms are just telling you, is it up or down?
The main operational intelligence is what's happening across the network and why. So, Statsieker,
you combine fast pulling, historical data, analytics, it all kind of comes together, and that
lets our customers detect problems earlier, validate root causes faster, and it gets you to a point
where you can start preventing issues instead of constantly firefighting. So, in addition to all
that, Dylan, you know, the data fidelity impacts are really important to what other operational
impacts have you seen from being able to have that real data and not just averages and not just
summary information? Sure. So, a big thing you see with tools that average or don't have the data
fidelity is you're going to miss things like ref outages, flapping, microbursts, things like that.
You're only pulling every five minutes and something looks good at the start of that window,
and then it looks good five minutes later. You're missing everything that happens in between.
So, that's really where that high fidelity historical data retention really comes in. You need
evidence of what's going on in your system. Your monitoring system rolls up or only keeps a few days
of history. Your troubleshooting really becomes guesswork. Statsieker's historical data lets you
go back days, weeks, months. Just see exactly what was doing and something else that I really like
that I don't think I've mentioned is it allows you to put multiple data points from different types
of devices right next to each other. So, you're not just looking at one device to figure out what
happened there. You can get a big picture for a ton of different things on your network all at once
and because you have the data and it was all pulled at the same time, you can really correlate things
there to figure out how to make use of that data and that really comes in valuable. You're not
flipping forth between a bunch of the devices on the CLI. You're not digging through text files. We're
combining them and putting it together for you. And then yeah, that's just really valuable for
figuring out just exactly what's going on. I think you said earlier like your standard install
can support up to a million interfaces so that can cover a pretty large network. I imagine.
So, what are some of the biggest networks you've seen being monitored with Statsieker?
We're monitoring the biggest retailer in the United States. Statsieker is a company based out
of Australia but we operate around the globe. So, a lot of the big retailers in Australia, Europe,
I said I was a Statsieker customer. I worked for the biggest casual dining restaurant
organization in the United States, say on elsewhere. So, really, that's somewhere where we live
and some of our biggest customers are. It's these, it's not, we do great with small to medium-sized
networks but we just scale up to the huge enterprise customers as well. It have millions of
devices, thousands of locations, places like that. Sure. You also talked about the importance of
integrations. I think you mentioned service now. You also have some good vendor integrations.
Can you talk about those? Let's start with service now. So, what does it look like for you to
cooperate and collaborate with service now in IT service management? Sure. So, that's a huge
thing. There's so many tools and everyone's doing something different out there. So,
the approach we've taken to it in our most recent releases is we're building in a lot of support
for HTTP webhooks and those tie right into our existing learning features. There's just another
option to get your data out there. So, threshold event happens and our alert is created and then
what we're doing is we're generating that HTTP webhook that integrates right to service now.
For example, it's got all the key values there and everything that service now needs to create
a ticket to that and prioritize it and get it up and running. We export it out of stat seeker
and it goes into those platforms. Some of the other options I mentioned were Slack,
Teams, Pager duties, surprisingly popular still. But we also just have a custom options for that. So,
if you have a platform that can take in just an HTTP webhook, that option is there. Other ways we
have of getting data out, we do support custom Python scripts. So, if you want to do something really
advanced, we work with you on that. It's built in the stat seeker that support. We've got a full
rest API to access every bit of data that's in here. So, and then of course, just standard emails
with PDFs, CSVs of your data. So, we give you a lot of options to get it out there and integrate it
with other tools. Yeah, this is the fuel for automating remediation and networks. I spend a lot
of time talking about network automation from other things that I do in my career and you need
the basic information of what's happening to trigger either a person to take some action or
to fire off a script. You know, and tell an orchestrator, look, we need you to do something about this
win link or about the switch interface, right? Yeah. And just that right there, what you're talking
about, we try to automate a lot of the process. So, you can spend a more time actually working the
issue. So, we do automated discovery and then re-walk. So, stat seeker's always keeping your
network up to date, learning what's new, what's changed. And once it's been set up, there's not a lot
of setup or configuration you have to do. So, you can just auto-groupings. Another big one throws
all your data into groups that make sense automatically. So, a lot of that admin and configuration
that takes away from the actual engineering and problem solving, stat seeker's doing that in
the background so that you can really focus on what matters and not fighting with your tool.
You do, you also called out good vendor equipment integrations, but you are using completely
standard base mechanisms from S&MP polling and ICMP. So, imagine you don't really have any
limitations on the equipment you can monitor and work with. That's correct. If a device supports
S&MP and ICMP, stat seeker can do something with it. We do support what we call custom data types.
So, let's say we have a funky vendor that's not really doing standard S&MP stuff. Sure.
We'll work with you to develop a custom data type that really takes those S&MP tables that
are coming in from the devices and turns them into something more useful for stat seeker. But
the good thing about that is when we work with customers to build out those custom data types,
it's not just for that customer. We roll it into our next release. So, stat seeker is constantly
growing to learn about new vendors, new data types. And even though S&MP has been around for a
long time, it's still the gold standard for monitoring. It gives you everything you need to know.
And it's very efficient, which really is what we like stat seeker to be. We don't want to have
this huge footprint. We don't want you to have a ton of servers and resources behind it. We want
to be as efficient as possible. And that's where that comes in. You did mention some of our vendor
integrations. And that is something we've seen. Some vendors don't have all 100% S&MP support
anymore. So, we built in modules to stat seeker right now where stat seeker can talk directly to the
API for Cisco ACI, Cisco Marockie, and pull in that data and combine it with what we're getting
from S&MP from your other network devices. And it's all seamless. So, you don't have to have like a
separate view for like you would when you're going to those vendors, those vendors API portals,
you're only saying the stuff that's there. If you're a operator with a mixed network, which most
people are, stat seeker gives you a way to integrate those APIs with everything that's coming in from
S&MP and elsewhere. Cool. On the custom data type front, you know, you mentioned very graciously,
there are some vendors that do interesting things that don't have full S&MP support. Have you seen
any really interesting examples there like maybe from the IoT space or anything else with like a
funky, a funky stack that you were able to do something interesting with? Sure. IoT is a huge one
because it's still kind of the wild west right there. Sure. Every IoT manufacturer thinks they
can do S&MP better than anyone's done it before. So, right. Instead of sticking to the tried
and true, they they'll put their own things out there and just kind of ship it and hope it works
which is great if you're only working with that product. But from a network, engineering standpoint,
the data might look a little weird. So, a lot of things we see are like environmental sensors.
Sure. UPSs are a big one that we're really building out. Okay. UPSs PDUs.
Sure. Every vendor has their own basically S&MP schema in that space and there's been so many vendors
vendor acquisitions and mergers that it's kind of a mess of you can look at one UPS from one
manufacturer, look at a model from a year later. The S&MP data is going to be completely different.
Interesting. And so, what we've done with one of our big retailers who does a lot of acquisitions and
acquiring of other brands, they have this huge sprawl of different UPSs PDUs.
Sure. They really had no way of what was going on with them because there wasn't a single tool
they could make sense of all that. Right. So, that's where they had people that were going in every day
manually logging into hundreds or thousands of UPSs just to tech the metrics on there.
And so, we worked with them to build out a custom data type that really takes in all those S&MP
tables coming in from the different devices, figures out what they actually mean and then put
them together into stat-seeker reports and dashboards where they can actually see all the data and make
sense of it. Sure. That's interesting. And I guess it's not completely unexpected. It's not like
working with switching router manufacturers where S&MP behavior is very well known and expected
it's part of building a router or switch versus a UPS or a sensing device. Their focus is on
sensing technology, how power supply actually works and the networking and S&MP support may be
more of an afterthought. Exactly. That's definitely what I've seen and that's one of the problems
we're trying to solve. Sure. And I say that without judgment, like I get it, people focus on their
core business. Well, you give them the ability to adapt, right? And again, even from the same UPS
vendor, you can deal with different formats of data that you get from release one to release two.
They're very interesting. Even different firmware versions to get in different data.
For sure. Well, if you could see around the corner, get out your crystal ball or your
volunteer, whatever you prefer, a token reference there, not a military contractor reference.
Where is this going? You know, how do you see StatSeeker and the market for observability and
operational intelligence going? Not the next five years, maybe not even next two or three years,
like 12, 18 months from now, where do you expect things to be heading?
Sure. So I think we're going to see a lot more focus on data consolidation and integration.
Network teams are using a lot of different tools, a lot of different
devices and challenges really bringing all the signals together. So given those operators a
way to see the full picture, longer than that, I think we will see more automation and AI
assisted operations. That data, those systems are only going to be as good as the data feeding into
them. That's really why accurate historical data and accurate real-time telemetry are going to be
so important. AI and automation, everyone wants to talk about them, like their magic fix for
everything, but they're only as good as the data sets and they can only identify
patterns and make recommendations. So I think we will be moving in that direction,
but it's all going to be dependent on having good data and at least as far as from the StatSeeker
side goes, that's where I see things going in the future.
Yeah, I don't want to drive your product roadmap, but I could see real value again in having
that real data not summarized, being extremely valuable for specialized model training.
Where I might want to, here's the last six, 12, 18 months performance of my network.
Here are all my configs, and here's my Cisco infrastructure, here's my Gene2Per infrastructure,
here's my Nokia infrastructure, pick your favorite vendor names. If you do some really powerful
things with that, I would imagine. You can, and that does tie into another product integration.
We've recently introduced is StatSeeker now has a Kafka X-Border module.
Wow, we've taken all that data, and we've got customers that are taking it and dumping
all their network data in real time using Kafka into these great big data lakes to make
just the kind of AI things that you're talking about. And I'm really excited about that Kafka
Data X-Border, and I think that is going to tie into a lot of the future use cases you're
talking about there. Let's come back in a year and see what's happened there. I'm not kidding,
right? I know, I'd be happy to. I'm really excited. I think everyone's,
everyone's got their best guesses about what's going to happen in the next year, the next
year to three years, but we just got to wait and see. Now, I think those focused,
focused models for network behavior and performance are going to be critical. So,
I'm calling it, that's what my crystal ball says. So, it's definitely on the right track.
Sounds like we're similar. Well, look, before we wrap this up, you know, you've been doing this
for a while, right? Variety of environments. You know, zoom out. This doesn't just have to be,
you know, Dylan and StatSeeker, you have any final words of wisdom for listeners from being,
you know, embroiled in net ops for as long as you have. What would you, how would you guide our listeners?
Sure. I had to give one guiding thought. It would be that StatSeeker gives you confidence.
You want to know what's happening on your network, and you want a tool that's going to help you
understand it better, a tool you don't have to fight with. But yeah, at the end of the day,
network operators want confidence. You want to know what's happening in your network,
you want to be able to prove what occurred, and you want to have the visibility needed to
prevent problems before users are impacted. You don't want to be on that phone call at 2AM,
where everyone's blaming the network, and you're struggling to gather data from a bunch of
sources. People want answers right away. That's where StatSeeker comes in. It's a tool that
and a platform that really gives you that confidence to solve problems and find problems before
they happen. Yeah, I don't want to be on that 2AM call anymore. We've all been on the 2AM call.
I can't do that anymore. I just cannot do that anymore. But we've talked about a lot here,
but people can actually go see what you're talking about, Dylan, where should they go to learn more?
Yeah, we've set up a page specifically for people listening to this podcast, and that's just
off of our homepage. That's going to be statseeker.com slash net ops. It's got all the information
there. It's going to hit on some of the key things I talked about today. And really,
that's the next step for after listening to this for finding out about
statseeker after this podcast. Well, that's one of the big things that we try to do on total
network operations, expose you to new tooling. You should go check out statseeker.com slash net ops
and see what Dylan's been talking about here. Dylan, thank you so much for coming on
total network operations today. Thanks for having me Scott. One thing I just forgot to mention
about that statseeker.com slash net ops that I know our marketing theme will kill me if I don't
throw it out there is we do offer a full 30 day free trial, no credit card needed, no sales call.
You just go there, you download it, we give it to you, you do whatever you want with it,
you put in your own data there. And really, that's the best way to
get a fill for how statseeker can serve you and all the value it can bring is to really get your
hands on hands on it very short set up time. And I really recommend anybody that was interested
today to try out that demo. And thank you for the no credit card required. I'm totally serious.
So yes, I hate trying out new tools where it's required. I just want to get my hands on the
thing. And that's what I what I've tried to get with statseeker. Well, very good. So any
listeners out there, operators, thank you for spending time with us today on another
conversation at Total Network Operations. Again, that's statseeker.com slash net ops to do that
30 day free trial. Thanks again. And we'll see you next time on Total Network Operations.

The Fat Pipe - Most Popular Packet Pushers Pods

The Fat Pipe - Most Popular Packet Pushers Pods

The Fat Pipe - Most Popular Packet Pushers Pods