Quantcast
Channel: THWACK: All Content - All Communities
Viewing all articles
Browse latest Browse all 20205

Let's just eliminate alerting altogether, okay?

$
0
0

Hi folks, I'm Robert Novak, a long time sysadmin, once and future sysadmin manager, new Thwack ambassador, and a guy who gets really tired of his phone buzzing out of control.

 

Having had oncall duty be on my plate for much of the past two decades, I know the pain of one easy-to-fix problem causing your phone to actually leave the room on its own.

 

Maybe your monitoring system is housed outside the production network, and when that network link fails, all 1500 of your alerts go red (or brown, if your alerting system is honest). Perhaps someone restarts the DNS server with a pound-sign comment in a zone file, same thing happens.

 

Or maybe it's a honest lolcat-astrophe. A storage server crashes, or the nfs mount your web farm lives on goes bad. Still, you're getting a couple hundred (or more) alerts that could be traced back to one. And if you'd had that structure in mind when you set everything up, you might have had one alert and fixed it in 5 minutes, rather than having

management on the phone asking why you're talking on the phone instead of fixing the problem.

 

I know what I've done in the past, and I'm not that proud of it, but I'd like to hear your thoughts...

 

  • How do you make your alerting system work for you, rather than against you?
  • Do you implement dependency trees, or just loosely adjust what pages / emails / only shows up on the web dashboard?
  • If you use dependency trees, do you build the dependencies in from the start, or start by alerting on everything and working your way back from that?
  • What's the biggest wish you have for your current alerting system?

 

When you're done here, come see the next post... You do love oncall duty, don't you?


Reply to this post to get an entry in the March Ambassador Engagement contest. An iPod Nano sits in the balance!


Viewing all articles
Browse latest Browse all 20205

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>