Settings

Theme

Are We Ready to Kill Thresholds?

obfuscurity.com

26 points by obfuscurity_ 12 years ago · 3 comments

Reader

Pewpewarrows 12 years ago

Forgive me if this is a dumb comment to make, as I'm just barely starting to get into monitoring and the statistics knowledge that goes along with it, but adaptive fault detection does tend to scare me a bit. In the event that a problem isn't a spike, and instead gradually builds up over hours/days/weeks, I wouldn't be confident in something picking a dynamic threshold for me. I'd be afraid of it deeming the ever-rising resource usage as normal behavior, if it happens slow enough, and me not being alerted before it's too late (servers becoming unresponsive).

  • obfuscurity_OP 12 years ago

    That's not at all a dumb comment. As I alluded to in the post, I think it's important that we understand how these systems determine what is - or isn't - an abnormality or fault. Unfortunately, that often means revealing their "secret sauce" and risk exposing their product differentiation. It's going to be interesting to see how these products earn our trust.

    • jonlives 12 years ago

      Absolutely - this is one of the reasons that we made Kale open sourced so that people can see what we consider an anomaly, and adapt for their own use cases if needed. If your anomaly detection contains secret sauce, it'll be very hard for people to have confidence in it.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection