Settings

Theme

Ask HN: How did you learn to debug production incidents?

1 points by binora 4 months ago · 3 comments


dhavalt 4 months ago

While debugging a crash or incident, I used to get lost in the specific line of code and trying to fixing symptoms. TLA+ gave me new prespective towards debugging, and started treating system crashes as a state problems rather than just code errors. I stop asking 'Why is this line failing?' and start asking 'How did the system get into this state?'.

And whenever a fix or patch feel like a duct-tape to me, start looking at the architecture and keep asking myself how can this be refactor to make it more resilient.

I realized my brain naturally wants to 'fill in the gaps' with assumptions. Learning to suppress that urge and force myself to verify what is actually happening rather than what I think is happening has been the most important part of my growth.

Its a continues process, you never stop learning.

marco_z 4 months ago

Through tears and mud, how else?

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection