Even Small Configuration Changes can cause Bigger Problems

I've got a small app in production that's been running without a glitch until the other day.  It turned out to be a case where the server didn't have enough disk space and nothing more.  That should have been the end of it, but I also decided to add myself to the error emails.  The next day someone reported not getting their regular email, and I couldn't figure it out.

Why would something that had been working perfectly and not been changed just fail randomly?  I looked everywhere in the code and could find no reason, and nothing was logged about this.  That said, there was an exception that was logged, but it was about my email, not the other.  My email address was on another domain, and that wasn't allowed, but I just shrugged it off.  Then it happened again today and all the sudden I realized what should have been very obvious.

That small configuration change I had made to send myself error emails was the actual problem!  My code first sent error emails before sending routine emails, all of which was one try/catch  So when I added myself to the error emails, I introduced an exception that skipped the others.  This was now easy to fix -- just make each email send be in its own try/catch, instead of one.

And the lesson learned:  any change, even a tiny configuration change, can introduce problems.  Even bigger lesson:  when problem occurs after small change, then your change is likely cause.  Those should really have been obvious in retrospect, but I managed to convince myself otherwise.

No Comments