When Good Programs Go Spamming
Whether it’s unit testing or behavior driven I like testing, and usually go overboard on such. I’m also a big stickler on handling exceptions as well. I just completed a fix to a very small program that turned out to be a huge spam engine because the original programmer had neither testing or exception handling and yet a large amount of spam complaints were coming in and this program was the final end point.
This program is in production and as I evaluated the complaints and faults it really blew me away that spam was being sent at all. The program itself was written in C# and used as a windows service. So whenever it faulted it was restarted by the services a minute later. That’s all well and good as far as keeping it running on faults, but why was it sending spam.
All it had to do was query a database and grab a group list. For each of the lists it then queried the database to find all the emails to send. So what we had here was two loop conditions. For each lists grab the associated emails and for each email send. If a fault in either of the two killing the program should have not send the wrong email to a list. Right?
Well in practice this did not happen. Even on the fault of the email retrieval the program did continue to run for the group list with the previous queries emails; if even for a few emails before crashing. This became spam and this is where the complaints came from.
Testing aside this is why handling exceptions is extremely important to the sanity of our programs. If a fault occurs during the initial retrieval of the group list occurs it is safe to let the program crash as the windows service will restart it. However during the associated emails retrieval portion I added a try/catch portion. If the database returned the emails then continue processing as normal, otherwise “continue” (and log) the next group integration. Simple as that!
What was the result of this seemingly simple fix? Not only did cross sending emails to the wrong group list stop, but the program no longer faulted out every time it ran.
While fault tolerance is a testable method (insert your method of choice) it really is a programming path of itself and that what I am showing here. A test could have said “given that the database timed out the program should continue normally”, the exception handling for this test can change the entire logic or operation of the program. Proper exception planning along with program operation planning greatly enhances its usefulness.
I’m still in shock and awe that even when a simple program like this crashes it still somehow continued to operate, albeit incorrectly. Oh well it’s fixed now.

