I’m hanging my head in shame right now. A part of me doesn’t want to have to write this blog post, but the lesson is too important not to share. When I published version 0.3 of ConTabs 4 days ago, a big chunk of a headline new feature didn’t work. Wanna see how I messed up? Read on…
What broke?
The big new feature in v0.3 was the ability to specify how long strings should be handled. The old behaviour was that the column would grow to become as wide as needed. This would often lead to broken tables, as these columns can increase the overall width of the table to greater than the console can accommodate.
The old “do nothing” approach is still the default, but a couple of alternate behaviours were added:
- Truncation
- Wrapping
The former worked fine. Strings were reliably truncated to the specified width, optionally with some ellipsis. This can be seen in my end-of-year review post, or in the screenshot below.
The wrapping behaviour, on the other hand, was only half-baked:
Notice how the long strings are wrapped, but that the newlines break the whole row, rather than wrapping within a cell. Clearly not the desired outcome.
How it happened
The first thing to say is that I didn’t break the build. The codebase at v0.3 compiles fine and all the tests passed. I know this because AppVeyor gave me a green light. Moreover, I know my tests are comprehensive because Coveralls reported that 100% of my code was covered.
So what went wrong? Put simply, I wasn’t testing the right things. My unit tests were too low-level and I didn’t do the level of manual testing that a new release deserves.
I did have a test to cover the method that applies the long string behaviour to the data in a cell. This test passed because this method was working as specified. Unfortunately, I didn’t have a test for the way the output of this method was used within the table.
This means I didn’t catch the fact that I’d failed to implement an important piece of the puzzle – I hadn’t made the OutputBuilder
capable of handling multiline cells. D’oh!
OK, so my automated testing had failed to catch a defect. Not great admittedly, but not the end of the world. I’d catch it with manual testing before publishing, right? Wrong! Somehow, I failed to realise that all my manual tests had been with the truncation behaviour, rather than the wrapping behaviour. In fact, the first time I saw this was broken was when I was using the NuGet package in a demo – I nearly died with embarrassment.
All fixed now
ConTabs users will be pleased to hear that this is fixed in v0.3.1, which is already available on NuGet. As you can see below, the new version works as you’d expect:
Lessons learnt
There are two lessons here. Firstly: unit tests only test what they’ve been written to test. Having 100% coverage and all your tests passing is great, but doesn’t necessarily mean that the code behaves as you expect. Secondly, it’s important to be thorough and methodical when performing verification of the system as a whole.
What’s daft is that we do this really well where I work. We consciously follow a “testing V” and plan our testing as we plan our projects and write specs. If I’d been as methodical with my side project as I am at work, I’m pretty sure this wouldn’t have slipped through.
As the ConTabs project grows, the approach to testing will have to become a lot more grown up. When more and more people start to consume the NuGet package, the potential impact of this sort of mistake will grow.
With that in mind, I suppose I’m grateful that I botched this release. Perhaps the embarrassment at this stage will mean I’m less likely to screw up in front of a bigger audience later on.
Hi Tom, thanks for sharing. You may find approval (also known as snapshot or golden) testing a useful technique: http://approvaltests.sourceforge.net/
Basically, it would replace building up your expected string and comparing against it with a single assertion:
Approvals.Verify(tableObj.ToString());
The framework produces a plaintext file containing the table and runs a diff against the ‘known good’ last output. If the diff is empty, the test passes; if not, test fails.
I think this is great for your particular case because it lets you quickly set up different attributes in your table objects and test their outputs trivially, without having to write the expected outputs by hand.
Have you considered using BDD to define and automate your requirements?