chalakanth

Early error detection is paramount

Is early error detection, while developing software, really all that important?

This question has probably been asked and answered a thousand times over in the last 50 years.  Let me try to rehash the argument in my own words.

Catch the error while the cement is wet

Construction, of the brick and mortar variety, sometimes provides an apt analogy for software development, so I am going to give that a whirl.

See if you can correlate the story I lay out below, to the various stages in a software development task – DEV, QA, Bug Fixing, and the Aftermath.

Say you decide to build a house. You create some kind of design, and completely construct your house.

After the construction is all finished, and only after this, you call in an inspector to see if the construction is up to code. The inspector finds that your electrical wiring is the wrong gauge.  You must change it.

The wiring is inside the walls.  You have to tear into the walls to get to the wiring.

You have already spent a lot of money and time on the construction.  You want to move in already.  The extra expense is a burden on the pocket book, and the mind.  You are not at your patient, nor enthusiastic best.

The builder had other projects scheduled.  He wants to be done with your house yesterday. He can no longer giving his best attention to your problems.

Some of the folks that worked on this house are needed on the county commissioner’s lake side cottage. The builder brings in some temporary help to make the fixes. These are snot-nosed college kids, who don’t know many of the small, but consequential technical decisions that went into your house’s construction.  They are going to trip over these and make mistakes. Worse, these kids know they will never see you and your house after this summer.

After the wiring is changed, you notice that a couple of windows do not close well. The lights in the stairwell flicker randomly, but noticeably.  The re-painting of the walls in your guest room is not the right shade.  By this time you have no other place to live, so you suck it up, and move in.

After you move in you notice that your water heater does not work well with the new wiring.  You have to replace the water heater.   More aggravation; more time wasted; more expense; send in the plumber.

There is worse.  While monkeying with the walls and the wiring, a construction worker accidentally rammed a 100 pound sander into a load bearing beam.  It now has a crack it in.  No one notices.

Let’s see how this correlates to software development.

DEV

Say you decide to build a house. You create some kind of design, and completely construct your house.

Listen pilgrim, I just write code. I don’t test.

QA

After the construction is all finished, and only after this, you call in an inspector to see if the construction is up to code. The inspector finds that your electrical wiring is the wrong gauge. You must change it.

This happens all the time in enterprise software development. Developers do not test their code effectively. Infrastructure, which enables developers to adequately test the code they write, often does not exist. Everyone, including management, is happy to leave serious testing till after all of the code is turned in.

The bug fixing

The wiring is inside the walls.  You have to tear into the walls to get to the wiring.

The bug is buried somewhere deep in a few thousand lines of code that you blithely turned in. You go digging, make a change some place, with little knowledge of everything else that might now be affected by your change. It is hard to know, there is too much code, code that you don’t even remember exists. The bug fix is a risk.

You have already spent a lot of money and time on the construction.  You want to move in already.  The extra expense is a burden on the pocket book, and the mind.  You are not at your patient nor enthusiastic best.

The builder had other projects scheduled.  He wants to be done with your house yesterday. He is no longer giving his best attention to your problems.

I’ve seen this in just about every large project I have been part of. Developers use all of a sprint to write code and turn it in. Testing of this code happens in the next sprint, when both the stakeholders and the developers are also assigned to the development tasks scheduled for this second sprint. Nobody is able to bring their best selves to the bug fixing.

Some of the folks that worked on this house are needed on the county commissioner’s lake side cottage. The builder brings in some temporary help to make the fixes. These are snot-nosed college kids, who don’t know many of the small, but consequential technical decisions that went into your house’s construction.  This is going to trip them up, and they will make mistakes.   Worse, these kids know they will never see you nor your house after this summer.

You designed and wrote the original code. However bugs are assigned to someone else, who knows little of the business requirements, the design decisions that went into the solution, and the contours of the code base you created.

Sometimes this someone else is a consultant. And we know consultants can be a mixed blessing, don’t we?

The new and not so improved aftermath

After the wiring is changed, you notice that a couple of windows do not close well. The lights in the stairwell flicker randomly, but noticeably.  The re-painting of the walls in your guest room is not the right shade.  By this time you have no other place to live, so you suck it up, and move in.

A bug fix can fundamentally improve the solution you constructed. Or it may just be a jerry-rigged whatchamacallit that Rube Goldberg would look down his nose at. Often it is the latter, which gets you past today’s problem, and sows the seeds for several others.

But you have no choice. The show must go on.

After you move in you notice that your water heater does not work well with the new wiring.  You have to replace the water heater.   More aggravation; more time wasted; more expense; send in the plumber.

There is worse.  While monkeying with the walls and the wiring, a construction worker accidentally rammed a 100 pound sander into a load bearing beam.  It now has a crack it in.  No one notices.

Like I mentioned earlier, you often have no idea what damage you did while making your bug fix.

Lesson Learned

You want to catch the wiring issue in DEV:

  • Before a whole bunch of stuff was built around it
  • Before you build a whole bunch of stuff that depends on the error
  • While the construction crew was focused exclusively on this task
  • When the folks who made the error are available to rectify the error
  • When you have the least risk if you make another mistake
chalakanth

Who verifies the blueprint?

You have a business problem (sometimes called a ‘business requirement’).   Someone devises a solution.  You create a blueprint (sometimes called a ‘specification‘) of the solution.   Then you construct the solution that the blueprint specifies.  This workflow suggests that there are at least two things to verify.

  • Does my construction adhere to the blueprint?
  • Is the blueprint correct in the first place?

Here is an example.

Business Requirement

To determine the monies that we may have to refund a customer (say an auto-insurance holder), perform the following calculation.

The monies that the customer owes at the moment
minus
the monies that the customer has already paid us

If the customer has paid us more than she owes at the moment, the customer is due a refund.

However, we cannot count all of the payments that we have received from the customer.  There are rules that tell us which payments must be ignored while calculating refunds.  Here is one of them.

The ‘My Dog Died’ rule

Customers that are dis-enrolled because they failed to pay their premiums, can ask for and receive a sort of ‘grace period’, of usually 2 to 3 months, in which they can catch up (or perhaps even pay ahead).  If they hit all the payment targets within this period, the customer is re-instated.  Let’s call this the ‘My Dog Died, So Have Pity On Me‘ rule.

Specification of the solution – The Blueprint

So how will we satisfy the business requirement?  What are we going to build?

The brain trust (the business analyst, the architect, the DBA, the resident loudmouth), go off into their huddle, and produce these instructions for the construction crew (a developer, and a tester).

  • Using already known methods, determine if the customer is dis-enrolled because of failure to pay premiums.
  • Determine if the customer was granted a ‘My Dog Died‘ grace period, and if so how long that period is for.  In particular, look for the following attributes.
    • An attribute called ‘GriefStricken‘.  Its ‘Effective Date’ is the start of the ‘My Dog Died’ period.
    • An attribute called ‘GriefStrickenExpiration‘,  Its ‘Effective Date’ is the end of the ‘My Dog Died’ period.
  • When calculating refunds, ignore all payments received during the ‘My Dog Died’ period.

Verification

Does construction match blueprint

As I mentioned earlier, in my environment the construction crew consists of a developer, and a tester.

The developer writes computer code that implements the blueprint.  The developer and the tester verify that the computer code does indeed do what the blueprint lays out.

They discover bugs – mismatches between the construction and the blueprint.  The developer fixes the bugs – removes the mismatches.  At some reasonable point, the construction crew turns in the solution.

But.

Who verifies the blueprint

Yea, you guessed it – the blueprint was wrong.

It turns out that the ‘My Dog Died’ period is stipulated to start on the day that the customer is dis-enrolled.

What we thought was the start date, the ‘EffectiveDate’ of the ‘GriefStricken’ attribute, is only the day on which the customer was approved for the ‘My Dog Died’ grace process, which is often several days after the dis-enrollment.

There was no way for the developer to know this.  The tester did not know this either.  The construction crew only knows what is in the specification of the solution.

Yet, this is a significant bug.

chalakanth

The Enterprise Programmer Blues

Others, like Bob Martin, have made the point, that coding, in fact, is writing.  So I was not surprised when I found that the venerable writing guide, “The Elements of Style“, which has been around for about a 100 years now, had something to say about computer programming.

Before beginning to compose something, gauge the nature and extent of the enterprise and work from a suitable design.  Design informs the simplest structure, whether of brick and steel or of prose. You raise a pup tent from one sort of vision, a cathedral from another.  This does not mean you must sit with a blueprint always in front of you, merely that you had best anticipate what you are getting into.  To compose a laundry list, you can work directly from a pile of soiled garments, ticking them off one by one. But to write a biography, you will need at least a rough scheme; you cannot plunge in blindly and start ticking off fact after fact about your subject, lest you miss the forest for the trees and there be no end to your labors.

I write computer code in the enterprise.  There have been no cathedrals in my past, much less a solid hut.   The only promise that the big bad enterprise makes to me is spaghetti, or with poetry now, ‘soiled garments’.  How many clumsy hands have roughed you up, you poor code?

Ever wonder what ‘analysis’ is, which is merely the door that leads to ‘design’?  I haven’t heard a more melancholy answer than this, “… lest you miss the forest for the trees and there be no end to your labors“.   I’ve labored, Lord, how I’ve labored.   

In enterprise programming, very little is complex.  Dissect a bug, and 9 times out of 10, you will find, to borrow the Captain’s words from Cool Hand Luke, “a failure to communicate”.   Bugs have made me laugh. They have made me want to pull my hair out in frustration, and angry enough to challenge the miscreant to a duel. But I have never felt like crying, till I saw “16. Be Clear”.

Muddiness is not merely a disturber of prose, it is also a destroyer of life, of hope: death on the highway caused by a badly worded sign, heartbreak among lovers caused by a misplaced phrase in a well-intentioned letter, anguish of a traveller expecting to be met at a railroad station and not being met because of a slipshod telegram.  Think of the tragedies that are rooted in ambiguity, and be clear! When you say something, make sure you have said it.  The chances of your having said it are only fair.

Lovely.