feature article
Subscribe Now

Nailing Jell-O to a Wall

Saving Software from the Slippery Slope

Oddly, the engineering director didn’t seem as impressed as he should have been.  Perhaps he couldn’t see how well the software project was going already?  Only days into the project and it looked like it was already 80% complete or so.  The next six months would be a cake-walk.  The team obviously would finish all the required functionality as well as Marketing’s “nice to have” list.  The team leader started wondering what extra goodies the engineers would be able to slip in during their spare time.

A month later, things were still looking great.  Not quite as great as before, of course.  The original UI had to be scrapped because it wasn’t implemented on the right framework.  Also, most of the feature implementations had been just stubs initially, and now coding full and robust implementations of them was taking considerably more time.  That was all expected, though.  With a month to go before functionality freeze and alpha start, the project was well on track.  There was really nothing new visible in a demo, so the engineering director was left out of the loop at this juncture.  The manager decided to wait until next month to bring him in again – at the alpha milestone.

With two days to go before alpha, the manager was feeling the frenzied pace.  Marketing had come in with a couple more feature requests, and some of the less important features had been tabled while the important bugs were fixed.  The team had been bringing a lot of new capabilities online, and a number of serious bugs had crept in that had to be fixed before more features could be added.  The manager rationalized that it was better to get the system more stable before alpha and postpone the final few features until after functionality freeze.  He reasoned that the alpha customers would need to be able to do some kind of useful work with the software.  It wouldn’t help much to have each and every feature coded if most of the important ones just crashed.

Evidently, alpha test had sneaked up on marketing, so they didn’t have any actual customers lined up for the event.  Instead, they compromised by bringing in a few applications engineers for a “preview and bash” session.  The manager had gone into the lab the day before and installed the alpha version of the software on all the machines.  It was a nightmare, because the real installation packages weren’t written yet.  Directories had to be manually created and libraries hand-loaded into place.    The install was done at midnight and the software was still not working perfectly on all the machines.

After the bloodbath of the alpha bash, the development team rolled up their sleeves and got down to business.  The applications engineers hated the organization of the user interface.  Most of them hit major crashes and bugs, and none had successfully completed the exercise they brought with them.  They each had envisioned different paths through the system that didn’t match what the development team had implemented.  Over 100 bugs and feature requests had been filed, and major re-work was in order.  In other words, it was a pretty typical alpha test.

The next two months were made up of long hours, time away from family, pizzas slipped under the office door, and considerable stress among the developers.  QA had kicked into high gear, and the number of open bug reports had skyrocketed.  The AE manager had complained to the engineering director that the alpha bash was a waste of his people’s time and that the project was seriously off track.  Vacations were being cancelled, features were being scrapped, and the team was instructed to spend the time just “stabilizing the current system.”

Finally, and somewhat ironically, almost four months after the first “demo,” the system looked almost identical to what the team had shown the director four months earlier.  Of course, under the hood, there was a wealth of difference.  Most of the features were now fully implemented and working, and the UI was much more intuitive.  The database was now live and the error handlers were mapped to actual messages.  The persistent file storage was now working so user data could be saved and restored successfully – most of the time, anyway.  On the calendar, the project was supposed to be at code freeze and beta test start.  Marketing, however, wanted a four week delay.  They didn’t want to expose any customers to the state of the system today.  A “guru” technical marketing engineer was even appointed to provide oversight to the development team’s efforts.  Marketing no longer trusted engineering to develop autonomously.

The two months that had been planned for beta were quickly eaten up by a seemingly endless cycle of testing and debugging.  Instead of getting shorter, the bug list was growing, and the engineering team felt that many of the so-called bugs were “enhancement requests” disguised as bug reports.  Tensions between marketing and engineering grew.  The sales channel was angry because the product was now delayed.  Company executives were considering replacing the manager, but they didn’t want to further de-stabilize the project. 

Twelve months after the beginning of the “six month” development project, the first production version of the software was shipped.  It had only a meager subset of the originally intended functionality and bore little resemblance to the fantasy prototype that had been demonstrated almost an entire year earlier.  The development team was demoralized and decimated – losing several key members during the process. 

If this sounds like a typical software development project in your company, you are not alone.  Countless talented and well-intentioned software development teams repeatedly fall victim to the subtle but serious software engineering traps that snagged our heroes.  Fortunately, if you watch for the signs, you can steer clear of these colossal sneaker waves and keep your team safe on the beach instead of watching helplessly as they wash into the sea of endless debug cycles.

First, if you were to graph it, software follows one of the world’s strangest curves of “apparent completion” versus “actual completion.”  What does this mean?  Often, when software appears to be 90% complete, it is only 10% complete in reality.  For applications with a graphical user interface (GUI), it takes almost no time to prototype the GUI and create something that looks dangerously similar to the final product.  This demo typically sets unreasonable expectations in management and even sometimes lulls the development team themselves into a false sense of security. 

Second, despite the best intentions in the world, software specifications almost never define a usage model that works well.  The only reliable way to capture and refine the user experience with software is to iteratively expose realistic users to the system and make refinements in the process based on their feedback.  Of course, those users have to understand that they are acting as part of the development team and that they are refining software that is far from production ready.  Setting proper expectations among alpha testers is key.

Third, most teams are woefully naïve in the classification of their bug reports.  Typically, a single rating scale is used that goes from something like “Critical” (this bug must be fixed right now) to “Low” (this bug will never be fixed).  There is a tendency in such a system for severe inflation of the classifications as stress mounts on the team.  In some cases, teams even define new, higher levels of criticality on the fly – “Super-Critical” and “Mega-Super-Critical” come to mind.

Of course, the goal of a bug tracking and classification system is to keep the development team working on the most important issues first, and to track and document the known issues needing to be addressed.  The problem is – with a new and complex piece of software, the list quickly outgrows the complexity-handling capability of the rating and tracking system. 

Realistically, there are at several distinct axes on which problems should be rated.  The first is the likelihood of a customer encountering the bug.  Some problems may occur every time a user fires up the software.  Others may happen only in extremely rare corner cases or in theoretical cases that may never occur at all in practice.  A second is the impact of that problem on a user of the system.  Some problems may be only cosmetic – a few pixels that aren’t aesthetically pleasing on the screen.  Others may be catastrophic – errors that could cause (in the case of safety-critical systems) loss of life or property damage, or, in more conventional applications, loss of critical data.    A third axis is the amount of engineering effort required to fix the bug.  There is generally no correlation between this assessment and the other two.  However, the effort required is certainly a big factor in scheduling the work in engineering.  A fourth factor to consider in assessing bug reports is the likelihood that the fix will destabilize some other aspect of the system. 

The actual “importance” of addressing a particular bug may be something like the product of these four ratings – Obviously, the most important bugs to fix are those that happen frequently and carry severe consequences.  The decision to go ahead with a fix, however, also must consider the resources and risk involved in implementing it.  Educating an entire product team (even those marketing folks) on these factors can be extremely valuable in keeping a project on track, for maintaining reasonable expectations across marketing, engineering, and management, and for avoiding irrational abuse of a single-score bug tracking system.

All of these techniques really are aimed at maintaining a realistic and shared understanding among everyone associated with the project on how much work has been done, how much remains, and what the final product will do when completed.  The difficulty in accurately estimating software product development effort and realistically tracking progress has caused the downfall of many intelligent, capable, and talented software teams.  Keeping expectations about those issues in line can often be more important for the careers of those involved than actual productivity.

Leave a Reply

featured blogs
Nov 25, 2020
It constantly amazes me how there are always multiple ways of doing things. The problem is that sometimes it'€™s hard to decide which option is best....
Nov 25, 2020
[From the last episode: We looked at what it takes to generate data that can be used to train machine-learning .] We take a break from learning how IoT technology works for one of our occasional posts on how IoT technology is used. In this case, we look at trucking fleet mana...
Nov 25, 2020
It might seem simple, but database units and accuracy directly relate to the artwork generated, and it is possible to misunderstand the artwork format as it relates to the board setup. Thirty years... [[ Click on the title to access the full blog on the Cadence Community sit...
Nov 23, 2020
Readers of the Samtec blog know we are always talking about next-gen speed. Current channels rates are running at 56 Gbps PAM4. However, system designers are starting to look at 112 Gbps PAM4 data rates. Intuition would say that bleeding edge data rates like 112 Gbps PAM4 onl...

featured video

AI SoC Chats: Scaling AI Systems with Die-to-Die Interfaces

Sponsored by Synopsys

Join Synopsys Interface IP expert Manmeet Walia to understand the trends around scaling AI SoCs and systems while minimizing latency and power by using die-to-die interfaces.

Click here for more information about DesignWare IP for Amazing AI

featured paper

Top 9 design questions about digital isolators

Sponsored by Texas Instruments

Looking for more information about digital isolators? We’re here to help. Based on TI E2E™ support forum feedback, we compiled a list of the most frequently asked questions about digital isolator design challenges. This article covers questions such as, “What is the logic state of a digital isolator with no input signal?”, and “Can you leave unused channel pins on a digital isolator floating?”

Click here to download the whitepaper

Featured Chalk Talk

SLX FPGA: Accelerate the Journey from C/C++ to FPGA

Sponsored by Silexica

High-level synthesis (HLS) brings incredible power to FPGA design. But harnessing the full power of HLS with FPGAs can be difficult even for the most experienced engineering teams. In this episode of Chalk Talk, Amelia Dalton chats with Jordon Inkeles of Silexica about using the SLX FPGA tool to truly harness the power of HLS with FPGAs, getting better results faster - regardless of whether you are approaching from the hardware or software domain.

More information about SLX FPGA