WaterFail

Just over a decade ago, 17 frustrated software engineers made a pilgrimage to the top of a mountain in Utah. (OK, they really just went to a resort for a boondoggle, but bear with us here). After several days of soul-searching, they descended from the mountain with a manifesto that would forever change the face of software development. The Agile Manifesto codified what coders had been thinking and saying for years: The waterfall development process – the prime directive for professional-grade software development for most of the history of software development itself – was badly broken.

The traditional waterfall development process does pretty much what the metaphor implies. While a near-infinite number of variations are used, waterfall development starts at the top with a documentation of customer requirements, drops through the creation of a functional specification for a system to meet those requirements, continues through the creation of development and testing and project plans that describe how a group of people will work together to realize the system described by that functional specification, and ends with design, verification, manufacturing, testing, and deployment of the production system.

Taken to the literal extreme, waterfall development would be completely linear. There would be no loops or iterations. Each step of the engineering process would be completed, reviewed, verified, and signed off before the next step commenced. Marketing would gain perfect insight into the needs of the target customer and would articulate those needs clearly and unambiguously in a requirements document. Engineering would respond with a functional specification for a system that met each and every one of those customer requirements and did so with robustness and elegance. Verification engineers would construct bomb-proof tests that would confirm that the target system was performing perfectly, and the realization of the design would – using the entire array of best practices – be delivered and verified on schedule ready for production and distribution.

Then we all woke up.

As systems become more complex (waterfall proponents claim), we must develop a set of best practices to assure that our product works correctly and does what our customers need it to do. Over the decades, the waterfall approach to engineering evolved, expanded, and exploded. At its root were the three steps: 1) Decide exactly what you’re going to build. 2) Build it. 3) Verify that you built what you intended. These humble three steps, of course, expanded into a chronic case of best-practice bloat.

Have you ever thought about all the things in life that you’re “supposed” to do? It is pretty much agreed that we should brush our teeth at least once per day. Aerobic exercise is considered good. Getting the proper number of hours of sleep is important, of course. Planning and eating a well-balanced and nutritional diet isn’t considered eccentric by most. Each of these practices takes a little time out of the day. As one evolves one’s daily routine to include more and more sophisticated versions of these best-practices, an interesting effect kicks in. There is no longer enough time to do them all. If we actually took the time to plan and execute all of the things considered “best practices” for our lives, we’d have no time left for life. As a result, we are left in a state of constant guilt for failing to keep up with everything we “should be doing” without ever stopping to notice that complete and perfect conformance is physically and temporally impossible. Between recycling, balancing our checkbooks, sweeping the floors, flossing, drinking 8 glasses of water, cleaning the rain gutters, changing the oil, conditioning our hair, applying sunscreen (30 minutes BEFORE we go into the sun), and keeping a proper checklist showing us that we did all these things, how do we find time for life?

This same effect has invaded our engineering processes. By the time we do everything we are supposed to do in a professional-grade waterfall development flow, we have no time left for engineering. (You DID remember to update the PERT chart in your project plan this week, didn’t you?)

Waterfall broke down for software engineering. Agile proponents argued that software development was a creative process, and thus by its very nature defied rigor such as specification and scheduling. Perhaps the true culprit is not creativity but complexity. Modern software systems are the most complex things ever created by humans. Software is, in fact, almost pure complexity. When a system reaches a certain level of complexity, documents like functional specifications become so large and complex that our minds can no longer objectively evaluate them.

I once worked on a software project with about a dozen engineers. We were using a waterfall development process, so our first step (as an engineering team) was to develop a functional specification based on the marketing requirements document (which had been reviewed, signed, sealed, and delivered previously). As we developed the 1600-page functional specification, we began to see hints that we had lost control. The market requirements themselves turned out to be rife with contradictions and conflicts. None of us caught these inconsistencies during the extensive review process, of course. To make matters worse, we had all “signed off” on the market requirements document – setting ourselves up for at least shared organizational blame when problems were subsequently found.

The problems did not stop with the requirements translation step. After an agonizing specification review and approval process, we began developing code. Almost immediately, the coding process turned up inconsistencies and problems with major aspects of the functional specification. Page 124 of the spec said the system should do X, but page 1024 assumed that the system was instead doing Y. Even the most preliminary experimentation with a prototype showed that vast sections of the specification described a system that would be completely unusable, despite the fact that the specification had received the “Good to Go” sign-off from a large team of verified experts.

Our system was simply too complex for a strict application of the waterfall model. We needed an iterative process that would allow human interaction with the system as it evolved and course corrections based on the discoveries made during the development and prototyping process. Our solution was something akin to a modern agile development process, although the term had not yet been coined (this was the 1980s).

We like to think of hardware design as complex, and, with chips now containing billions of transistors, we can certainly make a good argument for complexity. However, even at today’s scale, hardware designs are at least an order of magnitude behind software in complexity. Software development hit the wall over a decade ago, and it has adapted by shifting away from the waterfall model toward something more like an agile development process. Today, I’d argue, hardware can’t be far behind. Our hardware development projects are already showing the signs of failure from excessive complexity, with teams and schedules spinning out of control.

When our project falls victim to these effects, the natural instinct is to hunker down and waterfall harder. Project managers push for even more detailed specifications, and management intervenes with new and impressive levels of sign-off requirements. If everybody signs off on each step (the reasoning goes), then we will have a perfect audit trail of accountability when things go off track. We’ll know whom to fire.

Of course, the problem is not accountability. The problem is that nobody in the entire organization (or perhaps on the planet) is intelligent enough to hold the entire concept of a complex product in their head and spot all the inconsistencies and issues. We can review and edit and approve and sign-off until the cows come home, but nothing short of allowing actual users to interact with a prototype of the product will shake out all of the bugs and busts in a new, complex, electronic product. Until we have the technology to formally verify our design against the “Do what I want” property, some aspect of successful design must be an iterative loop involving the user.

One of the main ideas with agile is to create a tight, rapid, feedback and iteration loop. Prototypes are built, tested, and evaluated – and ideas for improvement are fed back and integrated into future iterations. Hardware, however, does not adapt as easily to this iterative design and implementation style as software. Software has only the compiler between the code and the reality. Make a change, run the compiler, and you’re ready to test a new prototype. In hardware, particularly if your process involves design of a complex SoC, turning the iteration crank could require months and millions each time.

One of the least-touted benefits of programmable logic technology is the enablement of agile development practices for hardware. Even with designs where FPGAs are not planned as part of the final product, FPGA-based prototypes are used to facilitate rapid iteration on the functionality of the hardware portion of the design. Whether we admit we are doing agile development or not, use of FPGA-based prototyping is an agile-like behavior.

Increasingly, our systems are highly complex mixtures of software and hardware. Agile methods from the software side of our projects will inevitably bleed over into the hardware side and will take a seat in the overarching system-level design flow. As our hardware platforms become ever-more-standardized SoCs, this system-level agility will be enabled even further

As with many technical problems, the human and cultural side of this design flow transformation will be the slowest and most difficult to manage. The true value of a revolutionary process change cannot be realized by top-down mandate to the masses. Engineers have to first understand and embrace the concepts, and then adapt their work practices and culture to accommodate them, before the big administrative switch can be thrown.

4 thoughts on “WaterFail”

kevin says:

July 10, 2012 at 2:36 pm

Is some variant of agile development practical for hardware? Does waterfall fail as system complexity rises? What is your experience?

Log in to Reply
jjn1ch0ls says:

July 11, 2012 at 10:36 am

This is an interesting and timely article. With the increasing adoption of Agile methodologies for Software development, I have wondered when the topic of Agile would be discussed by Hardware engineers. We have been using Agile type methodologies for FPGA development for many years, and focusing on Scrum recently. We have found that FPGA development with its re-programmable devices is a great fit for Agile, in many FPGA design and verification projects. The ability to define, design, test, built, and release product to customers in short periods, allows both our customers and ourselves to work with the product much earlier than with a waterfall methodology. In almost every project, our customers have decided the original requirements are not what they exactly want, and so they modify the requirements for future releases. If we had used a waterfall methodology and not delivered product until it was completely finished, the customer would not have found the need to change the product requirements until much later and at a much higher cost. There are some Hardware projects which are not a good fit for Agile development, and there are definitely challenges to using Agile, but the variety of Agile methodologies (including XP, Scrum, Crystal, Lean Development, FDD, and others) means there is often a methodology that would work for a specific FPGA development project and team, and will decrease schedule and budget.

Great article Kevin, and thanks for starting this discussion.

Jeff

Log in to Reply
studleylee says:

July 11, 2012 at 1:22 pm

I agree: FPGA’s have allowed a more Agile quick turn development cycle for hardware designs. One does have to keep versatility in mind when hanging locked function hardware off the FPGA, but that’s just a mindset to have prior to layout. You can then have the same reviews and feedback discussions for HDL’s as software design( in a parallelism sense).
We live in interesting times for electronics. At 48, I’ve played with Vacuum tubes to SystemC.
I look forward to seeing if more analog like functions( pSoc ? ) will be integrated in a “sea-of-analog-units” once materials sciences evolve it. I loved that delta-sigma A/D and audio subsystem a guy implemented using the LVDS inputs on an FPGA. Great thinking! Thanks for clearing up what ‘Agile’ is in this sense. It was a keyword that I had not given weight to, thinking it was just another fad.
-Lee Studley

Log in to Reply
yaronb@ethernitynet.com says:

July 12, 2012 at 10:35 am

Hi Jeff,
What you are doing sounds exciting! We are designing switches on FPGA. We are looking for an Agile methodology for a long time and encountered many implementations issues. How long are your iterations? Does an iteration always end with an FPGA version in lab or can it be simulation of a specific feature? How are your customers feeling about your agile process which is not common in HW?

Yaron

Log in to Reply

WaterFail

Related

4 thoughts on “WaterFail”

Leave a Reply Cancel reply

featured chalk talk