feature article
Subscribe Now

Robot Penguins Zap Cyber-Threat!

University Research Reveals Animal-Based Coding Practices

“It’s practically impossible to look at a penguin and feel angry.” – Joe Moore

The headline was straight out of click-bait heaven. “Hungry penguins keep self-driving cars safe from hackers.” It had everything: cute animals, cyber-threats, and scary autonomous vehicles. Throw in a few UFOs and a long-dead mobster and you’d have the perfect Facebook meme.

The reality proved somewhat less sensationalistic, though more useful to actual programmers. A university research team led by Prof. Yiannis Papadopoulos had been exploring ways to make safety-critical software safer and more reliable, and they had found a useful example in… penguins.

Here’s the deal. Safety-critical code is often developed, designed, and regulated under strict guidelines that (we hope) help to prove the code’s reliability. We want to make sure that all possible failures have been anticipated and accounted for. That’s hard enough when you’re banging out “Hello, world!,” but it’s darn near impossible when you’ve got dozens of programmers working on millions of lines of code, all running on different pieces of hardware. How can you possibly prove that everything will work as expected? And, even if it does, how can you ensure that your system fails gracefully in the event of a component failure?

These aren’t hypothetical problems. Programmers working on cars and aircraft and other safety-critical systems deal with them all the time. Thus, we have safety frameworks like ISO 26262 and others. They don’t tell us exactly how to write our code, but they do help to cut the problem into bite-sized pieces so that we can sensibly allocate risk. The standards allow us to ask questions like, “If this piece over here fails, how bad is the outcome?” or “How many things have to fail simultaneously for the system to crash?”

The concept is simple enough – divide and conquer – but the implementation can be devilishly difficult. It doesn’t take a very big program before the number of potential failure points (i.e., software subroutines) skyrockets. Exploring each and every potential weak spot becomes a challenge all by itself – never mind fixing the problems your analysis uncovers.

ISO 26262 and other standards employ the concept of safety integrity levels (SILs). Basically, every component of your system (either a software “component” or a physical hardware component) is assigned a level of importance, from SIL 1 through SIL 4. The failure of an SIL 1 component would be a nuisance, but not catastrophic, while an SIL 4 failure would be life-threatening. Again, simple enough, at least in concept.

But that’s just for single-point failures. Plenty of real-world problems arise only when two or more components fail simultaneously. It’s the kind of “Come on; you’ve got to be kidding!” failure that we often can’t predict. You get a divide-by-zero error at exactly midnight when the power supply glitches and a black cat breaks a mirror under a ladder. That kind of once-in-a-lifetime failure that’s impossible to anticipate. Now the problem gets thorny.

Assigning the risk and predicting the relative likelihood of multiple-point failures is tricky, but, again, ISO 26262 comes to our rescue. It defines an algebraic method of combining the SILs from multiple components to come up with a composite SIL for interrelated groups. Is a failure of Component A more serious than simultaneous failures of Components B and C? And which is more likely to happen? Now you’ve got some calculus on your hands.

Normally, you’d do an exhaustive and recursive analysis through all the branches of the fault tree. It’s just math, right? We have computers for that. Okay, but that just gives you a depressingly large number that enumerates your problems. What you really want to know is how to minimize that number. Or, better yet, how to partition your system in a way that reduces the potential sources of failure. Does combining these two subsystems result in a safer system overall? And if so, is it worth the effort to do that? Decisions, decisions…

Cue the penguins.

A research team at the University of Hull (UK) and the University of Mohammed Cherif Messaadia (Algeria) wanted to find a way to reduce the bone-crushing complexity of searching the entire fault tree looking for optimal solutions. Real-world developers have given up on simple recursive traversals, since those methods don’t scale well with large numbers of nodes. Instead, so-called “meta-heuristic” optimizations like Genetic Algorithm and Tabu Search have cut analysis times by an order of magnitude. But maybe there’s an even better way?

Turns out, penguin colonies do much the same thing all the time. Penguins dive into the water in groups, searching for fish to eat. Maybe they’ll find fish; maybe they won’t. But, as a group, they’re quite efficient at either finding fish or moving on to better hunting grounds. They abandon fish-free areas quickly without wasting a lot of time before moving on to more abundant waters.

Individual penguins are also good at communicating their success (or failure) to other penguins, which helps to rapidly diffuse knowledge to the group. These characteristics make a good template for a weighted risk-assessment algorithm. No potential faults here? Move on quickly. Lots of potential faults over here? Keep digging, at least for a while, searching for even better “fish.”

Moreover, the penguins’ hunting strategy is qualitative, which maps nicely onto the SILs and the analysis of alternatives. The quality of the fishing isn’t binary; some areas have more fish than others, but some are also harder to exploit than others. If the fish are particularly deep or fast-swimming, the penguins have to expend more precious oxygen to catch them. Less fecund waters might be better for the penguins if the fish there are easier to catch. Similarly, when two or more approaches to system partitioning produce equally safe and reliable solutions, one might be cheaper/easier to implement because it involves fewer components.

The resulting algorithm is called Penguins Search Optimization Algorithm (PeSOA).

To test their hypothesis, the research team used an already-available model of an electric car’s hybrid braking system to compare their simulated penguins to popular alternative methods.

(It’s “hybrid” because the simulated car combines conventional mechanical (friction) brakes with electric motors. Normally, the motors power the car, but, when the car is braking, the motors reverse their function, becoming generators that create mechanical resistance (drag) on the wheels, while also conveniently generating a bit of electricity that can be put back into the car’s battery. Why use friction brakes at all and not just the motor/generators exclusively? Two reasons. First, the generators are nearly useless if the battery is already fully charged; and second, the electrical drag isn’t enough to stop a heavy vehicle from high speed. Electric braking is fine for gradual deceleration but not for panic stops.)

This was a test of simple brake function, not exotic auto-navigation or collision avoidance. You just press the brake pedal, the car slows. Simple, right?

Not at all. The researchers found that in the simulated braking system that they and previous researchers had used, there were 60 potential component failures that could cause a fault, most of them serious. Those might represent broken wires, loose connectors, faulty semiconductors, buggy code, and so on.

Furthermore, a fault-tree analysis revealed 11,573 “cut sets,” meaning the number of different ways that two or more components could fail simultaneously to create a fault, even when a single failure wouldn’t.

The bad news? The number of potential re-factorings (that is, the number of different ways the system could be partitioned to deliver the same functionality) meant that something in the neighborhood of 1041 different solutions had to be explored. A recursive search was clearly not feasible.

The good news? The penguins triumphed. The table below shows how the four different algorithms fared using four different weighting methods, measured in relative CPU execution time.



Genetic Algorithm

Tabu Search






















PeSOA was generally about an order of magnitude faster than either the GA or TS algorithms, which were, in turn far faster than a simple recursive minimum-cost search. Each row represents a different way of weighting equivalent solutions. For example, are two SIL 1 components better or worse than a single SIL 2 component, or is one SIL 3 equal to an SIL 1 plus an SIL 2? Different designers will apply different weights to these decisions based on local requirements.

Significantly, all three advanced algorithms (PeSOA, GA, and TS) did find optimal solutions. That is, you would end up with an unequivocal answer to the question of, “What is the best way to partition this complex system?” And you won’t have to go diving in icy water to find it. That ought to be worth a few fish.

Leave a Reply

featured blogs
Apr 22, 2021
President George H. W. Bush famously said that he didn't do "the vision thing". Well, here at Cadence we definitely do the vision thing. In fact, the Tensilica Vision DSP product line... [[ Click on the title to access the full blog on the Cadence Community si...
Apr 21, 2021
Robotics are everywhere, or so it seems. Robot combat, STEM outreach, robotic arms and advanced surgical devices are but a few applications that touch our everyday life. As a student, I was the President of the Santa Barbara City College Robotics Club and a FIRST Robotics com...
Apr 21, 2021
Introducing PrimeLib, an SoC design tool that maps the latest chip technologies & enables correct-by-construction design for SoCs at advanced process nodes. The post Advanced Nodes Drive Demand for Advanced Library Characterization and Validation Solutions appeared firs...
Apr 20, 2021
By Sherif Hany and Abdellah Bakhali Regardless of which technology node they're using, design houses… The post Give me my space! Why high voltage and multiple power domain designs need automated context-aware spacing checks appeared first on Design with Calibre....

featured video

The Verification World We Know is About to be Revolutionized

Sponsored by Cadence Design Systems

Designs and software are growing in complexity. With verification, you need the right tool at the right time. Cadence® Palladium® Z2 emulation and Protium™ X2 prototyping dynamic duo address challenges of advanced applications from mobile to consumer and hyperscale computing. With a seamlessly integrated flow, unified debug, common interfaces, and testbench content across the systems, the dynamic duo offers rapid design migration and testing from emulation to prototyping. See them in action.

Click here for more information

featured paper

From Chips to Ships, Solve Them All With HFSS

Sponsored by Ansys

There are virtually no limits to the design challenges that can be solved with Ansys HFSS and the new HFSS Mesh Fusion technology! Check out this blog to know what the latest innovation in HFSS 2021 can do for you.

Click here to read the blog post

featured chalk talk

Silicon Lifecycle Management (SLM)

Sponsored by Synopsys

Wouldn’t it be great if we could keep on analyzing our IC designs once they are in the field? After all, simulation and lab measurements can never tell the whole story of how devices will behave in real-world use. In this episode of Chalk Talk, Amelia Dalton chats with Randy Fish of Synopsys about gaining better insight into IC designs through the use of embedded monitors and sensors, and how we can enable a range of new optimizations throughout the lifecycle of our designs.

Click here for more information about Silicon Lifecycle Management Platform