feature article
Subscribe Now

Robot Penguins Zap Cyber-Threat!

University Research Reveals Animal-Based Coding Practices

“It’s practically impossible to look at a penguin and feel angry.” – Joe Moore

The headline was straight out of click-bait heaven. “Hungry penguins keep self-driving cars safe from hackers.” It had everything: cute animals, cyber-threats, and scary autonomous vehicles. Throw in a few UFOs and a long-dead mobster and you’d have the perfect Facebook meme.

The reality proved somewhat less sensationalistic, though more useful to actual programmers. A university research team led by Prof. Yiannis Papadopoulos had been exploring ways to make safety-critical software safer and more reliable, and they had found a useful example in… penguins.

Here’s the deal. Safety-critical code is often developed, designed, and regulated under strict guidelines that (we hope) help to prove the code’s reliability. We want to make sure that all possible failures have been anticipated and accounted for. That’s hard enough when you’re banging out “Hello, world!,” but it’s darn near impossible when you’ve got dozens of programmers working on millions of lines of code, all running on different pieces of hardware. How can you possibly prove that everything will work as expected? And, even if it does, how can you ensure that your system fails gracefully in the event of a component failure?

These aren’t hypothetical problems. Programmers working on cars and aircraft and other safety-critical systems deal with them all the time. Thus, we have safety frameworks like ISO 26262 and others. They don’t tell us exactly how to write our code, but they do help to cut the problem into bite-sized pieces so that we can sensibly allocate risk. The standards allow us to ask questions like, “If this piece over here fails, how bad is the outcome?” or “How many things have to fail simultaneously for the system to crash?”

The concept is simple enough – divide and conquer – but the implementation can be devilishly difficult. It doesn’t take a very big program before the number of potential failure points (i.e., software subroutines) skyrockets. Exploring each and every potential weak spot becomes a challenge all by itself – never mind fixing the problems your analysis uncovers.

ISO 26262 and other standards employ the concept of safety integrity levels (SILs). Basically, every component of your system (either a software “component” or a physical hardware component) is assigned a level of importance, from SIL 1 through SIL 4. The failure of an SIL 1 component would be a nuisance, but not catastrophic, while an SIL 4 failure would be life-threatening. Again, simple enough, at least in concept.

But that’s just for single-point failures. Plenty of real-world problems arise only when two or more components fail simultaneously. It’s the kind of “Come on; you’ve got to be kidding!” failure that we often can’t predict. You get a divide-by-zero error at exactly midnight when the power supply glitches and a black cat breaks a mirror under a ladder. That kind of once-in-a-lifetime failure that’s impossible to anticipate. Now the problem gets thorny.

Assigning the risk and predicting the relative likelihood of multiple-point failures is tricky, but, again, ISO 26262 comes to our rescue. It defines an algebraic method of combining the SILs from multiple components to come up with a composite SIL for interrelated groups. Is a failure of Component A more serious than simultaneous failures of Components B and C? And which is more likely to happen? Now you’ve got some calculus on your hands.

Normally, you’d do an exhaustive and recursive analysis through all the branches of the fault tree. It’s just math, right? We have computers for that. Okay, but that just gives you a depressingly large number that enumerates your problems. What you really want to know is how to minimize that number. Or, better yet, how to partition your system in a way that reduces the potential sources of failure. Does combining these two subsystems result in a safer system overall? And if so, is it worth the effort to do that? Decisions, decisions…

Cue the penguins.

A research team at the University of Hull (UK) and the University of Mohammed Cherif Messaadia (Algeria) wanted to find a way to reduce the bone-crushing complexity of searching the entire fault tree looking for optimal solutions. Real-world developers have given up on simple recursive traversals, since those methods don’t scale well with large numbers of nodes. Instead, so-called “meta-heuristic” optimizations like Genetic Algorithm and Tabu Search have cut analysis times by an order of magnitude. But maybe there’s an even better way?

Turns out, penguin colonies do much the same thing all the time. Penguins dive into the water in groups, searching for fish to eat. Maybe they’ll find fish; maybe they won’t. But, as a group, they’re quite efficient at either finding fish or moving on to better hunting grounds. They abandon fish-free areas quickly without wasting a lot of time before moving on to more abundant waters.

Individual penguins are also good at communicating their success (or failure) to other penguins, which helps to rapidly diffuse knowledge to the group. These characteristics make a good template for a weighted risk-assessment algorithm. No potential faults here? Move on quickly. Lots of potential faults over here? Keep digging, at least for a while, searching for even better “fish.”

Moreover, the penguins’ hunting strategy is qualitative, which maps nicely onto the SILs and the analysis of alternatives. The quality of the fishing isn’t binary; some areas have more fish than others, but some are also harder to exploit than others. If the fish are particularly deep or fast-swimming, the penguins have to expend more precious oxygen to catch them. Less fecund waters might be better for the penguins if the fish there are easier to catch. Similarly, when two or more approaches to system partitioning produce equally safe and reliable solutions, one might be cheaper/easier to implement because it involves fewer components.

The resulting algorithm is called Penguins Search Optimization Algorithm (PeSOA).

To test their hypothesis, the research team used an already-available model of an electric car’s hybrid braking system to compare their simulated penguins to popular alternative methods.

(It’s “hybrid” because the simulated car combines conventional mechanical (friction) brakes with electric motors. Normally, the motors power the car, but, when the car is braking, the motors reverse their function, becoming generators that create mechanical resistance (drag) on the wheels, while also conveniently generating a bit of electricity that can be put back into the car’s battery. Why use friction brakes at all and not just the motor/generators exclusively? Two reasons. First, the generators are nearly useless if the battery is already fully charged; and second, the electrical drag isn’t enough to stop a heavy vehicle from high speed. Electric braking is fine for gradual deceleration but not for panic stops.)

This was a test of simple brake function, not exotic auto-navigation or collision avoidance. You just press the brake pedal, the car slows. Simple, right?

Not at all. The researchers found that in the simulated braking system that they and previous researchers had used, there were 60 potential component failures that could cause a fault, most of them serious. Those might represent broken wires, loose connectors, faulty semiconductors, buggy code, and so on.

Furthermore, a fault-tree analysis revealed 11,573 “cut sets,” meaning the number of different ways that two or more components could fail simultaneously to create a fault, even when a single failure wouldn’t.

The bad news? The number of potential re-factorings (that is, the number of different ways the system could be partitioned to deliver the same functionality) meant that something in the neighborhood of 1041 different solutions had to be explored. A recursive search was clearly not feasible.

The good news? The penguins triumphed. The table below shows how the four different algorithms fared using four different weighting methods, measured in relative CPU execution time.

Algorithm

MinCost

Genetic Algorithm

Tabu Search

Penguins

Linear

770

15.51

4.44

0.27

Experiential-I

830

14.63

14.70

0.59

Experiential-II

620

3.31

5.72

0.89

Logarithmic

51,150

3.04

1.91

0.32

PeSOA was generally about an order of magnitude faster than either the GA or TS algorithms, which were, in turn far faster than a simple recursive minimum-cost search. Each row represents a different way of weighting equivalent solutions. For example, are two SIL 1 components better or worse than a single SIL 2 component, or is one SIL 3 equal to an SIL 1 plus an SIL 2? Different designers will apply different weights to these decisions based on local requirements.

Significantly, all three advanced algorithms (PeSOA, GA, and TS) did find optimal solutions. That is, you would end up with an unequivocal answer to the question of, “What is the best way to partition this complex system?” And you won’t have to go diving in icy water to find it. That ought to be worth a few fish.

Leave a Reply

featured blogs
Jul 5, 2022
The 30th edition of SMM , the leading international maritime trade fair, is coming soon. The world of shipbuilders, naval architects, offshore experts and maritime suppliers will be gathering in... ...
Jul 5, 2022
By Editorial Team The post Q&A with Luca Amaru, Logic Synthesis Guru and DAC Under-40 Innovators Honoree appeared first on From Silicon To Software....
Jun 28, 2022
Watching this video caused me to wander off into the weeds looking at a weird and wonderful collection of wheeled implementations....

featured video

Demo: Achronix Speedster7t 2D NoC vs. Traditional FPGA Routing

Sponsored by Achronix

This demonstration compares an FPGA design utilizing Achronix Speedster7t 2D Network on Chip (NoC) for routing signals with the FPGA device, versus using traditional FPGA routing. The 2D NoC provides a 40% reduction in logic resources required with 40% less compile time needed versus using traditional FPGA routing. Speedster7t FPGAs are optimized for high-bandwidth workloads and eliminate the performance bottlenecks associated with traditional FPGAs.

Subscribe to Achronix's YouTube channel for the latest videos on how to accelerate your data using FPGAs and eFPGA IP

featured paper

3 key considerations for your next-generation HMI design

Sponsored by Texas Instruments

Human-Machine Interface (HMI) designs are evolving. Learn about three key design considerations for next-generation HMI and find out how low-cost edge AI, power-efficient processing and advanced display capabilities are paving the way for new human-machine interfaces that are smart, easily deployable, and interactive.

Click to read more

featured chalk talk

Solutions for Heterogeneous Multicore

Sponsored by Siemens Digital Industries Software

Multicore processing is more popular than ever before but how do we take advantage of this new kind of processing? In this episode of Chalk Talk, Jeff Hancock from Siemens and Amelia Dalton investigate the challenges inherent in multicore processing, the benefits of hypervisors and multicore frameworks, and what you need to consider when choosing your next multicore processing solution.

Click here for more information about Multicore Enablement: Enabling today’s most advanced MPSoC systems