feature article
Subscribe Now

Embedded Vision

Systems are becoming more and more human. We are gradually endowing them with the ability to do some of what we do so that we no longer have to. A part of that involves giving them some version of the five senses. (If they get a sixth sense, we’re in trouble.)

Arguably the most complex of those senses is the ability to see. Actually, seeing is no big deal: add a camera and you’re there. It’s making sense out of what you see that’s so dang difficult. Work on vision technology has presumably gone under the radar for years in the spooks-and-spies world (either that or Hollywood has been lying to us, which couldn’t possibly be the case). The computation required has kept it on beefy machines even as it came out of the shadows and into plain view.

But now we want our little devices to be able to act like they have their own visual cortexes (not to be confused with ARM Cortexes, although one might be involved in the implementation of the other). Which means not just computation, but computation with performance and low power. In a small footprint. For a low price. No problem.

The topic of embedded vision is the explicit charter of the recently-formed Embedded Vision Alliance, a group that had its first public conference in conjunction with Design East in Boston last month. Various players in this space, all members of the alliance, presented different aspects of the state of the art – discussions that largely presented challenges more than ready solutions.

Many technology areas have their seminal moment or quasi-religious touchstone. For semiconductors it’s Moore’s Law. For touch and other sensor technologies (just to mention a few), it’s the iPhone. Embedded vision has its own such moment of revelation: the Kinect from Microsoft.

While the Kinect was preceded by numerous other sophisticated interactive gaming systems, it was the first to do so on a large scale using only vision – no accelerometers or other motion detectors. Not only did it bring embedded vision into the mainstream, it did so at a reasonable cost. And, once the system was hacked, it became a garage-level platform for experimentation.

So the Kinect is vision’s iPhone (with less salivating and swooning). And, just as chip presentations must reference Moore’s Law, and anything relating to phones has to reference the iPhone, the Kinect is the point of departure for folks in the Embedded Vision Alliance.

It would appear that vision technology can be bifurcated at a particular level of abstraction. Below that point are found relatively well-understood algorithms for such things as face recognition or edge detection. These algorithms are often compute-intensive – or, worse yet, memory-bandwidth-intensive. Not that there’s no more work to do here; presumably people will keep coming up with new ideas, but much of the work is in optimizing the performance of these algorithms on various hardware architectures.

Above this level, things become a lot foggier. This is the realm of high-level interpretation. You’ve identified lots of edges in a frame: so what? What do they mean? This is where the world becomes much less strictly algorithmic and more heuristic. And it’s where a lot of original research takes place. Those two edges that intersect: are they part of the same structure? Is one in front of the other? Is one or both of them moving? Is the ambient light creating shadows that could be misinterpreted as objects in their own right?

While one could argue as to where this point of separation between the algorithmic and the heuristic is, there’s a de-facto point that seems to have found itself at a convenient place: the OpenCV library. This is a highish-level API and library of routines that takes care of the algorithmic bits that have reasonably solid implementations. They then become the building blocks for the fuzzier routines doing the high-level work.

While OpenCV forms a convenient rallying point, and, while it abstracts away a lot of compute-intensive code, it’s no panacea. The libraries were developed with desktop (or bigger) machines in mind. So, for instance, it requires a C++ compiler – something you’re not likely to see much of in the creation of deeply-embedded systems. It’s been developed on the Intel architecture; adapting it to smaller or different embedded architectures will require significant optimization. And many routines rely on floating point math, a capability missing in many embedded systems.

One of the participating companies, Videantis, has taken OpenCV as a transition level one step further: they’ve built hardware acceleration IP that operates at the level of the OpenCV API. This allows them to optimize the implementation of many of the OpenCV routines while letting designers write algorithm code using OpenCV that, in a manner of speaking, requires no porting.

While the guys in white coats work the intelligent algorithms in one room, guys in greasy dungarees are off in the next room trying to figure out the best hardware for running these things. And numerous presentations pointed to the need for a heterogeneous structure for doing this. That means that work can be distributed between a standard CPU, a highly-parallel GPU, a highly-efficient DSP, and an FPGA – or some partial combination thereof.

The need to identify such an architecture is reflected in the Heterogeneous System Architecture (HSA) initiative. In fact, in a separate discussion, Imagination Technologies, one of the HSA founding companies, indicated that pure multicore has run its course – it doesn’t scale beyond the kinds of quad-core engines you see now. That doesn’t necessarily square with the many-core efforts underway, but it’s hard to argue that a single core replicated many times is the optimal answer for all problems. And if you take power into account – critical for embedded and mobile applications – then you pretty much have to tailor the engine to the problem for efficiency.

And power isn’t the only constraint. We saw vision subsystems packed into a cube 1” on a side. And then there’s the price question. The Kinect has thrown down the gauntlet here with a rough $80 bill of materials (BOM). That’s a high-volume, consumer-oriented device. The first of its kind. Which means that cost has one way to go: down. These days, systems are getting closer to $50, but the next generation of systems will need to target a BOM of $15-20 in order to achieve extremely high volumes.

All of which brings the promise of lots of machine eyes all around us. Watching our every move. Wait, I’m starting to freak myself out here – I’m sure they’ll be used only for good. And much of that could probably already be deployed by the Qs and James Bonds of the world if they wanted to. This will move the all-seeing capability out of the hands of only the government and into the corporate and consumer mainstream. OK, now I’m really freaking out. [Deep breaths… oooooooooommmmmm – wait, I guess that, by law, that would be ooooooohhhhhhhhmmmmm in Silicon Valley] As usual, promise and peril going hand-in-hand. Ample fodder for further discussion as events warrant.

 

More info:

Embedded Vision Alliance

HSA

One thought on “Embedded Vision”

Leave a Reply

featured blogs
Nov 30, 2023
No one wants to waste unnecessary time in the model creation phase when using a modeling software. Rather than expect users to spend time trawling for published data and tediously model equipment items one by one from scratch, modeling software tends to include pre-configured...
Nov 27, 2023
See how we're harnessing generative AI throughout our suite of EDA tools with Synopsys.AI Copilot, the world's first GenAI capability for chip design.The post Meet Synopsys.ai Copilot, Industry's First GenAI Capability for Chip Design appeared first on Chip Design....
Nov 6, 2023
Suffice it to say that everyone and everything in these images was shot in-camera underwater, and that the results truly are haunting....

featured video

Dramatically Improve PPA and Productivity with Generative AI

Sponsored by Cadence Design Systems

Discover how you can quickly optimize flows for many blocks concurrently and use that knowledge for your next design. The Cadence Cerebrus Intelligent Chip Explorer is a revolutionary, AI-driven, automated approach to chip design flow optimization. Block engineers specify the design goals, and generative AI features within Cadence Cerebrus Explorer will intelligently optimize the design to meet the power, performance, and area (PPA) goals in a completely automated way.

Click here for more information

featured webinar

Rapid Learning: Purpose-Built MCU Software Tools for Data-Driven Embedded IoT Systems

Sponsored by ITTIA

Are you developing an MCU application that captures data of all kinds (metrics, events, logs, traces, etc.)? Are you ready to reduce the difficulties and complications involved in developing an event- and data-centric embedded system? This webinar will quickly introduce you to excellent MCU-specific software options for developing your next-generation data-driven IoT systems. You will also learn how to recognize and overcome data management obstacles. Register today as seats are limited!

Register Now!

featured chalk talk

Enabling IoT with DECT NR+, the Non-Cellular 5G Standard
In the ever-expanding IoT market, there is a growing need for private, low cost networks. In this episode of Chalk Talk, Amelia Dalton and Heidi Sollie from Nordic Semiconductor explore the details of DECT NR+, the world’s first non-cellular 5G technology standard. They investigate how this self-healing, decentralized, autonomous mesh network can help solve a variety of IoT connectivity issues and how Nordic is helping designers take advantage of DECT NR+ with their nRF91 System-in-Package family.
Aug 17, 2023
13,198 views