Embedded

July 19, 2011

Embedding Vision

Creating Devices That See

by Kevin Morris

Remember sitting down at a DEC VT52 terminal?  The screen held 24 lines of text, at 80 characters each.  The font was built in.  VT52 proudly boasted support for all 95 ASCII characters including the desirable but somewhat superfluous lower case letters.  Some special graphics characters were available as well, but the terminal did not support graphics per se.  There was no mouse, no windows, and text editing was only marginally WYSIWYG - mostly using the "vi" text editor. 

Today, it's hard to think of interacting with a computer, or even a smartphone, without a GUI and some sort of pointing device.  Even for those of us old enough to remember working on the VT52, trying to communicate with a machine exclusively via a keyboard would seem awkward and arcane at best.

Since most of our mobile and tablet devices have used touchscreen interfaces for awhile, it is interesting to watch a young person poke expectantly at the screen of a desktop or laptop computer, expecting touch response, and then being confused when the machine ignores their input.  Once you've grown accustomed to a level of sophistication in human-machine interface, it's hard to go back.

As engineers, most of us know what the next steps are, and we know they're really difficult problems.  Our machines need to be able to see and hear us and to understand what they're seeing and hearing.  Voice interaction has been with us for awhile now, but it really hasn't caught on in the mainstream.  The accuracy of voice recognition/understanding is still low, and the idea of a work or office environment with a sea of cubes - where everyone is talking aloud to their computers simultaneously - sounds a bit chaotic at best.  The problem here, of course, is that deriving meaning from spoken language is a much harder problem than simply recognizing words and phrases in an audio stream.  The secondary problem is that people seem to want private interactions with their devices - even in public places.  Spoken communication doesn't facilitate that very well.

Likewise, there has been a gigantic amount of research into machine vision.  Video has been commoditized to the point that relatively inexpensive hardware is required to add video cameras, storage, and playback capability to an embedded device.  However, making a machine understand what is going on in that video stream is a significant challenge - one that researchers have been grappling with for decades.  The limiting factors for machine vision have always been computing power (massive amounts of computing power are required to do most of the machine vision algorithms out there in real-time on a video stream) and the algorithms themselves.  While it's true that there is a vast repository of research available on machine vision algorithms, those algorithms tend to be tailored for very specific problems.  A number of sophisticated algorithms exist for facial recognition, for example, but those algorithms are different from those required for locating people in a scene, those required for understanding human gestures and movement, and so forth.  Since the algorithms are so specific to the type of information being extracted from the scene, creating fixed-hardware accelerators to solve the computing problem becomes impractical.  Programmable hardware like FPGAs and/or huge amounts of parallelism in conventional processors (such as with graphics processors) is required. 

This year, however, machine vision went mass market with the introduction of the Kinect interface for Microsoft's Xbox 360.  In case you've been under a rock for the past year (which many of us working on complex engineering problems tend to be from time to time), MIcrosoft built a low-cost device that enables a video game console to get its input from watching the players move, rather than from a dedicated controller.  The system can locate people in the scene, interpret their gestures, and even recognize which individuals it is "seeing."  With the retail cost of the system being less than $150 USD, one can imagine that the bill of materials cost must be very low.  Granted, Kinect cheats a bit by borrowing some of the Xbox 360's massively parallel processing power to accomplish its magic, but even with that, Kinect sets a new bar for cost-effective machine vision.

Kinect has kicked off a virtual revolution in hacking, which apparently has been warmly welcomed by Microsoft.  There are websites and forums dedicated to sharing information on using and adapting the Kinect hardware for a huge variety of applications.  With the broad-based adoption of Kinect, the door to machine vision has been blown open, and the next few years should see remarkable progress in adding a sense of sight to our intelligent devices.

Unfortunately, adding machine vision to your next embedded design isn't as simple as dropping in WiFi or USB.  You can't just add a camera and a piece of machine vision IP to your embedded device and end up with a functional machine vision interface.  Vision, as we mentioned, is an incredibly complex problem that has already experienced decades of research, and the average - or even the far-above-average electronic designer - isn't going to just pick it up with some spare weekend reading.  To get our intelligent devices to see and understand the world around them, we're going to need some serious help. 

Fortunately, a new group has been formed with the intent of doing just that.  The Embedded Vision Alliance was founded with the goal of "Inspiring and empowering engineers to design systems that see and understand."  Jeff Bier, President of Berkeley Design Technology (BDTi) and founder of the Embedded Vision Alliance, sees huge market potential for embedded vision applications in the near future in areas like consumer electronics, automotive, gaming, retail, medical, industrial, defense, and many others.  Embedded vision systems will be doing things like gesture-based control of devices, active driver safety and situational awareness, active digital signage, and point-of-sale transaction assistance - just to name a few.  

"The engineer who wants to add vision to his or her embedded design will have both great news and bad news," explains Bier.  "First, they will discover that there are hundreds of papers, books, and other resources with volumes of research on the topic.  Then, they will discover that the vast majority of that work is not particularly useful for real-world engineering applications.  Much of the material is heavily theoretical - books with 800 pages filled with multi-variable calculus - and very little of it is in a form that engineers could use, like block diagrams and code."  One of the goals of the Embedded Vision Alliance is to sift through that mountain of information and extract that which will be practically useful for adding vision to embedded designs.

The Embedded Vision Alliance already has over a dozen companies participating - from semiconductor suppliers to distributors to software companies - all of whom see a big future for embedded vision and who have products or technology that they feel will play a significant role in deployment of that capability.  The Alliance is already building a website (www.embedded-vision.com) with resources and community to assist engineers in development of embedded vision capabilities.  With efforts like this, the path to embedded vision will be far less treacherous.

Embedded vision is one of the most significant and exciting engineering challenges to come along in decades, and it will happen.  There will be a time when interacting with a machine that can't see you will seem as strange as trying to compute with a VT52 would today.  Once our intelligent devices gain a proper set of senses, a vast range of new applications and capabilities will emerge.  If we want to be part of that revolution, we'd better start catching up now.  If embedded vision were easy, everybody would already have it.

Comments:


weiwei2

Total Posts: 8
Joined: Dec 2010

Posted on August 07, 2011 at 1:12 PM

i do see some companies in my state is trying to do embedded vision. Myself is trying too
You must be logged in to leave a reply. Login »

Related Articles

Collision of Two Worlds

The FPGA Supercomputing Nexus of Hardware and Software

by Kevin Morris

We are always trying to make machines that think faster. Before we finish building computers that can solve the last generation of problems, our imagination...

Connected Dots

Wireless Connectivity and a Whole Lot of Sensors

by Amelia Dalton

Smart, ultra-low power wireless modules are the name of the game in this weeks Fish Fry. We sit down with Nick Kanopoulos (CEO of...

An Open Sensor Platform

Sensor Platforms and ARM Propose Framework

by Bryon Moyer

It might just be the end of another lurch.

Technology doesnt evolve in a smooth, continuous fashion. Someone has an idea for something...

Two by Tools

Virtual Prototyping and DAC 2014 Preview

by Amelia Dalton

In this weeks editorial matrix, Fish Fry has EDA on all sides. In one corner, we have Chuck Alpert (DAC 2014 Technical Program Chair) here...

Dawn of a new Ara

Whats Googles New Modular Smartphone Really About?

by Kevin Morris

It would be easy to blow off Googles Project Ara modular smartphone concept as just another one of those Google science fair projects that...

Related Blog Posts

IoT Via WiFi

by Bryon Moyer

Modules are increasing being made available to simplify the process of connecting Things to a WiFi...

How Does Multicore Affect Code Coverage?

by Bryon Moyer

Analysis tools can help identify whether code gets executed. LDRA recently added multicore coverage; what does this...

IoT Paranoia Not a Bad Thing

by Bryon Moyer

A recent presentation on IoT security tried to put a real-world face on the abstract notion of IoT...

Shipping Data Between Things and the Cloud

by Bryon Moyer

The thing that makes the IoT the IoT is the fact that Things can communicate. Sounds simple, right? Well not...

On the Scene: Project Ara

by Amelia Dalton

The race hasn't yet begun. In fact, we're not even on the starting block, but the rule book for this race - the race to...

  • Feature Articles RSS
  • Comment on this article
  • Print this article

Login Required

In order to view this resource, you must log in to our site. Please sign in now.

If you don't already have an acount with us, registering is free and quick. Register now.

Sign In    Register