feature article
Subscribe Now

Computer Vision 101

Making Machines See the World is Harder Than it Looks

“Vision is the art of seeing what is invisible to others.” — Jonathan Swift

What started out as an experiment has become an obsession. I wanted to see what happened to the dinosaurs.

I confess, I have (or had) two big concrete dinosaurs in my backyard. They’re awkward and massive and weigh about 200–300 pounds each. They look great hiding amongst the ferns, and they scare off rodents. And startle houseguests.

But one of them disappeared without a trace. Just like the real dinosaurs, he was apparently abducted by aliens who teleported him up to their spaceship for invasive examinations. What other explanation could there be? He’s far too heavy for one or two people to move unaided. There were no drag marks in the dirt and no broken pieces anywhere, so he wasn’t removed piecemeal. He just… vanished.

Time for some security cameras. If the other one disappears, I want to get this on video.

Buying and installing outdoor high-resolution cameras is easy. The hard part is training them to do something useful. Like a lot of projects, you think you’re mostly finished when the hardware works, but it’s really the software that takes up all the time.

The cameras themselves have rudimentary motion-detection features burned into their firmware, but it’s not very useful. So instead, they’re configured to squirt raw video streams at about 1 Mbps to a dedicated PC running specialized camera software. There’s an 8-core CPU with 16GB of RAM crunching away on the video, looking for alien dinosaur snatchers. This ought to make great YouTube fodder.

And it would, if you’re into birds, clouds, spiderwebs, and light summer breezes. The cameras triggered on everything. They were overly sensitive to changes in light, irrelevant background motion, random wildlife flying/hopping/crawling past its field of view – you name it, they recorded it.

Fortunately, the camera software has lots of parameters you can tweak. It’s a fantasyland of virtual knobs and dials and switches you can adjust to fine-tune your own surveillance system. The NSA has nothing like this, I’m sure.  

Trouble is, cameras and computers see the world as flat, so they have lousy depth perception. It’s all just pixels to them. A ten-pixel object up close is the same as a ten-pixel object at 50 paces; it can’t tell the difference between near and far, so it has no idea of relative sizes. A bug crawling on the camera lens looks yuuuge, not unlike a Sasquatch in the distance. How to tell them apart?

Similarly, they’re not good at shapes. The software can detect “blobs,” in the sense that it knows when contiguous pixels all move and change together. It isn’t fooled by dust or snow, for example, because those pixels are scattered all over the image. It can detect a person or a bird (or a spaceship?) because those pixels are clustered. So there’s that. But you can’t teach it good shapes from bad shapes. All it knows is the size of the blob (in pixels), which doesn’t correlate to anything useful in the real world. A cardboard box is treated the same as a face. This is not ideal.

It understands color, but not the same way we do. In fact, color is all it understands: the relative light value of each pixel, on a scale from 0 to some large number. Your job is to teach it what color(s) or light/dark combinations are worthy of notice. Again, harder than it sounds.

On the plus side, the cameras have remarkably good night vision. They may actually work better at night than during the daytime, because the cameras have their own infrared LEDs that emit invisible (to us) illumination, like a built-in spotlight. That lighting is more consistent and controlled than miserable ol’ sunlight, which changes position and varies in brightness. Night vision is so much more controlled.

It’s also eerily good at spotting nocturnal wildlife. Mammalian retinas reflect light, that spooky glow-in-the-dark effect familiar from TV nature documentaries. The critters think they’re being sneaky when they’re actually all lit up. Clearly, our ancient animal ancestors didn’t have flashlights, or this unfortunate characteristic would’ve evolved out years ago.

A few months into this, I’ve collected enough video for a third-rate nature documentary. Want to see how grass blows in the wind? I’ve got hours of footage. Bugs? We got bugs of all shapes and sizes. Annoyingly, spiders seem attracted to the cameras’ IR LEDs and build webs across the lens. It’s giant 1950s B-movie monsters all over again, but in color.

The state of the art hasn’t progressed much in the past few decades. We’re still trying to get Von Neumann machines to “see” things for what they are. I worked for a robotics company that offered machine vision as an option. This consisted of black-and-white cameras that could tell whether a gear, gizmo, or gadget on a conveyer belt was out of place or misaligned. Even then, it took a bit of training and tweaking to make it reliable.

One large baking company used our robots to assemble sandwich cookies, and wanted the machines trained to deliberately misalign the two halves by 3 degrees, so that the cookies would look more “homemade.” In that case, the machines were capable of being too accurate. A rare exception.

Today, we have a big tool bag of new approaches to machine vision. Cognex, National Instruments, Sensory, Optotune, and others are all tackling this, although most use conventional hardware, with some ASICs or FPGAs thrown in. Google’s TensorFlow also makes an appearance, taking the “big data” approach to teaching your machine what’s what. There’s a one-day class coming up in October if you’re in the San Jose area.

While you go do that, I’ll go back to looking for aliens. Somehow, they’ve eluded me all this time. I’ve got tons of spider footage, though. Maybe they somehow conspired to work together…

2 thoughts on “Computer Vision 101”

Leave a Reply

featured blogs
Dec 5, 2023
Generative AI has become a buzzword in 2023 with the explosive proliferation of ChatGPT and large language models (LLMs). This brought about a debate about which is trained on the largest number of parameters. It also expanded awareness of the broader training of models for s...
Nov 27, 2023
See how we're harnessing generative AI throughout our suite of EDA tools with Synopsys.AI Copilot, the world's first GenAI capability for chip design.The post Meet Synopsys.ai Copilot, Industry's First GenAI Capability for Chip Design appeared first on Chip Design....
Nov 6, 2023
Suffice it to say that everyone and everything in these images was shot in-camera underwater, and that the results truly are haunting....

featured video

Dramatically Improve PPA and Productivity with Generative AI

Sponsored by Cadence Design Systems

Discover how you can quickly optimize flows for many blocks concurrently and use that knowledge for your next design. The Cadence Cerebrus Intelligent Chip Explorer is a revolutionary, AI-driven, automated approach to chip design flow optimization. Block engineers specify the design goals, and generative AI features within Cadence Cerebrus Explorer will intelligently optimize the design to meet the power, performance, and area (PPA) goals in a completely automated way.

Click here for more information

featured paper

Power and Performance Analysis of FIR Filters and FFTs on Intel Agilex® 7 FPGAs

Sponsored by Intel

Learn about the Future of Intel Programmable Solutions Group at intel.com/leap. The power and performance efficiency of digital signal processing (DSP) workloads play a significant role in the evolution of modern-day technology. Compare benchmarks of finite impulse response (FIR) filters and fast Fourier transform (FFT) designs on Intel Agilex® 7 FPGAs to publicly available results from AMD’s Versal* FPGAs and artificial intelligence engines.

Read more

featured chalk talk

Reliable Connections for Rugged Handling
Materials handling is a growing market for electronic designs. In this episode of Chalk Talk, Amelia Dalton and Jordan Grupe from Amphenol Industrial explore the variety of connectivity solutions that Amphenol Industrial offers for materials handling designs. They also examine the DIN charging solutions that Amphenol Industrial offers and the specific applications where these connectors can be a great fit.
Dec 5, 2023
19 views