feature article
Subscribe Now

Computer Vision 101

Making Machines See the World is Harder Than it Looks

“Vision is the art of seeing what is invisible to others.” — Jonathan Swift

What started out as an experiment has become an obsession. I wanted to see what happened to the dinosaurs.

I confess, I have (or had) two big concrete dinosaurs in my backyard. They’re awkward and massive and weigh about 200–300 pounds each. They look great hiding amongst the ferns, and they scare off rodents. And startle houseguests.

But one of them disappeared without a trace. Just like the real dinosaurs, he was apparently abducted by aliens who teleported him up to their spaceship for invasive examinations. What other explanation could there be? He’s far too heavy for one or two people to move unaided. There were no drag marks in the dirt and no broken pieces anywhere, so he wasn’t removed piecemeal. He just… vanished.

Time for some security cameras. If the other one disappears, I want to get this on video.

Buying and installing outdoor high-resolution cameras is easy. The hard part is training them to do something useful. Like a lot of projects, you think you’re mostly finished when the hardware works, but it’s really the software that takes up all the time.

The cameras themselves have rudimentary motion-detection features burned into their firmware, but it’s not very useful. So instead, they’re configured to squirt raw video streams at about 1 Mbps to a dedicated PC running specialized camera software. There’s an 8-core CPU with 16GB of RAM crunching away on the video, looking for alien dinosaur snatchers. This ought to make great YouTube fodder.

And it would, if you’re into birds, clouds, spiderwebs, and light summer breezes. The cameras triggered on everything. They were overly sensitive to changes in light, irrelevant background motion, random wildlife flying/hopping/crawling past its field of view – you name it, they recorded it.

Fortunately, the camera software has lots of parameters you can tweak. It’s a fantasyland of virtual knobs and dials and switches you can adjust to fine-tune your own surveillance system. The NSA has nothing like this, I’m sure.  

Trouble is, cameras and computers see the world as flat, so they have lousy depth perception. It’s all just pixels to them. A ten-pixel object up close is the same as a ten-pixel object at 50 paces; it can’t tell the difference between near and far, so it has no idea of relative sizes. A bug crawling on the camera lens looks yuuuge, not unlike a Sasquatch in the distance. How to tell them apart?

Similarly, they’re not good at shapes. The software can detect “blobs,” in the sense that it knows when contiguous pixels all move and change together. It isn’t fooled by dust or snow, for example, because those pixels are scattered all over the image. It can detect a person or a bird (or a spaceship?) because those pixels are clustered. So there’s that. But you can’t teach it good shapes from bad shapes. All it knows is the size of the blob (in pixels), which doesn’t correlate to anything useful in the real world. A cardboard box is treated the same as a face. This is not ideal.

It understands color, but not the same way we do. In fact, color is all it understands: the relative light value of each pixel, on a scale from 0 to some large number. Your job is to teach it what color(s) or light/dark combinations are worthy of notice. Again, harder than it sounds.

On the plus side, the cameras have remarkably good night vision. They may actually work better at night than during the daytime, because the cameras have their own infrared LEDs that emit invisible (to us) illumination, like a built-in spotlight. That lighting is more consistent and controlled than miserable ol’ sunlight, which changes position and varies in brightness. Night vision is so much more controlled.

It’s also eerily good at spotting nocturnal wildlife. Mammalian retinas reflect light, that spooky glow-in-the-dark effect familiar from TV nature documentaries. The critters think they’re being sneaky when they’re actually all lit up. Clearly, our ancient animal ancestors didn’t have flashlights, or this unfortunate characteristic would’ve evolved out years ago.

A few months into this, I’ve collected enough video for a third-rate nature documentary. Want to see how grass blows in the wind? I’ve got hours of footage. Bugs? We got bugs of all shapes and sizes. Annoyingly, spiders seem attracted to the cameras’ IR LEDs and build webs across the lens. It’s giant 1950s B-movie monsters all over again, but in color.

The state of the art hasn’t progressed much in the past few decades. We’re still trying to get Von Neumann machines to “see” things for what they are. I worked for a robotics company that offered machine vision as an option. This consisted of black-and-white cameras that could tell whether a gear, gizmo, or gadget on a conveyer belt was out of place or misaligned. Even then, it took a bit of training and tweaking to make it reliable.

One large baking company used our robots to assemble sandwich cookies, and wanted the machines trained to deliberately misalign the two halves by 3 degrees, so that the cookies would look more “homemade.” In that case, the machines were capable of being too accurate. A rare exception.

Today, we have a big tool bag of new approaches to machine vision. Cognex, National Instruments, Sensory, Optotune, and others are all tackling this, although most use conventional hardware, with some ASICs or FPGAs thrown in. Google’s TensorFlow also makes an appearance, taking the “big data” approach to teaching your machine what’s what. There’s a one-day class coming up in October if you’re in the San Jose area.

While you go do that, I’ll go back to looking for aliens. Somehow, they’ve eluded me all this time. I’ve got tons of spider footage, though. Maybe they somehow conspired to work together…

2 thoughts on “Computer Vision 101”

Leave a Reply

featured blogs
Apr 25, 2024
Cadence's seven -year partnership with'¯ Team4Tech '¯has given our employees unique opportunities to harness the power of technology and engage in a three -month philanthropic project to improve the livelihood of communities in need. In Fall 2023, this partnership allowed C...
Apr 24, 2024
Learn about maskless electron beam lithography and see how Multibeam's industry-first e-beam semiconductor lithography system leverages Synopsys software.The post Synopsys and Multibeam Accelerate Innovation with First Production-Ready E-Beam Lithography System appeared fir...
Apr 18, 2024
Are you ready for a revolution in robotic technology (as opposed to a robotic revolution, of course)?...

featured video

How MediaTek Optimizes SI Design with Cadence Optimality Explorer and Clarity 3D Solver

Sponsored by Cadence Design Systems

In the era of 5G/6G communication, signal integrity (SI) design considerations are important in high-speed interface design. MediaTek’s design process usually relies on human intuition, but with Cadence’s Optimality Intelligent System Explorer and Clarity 3D Solver, they’ve increased design productivity by 75X. The Optimality Explorer’s AI technology not only improves productivity, but also provides helpful insights and answers.

Learn how MediaTek uses Cadence tools in SI design

featured paper

Designing Robust 5G Power Amplifiers for the Real World

Sponsored by Keysight

Simulating 5G power amplifier (PA) designs at the component and system levels with authentic modulation and high-fidelity behavioral models increases predictability, lowers risk, and shrinks schedules. Simulation software enables multi-technology layout and multi-domain analysis, evaluating the impacts of 5G PA design choices while delivering accurate results in a single virtual workspace. This application note delves into how authentic modulation enhances predictability and performance in 5G millimeter-wave systems.

Download now to revolutionize your design process.

featured chalk talk

Accessing AWS IoT Services Securely over LTE-M
Developing a connected IoT design from scratch can be a complicated endeavor. In this episode of Chalk Talk, Amelia Dalton, Harald Kröll from u-blox, Lucio Di Jasio from AWS, and Rob Reynolds from SparkFun Electronics examine the details of the AWS IoT ExpressLink SARA-R5 starter kit. They explore the common IoT development design challenges that AWS IoT ExpressLink SARA-R5 starter kit is looking to solve and how you can get started using this kit in your next connected IoT design.
Oct 26, 2023
23,607 views