feature article
Subscribe Now

Watch Out! It’s the Smallest Stereo Camera in the World!

I cannot believe how fast things are moving when it comes to things like artificial intelligence (AI) and machine learning (ML), especially in the context of machine vision, which refers to the ability of a computer to use sight to perceive what’s going on in the world around it (see also What the FAQ are AI, ANNs, ML, DL, and DNNs?).

Are you familiar with eYs3D Microelectronics? This is a fabless design house that focuses on end-to-end software and hardware systems for computer vision, including advanced vision-processing System-on-Chip (SoC) devices. Meanwhile, eCapture is a brand of eYs3D that focuses (no pun intended) on the camera business.

I was just chatting with the folks at eCapture, and I must admit that I’m rather excited about their recent announcement regarding the availability for pre-order of their G53 3D Vision Depth-Sensing camera, which — at only 50 x 14.9 x 20 mm — they claim to be the smallest stereo camera in the world.

Meet the G53 (Image source, eCapture)

In addition to the lenses and optical sensors, this 3D camera boasts an eYs3D vision-processing SoC that computes stereo depth data, thereby off-loading this computationally intensive task from the main system’s CPU/GPU host processor.

The G53 is of interest for a wide variety of tasks, including simultaneous localization and mapping (SLAM), which refers to the computational task of constructing or updating a map of an unknown environment while — at the same time — keeping track of the mapping agent’s location within it. In turn, visual SLAM (vSLAM) leverages 3D vision to perform location and mapping functions.

Often enhanced by sensor fusion in the form of odometry (the use of data from motion sensors to estimate change in position and orientation over time) SLAM is used for tasks such as robot navigation and robotic mapping, and also for virtual reality (VR) and augmented reality (AR) applications (see also What the FAQ are VR, MR, AR, DR, AV, and HR?).

The point is that the high-performance eYs3D vision processing chip in the G53 camera system allows it to capture exactly the same frame on both sensors, thereby reducing errors from unsynchronized frames while gathering data for use by vSLAM algorithms.

In the context of digital sensors, various ways of capturing the image may be employed. In the case of a rolling shutter mode, each image or frame of a video is captured, not by taking a snapshot of the entire scene at a single instant in time, but rather by rapidly scanning across the scene, either vertically or horizontally. Although this can increase sensitivity, it also results in distortions when the scene involves moving objects or the camera itself is in motion (when the camera is mounted on a robot, for example).

The alternative is known as global shutter mode in which the entire frame is captured at the same instant. Since one of the deployment scenarios for the G53 is robots in motion, it employs global shutter technology to realize simultaneous exposure of all the pixels associated with both cameras, thereby reducing distortion to provide more accurate images.

Comparison of global shutter (left) and rolling shutter (right)
(Image source, eCapture)

Ooh, Ooh — I just remembered two potential use models the folks from eCapture were talking about. The first that caught my attention was the fact that speech recognition can be greatly enhanced by means of machine vision-based lip-reading, especially in noisy environments as exemplified by the cocktail party effect.

The second use case involves people walking past a shop in a street or past a display in a store. Depth-sensing cameras like the G53 can be used to determine the location of each person. Based on this information, beamforming techniques can be used to precisely transmit an audio message that can be heard only by the intended recipient in a 1-meter diameter circle. Someone else standing two meters away wouldn’t hear a thing. Actually, that’s not strictly true, because they would be hearing a different message targeted just for them (e.g., “Your coat is looking a little bedraggled — did you know the latest fashions just arrived and we have a sale on at the moment?”).

But we digress… An infrared (IR) illuminator emits light in the infrared spectrum. The G53 boasts two built-in active IR illuminators to facilitate visual binocular plus structured light fusion. In addition to enabling vision in completely dark environments, this technology also enhances recognition accuracy for things like white walls and untextured objects.

Using active IR to enhance recognition accuracy for white walls and untextured objects (Image source, eCapture)

Enhancing the ability of machines to perceive and understand the world around them can greatly increase their efficiency and usefulness. 3D computer vision technology is critical for AI applications, enabling autonomous functionality for software and machines, from robotic spatial awareness to scene understanding, object recognition, and better depth and distance sensing for enhanced intelligent vehicles. The small size of the G53 makes it ideal for space and power-constrained applications, like modestly sized household cleaning robots, for example. Other obvious applications include automated guided vehicles (AGVs) and autonomous mobile robots (AMRs), such as those used in factories and warehouses and also for tasks like goods to person (G2P) deliveries.

I believe that another key deployment opportunity for the G53 will be in cobots (collaborative robots), which are robots that are intended to be used for direct human-robot interaction within a shared space, or where humans and robots are in close proximity. As the Wikipedia says:

Cobot applications contrast with traditional industrial robot applications in which robots are isolated from human contact. Cobot safety may rely on lightweight construction materials, rounded edges, and inherent limitation of speed and force, or on sensors and software that ensure safe behavior.

Observe the “sensors and software that ensure safe behavior” portion of this, which — to me — sounds like another way of saying “cameras like the G53 coupled with AI.”

Now, I love robots as much as (probably much more than) the next person, but I think that the depth perception technology flaunted by devices like the G53 has potential far beyond robotic applications. I’ve said it before and I’ll say it again — I believe that the combination of augmented reality and artificial intelligence is going to change the way in which we interface and interact with our systems, the world, and each other. An AR headset will already have two or more outward-facing cameras feeding visual data to the main AI processor. Stereo depth data derived from this imagery could be used to enhance the user experience by providing all sorts of useful information. And the same eYs3D vision-processing chip found in the G53 camera system (or one of its descendants) could be employed to off-load the computationally intensive task of computing stereo depth data from the host processor.

I’m sorry. I know I tend to talk about AR + AI a lot, but I honestly think this is going to be a major game changer. When will this all come to pass? I have no idea, but when I pause to ponder how quickly technology has advanced in the past 5, 10, 15, and 20 years — and on the basis that technological evolution appears to be following an exponential path — I’m reasonably confident that we are going to see wild and wonderful (and wildly, wonderfully unexpected) things in the next 5, 10, 15, and 20 years.

Returning to the present, developers and systems integrators will be happy to hear that eCapture also offers software development tools for its products that allow for ease of integration into the Windows, Linux, and Android operating system (OS) environments.

Last but not least, over to you. What do you think about AI, ML, VR, AR, and machine perception in general and in the context of machine vision in particular? And are you excited or terrified by what is lurking around the next temporal turn?

2 thoughts on “Watch Out! It’s the Smallest Stereo Camera in the World!”

  1. OH! This article gives me the willies!

    Reminds me of the days where outside the main gate of every Navy base on the planet was this line of questionable shops. Each having at least one “salesman?” out front who if not threatened would attempt to drag you into his den of “C*ap” and in the process quickly empty your pockets.

    1. I remember much the same thing happening to me while walking through the streets of Hong Kong — and if we thought it used to be bad… just wait to see what the future has in store LOL

Leave a Reply

featured blogs
Nov 30, 2023
No one wants to waste unnecessary time in the model creation phase when using a modeling software. Rather than expect users to spend time trawling for published data and tediously model equipment items one by one from scratch, modeling software tends to include pre-configured...
Nov 27, 2023
See how we're harnessing generative AI throughout our suite of EDA tools with Synopsys.AI Copilot, the world's first GenAI capability for chip design.The post Meet Synopsys.ai Copilot, Industry's First GenAI Capability for Chip Design appeared first on Chip Design....
Nov 6, 2023
Suffice it to say that everyone and everything in these images was shot in-camera underwater, and that the results truly are haunting....

featured video

TDK CLT32 power inductors for ADAS and AD power management

Sponsored by TDK

Review the top 3 FAQs (Frequently Asked Questions) regarding TDK’s CLT32 power inductors. Learn why these tiny power inductors address the most demanding reliability challenges of ADAS and AD power management.

Click here for more information

featured paper

Power and Performance Analysis of FIR Filters and FFTs on Intel Agilex® 7 FPGAs

Sponsored by Intel

Learn about the Future of Intel Programmable Solutions Group at intel.com/leap. The power and performance efficiency of digital signal processing (DSP) workloads play a significant role in the evolution of modern-day technology. Compare benchmarks of finite impulse response (FIR) filters and fast Fourier transform (FFT) designs on Intel Agilex® 7 FPGAs to publicly available results from AMD’s Versal* FPGAs and artificial intelligence engines.

Read more

featured chalk talk

One Year of Synopsys Cloud: Adoption, Enhancements and Evolution
Sponsored by Synopsys
The adoption of the cloud in the design automation industry has encouraged innovation across the entire semiconductor lifecycle. In this episode of Chalk Talk, Amelia Dalton chats with Vikram Bhatia from Synopsys about how Synopsys is redefining EDA in the Cloud with the industry’s first complete browser-based EDA-as-a-Service cloud platform. They explore the benefits that this on-demand pay-per use, web-based portal can bring to your next design. 
Jul 11, 2023
17,184 views