Is This the Next Level of Machine Vision for Autonomous Driving?

Earlier this year I penned a column explaining how Ambarella’s next-generation AI vision processors were targeting edge applications. The focus of that column was the CV72S System-on-Chip (SoC) device, which presents Ambarella’s state-of-the-art technology at a size and cost point that fits the IoT market in general and the security/surveillance market in particular.

To be honest, I fear I may have dropped the ball. I was so focused on Embedded Edge and IoT applications that I neglected to delve deeper into all the other stuff the folks at Ambarella are doing (I shall chastise myself soundly later). Specifically, I failed to realize just how prominent they are in the field of autonomous driving (AD). Fortunately, I was just chatting with Pier Paolo Porta, Director of Automotive Marketing at Ambarella, and I’m happy to report that I now have a much better grasp as to what’s going on.

Before we plunge headfirst into any new technology announcements, let’s take a moment to see how we got to be where we are today, commencing with a company called VisLab.

The history of Ambarella’s AD Stack (Source: Ambarella)

VisLab emerged from a university laboratory in the mid-1990s. Their focus was on creating software that could be used to realize cars that could drive themselves, and their first demonstration of autonomous driving took place in 1998 when they drove 2,000 kilometers on Italian highways with the car driving itself 94% of the time (I didn’t even know anyone was doing stuff like this that far back in time).

Pier joined VisLab in time for the 2005 DARPA Grand Challenge. In 2010 the team undertook a 3-month 15,000-kilometer trip from Italy to China in autonomous mode. Just to set a point of reference, this occurred around the same time the guys and gals at Google announced the start of their self-driving program (so, while Google was thinking about it, VisLab was doing it).

In 2014, the team at VisLab established a new 100% autonomous benchmark (see the black car in the image above). Hold onto your hat because this involved 26 cameras (13 binocular pairs) and 17 PCs with a power consumption of 5 kilowatts, which is more than it takes to power a house. (Don’t worry, developments over the past ten years have resulted in industry-leading power consumption for the latest and greatest solutions we will be discussing below.)

The following year, in 2015, Ambarella acquired VisLab. This was a marriage made in heaven—Ambarella’s computer vision SoCs with VisLab’s computer vision software. Since that time, the combined team has been making increasingly powerful SoCs while—at the same time—evolving and porting the VisLab computer vision software stack onto these bodacious devices.

Actually, this might be a good time to present a high-level block diagram representing Ambarella’s CV3-AD SoC family as illustrated below.

Block diagram representing Ambarella’s CV3-AD SoC family (Source: Ambarella)

These SoCs provide optimal processing for vision and radar sensing and fusion. In particular, observe the CVflow AI engines, which minimize power consumption and the processing load on the SoC’s on-chip Arm cores.

If we return to the first image for a moment, it’s worth noting that the latest members of Ambarella’s research and development vehicle fleet include a sensing suite that consists of mono and stereo cameras, along with Ambarella’s Oculii 4D imaging radar, with all the processing being performed by the CV3-AD SoC. It’s also worth noting that these solutions are flexible and can support OEMs’ specific sensing suites, including the option for adding LiDAR.

Now, this is where things start to get exciting. A key feature of Ambarella’s software stack is that it requires only readily available, standard-definition (SD) maps, thereby eliminating the need for pre-generated high-definition (HD) maps. The software stack (running on the CV3-AD) generates HD maps in real-time using live environmental data from the vehicle’s sensing suite. In contrast, the pre-generated HD maps used by other AD systems are brittle and expensive to maintain, requiring centimeter-level localization capability. They are also unreliable under dynamic conditions such as road construction or accidents. Additionally, Ambarella’s real-time HD map generation is ideal for handling difficult AD scenarios, such as downtown areas in large cities with roundabouts, narrow roads with parked vehicles, heavy traffic, construction, and a high density of pedestrians, cyclists, and other vulnerable road users.

The software stack in action (Source: Ambarella)

Real-time generation of high-definition 3D maps (Source: Ambarella)

Now, I must admit that the following graphic confused the socks off me at first (note to self: always wear elasticated socks when talking with the folks at Ambarella in the future). The way these things usually work is that a company announces an SoC with some amount of processing capability. Sometime later, they announce a new generation of that SoC with a greater amount of processing capability, and so it goes. Just for giggles and grins, the folks at Ambarella are doing things the other way round. Peruse and ponder the graphic below and we will reconvene on the other side.

Ambarella are adding the CV3-AD635 and CV3-AD655 to their portfolio (Source: Ambarella)

Initially I thought the arrows were pointing the wrong way, but I was wrong. Boasting a whopping 10-billion transistors, the most powerful member of the family, the CV3-High, which supports L4 AD and automated parking, was announced in 2022. This was followed in 2023 by the CV3-AD685, which supports L3/L4 AD and automated parking, and the China-focused CV72AQ, which supports L2+ AD and automated parking.

In a couple of weeks at CES 2024, the chaps and chapesses at Ambarella will be formally debuting the CV3-AD635, which supports Mainstream L2+ and automated parking, and the CV3-AD655, which supports Advanced L2+ (L2++, if you will) and automated parking.

The arrows in the above diagram now make sense. The CV3-AD635 offers 3X the AI processing power of the CV72AQ; the CV3-AD655 offers 2X the AI processing power of the CV3-AD635; the CV3-AD685 offers 3X the AI processing power of the CV3-AD655; and the CV3-High offers 2X the AI processing power of the CV3-AD685 (phew!).

Of course, those little scamps at Ambarella could have drawn the arrows the other way round, but seeing things as being 2X and 3X more powerful provides a more positive marketing message than presenting them as being 1/2 and 1/3 as powerful LOL.

The thing is that the folks at Ambarella have good reasons for doing things the way they are doing them. The market is currently ready and accepting of the affordable L2+ and L2++ levels of autonomy provided by the CV3-AD635 and CV3-AD655 devices. Users and regulatory bodies are moving towards L3 and L4 levels of autonomy and—when they get there—the folks at Ambarella will be ready and waiting with their CV3-AD685 and CV3-High devices (or their descendants).

If you want to learn more, then I’m delighted to be able to say that the new Ambarella AD software stack will be demonstrated via fully autonomous test drives at Ambarella’s invitation-only exhibition during CES in Las Vegas (if I were attending CES, I’d be at the front of the queue). For more information or to schedule a ride, contact your local Ambarella representative or visit https://www.ambarella.com/

Is This the Next Level of Machine Vision for Autonomous Driving?

Related

Leave a Reply Cancel reply

featured video

Larsen & Toubro Builds Data Centers with Effective Cooling Using Cadence Reality DC Design

featured chalk talk