We may think that today’s artificial intelligence (AI), especially Generative AI (GenAI) like ChatGPT, is terrific, but I’m informed that “We ain’t seen nothin’ yet!” I’m also informed that the “next big thing” in AI will be centered on images and video, specifically machine vision. However, today’s cameras, which were designed for humans, not computers, are still far from the point where they can capture image data for AI at a quality and quantity comparable to the data currently used to power large language models (LLMs), such as ChatGPT.
I was just chatting with Dr. Sebastian Bauer, who serves as the Chief Executive Officer (CEO) and Co-Founder of Ubicept (think “ubiquitous perception”). Founded in 2021, Ubicept is a computer vision startup spun out of research from MIT and the University of Wisconsin–Madison. The company focuses on developing photon-level perception systems capable of operating in extreme lighting conditions and capturing sharp images of high-speed motion.
There’s an old programmer’s joke (not that I have anything against old programmers) that goes: “In order to understand recursion, you must first understand recursion”. I didn’t say it was a good joke. The reason I just mentioned this is that I was going to say the next part of this column was recursive… then I decided it wasn’t… but I didn’t want to lose the joke… and now I’m just rambling (sorry).
Rather than “recursive,” it might be better to say that what I’m about to tell you is “reflexive,” involving “circular innovation,” “cross-pollination,” and “serendipitous reverse compatibility.” I only hope that clears things up.
The point is that the guys and gals at Ubicept began by creating a new type of camera sensor based on SPAD (single-photon avalanche diode) technology—think an array of SPAD-based pixels. This sensor can do things a CMOS (complementary metal-oxide-semiconductor) device can only dream of doing. As part of this, the chaps and chapesses at Ubicept developed super-sophisticated image processing algorithms that can extract delightful amounts of detail from the captured images or video streams. The reason that “serendipitous reverse compatibility” comes into the picture (no pun intended) is that they subsequently discovered that their algorithms can also be used to enhance the quality of images from standard CMOS sensors. (Phew! I wasn’t entirely sure I would be able to bring that one home.)
Lest you think I’m wandering off into the weeds, what would you say if I were to tell you there’s a good chance that SPAD-based sensors will replace their CMOS-based counterparts in the not-so-distant future? I hope that’s made you sit a bit straighter in your seat and start to pay a tad more attention.
Now, your first reaction to this might be to say, “Your an idiot; CMOS-based sensors can boast arrays containing tens or hundreds of megapixels, while the current state-of-the-art in SPAD-based arrays is only around a measly megapixel!” And my first reaction to your first reaction would be to respond, “Anyone who writes to tell me that I’m an idiot really ought to spell ‘you’re’ with an apostrophe if they want to make their point stick.”
Speaking of “points,” I have two I wish to make here. First, SPAD sensors are rapidly increasing in resolution. Second, it wasn’t all that long ago that pundits were postulating the same negative prose when comparing “new kids on the block” CMOS sensors with their predecessors in the form of CCD (charge-coupled device) sensors. I tell you; it’s like déjà vu all over again (did someone just say that?).
Let’s take a brief sortie down memory lane (cue time-travelling audio and visual effects). CCD sensors were invented in 1969 by Willard Boyle and George E. Smith at Bell Labs. They gained traction in the 1970s and 1980s for high-quality imaging applications, especially in the scientific and broadcast markets. Later, CCDs wended their way into commercial and consumer markets for applications such as camcorders, flatbed scanners, and digital cameras. The way CCDs work is that photons create charge in the pixels, and this charge is then transferred across the chip and read out at a single point. CCDs offer high image quality, low noise, and excellent sensitivity. On the downside, they are expensive, power-hungry, and relatively slow compared to modern alternatives.
CMOS technology dates back to the 1960s, but CMOS image sensors only became practical in the 1990s, thanks to advances in semiconductor fabrication techniques. Unlike CCDs, CMOS sensors use an architecture where each pixel has its own charge-to-voltage conversion and readout circuitry. This enables random access, faster readout, and greater integration. By the 2000s, CMOS began to outperform CCDs in terms of cost, speed, and power efficiency. While early CMOS sensors lagged behind CCDs in image quality, that gap has largely closed. Today’s CMOS sensors offer lower costs, low power consumption, high frame rates, and the ability to integrate on-chip processing. As a result, they have become standard in devices such as smartphones, webcams, digital SLR (DSLR) cameras, and the cameras used in machine vision systems for robotics, security, and autonomous vehicles.
And thus we come to SPAD-based sensors. These evolved from avalanche photodiode research in the 1970s and 1980s, with SPAD arrays emerging more recently (circa the 2010s). In this case, SPAD pixels can detect individual photons by triggering an avalanche current when a photon hits a reverse-biased p-n junction. SPAD-based sensors offer unique capabilities in terms of time-resolved imaging, time-of-flight measurements, and photon counting with picosecond resolution. The main issue is their larger pixel size; however, SPAD pixels are becoming smaller and smaller every day. Also, SPAD sensor arrays can be equipped with RGB filters, just like CMOS arrays, which allows them to be used for both monochrome and color imaging applications.
Now, you may be asking yourself why we should be excited about SPAD-based sensors. Well, the first thing is that they can work with extremely low levels of light. Remember that even a single photon can generate an electron-hole pair, which then triggers a self-sustaining avalanche of charge carriers.
My knee-jerk reaction was to assume that this meant the images would be completely washed out in bright lighting conditions, such as full daylight. However, this turns out not to be the case. Due to the underlying technology, you can effectively discard unwanted photons, resulting in a sensor that can operate in all lighting conditions, from extremely low-light to extremely bright-light.
Of course, having more light means you can do more with it. For example, the dynamic range of a CMOS camera sensor typically ranges from 60 to 90 dB, depending on the sensor’s design, quality, and application. Sebastian tells me that, using a SPAD sensor from one of their hardware partners, they’ve built an experimental camera that can capture 1,000 frames per second (fps) with a dynamic range of 140 dB, which is exceptionally good (one might say eye-wateringly so).
Of course, I can waffle on about this for hours without being able to truly convey just how powerful this technology is. If only I could show you. Wait! I can! Cast your orbs over this video while trying not to squeal in excitement.
This was shot outside Ubicept’s office in Boston at night. On the left, we see the video stream captured by a state-of-the-art CMOS sensor-based security camera. On the right, we see the same scene surveilled by a 1 megapixel SPAD sensor-based camera and processed using Ubicept’s algorithms.
This is something to watch repeatedly. Details (from raindrops to writing) that are invisible and/or illegible in the CMOS image are as clear as day (you know what I mean) in the SPAD counterpart.
I used to be a doubter, but now I’m a believer. If semiconductor manufacturers can shrink SPAD pixels and sensor arrays to the same size as their CMOS counterparts—and I see no reason why this shouldn’t be possible—then I would not be surprised if all our camera and machine vision devices employ SPAD technology in the not-so-distant future.
If we were to talk to younger engineers today, I bet most of them are at least aware of CMOS sensors, even if they don’t work with them. I also bet that many of them have never even heard the term CCD. Based on this, I can easily envisage a day, say 20 years in the future, when the same situation applies to SPAD and CMOS sensors. What say you? Are you a doubter… or are you a believer?
I’ve read a few articles on your site and truly admire your style—thanks a ton, and keep up the great work! The Weeknd Red Blazer Suit
I was going to make a snarky remark about your fatuous praise (thank you very much, by the way), but I couldn’t stop myself from clicking on your Weeknd Red Blazer Suit link, and I agree with your assertion that this blazer “combines bold color with sophisticated design to create a striking look.” Sad to relate, I don’t own or wear a blazer or suit anymore, so it’s not for me, but I have a friend who knows a man who…