feature article
Subscribe Now

Look At Something, Ask a Question, Hear an Answer: Welcome to the Future

A few days ago, I was introduced to a tempting taste of the future that had me squirming in my seat in excitement and anticipation. You know what it’s like when you are ambling your way through the world without a thought in your head (at least, that’s the way I usually do it). And then something catches your eye that sparks a cascade of questions. Rather than take a picture or make notes for future follow-up, suppose you could simply articulate your questions aloud and immediately hear the answers tickling your ears. Well, that’s the type of technology I just saw!

Yes, of course this story features artificial intelligence (AI). It’s hard to come up with a story that doesn’t, these days. As an aside, before we plunge into the fray with gusto and abandon, I’m sure you’ve heard about AI hallucinations, which are also known as artificial hallucinations, artificial delusions, and artificial confabulations. All these terms refer to when an AI generates a response containing false or misleading information presented as fact. As reported on Legal Dive, for example, the unfortunate attorney Steven A. Schwartz of Levidow, Levidow, and Oberman used an AI to generate a filing that he foolishly assumed to be factually correct prior to presenting it to a judge. Unfortunately for Steven, the judge spotted that the AI had decided to feature bogus cases in its response, boosting its arguments with bogus quotes, bogus citations, and bogus judicial decisions. Suffice it to say that this eventually led to Steven having a very bad hair day indeed.

How do we prevent AI hallucinations? Well, one solution would be to make all AIs like Goody-2, which is billed as the “World’s Most Responsible AI Chatbot.” According to an article I read recently in Wired, this self-righteous chatbot takes AI guardrails to an illogical extreme: “It refuses every request, responding with an explanation of how doing so might cause harm or breach ethical boundaries.”

But we digress…

To provide a basis for our discussions, it’s probably worth noting that we don’t actually see things as well as many people think we see them. For example, we tend to assume that everything we see is always in focus. However, our vision is foveated, which means the only high-resolution part of our optical sensing is an area called the fovea in the center of the retina. In turn, this means we have only about a 2-degree field-of-view (FOV) that’s in focus. That’s about the size of the end of your thumb when held at arm’s length. Everything outside this area quickly drops off in terms of resolution and color contrast. These outer areas are what we call our peripheral vision.

The reason we feel that everything is in focus is that our brain maintains a model of what it thinks it’s seeing, and our eyes dart around filling in the blanks. One thing our peripheral vision is good at is detecting movements and shapes, and one thing all this knowledge is good for is something known as foveated rendering. The idea here is that if we are wearing a mixed reality (MR) headset—where MR encompasses any combination of virtual reality (VR), augmented reality (AR), diminished reality (DR), and augmented virtuality (AV)—then we can dramatically reduce the computational workload associated with the rendering algorithms by using sensors to detect where the user’s eyes are looking, and only rendering those areas in high resolution, gradually diminishing the fidelity of the rendering the further out we go.

“What sort of sensors?” I hear you cry. I’m glad you asked, because Prophesee describe their GenX320 metavision sensor as being “the world’s smallest and most power-efficient event-based vision sensor.” If you want to learn more about the GenX320, take a look at the column penned by my friend Steve Leibson: Prophesee’s 5th Generation Sensors Detect Motion Instead of Images for Industrial, Robotic, and Consumer Applications.

“What else can these sensors be used for?” I hear you ask. Once again, I’m glad you asked, because I was just chatting with the folks at Zinn Labs. The “Zinn” part is named after the German anatomist and botanist Johann Gottfried Zinn, who provided the first detailed and comprehensive anatomy of the human eye circa the mid-1700s.

I’ve been introduced to some boffin-packed-and-stacked companies over the years, but I think Zinn Labs is the first in which every single member has a PhD. These clever little scamps develop gaze-tracking systems based on event sensors, which enable dramatically lower latency and higher framerates. In addition, the tailored sensor data can run in limited-compute embedded environments at low power while maintaining high-performance gaze accuracy.

As an example of the sort of things the little rascals can do with their gaze-tracking systems, the folks at Zinn Labs recently announced an event-based gaze-tracking system for AI-enabled smart frames and MR systems.

Event-based gaze-tracking system for AI-enabled smart frames and MR systems (Source: Zinn Labs)

In addition to an outward-facing 8-megapixel camera in the center, and microphones and speakers at the sides, these frames (lens-less, in this case) boast Zinn’s event-based modules featuring Prophesee’s GenX320 sensors. These are the frames that were employed in the video demo that blew me away.

 

As we see, in this example, the user asks questions while looking at different plants. The front-facing camera captures the scene, while the sensors identify where the user is looking. A curated version of the visual information is passed to an image-detection and recognition AI in the cloud. At the same time, the question is fed through a speech-to-text AI. The combination of the identified image and the question are then fed to a generative AI like ChatGPT. The response is then fed through a text-to-speech AI before being presented to the user.

Admittedly, there is a bit of a delay between questions and responses, and the format of the replies is a little stilted (we really don’t need the “let me look that up for you” part). However, we must remind ourselves that this is just a tempting teaser for what is to come. AIs are going to get more and more sophisticated while 5G and 6G mmWave cellular communications are going to get faster and faster with lower and lower latencies.

All I can say is that my poor old noggin is jam-packed with potential use cases (personal, professional, and social). How about you? If you do have any interesting ideas, you might be interested in acquiring one of Zinn’s development kits. In the meantime, as always, I would love to hear what you think about all this.

One thought on “Look At Something, Ask a Question, Hear an Answer: Welcome to the Future”

Leave a Reply

featured blogs
Apr 26, 2024
LEGO ® is the world's most famous toy brand. The experience of playing with these toys has endured over the years because of the innumerable possibilities they allow us: from simple textbook models to wherever our imagination might take us. We have always been driven by ...
Apr 26, 2024
Biological-inspired developments result in LEDs that are 55% brighter, but 55% brighter than what?...
Apr 25, 2024
See how the UCIe protocol creates multi-die chips by connecting chiplets from different vendors and nodes, and learn about the role of IP and specifications.The post Want to Mix and Match Dies in a Single Package? UCIe Can Get You There appeared first on Chip Design....

featured video

MaxLinear Integrates Analog & Digital Design in One Chip with Cadence 3D Solvers

Sponsored by Cadence Design Systems

MaxLinear has the unique capability of integrating analog and digital design on the same chip. Because of this, the team developed some interesting technology in the communication space. In the optical infrastructure domain, they created the first fully integrated 5nm CMOS PAM4 DSP. All their products solve critical communication and high-frequency analysis challenges.

Learn more about how MaxLinear is using Cadence’s Clarity 3D Solver and EMX Planar 3D Solver in their design process.

featured paper

Designing Robust 5G Power Amplifiers for the Real World

Sponsored by Keysight

Simulating 5G power amplifier (PA) designs at the component and system levels with authentic modulation and high-fidelity behavioral models increases predictability, lowers risk, and shrinks schedules. Simulation software enables multi-technology layout and multi-domain analysis, evaluating the impacts of 5G PA design choices while delivering accurate results in a single virtual workspace. This application note delves into how authentic modulation enhances predictability and performance in 5G millimeter-wave systems.

Download now to revolutionize your design process.

featured chalk talk

The Future of Intelligent Devices is Here
Sponsored by Alif Semiconductor
In this episode of Chalk Talk, Amelia Dalton and Henrik Flodell from Alif Semiconductor explore the what, where, and how of Alif’s Ensemble 32-bit microcontrollers and fusion processors. They examine the autonomous intelligent power management, high on-chip integration and isolated security subsystem aspects of these 32-bit microcontrollers and fusion processors, the role that scalability plays in this processor family, and how you can utilize them for your next embedded design.
Aug 9, 2023
30,781 views