feature article
Subscribe Now

Generative AI Is Coming to the Edge!

Over the past few months, I’ve waffled on (as is my wont) about various flavors (ooh, tasty) of generative artificial intelligence (GenAI). On the menu were items like GitHub Copilot, which generates code (throwing in errors and security vulnerabilities for free), and Metabob from Metabob (can you spell “Zen”), which looks at the code generated by GitHub Copilot and takes the bugs and security vulnerabilities out again (to understand recursion, you must first understand recursion).

We’ve also discussed other GenAI-based Copilots, like SnapMagic, which can help design engineers pick parts, and Flux.ai, which can help them to design and layout their circuits.

Another form of GenAI comes in the form of text-to-image models, such as Stable Diffusion, which can be used to generate detailed images conditioned on text descriptions. It can also be used for related tasks, such as inpainting (where damaged, deteriorated, or missing parts of an artwork are filled in to present a complete image), outpainting (extending an existing image beyond its original canvass), and image-to-image translations.

The thing is, until now, we all pretty much took it for granted that GenAI models of this class, capacity, and caliber were destined to be run in the cloud. Can you imagine what it would be like to be able to run something like Stable Diffusion on a USB Drive that plugs into your notepad or laptop computer? I can. Actually, I don’t have to imagine this because I’ve seen it in action (I’ll show you a video later in this column).

To be honest, I have a personal interest in being able to generate images from text descriptions. I’m currently writing a small book called The Life (of a Boy Called) Clive (It Rhymes with Five). This all came about because my wife (Gina the Gorgeous) has been pushing me to write fiction. She firmly believes that since I’ve written technical tomes (search for “Clive Maxfield” on Amazon), writing fiction would be easy peasy lemon squeezy. She doesn’t seem to realize that a real fiction writer can convey the impression of something like the grandiosity of a ballroom with a few well-chosen words. By comparison, I would find this task to be stressed depressed lemon zest because I would be reduced to documenting the ballroom’s dimensions using both imperial and metric measurements.

Unwilling to give up, Gina next suggested that I write all of the stories about my formative years that she’s heard (over and over again) from my mother. Like the time when I was around 18 months old and I discovered what the bottom of a large barrel full of ice-cold water looked and felt like, thereby necessitating yet another mad rush to the hospital with my parents (suffice it to say that the doctors and nurses at our local hospital knew me by name). Or the time we went on holiday when I was two years old, and I slid in my socks along the linoleum floor in our Victorian hotel bedroom, and I hit the low windowsill and shot out of the window. Did I mention we were on the sixth floor? My mom said she was talking to my dad when he suddenly performed a ballet leap across the room that would have made Nureyev proud (dad used to be a dancer before the war) and threw himself through the open window, leaving only a hand clasped to the window frame as a reminder of his earlier presence. He then slowly pulled himself back inside, grasping me by my ankle in his other hand. Or the time when I fell off the cliff, making a surprise entrance to the family basking below. Or the time when… but I’m sure you get the drift.

The thing is, I would like to illustrate each of these stories with little minimalist pencil sketches in the style of E.H. Shepard’s illustrations for Winnie-the-Pooh. This is the sort of thing I could happily do in the evenings while ensconced in my comfy chair in our family room—if only I had a USB Drive that could run my own personal copy of Stable Diffusion.

Am I the only one who wishes for the ability to be able to run GenAI at the edge (by which I mean the edge in the form of laptop computers and edge servers—not the extreme edge in the form of IoT devices, although I’m not saying we won’t end up there before too long)? “No!” I cry, “a thousand times no!” In fact, there are many applications ranging from AI home assistants to medical devices that would benefit from GenAI on the edge. The reasons for edge deployment of GenAI include lower cost, privacy and reliability, increased accuracy (personalized models can be fine-tuned with individual and enterprise data for customization and improved accuracy), and low latency (supporting real-time application of GenAI for surveillance, video conferencing, gaming…).

Do you recall deep in the mists of time we used to call 2020 when I wrote my column, Say Hello to Deep Vision’s Polymorphic Dataflow Architecture? At that time, we discussed Deep Vision’s Ara-1 device.

Meet the Ara-2 (Source: Kinara)

Well, reminiscent of Michael Jackson’s 1991 Black or White music video (go on, you know you want to see it again), which boasted the first full photorealistic face morphing (was that really 33 years ago?), Deep Vision somehow morphed into a company called Kinara.

I was just chatting with Ravi Annavajjhala, who is CEO at Kinara, Wajahat Qadeer, who is Co-Founder and Chief Architect at Kinara, and the legendary Markus Levy, who seems to be present wherever cutting-edge machine vision and artificial intelligence appear.

Our conversation spanned too many topics to cover here (I’m sorry). Suffice it to say that, as illustrated in the above diagram, the Ara-1—which is latency-optimized for edge operations and offers 10X Capex/TCO improvement over GPUs—is currently shipping in volume and has been for several years now, which makes it proven technology. Meanwhile, the currently sampling Ara-2 offers 5-8X performance improvement over the Ara-1.

The Ara-2’s neural cores offer enhanced compute utilization, along with support for INT4, INT8, and MSFP16 data types. Each Ara-2 can support 16GB of LPDDR4 (4X that of the Ara-1), and multiple chips can be used to provide scalable performance with automatic load balancing. Furthermore, the Ara-2 supports secure boot and encrypted memory, thereby keeping our secrets safe.

Of particular interest (at least, to me) is the fact that the Ara-1 and Ara-2 are both forward and backward compatible. For example, the guys and gals at Kinara are running Gen-AI applications on Ara-1 “because we can.” They say that GenAI on an Ara-1 chugs along a little slower than on an Ara-2, but—as we just mentioned—you can increase performance by ganging two or more Ara-1s together. An Ara-2 is capable of running a GenAI model on its own but—once again—you can gang multiple devices together if the occasion demands.

All of which leads nicely to the image illustrating Ara-2 products below. We start with the chip itself, which you can purchase standalone to build into your own custom products.

Ara-2 products (Source: Kinara)

Alternatively, you can purchase a module in the form of a USB module or an M.2 module, both of which are available with 4GB or 16GB of memory.  Or, if you really wish to beef things up, you can opt for a PCIe card with 4X Ara-2 chips and 32GB or 64GB of memory, or 8X Ara-2 devices and 128GB of memory. 

Yes, of course, if given a choice, I’d love to slip a PCIe card with 8X Ara-2 chips and 128GB of memory into my office tower computer and take it for a spin. On the other hand, I was just watching this video.

 

 

As we see, this handy-dandy USB device plugs into your notepad or laptop computer, after which you can bask in the glow of being able to run your very own local Stable Diffusion model. I can only imagine the looks of awe and envy if I were to be doing this on a plane while on my way to speak at a conference. One of those flights that’s jam-packed with techno-nerds (my peeps) going to the same conference—the sort of people who would understand and appreciate what they were looking at (history being made).

Not surprisingly, I’m wondering if I could use one of these bodacious beauties to generate the pencil sketch images for my The Life of Clive book. If I get to lay my hands on one, I’ll let you know. What say you? Do you have any thoughts you’d care to share, preferably without asking ChatGPT to craft a cunning comment for you, although—now I come to think about it—I’d be interested to see such a comment so long as you identified its author?

4 thoughts on “Generative AI Is Coming to the Edge!”

  1. PERPLEXITY says: The phrase “Generative AI Is Coming to the Edge!” can be interpreted as a synesthetic comment, suggesting a sensory experience related to the advancement of generative AI towards edge computing. This expression combines the concept of generative AI with the idea of movement or progress, possibly evoking a sense of anticipation or excitement.
    The term “synesthetic” typically refers to the production of a sense impression relating to one sense or part of the body by stimulation of another sense or part of the body. In this context, the phrase may metaphorically evoke a sensory experience related to the technological advancement of generative AI towards edge computing, potentially conveying a sense of innovation or transformation.
    The use of the phrase in this manner is creative and metaphorical, linking the advancement of technology to a sensory experience, and it may be interpreted as an imaginative way to convey the significance of this development.
    If you have a specific interpretation or context in which this phrase is being used, please feel free to provide additional details for a more tailored response.

  2. https://www.perplexity.ai/search/Max-Maxfield-or-I3_C9zXDS8K1lWf7vfAZaA Generative AI Is Coming to the Edge! that’s a synesthetic comment – The combination of GenAI and a gustatory perception sound for me synesthetic and remembers me on the book cover of “Bebop to the Boolean Boogie: An Unconventional Guide to Electronics” which is synesthetic either. After deciphering your book cover I searched for further articles about synesthetic perceptions written by you and found your story about the electronic engineer Jordan A. Mills who saw schematic diagrams in color. I searched a lot in the internet but there was no Jordan A. Mills in the world so it must have been a pseudonym.
    Some years ago I gave to my students the project requirement “you have to build a synaesthesis tester” and the best of them answered “we can’t build a synaesthesis tester because we are no synaesthets” which has been quite right. So I invited some synaesthets into the class room which was an epiphany for most of us. A student build a synaesthesis tester later on which generated a special colored letter pattern which included a second hidden pattern but my specification was wrong what I realized later on when I testes with synaesthets. All letters has been rotated intentionally by a small angle because I supposed the synesthetic capabilities on an other visual processing layer and therefore the synaesthets were not able to read but only to feel the hidden pattern.

    1. As far as I recall, Jordan A. Mills was just a typical engineer, so he might not have a big digital footprint. I think the term “synaesthesis tester” is a big generic — there are many different types of synaesthesia — How did you find some synaesthets to come to your class?

Leave a Reply

featured blogs
Feb 22, 2024
The new Cadence training website is online! This newly redesigned website provides an overview of our well-respected training methods and courses, plus offerings that might be new to you. Modern design and top-of-the-page navigation make it easy to find just what you need'”q...
Feb 15, 2024
This artist can paint not just with both hands, but also with both feet, and all at the same time!...

featured video

Shape The Future Now with Synopsys ARC-V Processor IP

Sponsored by Synopsys

Synopsys ARC-V™ Processor IP delivers the optimal power-performance-efficiency and extensibility of ARC processors with broad software and tools support from Synopsys and the expanding RISC-V ecosystem. Built on the success of multiple generations of ARC processor IP covering a broad range of processor implementations, including functional safety (FS) versions, the ARC-V portfolio delivers what you need to optimize and differentiate your SoC.

Learn more about Synopsys ARC-V RISC-V Processor IP

featured paper

How to Deliver Rock-Solid Supply in a Complex and Ever-Changing World

Sponsored by Intel

A combination of careful planning, focused investment, accurate tracking, and commitment to product longevity delivers the resilient supply chain FPGA customers require.

Click here to read more

featured chalk talk

Improving Chip to Chip Communication with I3C
Sponsored by Mouser Electronics and Microchip
In this episode of Chalk Talk, Amelia Dalton and Toby Sinkinson from Microchip explore the benefits of I3C. They also examine how I3C helps simplify sensor networks, provides standardization for commonly performed functions, and how you can get started using Microchips I3C modules in your next design.
Feb 19, 2024
432 views