Over the past few months, I’ve waffled on (as is my wont) about various flavors (ooh, tasty) of generative artificial intelligence (GenAI). On the menu were items like GitHub Copilot, which generates code (throwing in errors and security vulnerabilities for free), and Metabob from Metabob (can you spell “Zen”), which looks at the code generated by GitHub Copilot and takes the bugs and security vulnerabilities out again (to understand recursion, you must first understand recursion).
Another form of GenAI comes in the form of text-to-image models, such as Stable Diffusion, which can be used to generate detailed images conditioned on text descriptions. It can also be used for related tasks, such as inpainting (where damaged, deteriorated, or missing parts of an artwork are filled in to present a complete image), outpainting (extending an existing image beyond its original canvass), and image-to-image translations.
The thing is, until now, we all pretty much took it for granted that GenAI models of this class, capacity, and caliber were destined to be run in the cloud. Can you imagine what it would be like to be able to run something like Stable Diffusion on a USB Drive that plugs into your notepad or laptop computer? I can. Actually, I don’t have to imagine this because I’ve seen it in action (I’ll show you a video later in this column).
To be honest, I have a personal interest in being able to generate images from text descriptions. I’m currently writing a small book called The Life (of a Boy Called) Clive (It Rhymes with Five). This all came about because my wife (Gina the Gorgeous) has been pushing me to write fiction. She firmly believes that since I’ve written technical tomes (search for “Clive Maxfield” on Amazon), writing fiction would be easy peasy lemon squeezy. She doesn’t seem to realize that a real fiction writer can convey the impression of something like the grandiosity of a ballroom with a few well-chosen words. By comparison, I would find this task to be stressed depressed lemon zest because I would be reduced to documenting the ballroom’s dimensions using both imperial and metric measurements.
Unwilling to give up, Gina next suggested that I write all of the stories about my formative years that she’s heard (over and over again) from my mother. Like the time when I was around 18 months old and I discovered what the bottom of a large barrel full of ice-cold water looked and felt like, thereby necessitating yet another mad rush to the hospital with my parents (suffice it to say that the doctors and nurses at our local hospital knew me by name). Or the time we went on holiday when I was two years old, and I slid in my socks along the linoleum floor in our Victorian hotel bedroom, and I hit the low windowsill and shot out of the window. Did I mention we were on the sixth floor? My mom said she was talking to my dad when he suddenly performed a ballet leap across the room that would have made Nureyev proud (dad used to be a dancer before the war) and threw himself through the open window, leaving only a hand clasped to the window frame as a reminder of his earlier presence. He then slowly pulled himself back inside, grasping me by my ankle in his other hand. Or the time when I fell off the cliff, making a surprise entrance to the family basking below. Or the time when… but I’m sure you get the drift.
The thing is, I would like to illustrate each of these stories with little minimalist pencil sketches in the style of E.H. Shepard’s illustrations for Winnie-the-Pooh. This is the sort of thing I could happily do in the evenings while ensconced in my comfy chair in our family room—if only I had a USB Drive that could run my own personal copy of Stable Diffusion.
Am I the only one who wishes for the ability to be able to run GenAI at the edge (by which I mean the edge in the form of laptop computers and edge servers—not the extreme edge in the form of IoT devices, although I’m not saying we won’t end up there before too long)? “No!” I cry, “a thousand times no!” In fact, there are many applications ranging from AI home assistants to medical devices that would benefit from GenAI on the edge. The reasons for edge deployment of GenAI include lower cost, privacy and reliability, increased accuracy (personalized models can be fine-tuned with individual and enterprise data for customization and improved accuracy), and low latency (supporting real-time application of GenAI for surveillance, video conferencing, gaming…).
Do you recall deep in the mists of time we used to call 2020 when I wrote my column, Say Hello to Deep Vision’s Polymorphic Dataflow Architecture? At that time, we discussed Deep Vision’s Ara-1 device.
Meet the Ara-2 (Source: Kinara)
Well, reminiscent of Michael Jackson’s 1991 Black or White music video (go on, you know you want to see it again), which boasted the first full photorealistic face morphing (was that really 33 years ago?), Deep Vision somehow morphed into a company called Kinara.
I was just chatting with Ravi Annavajjhala, who is CEO at Kinara, Wajahat Qadeer, who is Co-Founder and Chief Architect at Kinara, and the legendary Markus Levy, who seems to be present wherever cutting-edge machine vision and artificial intelligence appear.
Our conversation spanned too many topics to cover here (I’m sorry). Suffice it to say that, as illustrated in the above diagram, the Ara-1—which is latency-optimized for edge operations and offers 10X Capex/TCO improvement over GPUs—is currently shipping in volume and has been for several years now, which makes it proven technology. Meanwhile, the currently sampling Ara-2 offers 5-8X performance improvement over the Ara-1.
The Ara-2’s neural cores offer enhanced compute utilization, along with support for INT4, INT8, and MSFP16 data types. Each Ara-2 can support 16GB of LPDDR4 (4X that of the Ara-1), and multiple chips can be used to provide scalable performance with automatic load balancing. Furthermore, the Ara-2 supports secure boot and encrypted memory, thereby keeping our secrets safe.
Of particular interest (at least, to me) is the fact that the Ara-1 and Ara-2 are both forward and backward compatible. For example, the guys and gals at Kinara are running Gen-AI applications on Ara-1 “because we can.” They say that GenAI on an Ara-1 chugs along a little slower than on an Ara-2, but—as we just mentioned—you can increase performance by ganging two or more Ara-1s together. An Ara-2 is capable of running a GenAI model on its own but—once again—you can gang multiple devices together if the occasion demands.
All of which leads nicely to the image illustrating Ara-2 products below. We start with the chip itself, which you can purchase standalone to build into your own custom products.
Ara-2 products (Source: Kinara)
Alternatively, you can purchase a module in the form of a USB module or an M.2 module, both of which are available with 4GB or 16GB of memory. Or, if you really wish to beef things up, you can opt for a PCIe card with 4X Ara-2 chips and 32GB or 64GB of memory, or 8X Ara-2 devices and 128GB of memory.
Yes, of course, if given a choice, I’d love to slip a PCIe card with 8X Ara-2 chips and 128GB of memory into my office tower computer and take it for a spin. On the other hand, I was just watching this video.
As we see, this handy-dandy USB device plugs into your notepad or laptop computer, after which you can bask in the glow of being able to run your very own local Stable Diffusion model. I can only imagine the looks of awe and envy if I were to be doing this on a plane while on my way to speak at a conference. One of those flights that’s jam-packed with techno-nerds (my peeps) going to the same conference—the sort of people who would understand and appreciate what they were looking at (history being made).
Not surprisingly, I’m wondering if I could use one of these bodacious beauties to generate the pencil sketch images for my The Life of Clive book. If I get to lay my hands on one, I’ll let you know. What say you? Do you have any thoughts you’d care to share, preferably without asking ChatGPT to craft a cunning comment for you, although—now I come to think about it—I’d be interested to see such a comment so long as you identified its author?