industry news
Subscribe Now

Generative AI Compute Front-Runner d-Matrix Reaches New Milestone in Efficient AI Inference

Digital in-memory compute chiplet is a first-of-its-kind for the industry

[August 22, San Francisco, CA]: Today, d-Matrix, the leader in high-efficiency Generative AI compute for data centers, announced Jayhawk II, the next generation of its highly anticipated generative AI compute platform. This new silicon features an enhanced version of its digital in-memory-compute (DIMC) engine with chiplet interconnect. This industry-first silicon demonstrates a DIMC architecture coupled with the OCP Bunch of Wires (BoW) PHY interconnect standard for low-latency AI inference on large language models (LLMs) from data center scale LLMs like ChatGPT to more focused models like Meta’s Llama2 or Falcon from the Technology Innovation Institute.‍

Cloud and enterprise business leaders are eager to deploy generative AI applications but are encountering major hurdles: the cost of running inference, inference latency and throughput and the availability of chips and compute power that scales for LLMs. Jayhawk II is designed to solve each of the challenges by combining a DIMC architecture with a chiplet-based interconnect. The d-Matrix silicon delivers a 40x improvement in memory bandwidth when compared to the state-of-the-art high-end GPUs. The higher memory bandwidth of the Jayhawk II translates to higher throughput and lower latency for generative inference applications while minimizing total cost of ownership (TCO).

“With the announcement of Jayhawk II, our customers are a step closer to serving generative AI and LLM applications with much better economics and a higher quality user experience than ever before,” said Sid Sheth, CEO and co-founder of d-Matrix. “We’re working with a range of companies large and small to evaluate the Jayhawk II silicon in real-world scenarios and the results are very promising.”‍

The Jayhawk II silicon follows the original Jayhawk announced earlier in 2023, which demonstrated 2Tbps bi-directional die-die connectivity and outperformed competitors with high bandwidth, energy efficiency and cost-effectiveness. Today’s announcement builds upon the original release and addresses the main challenges of running generative AI-specific LLMs:

  • DIMC engine that scales from 30 TOPs/w to150 TOPs/w using a 6nm process technology
  • Supports floating point and block floating point numerics across a range of precisions
  • Supports compression and sparsity approaches enabling prompt caching for Generative AI models.
  • Can handle 10 – 20x more generative inferences per second for LLM model sizes ranging from 3B to 40B parameters compared to incumbent state-of-the-art GPU solutions.
  • Are 10 – 20x better TCO for generative inference when compared to these GPU solutions.

Jayhawk II is now available for demos and evaluation. To learn more visit d-matrix.ai.

About d-Matrix

d-Matrix is leading the data center architecture shift to Digital In-Memory Computing (DIMC) to address the growing demand for transformer and generative AI inference acceleration. d-Matrix creates flexible solutions for inference at scale using innovative circuit techniques, a chiplet-based architecture, high-bandwidth BoW interconnects and a full stack of machine learning and large language model tools and software. Founded in 2019, the company is backed by top investors and strategic partners including Playground Global, M12 (Microsoft Venture Fund), SK Hynix, Nautilus Venture Partners, Marvell Technology and Entrada Ventures.‍

Visit d-matrix.ai for more information and follow d-Matrix on LinkedIn for the latest updates.

Leave a Reply

featured blogs
Jul 12, 2024
I'm having olfactory flashbacks to the strangely satisfying scents found in machine shops. I love the smell of hot oil in the morning....

featured video

Unleashing Limitless AI Possibilities with FPGAs

Sponsored by Intel

Industry experts discuss real-world AI solutions based on Programmable Logic, or FPGAs. The panel talks about a new approach called FPGAi, what it is and how it will revolutionize how innovators design AI applications.

Click here to learn more about Leading the New Era of FPGAi

featured paper

Navigating design challenges: block/chip design-stage verification

Sponsored by Siemens Digital Industries Software

Explore the future of IC design with the Calibre Shift left initiative. In this paper, author David Abercrombie reveals how Siemens is changing the game for block/chip design-stage verification by moving Calibre verification and reliability analysis solutions further left in the design flow, including directly inside your P&R tool cockpit. Discover how you can reduce traditional long-loop verification iterations, saving time, improving accuracy, and dramatically boosting productivity.

Click here to read more

featured chalk talk

Data Connectivity at Phoenix Contact
Single pair ethernet provides a host of benefits that can enable seamless data communication for a variety of different applications. In this episode of Chalk Talk, Amelia Dalton and Guadalupe Chalas from Phoenix Contact explore the role that data connectivity will play for the future of an all electric society, the benefits that single pair ethernet brings to IIoT designs and how Phoenix Contact is furthering innovation in this arena.
Jan 5, 2024
25,361 views