industry news
Subscribe Now

CacheQ Unveils GPU support for QCC Heterogeneous Compute Development Environment

Delivers Faster Performance, Reduced Development Time for GPU Compute Architectures

LOS GATOS, CALIF., –– January 18, 2023 –– CacheQ Systems, Inc. today announced GPU support for its QCC Acceleration Platform, a heterogenous compute development environment delivering faster performance and reduced development time for computer architectures including multi-core processors, GPUs and field programmable gate arrays (FPGA). 

“Demand for hardware acceleration using GPUs and other heterogenous compute hardware is growing exponentially,” remarks Clay Johnson, CEO and co-founder of CacheQ Systems, developer of heterogeneous acceleration solutions. “Our goal is to simplify high-performance data center and edge-computing application development. The QCC Acceleration Platform meets that goal and will enable new solutions across a variety of applications, including life sciences, financial trading, government, oil and gas exploration and industrial IoT.”

The QCC Acceleration Platform Advantage

GPU deployment has advanced at a rapid pace in the last five years, and the yearly $25 billion dollar industry is expected to continue growing at approximately 33% CAGR through 2028.  

Heterogeneous compute systems such as multicore processors, GPUs as well as FPGAS attached to these processing systems have relied on software tools supported by hardware vendors and the open-source community. These tools traditionally relied upon software developers to pass information to the compilers to express parallelism in their code accomplished through hardware-specific APIs such as CUDA from NVIDIA, HIP from AMD, and oneAPI from Intel.  

Other efforts attempt to support pragmas embedded in C, C++, and Fortran through OpenACC, OpenMP, and OpenCL. All require deep knowledge of the target hardware to control memory copy and synchronization events, create teams of threads, manually remove loop carry dependencies, race conditions, and to add summations to achieve performance and correct code behavior on parallel compute units.

CacheQ QCC is the first compiler platform to automatically extract parallelism from standard C, C++, and Fortran code without requiring the developer to explicitly communicate parallelism to the compiler. QCC automatically accelerates applications using a variety of hardware, exceeding the performance of pragma-based approaches and can approach hand-coded API solutions with minimal hardware knowledge. This allows a developer to write generic code and target high-performance hardware at compile time without refactoring code, or refactoring in such a way that it is not target hardware specific and is easily functionally verifiable.

Based on the proprietary CacheQ virtual machine (CQVM), the QCC Acceleration Platform is a heterogenous compute development environment that converts serial high-level language (HLL) code into a parallel representation in less than 30 seconds for the most complex designs. It supports code profiling, utilization estimates, performance simulation, memory configuration and partitioning across a variety of compute engine processors including GPUs, x86, Arm and RISC-V, and FPGAs prior to generating a compute executable.

Features include a development environment with uniform drivers, protected containers and support for multiple boards from multiple vendors. Its design analysis offers profiling, performance simulation and memory activity reporting. An optimization capability adds code unrolling, user-driven memory configuration, and automatic and user-guided partitioning. 

The FPGA implementation includes a resource estimator, pre-configured shells, multiple boards and parts, and implementation tool automation. The memory implementation supports automatic integration, multi-port/multi-access and striping.

Availability and Pricing

The QCC Acceleration Platform is shipping now in limited volume with general availability in project to be in late 2023. The 0.18 release supports GPUs from nVidia and AMD, FPGA accelerator boards from Xilinx, and CPUs from Intel, AMD, Arm, Apple, and RISC-V.  

Pricing is available on request.

Visit the CacheQ website for additional information, or requests for a demonstration or early access to the QCC Acceleration Platform.

About CacheQ Systems

CacheQ Systems, headquartered in Los Gatos, Calif., with a development center in Longmont, Colo., was founded in 2018 to accelerate performance and simplify development of data center and edge-computing applications executing on processors and single or multi FPGAs. Its QCC Acceleration Platform reduces development time and increases acceleration, enabling software developers to implement heterogeneous compute solutions leveraging processors, GPUs and FPGAs with limited hardware architecture knowledge. More information can be found at the CacheQ Systems website.

Leave a Reply

featured blogs
Dec 7, 2023
Semiconductor chips must be designed faster, smaller, and smarter'”with less manual work, more automation, and faster production. The Training Webinar 'Flow Wrapping: The Cadence Cerebrus Intelligent Chip Explorer Must Have' was recently hosted with me, Krishna Atreya, Princ...
Dec 7, 2023
Explore the different memory technologies at the heart of AI SoC memory architecture and learn about the advantages of SRAM, ReRAM, MRAM, and beyond.The post The Importance of Memory Architecture for AI SoCs appeared first on Chip Design....
Nov 6, 2023
Suffice it to say that everyone and everything in these images was shot in-camera underwater, and that the results truly are haunting....

featured video

Dramatically Improve PPA and Productivity with Generative AI

Sponsored by Cadence Design Systems

Discover how you can quickly optimize flows for many blocks concurrently and use that knowledge for your next design. The Cadence Cerebrus Intelligent Chip Explorer is a revolutionary, AI-driven, automated approach to chip design flow optimization. Block engineers specify the design goals, and generative AI features within Cadence Cerebrus Explorer will intelligently optimize the design to meet the power, performance, and area (PPA) goals in a completely automated way.

Click here for more information

featured paper

3D-IC Design Challenges and Requirements

Sponsored by Cadence Design Systems

While there is great interest in 3D-IC technology, it is still in its early phases. Standard definitions are lacking, the supply chain ecosystem is in flux, and design, analysis, verification, and test challenges need to be resolved. Read this paper to learn about design challenges, ecosystem requirements, and needed solutions. While various types of multi-die packages have been available for many years, this paper focuses on 3D integration and packaging of multiple stacked dies.

Click to read more

featured chalk talk

Optimize Performance: RF Solutions from PCB to Antenna
RF is a ubiquitous design element found in a large variety of electronic designs today. In this episode of Chalk Talk, Amelia Dalton and Rahul Rajan from Amphenol RF discuss how you can optimize your RF performance through each step of the signal chain. They examine how you can utilize Amphenol’s RF wide range of connectors including solutions for PCBs, board to board RF connectivity, board to panel and more!
May 25, 2023
23,638 views