A new study by Cornell University researchers has confirmed that GSI Technology’s Associative Processing Unit (APU) can match the performance of high-end GPUs for large-scale artificial intelligence applications while consuming dramatically less energy. The findings highlight the potential of GSI’s compute-in-memory (CIM) architecture in the AI and high-performance computing markets.
The paper, published by ACM and presented at the Micro ’25 conference, found that the company’s Gemini-I APU delivered throughput comparable to an NVIDIA A6000 GPU on retrieval-augmented generation (RAG) workloads. Key findings from the research include:
- GPU-Class Performance: The APU achieved similar throughput to the NVIDIA A6000 GPU.
- Massive Energy Savings: The APU consumed over 98% less energy than the GPU across various large datasets.
- Superior CPU Efficiency: The APU performed retrieval tasks several times faster than standard CPUs, reducing total processing time by up to 80%.
“Cornell’s independent validation confirms what we’ve long believed—compute-in-memory has the potential to disrupt the $100 billion AI inference market,” said Lee-Lean Shu, Chairman and CEO of GSI Technology. “The APU delivers GPU-class performance at a fraction of the energy cost, thanks to its highly efficient memory-centric architecture.”
Titled “Characterizing and Optimizing Realistic Workloads on a Commercial Compute-in-SRAM Device,” the study represents one of the first comprehensive evaluations of a commercial compute-in-memory device under realistic workloads. The research team benchmarked the GSI Gemini-I APU against established CPUs and GPUs using datasets ranging from 10GB to 200GB.
The results point to significant opportunities for GSI Technology as customers increasingly prioritize performance-per-watt. Potential applications include edge AI for power-constrained robotics, drones, and IoT devices, as well as defense and aerospace systems where high performance is needed within strict energy and cooling limits.
Shu added that the company’s technology roadmap promises even greater gains. “Our recently released second-generation APU silicon, Gemini-II, can deliver roughly 10x faster throughput and even lower latency for memory-intensive AI workloads,” he said. “Looking ahead, Plato represents the next step forward, offering even greater compute capability at lower power for embedded edge applications.”
The Cornell study also introduced a new analytical framework for general-purpose compute-in-memory devices, providing optimization principles that strengthen the APU’s position as a scalable platform for developers.




