Technology

The challenge with AI acceleration is not limited to the hardware, instead we believe software is the biggest challenge. Being able to run any neural network at high utilization (>80%) and low memory usage is the challenge. Feeding the 3-dimensional neural network data using a traditional instruction set makes the compiler intangible in terms of achieving utilization and there by low power and size. CortiCore architecture provides the solution via its unique instruction set that dramatically reduces the compiler complexity. The approach allows us to create a compiler that achieves >80% utilization with 16x reduced memory (compared to currently available solutions) on all neural networks – demonstrated on our FPGA platforms.

Computing’s Energy Problem and What We Can Do About It

— Mark Horowitz, PhD

As seen in this table, Mark Horowitz, PhD shows how the amount of energy needed for a 32bit SRAM read is 25x more than the energy needed for a 8bit multiply function. CortiCore architecture reduces the non IP system memory access to once per input data and never does any writes to the non-IP memory, inspite of low (256KB) internal memory.

Source: ISSCC 2014: Computing’s Energy Problem by Mark Horowitz, PhD

Scalable RTL via parameters for performance and power

Number of ALU
Number of clusters
Activation memory size per cluster (Local mem –can be 256KB, 512KB typical)
DDR or No DDR – external memory
Internal system memory
External shared memory (optional)

Technology

Scalable RTL via parameters for performance and power

No long interconnects or interconnect fabric

Designed for high speed and/or HVt cells

Synthesized and P&R complete at 400MHz, 22nm

Hardware configuration input to compiler

©2024 Roviero, Inc. | All rights reserved.