In a move to address increasing performance requirements for artificial intelligence (AI) systems on chip (SoCs), US company Synopsys has released a new neural processing unit (NPU) IP and toolchain that it states delivers the performance needed for the latest, most complex neural network models.
The company notes that applications such as ADAS and other emerging AI use cases that implement complex neural network models are putting greater demands on compute and memory resources, often for safety-critical functions. To meet this range of application requirements, the company’s ARC NPX6 NPU IP is capable of scaling from 4K to 96K MACs and is able, in a single instance, to deliver up to 250 tera operations per second (TOPS) at 1.3 GHz on 5nm processes in worst-case conditions.
It also integrates hardware and software connectivity features that enable implementation of multiple NPU instances to achieve up to 3,500 TOPS of performance on a single SoC. The NPU IP provides more than 50 times the performance of the maximum configuration of the ARC EV7x Processor IP. Finally, the NPX6 offers optional 16-bit floating point support inside the neural processing hardware, maximizing layer performance and simplifying the transition from GPUs used for AI prototyping to high-volume power- and area-optimized SoCs.
“Based on our seamless experience integrating the Synopsys DesignWare ARC EV Processor IP into our successful NU4000 multi-core SoC, we have selected the new ARC NPX6 NPU IP to further strengthen the AI processing capabilities and efficiency of our products when executing the latest neural network models,” said Dor Zepeniuk, CTO at Inuitive, a designer of 3D and vision processors for advanced robotics, drones, augmented reality/virtual reality (AR/VR) devices and other edge AI and embedded vision applications. “In addition, the easy-to-use ARC MetaWare tools help us take maximum advantage of the processor hardware resources, ultimately helping us to meet our performance and time-to-market targets.”
The NPU IP meets stringent random hardware fault detection and systematic functional safety development flow requirements to achieve up to ISO 26262 ASIL D compliance. The processors, with comprehensive safety documentation included, feature dedicated safety mechanisms for ISO 26262 compliance and address the mixed-criticality and virtualization requirements of next-generation zonal architectures.
Meanwhile, the ARC MetaWare MX Development Toolkit includes compilers and debugger, neural network software development kit (SDK), virtual platforms SDK, runtimes and libraries, and advanced simulation models. It offers a single toolchain to accelerate application development and automatically partitions algorithms across the MAC resources for more efficient processing. For safety-critical automotive applications, the MetaWare MX Development Toolkit for Safety includes a safety manual and a safety guide to help developers meet the ISO 26262 requirements and prepare for ISO 26262 compliance testing.