This project will investigate a hybrid-grained programmable computing device, for high energy-efficient deep neural network processing. The device uses array-based computing architecture composed of both fine-grained and coarse-grained programmable units. Fine-grained units are implemented based on LUTs (look-up table), with high flexibility to support different logic. Coarse-grained units are implemented based on PEs (processing element), with highly efficiency for basic neuron computation. This project will study the computing architecture, memory access interface, programming method and synthesis tools, and develop key techniques including a hybrid-grained programmable architecture, high energy-efficient approximate units, dynamically programmable memory interface and HLS (high-level synthesis) for the device. This project will fabricate a prototype chip, integrated with 3D DRAM by SiP (System in Package). With the help of the HLS workflow, the device can be programmed to execute deep neural network based intelligence applications, with energy efficiency of over 10TOPs/W. The prototype chip will be verified by implementing an object recognition system.
本课题研究以高能效深度神经网络处理为主要目标的混合粒度可编程计算器件。该器件采用混合了细粒度可编程单元(LUT)和粗粒度可编程单元(PE)的阵列计算架构。细粒度单元配置高度灵活,支持实现任意逻辑。粗粒度单元具有高能效,支持神经元核心计算,相互配合从而实现高能效的通用深度神经网络处理。本课题研究计算架构、访存接口、编程机制和综合工具等方面内容,突破混合粒度可编程阵列架构、高能效近似计算单元、动态可配访存接口、高级脚本语言自动综合等关键技术,设计原型样片,完成关键部件电路的流片验证。该器件计算能效不低于10TOPs/W,支持与3D DRAM以2.5D方式集成封装,配合工具链能够支持以高级语言编程的通用深度神经网络智能应用。本课题将基于该器件原型样片完成目标识别应用验证系统。
为满足智能计算对芯片灵活性和高能效的迫切需求,本项目设计了一种粗粒度和细粒度混合的可编程器件和芯片架构。其中,细粒度单元配置高度灵活,支持实现任意逻辑;粗粒度单元具有高能效,支持神经元核心计算。通过粗细粒度可编程单元的相互配合,高效灵活地计算具有各种拓扑结构和参数的神经网络。本项目研究了计算架构、访存接口、编程机制和综合工具等方面内容,突破了混合粒度可编程阵列架构、高能效近似计算单元、动态可配访存接口、高级脚本语言自动综合等关键技术,设计了原型样片,并基于原型样片完成了目标识别应用验证系统。该芯片计算能效达到了20TOPs/W,支持与3D DRAM以2.5D方式集成封装,配合工具链能够支持以高级语言编程的通用深度神经网络智能应用。
{{i.achievement_title}}
数据更新时间:2023-05-31
一种基于多层设计空间缩减策略的近似高维优化方法
基于MCPF算法的列车组合定位应用研究
结直肠癌免疫治疗的多模态影像及分子影像评估
智能煤矿建设路线与工程实践
二维FM系统的同时故障检测与控制
面向人工智能应用的高能效数模混合计算芯片
面向可重构计算的编译技术
高能效可重构磁振子晶体功能器件的基础研究
面向流计算的高能效NoC体系结构研究