关键词:
Voltage
In-memory computing
Random access memory
Neural networks
MOSFET
Energy efficiency
Computational modeling
SRAM
compute-in-memory (CIM)
quantized neural networks
multiply and accumulate (MAC)
zero-skipping
current-mode accumulation
摘要:
This article presents a novel dual 7T static random-access memory (SRAM)-based compute-in-memory (CIM) macro for processing quantized neural networks. The proposed SRAM-based CIM macro decouples read/write operations and employs a zero-input/weight skipping scheme. A 65nm test chip with 528x128 integrated dual 7T bitcells demonstrated reconfigurable precision multiply and accumulate operations with 384 x binary inputs (0/1) and 384 x128 programmable multi-bit weights (3/7/15-levels). Each column comprises 384 x bitcells for a dot product, 48 x bitcells for offset calibration, and 96 x bitcells for binary-searching analog-to-digital conversion. The analog-to-digital converter (ADC) converts a voltage difference between two read bitlines (i.e., an analog dot-product result) to a 1-6b digital output code using binary searching in 1-6 conversion cycles using replica bitcells. The test chip with 66Kb embedded dual SRAM bitcells was evaluated for processing neural networks, including the MNIST image classifications using a multi-layer perceptron (MLP) model with its layer configuration of 784-256-256-256-10 The measured classification accuracies are 97.62%, 97.65%, and 97.72% for the 3, 7, and 15 level weights, respectively. The accuracy degradations are only 0.58 to 0.74% off the baseline with software simulations. For the VGG6 model using the CIFAR-10 image dataset, the accuracies are 88.59%, 88.21%, and 89.07% for the 3, 7, and 15 level weights, with degradations of only 0.6 to 1.32% off the software baseline. The measured energy efficiencies are 258.5, 67.9, and 23.9 TOPS/W for the 3, 7, and 15 level weights, respectively, measured at 0.45/0.8V supplies.