view in publisher's site
- لیست مقالات
An Energy-Efficient FPGA Implementation of an LSTM Network Using Approximate Computing
Long Short-Term Memory (LSTM) Recurrent Neural network (RNN) is known for its capability in modeling temporal aspects of data and has been shown to produce promising results in sequence learning tasks such as language modeling. However, due to the large number of model parameters and compute-intensive operations, existing FPGA implementations of LSTM cells are not sufficiently energy-efficient as they require large area and exhibit high power consumption. This work describes a substantially different hardware implementation of an LSTM which includes several architectural innovations to achieve high throughput and energy-efficiency. This paper includes extensive exploration of the design trade-offs and demonstrates the advantages for one common application - language modeling. Implementation of the design on a Xilinx Zynq XC7Z030 FPGA for language modeling shows significant improvements in throughput and energy-efficiency as compared to the state-of-the-art designs. It is worth mentioning that the proposed LSTM hardware architecture is also applicable to other applications that use LSTM as part of the neural network model (e.g., CNN-RNN models) or in whole (e.g., RNN models).