A Ternary Neural Network with Compressed Quantized Weight Matrix for Low Power Embedded Systems

S. N. Truong

doi:10.48084/etasr.4758

Authors

S. N. Truong Faculty of Electrical and Electronics Engineering, Ho Chi Minh City University of Technology and Education, Vietnam

Volume: 12 | Issue: 2 | Pages: 8311-8315 | April 2022 | https://doi.org/10.48084/etasr.4758

Received: 14 January 2022 | Revised: 2 February 2022 | Accepted: 11 February 2022 | Online: 9 April 2022

Corresponding author: S. N. Truong

Abstract

In this paper, we propose a method of transforming a real-valued matrix to a ternary matrix with controllable sparsity. The sparsity of quantized weight matrices can be controlled by adjusting the threshold during the training and quantizing process. A 3-layer ternary neural network was trained with the MNIST dataset using the proposed adjustable dynamic threshold. The sparsity of the quantized weight matrices varied from 0.1 to 0.6 and the obtained recognition rate reduced from 91% to 88%. The sparse weight matrices were compressed by the compressed sparse row format to speed up the ternary neural network, which can be deployed on low-power embedded systems, such as the Raspberry Pi 3 board. The ternary neural network with the sparsity of quantized weight matrices of 0.1 is 4.24 times faster than the ternary neural network without compressing weight matrices. The ternary neural network is faster as the sparsity of quantized weight matrices increases. When the sparsity of the quantized weight matrices is as high as 0.6, the recognition rate degrades by 3%, however, the speed is 9.35 times the ternary neural network's without compressing quantized weight matrices. Ternary neural network work with compressed sparse matrices is feasible for low-cost, low-power embedded systems.

Keywords:

quantized neural network, Ternary neural netw, deep learning, image recognition

References

K. L. Masita, A. N. Hasan, and T. Shongwe, "Deep Learning in Object Detection: a Review," in International Conference on Artificial Intelligence, Big Data, Computing and Data Communication Systems, Durban, South Africa, Aug. 2020, pp. 1–11. DOI: https://doi.org/10.1109/icABCD49160.2020.9183866

A. Alsheikhy, Y. Said, and M. Barr, "Logo Recognition with the Use of Deep Convolutional Neural Networks," Engineering, Technology & Applied Science Research, vol. 10, no. 5, pp. 6191–6194, Oct. 2020. DOI: https://doi.org/10.48084/etasr.3734

A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," in 26th Annual Conference on Neural Information Processing Systems, Nevada, USA, Dec. 2012, vol. 25, pp. 1097–1105.

S. Sahel, M. Alsahafi, M. Alghamdi, and T. Alsubait, "Logo Detection Using Deep Learning with Pretrained CNN Models," Engineering, Technology & Applied Science Research, vol. 11, no. 1, pp. 6724–6729, Feb. 2021. DOI: https://doi.org/10.48084/etasr.3919

J. Lee, J. Lee, D. Han, J. Lee, G. Park, and H.-J. Yoo, "An Energy-Efficient Sparse Deep-Neural-Network Learning Accelerator With Fine-Grained Mixed Precision of FP8–FP16," IEEE Solid-State Circuits Letters, vol. 2, no. 11, pp. 232–235, Aug. 2019. DOI: https://doi.org/10.1109/LSSC.2019.2937440

K. Yokoo, M. Atsumi, K. Tanaka, H. Wang, and L. Meng, "Deep Learning based Emotion Recognition IoT System," in International Conference on Advanced Mechatronic Systems, Hanoi, Vietnam, Dec. 2020, pp. 203–207. DOI: https://doi.org/10.1109/ICAMechS49982.2020.9310135

N. Lee, M. H. Azarian, M. Pecht, J. Kim, and J. Im, "A Comparative Study of Deep Learning-Based Diagnostics for Automotive Safety Components Using a Raspberry Pi," in IEEE International Conference on Prognostics and Health Management, San Francisco, CA, USA, Jun. 2019, pp. 1–7. DOI: https://doi.org/10.1109/ICPHM.2019.8819436

B. H. Curtin and S. J. Matthews, "Deep Learning for Inexpensive Image Classification of Wildlife on the Raspberry Pi," in 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference, New York, NY, USA, Oct. 2019, pp. 0082–0087. DOI: https://doi.org/10.1109/UEMCON47517.2019.8993061

E. Kristiani, C.-T. Yang, and K. L. Phuong Nguyen, "Optimization of Deep Learning Inference on Edge Devices," in International Conference on Pervasive Artificial Intelligence, Taipei, Taiwan, Dec. 2020, pp. 264–267. DOI: https://doi.org/10.1109/ICPAI51961.2020.00056

M. Courbariaux, I. Hubara, D. Soudry, R. El-Yaniv, and Y. Bengio, "Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1," Mar. 2016, Accessed: Feb. 12, 2022. [Online]. Available: http://arxiv.org/abs/1602.02830.

Y. Wang, J. Lin, and Z. Wang, "An Energy-Efficient Architecture for Binary Weight Convolutional Neural Networks," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 26, no. 2, pp. 280–293, Oct. 2018. DOI: https://doi.org/10.1109/TVLSI.2017.2767624

T. Simons and D.-J. Lee, "A Review of Binarized Neural Networks," Electronics, vol. 8, no. 6, Jun. 2019, Art. no. 661. DOI: https://doi.org/10.3390/electronics8060661

C. Baldassi, A. Braunstein, N. Brunel, and R. Zecchina, "Efficient supervised learning in networks with binary synapses," Proceedings of the National Academy of Sciences, vol. 104, no. 26, pp. 11079–11084, Jun. 2007. DOI: https://doi.org/10.1073/pnas.0700324104

K. Hwang and W. Sung, "Fixed-point feedforward deep neural network design using weights +1, 0, and −1," in IEEE Workshop on Signal Processing Systems, Belfast, UK, Oct. 2014, pp. 1–6. DOI: https://doi.org/10.1109/SiPS.2014.6986082

H. Yonekawa, S. Sato, and H. Nakahara, "A Ternary Weight Binary Input Convolutional Neural Network: Realization on the Embedded Processor," in IEEE 48th International Symposium on Multiple-Valued Logic, Linz, Austria, May 2018, pp. 174–179. DOI: https://doi.org/10.1109/ISMVL.2018.00038

S. Yin et al., "An Energy-Efficient Reconfigurable Processor for Binary-and Ternary-Weight Neural Networks With Flexible Data Bit Width," IEEE Journal of Solid-State Circuits, vol. 54, no. 4, pp. 1120–1136, Apr. 2019. DOI: https://doi.org/10.1109/JSSC.2018.2881913

L. Deng, P. Jiao, J. Pei, Z. Wu, and G. Li, "GXNOR-Net: Training deep neural networks with ternary weights and activations without full-precision memory under a unified discretization framework," Neural Networks, vol. 100, pp. 49–58, Dec. 2018. DOI: https://doi.org/10.1016/j.neunet.2018.01.010

S. N. Truong, "A Dynamic Threshold Quantization Method for Ternary Neural Networks for Low-cost Mobile Robots," International Journal of Computer Science and Network Security, vol. 20, no. 2, pp. 16–20, 2020.

S. N. Truong, "A Low-cost Artificial Neural Network Model for Raspberry Pi," Engineering, Technology & Applied Science Research, vol. 10, no. 2, pp. 5466–5469, Apr. 2020. DOI: https://doi.org/10.48084/etasr.3357

J. L. Greathouse and M. Daga, "Efficient Sparse Matrix-Vector Multiplication on GPUs Using the CSR Storage Format," in SC ’14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, New Orleans, LA, USA, Nov. 2014, pp. 769–780. DOI: https://doi.org/10.1109/SC.2014.68

X. Feng, H. Jin, R. Zheng, K. Hu, J. Zeng, and Z. Shao, "Optimization of Sparse Matrix-Vector Multiplication with Variant CSR on GPUs," in 17th International Conference on Parallel and Distributed Systems, Tainan, Taiwan, Dec. 2011, pp. 165–172. DOI: https://doi.org/10.1109/ICPADS.2011.91

H. Kabir, J. D. Booth, and P. Raghavan, "A multilevel compressed sparse row format for efficient sparse computations on multicore processors," in 21st International Conference on High Performance Computing, Goa, India, Dec. 2014, pp. 1–10. DOI: https://doi.org/10.1109/HiPC.2014.7116882

J. C. Pichel and B. Pateiro-Lopez, "Sparse Matrix Classification on Imbalanced Datasets Using Convolutional Neural Networks," IEEE Access, vol. 7, pp. 82377–82389, 2019. DOI: https://doi.org/10.1109/ACCESS.2019.2924060

J. Ranjani, A. Sheela, and K. P. Meena, "Combination of NumPy, SciPy and Matplotlib/Pylab -a good alternative methodology to MATLAB - A Comparative analysis," in 1st International Conference on Innovations in Information and Communication Technology, Chennai, India, Apr. 2019, pp. 1–5. DOI: https://doi.org/10.1109/ICIICT1.2019.8741475

L. Deng, "The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web]," IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 141–142, Aug. 2012. DOI: https://doi.org/10.1109/MSP.2012.2211477