A Low-cost Artificial Neural Network Model for Raspberry Pi

—In this paper, a ternary neural network with complementary binary arrays is proposed for representing the signed synaptic weights. The proposed ternary neural network is deployed on a low-cost Raspberry Pi board embedded system for the application of speech and image recognition. In conventional neural networks, the signed synaptic weights of –1, 0, and 1 are represented by 8-bit integers. To reduce the amount of required memory for signed synaptic weights, the signed values were represented by a complementary binary array. For the binary inputs, the multiplication of two binary numbers is replaced by the bit-wise AND operation to speed up the performance of the neural network. Regarding image recognition, the MINST dataset was used for training and testing of the proposed neural network. The recognition rate was as high as 94%. The proposed ternary neural network was applied to real-time object recognition. The recognition rate for recognizing 10 simple objects captured from the camera was 89%. The proposed ternary neural network with the complementary binary array for representing the signed synaptic weights can reduce the required memory for storing the model’s parameters and internal parameters by 75%. The proposed ternary neural network is 4.2, 2.7, and 2.4 times faster than the conventional ternary neural network for MNIST image recognition, speech commands recognition, and real-time object recognition respectively.


INTRODUCTION
Artificial Neural Networks (ANNs) and deep learning have achieved impressive successes in fields such as image recognition, speech recognition, and prediction [1][2][3][4][5][6]. ANNs are computationally expensive because they are composed of a huge number of computational tasks and internal parameters. ANNs are often implemented on high-performance CPUs (Central Processing Units) and GPUs (Graphics Processing Units) rather than low-cost embedded systems [7]. Various models of ANNs have been proposed for low-cost embedded systems such as binary neural networks and ternary neural networks [8][9][10][11][12][13][14]. Binary neural networks are the optimized models of neural networks that constraint the synaptic weights to the binary space {-1, 1} [8][9][10][11]. In a binary neural network, the conventional 32-bit floating-point multipliers are replaced by the logical XNOR operation to speed up performance. However, their accuracy is lower than full-precision neural networks because only one bit is used to represent the synaptic weight and activation function. To increase speed and accuracy, ternary neural networks that constraint the synaptic weights to the ternary space {-1, 0, 1} have been proposed [12][13][14][15]. The accuracy of the ternary and binary neural networks is slightly lower than the one of full-precision neural networks. However, the required memory is reduced significantly. Ternary neural networks are suitable to be implemented on low-cost embedded systems [14]. For a ternary neural network, the signed values of -1, 0, and 1 require 8-bit width memories and the 8-bit multipliers. In this paper, a method for representing the signed ternary synaptic weights by using the binary numbers 0 and 1 is proposed. The multipliers are replaced by the logical AND operations to reduce the required storage space and speed up the performance of the network. The proposed ternary neural network is deployed on a Raspberry Pi board for a mobile robot for object recognition and speech command recognition. Block diagram of the control unit for a mobile robot using the Raspberry Pi board Figure 1 shows a conceptual diagram of a control unit for a mobile robot. The control unit is implemented on a low-cost Raspberry Pi board. The system performs the tasks of pattern recognition including image recognition, speech command recognition, and real-time object recognition.

A. Ternary Neural Networks
A ternary neural network is an optimized model of ANNs with the weights constrained to -1, 0, and +1 to reduce the amount of required memory for storing the model's parameters [12][13][14][15]. The synaptic weights are quantized by 2 bits. It should be noted that the negative synaptic weights are necessary because the synapses are either excitatory or inhibitory [16][17][18].  The conceptual diagram of a ternary neural network, where the synaptic weights are -1, 0, or +1 Figure 2 shows a conceptual ternary neural network, where the synaptic weights are -1, 0, or +1. To represent the signed values of -1, 0, or +1, an 8-bit width memory can be used instead of the 32-bit width memory used in full-precision neural networks, so the amount of required memory for the model parameters of ternary neural networks is less than the full-precision neural networks. Though the amount of required memory is reduced, the ternary neural network is still far from the capability of a low-cost embedded system such as the Raspberry Pi board. The huge number of computation tasks (additions and multiplications) makes ternary neural networks too much complicated to be deployed on a low-cost embedded system. In this work, a complementary binary array is proposed to represent the signed synaptic weights. The multiplication is replaced by the bitwise AND operation to speed up the performance of the network.

B. Proposed Method
A complementary binary array to represent the signed synaptic weights is proposed. For the binary inputs, the output of the j th neuron in the hidden layer can be calculated by (1) [19]: where f is an activation function, x i is the i th input, w i,j is the synaptic weight representing the connection strength between the i th input neuron and the j th neuron in the hidden layer.
Equation (1) can be rewritten as: In (2), the signed values of -1, 0, and +1 can be represented by the binary numbers 0 and 1. For example, w i,j =-1 can be represented by w + i,j =0 and wi,j =1. Similarly, w i,j =+1 can be represented by w + i,j =1 and wi,j =0. By doing this, two binary numbers are used to store a signed value instead of using an 8bit number. As a result, the amount of required memory is reduced dramatically. Furthermore, for the binary input, the multiplication of two binary numbers can be performed by the bit-wise AND operation to reduce the computational task. The concept of the proposed complementary binary array for representing the signed ternary weights is shown in Figure 3.   Figure 3(b) presents the proposed method for storing the signed parameters and perform synaptic weighting. The signed values of -1, 0, and 1 are stored in two complementary binary arrays. If the weight is 1, the value of w + and ware respectively 1 and 0. If the weight is -1, the value of w + and ware respectively 0 and 1. By doing this, we need only two 1-bit memory cells for representing the signed value of -1, 0, or +1. Furthermore, the multiplication of two binary numbers can be replaced by the bit-wise AND operation, as shown in Figure 3(b). The multiplications are omitted. The proposed ternary neural network can be deployed on low-cost embedded systems effectively. In Figure 3(b), the summation of weighted inputs is performed by counting the "1" bits (population counting) in the result of the bit-wise AND operation. Employing two 1-bit memory cells to represent the signed synaptic weights can reduce the amount of memory dramatically. The proposed ternary neural network is effective for low-cost embedded systems used for mobile robots.

III. EXPERIMENTAL RESULTS
The proposed ternary neural network with the complementary binary array representing the signed synaptic weights is deployed on the Raspberry Pi board for the applications of speech recognition, image recognition, and realtime object recognition. For image recognition, a three-layer  The coefficients of MFCCs are quantized by 8 bits. The recognition rate for speech commands recognition is 91%. This is the first test of the proposed ternary neural network for the application of image recognition and speech recognition for a mobile robot. The ternary neural network with the proposed method for representing the ternary synaptic weights is also deployed for the application of real-time object recognition. In this experiment, 10 simple objects, shown in Figure 4, are used to evaluate the performance of the proposed neural network. For each object, 200 images were captured and converted to grayscale images for the training process. All images were captured with white background. Then, a ternary convolutional neural network was deployed on the Raspberry Pi board. There were 64 3×3 kernels in each convolutional layer, followed by a Max pooling layer. The fully-connected layer was composed of two hidden layers of 1024 hidden nodes. The output layer had 10 neurons for recognizing 10 objects. The convolutional neural network constraints the synaptic weights to the ternary space {-1, 0, 1}. The evaluated recognition rate for recognizing the 10 simple objects shown in Figure 4 is 89%. To compare the required memory and speed of a ternary neural network with the conventional method and the proposed technique for representing the signed synaptic weights, we deploy the two models on a Raspberry Pi board. Table I shows the comparison of the required memory of the conventional ternary neural network, and the proposed ternary neural network with a complementary binary array representing the signed synaptic weights. In Table I, we evaluate the memory that is required for storing the model and internal parameters. For a multilayer neural network, the conventional ternary neural network requires a memory of 398.2754KB, whereas the proposed ternary neural network with a complementary binary array representing the signed synaptic weights requires a memory of 99.5688KB, i.e. the proposed ternary neural network requires 75% less memory than the conventional ternary neural network. By using the bitwise AND operation and population counting instead of multiplication, the proposed ternary neural network is 4.2 times faster than the conventional ternary neural network for MNIST image recognition. For the speech recognition and real-time object recognition, the proposed ternary neural network can reduce the required memory by 75%, compared to the conventional ternary neural network. For speech recognition, the proposed ternary neural network is 2.7 times faster than the conventional neural network. For real-time object recognition, the ternary convolutional neural network with the proposed technique for ternary synaptic weight representation is 2.4×times faster than the ternary convolutional neural network using 8-bit ternary synaptic weights. IV. CONCLUSION Ternary neural networks have been proposed for reducing the required storage capacity and enhancing the speed of ANNs. However, the implementation of signed ternary weights still consumes high power and requires large computational resources. Many ternary neural network models have been deployed on high-performance processors such as GPUs for the application of image recognition [12][13][14][15]. In this work, the signed ternary synaptic weights are represented by two complementary binary synaptic weights. By doing this, the ternary neural networks are treated as binary neural networks. Power-hungry computational tasks such as multiplications are replaced by the bitwise AND operations to enhance the speed of the ANNs. The proposed technique is useful for deploying ANNs on low-cost embedded systems for mobile robots.
The proposed ternary neural network is deployed on a Raspberry Pi board suitable for a mobile robot. For reducing the amount of required memory, the signed values of ternary synaptic weights are represented by complementary binary arrays. Regarding image recognition, the proposed ternary neural network is tested in the MNIST dataset and achieves a recognition rate as high as 94%. For speech recognition, the proposed ternary neural network is evaluated using the Google Speech Commands and its accuracy is 91%. The proposed ternary neural network was also applied to real-time object recognition for a mobile robot. The proposed technique of representing singed synaptic weights reduces the required memory by 75% when compared to the conventional method. Overall, the proposed ternary neural network is 4.2, 2.7, and 2.4 times faster than the conventional ternary neural networks for image recognition, speech commands recognition, and realtime object recognition respectively.

www.etasr.com
Truong: A Low-cost Artificial Neural Network Model for Raspberry Pi