Balanced Communication-Avoiding Support Vector Machine when Detecting Epilepsy based on EEG Signals

—The revolution in technology affects many fields and among them the Healthcare system. The application-based computer was developed to help specialists to detect diseases, and to perform some basics operations. In this paper, focus is given on the proposed attempts to detect Epilepsy Disease (ED). Several Computer-Aided Diagnosis (CAD) methods were used to provide the brain’s disease status according to signals related to brain activities. These applications achieved acceptable results but still have their limitations. An intelligence CAD based on the Balanced Communication-Avoiding Support Vector Machine (BCA-SVM) is proposed to detect ED using Electroencephalogram (EEG) signals. This attempt is implemented on a Raspberry Pi 4 as a real board to ensure real-time processing. The CAD-based on BCA-SVM achieved an accuracy of 99.8% and the execution time was around 3.2s satisfying the real-time requirement.

INTRODUCTION Brain disorders such as Alzheimer's and Epilepsy Disease (ED) [1] are consuming vast resources of the health care system. Several methods have been proposed for the automated diagnosis of brain diseases. Many Computer-Aided Diagnosis (CAD) methods use Electroencephalogram (EEG) signals [2], physiological signals [3], or wearable sensors and smartphones [4]. Most recent works employ EEG signals instead of other methods due to their accuracy. The EEG is capable to read the brain's electrical signals generated by neurons measured through the scalp. Electroencephalography is the process of measuring the brain's neural activity as electrical voltage fluctuations along the scalp as a result of the current flowing in the brain's neurons [5,6]. The brain's electrical activity is monitored and recorded, in typical EEG tests, using electrodes that are fixed on the scalp [7]. A Brain-Computer Interface (BCI) system enables individuals to communicate through external devices by using their brain's electrical signals in the recording positions of the scalp. This can help to recognize anomalies of frequency patterns or accomplishing several tasks and seems to have many potential applications. This paper highlights the main issues of these methods, taking into account the proposed attempts to detect brain diseases and the fact that several methods based on CAD were used to identify ED using EEG signals. This paper contributes by highlighting the deficiency of a huge EEG database and techniques that were used may need a hardware accelerator to reduce the time duration of their applications.

II. LITERATURE REVIEW
Traditional medicine is based on visual inspection to recognize features assigned to brain diseases. This kind of inspection suffers from low accuracy while it is timeconsuming. In this section, many related works on recognizing ED based on machine learning techniques supported by different CAD tools are presented. Authors in [2,4] attempted to analyze EEG signals in detective and predictive data analytics by applying different intelligent classifiers to identify a range of signals related to brain diseases. Authors in [7][8][9] attempted to detect epilepsy outcome features through the EEG signals with artificial neural networks. In [7], the authors proposed an enhancement of the SNN model and reached an accuracy of 92.5%. In [8], the authors followed a wavelet methodology for nonlinear features. Then, a Levenberg-Marquardt backpropagation neural network method was performed, resulting in an accuracy of 96.7%. Authors in [9] proposed the use of the multi-spiking neural network technique and the reached accuracy was about 90.7%. Authors in [10] analyzed the EEG features according to a multi-spiking neural network with a new supervised learning algorithm. The accuracy is 94.8%, which was better than the result obtained in [9].
Authors in [11][12][13][14][15][16][17] applied the Gaussian Mixture Model to the EEG signals. An accuracy of 93.1% was reached in [11]. In [12][13][14], the authors used the Support Vector Machine as the main idea. The wavelet transformer obtained an accuracy of 96.3%, the recurrence quantification analysis achieved an accuracy of 95.6%, and the discrete wavelet transformation an accuracy of 96%. In [15,16], the authors used Fuzzy Sugeno applied to Entropy and Higher-order spectra methods. The achieved accuracy was around 98.1% and 99.7%. This method is challenged by its huge need of processing time and hardware resources. In [17], a deep learning methodology was used by applying Convolutional Neural Networks (CNNs). The model formed by a 13-layer Neural Network achieved an accuracy of 88.7%. Authors in [18,19], proposed an epilepsy detection method based on the Gaussian Mixture Model with 93.1% accuracy. Authors in [20], used the SVM methodology followed by the power spectral density estimation method, with the accuracy not exceeding 93.3%. Authors in [21], combined the SVM and the tunable Q wavelet transformer with 98.6% accuracy. Authors in [22], analyzed the EEG features using the K-Nearest Neighbors (KNN) method with 93.5% accuracy. Authors in [23] proposed an advanced algorithm for the recognition of seizures based on EEG features. These characteristics are extracted from the signal in temporal domains. A Back-propagation Neural Network (BNN) was used to achieve an accuracy of about 93%. Authors in [24] found the EEG frequency by applying a wavelet transformer to the spectral components. Then the Mixture of Expert (ME) was performed as a classifier to achieve an accuracy of about 94.5%.
Authors in [25] achieved an accuracy of about 98.68% when using a Decision Tree (DT) as a classifier. Authors in [26] proposed the use of cross-correlation coefficients to extract statistical features. Based on it, the SVM classifies the input EEG signal with about 96% accuracy. Authors in [27] attempted to improve the recognition accuracy by using the Linear Discriminant Analysis (LDA) classifier. But the obtained results showed an accuracy of 91.8% which was inferior to previous attempts. Authors in [28] applied the SVM to the entropy features associated with the EEG signal with 97.25% accuracy. Authors in [29] proposed the Local Neighbor Description Pattern (LNDP) as a new technique to extract features. This technique was tested by different classifiers including SVM, DT, and ANN. LNDP technique fused with ANN had the highest classification accuracy of 98.72%. Authors in [30] tried to detect epilepsy using the brain activities measured by the EEG signals. The signals were processed according to the position of electrodes and analyzed with the use of an adaptive mixture of the independent component. A symmetric-weighted scale-invariant local ternary pattern technique was applied to the obtained signals. The achieved features trained a neural network, reaching an accuracy of 99.53%. Authors in [31] attempted to overcome the problem of the variability of feature distribution by applying an Adaptive Median Feature Baseline Correction (AM-FBC). This method was followed by the SVM. Features were defined by a matrix of determinants and a successive decomposition index. The sensitivity was enhanced to achieve 96.6%. A summarization of the most significant studies on brain disorder disease recognition using computer-aided diagnosis can be seen in [32]. Table I presents a review of various implemented approaches to automate classification with the use of the EEG signals. The EEG data used by the authors in Table I are collected at Bonn University, Germany [33]. These data recordings were chosen from multi-channel EEG. Discrete wavelet transform 96% 96% 99% [26] Cross-correlation coefficients 96% -- [28] Entropy features 97.25% -- [21] Tunable Q wavelet transform 98.6% -- [29] (LNDP 98.72% -- [31] Adaptive

III. OPEN CHALLENGES
This section discusses the results achieved by the above mentioned techniques in detecting brain diseases based on the EEG signal. The discussion is performed considering the reported drawbacks. In the light of this brief review, the EEG signal represents a powerful criterion for the detection of brain diseases. According to the studied disease (seizures, epilepsy, Alzheimer's, etc.), the extraction phase tries to extract data from the EEG signal based on some features related to the specificity of the disease. This phase could be performed to an EEG signal depending on domains. The next phase applies different classifiers to obtain recognition accuracy. The shortcomings related to the recognition of brain diseases based on the EEG signals are: • Uncertain accuracy: commonly the obtained accuracy is higher than 90%. Most works use the public EEG database in [33]. It is composed of five datasets (A, B, C, D, and E) collected from only 5 participants. Therefore, the findings must be implemented in a larger database allowing result generalization. As mentioned above, several studies achieved different accuracy values despite the use of the same classifier and dataset. This can be explained by three reasons: (1) differences in the use of the dataset, (2) the extracted features are not always the same, and (3) difference of domains (time, frequency, and timefrequency).
• Non-real-time: most previous works implement advanced algorithms for the detection of brain diseases. Hence, algorithms were not executed in real-time due to the considerable computation time they demanded. Authors in [34] proposed a hardware implementation based on an FPGA board to ensure real-time detection. The authors used the SVM algorithm and achieved an accuracy of 86%. This result proved the decrease of the performance when implementing the algorithm on board. The need for a CAD framework deployed on a hardware board that maintains high accuracy is a necessity.
• Non-predictive: All previous works do not support disease prediction for a healthy person.
As a conclusion, the detection of the epilepsy disease using CAD tools still needs improvement in both software and hardware. The accuracy of the recognition had to be improved and the execution time should be decreased to ensure real-time response. These objectives are attained by performing the BCA-SVM [35] algorithm and by deploying it in a Raspberry Pi 4 board. In this paper, a system based on the BCA-SVM method as an intelligent algorithm to recognize the epilepsy disease at an earlier stage is presented.
IV. THE PROPOSED METHOD The aim of this study is to detect signs related to the ED according to the measured EEG signals. The BCA-SVM algorithm is applied to the EEG signals. This technique is chosen in order to ensure both speed and accuracy. During the training phase, the model is built through a set of input data. This phase requests an increasing number of processors and a set size which reduces the performance of the algorithm. The prediction phase is the easiest one due to the speedup and the parallel processing. The proposed BCA-SVM technique follows these steps: (1) Preprocessing, (2) feature extraction, (3) classification, and (4) evaluation.

A. Preprocessing
This step aims to normalize the EEG signal. The normalization is important due to the frequency bands associated with the difference of the frequency magnitude. The normalization aims to scale features, while the obtained values are more user-friendly. The filtered/raw values are scaled. The z-score method is used for normalization [32]: where V is the value of the feature to be scaled, M is the mean value of the feature, and S is the standard deviation of the feature. This method is adopted because it manages outliers.
Secondly, focus was given on augmenting data in the dataset. This goal aims to improve the obtained results of the training. The data augmentation was done with the use of the synthetic minority over-sampling technique. This method provides new samples based on the neighborhood method without re-using the available data. This method provides new samples belonging to the same original distribution. The most common challenge encountered in the ED research is the limited datasets and the limited data included in a dataset. This issue reduces the accuracy of CAD tools based on machine learning algorithms. The data augmentation step attempts to increase the amount of data in the dataset. Data drawn as a signal are segmented to attain a signal of equal length and associated with the same label as the original data. In our case, the length of the EEG signal is 10s. This process increases the number of available signals in the dataset from 40 to 96.

B. Feature Extraction
ED features are determined based on five frequency bands considered as signals of interest. These bands are described by the following restriction: It is necessary to apply a low pass filter to the signals to remove high frequency components. Through these frequency bands, the feature extraction could be defined. In the current study two key features were considered: (1) the Fast Fourier Transform (FFT) and (2) the Continuous Wavelet transform (CW) features. The FFT is applied to frequency signals to define the average magnitude related to all FFT coefficients. The Fast Fourier (FF) features are defined according to: where f s is the starting frequency, f e is the ending frequency associated with i th band, and C k defines the FFT coefficients.
where ts is the starting time, te is the ending time, and X j,k are the wavelet coefficients.

C. Classification
In this paper, focus is given on the training phase to overcome the mentioned shortcomings. Sequential Minimal Optimization (SMO) classes are introduced to divide the data set and to build the BCA-SVM kernel. Figure 1 describes the proposed technique. The implementation proceeds with the K-means algorithm. This classifier has some shortcomings: (1) K-means partitioning suffers from regularity and balance, (2) K-means is unpredictable because it dependents on data. An improvement is proposed to the K-means by applying the First Come First Served (FCFS) algorithm. Algorithm 1 as shown in Figure 2 describes the proposed enhancement. We assume that a clustering center corresponds to a machine node. The goal is to look for the closest center for every sample. The K-means is enhanced when a data center has n/m samples. The algorithm begins with a general definition. Then the machine node center and the balanced value are initialized. The treatment part manages data to find the center for each sample. The optimization step is done to improve the achieved result.

D. Evaluation
The necessary metrics to measure accurately the prediction and evaluate the classification performance are presented in this section. The validation is used 10 fold.
where False recognition is the number of false classifications and True recognition is the number of correct classifications. Sensitivity and Specificity are defined as: The evaluation step is finished by implementing the proposed method. For that, the CAD framework was deployed into a Raspberry Pi 4 [34] tied with the EEG device Emotiv [37]. This phase is evaluated with regard to processing time, CPU usage, and memory demand.

V. EXPERIMENTAL RESULTS
In this section, the proposed CAD based on the BCA-SVM method and using the dataset in [33] is verified and the implementation phase is described.

A. Verification
The verification is applied to samples considering ED and healthy (H) cases. The evaluation of the BCA-SVM method is presented in Table II. The proposed related to the ED cases reached 99.8% accuracy and 100% sensitivity and specificity. Nevertheless, the results related with regard to the H cases achieved 95.1% accuracy, 92.6% sensitivity, and 89.9% specificity. The obtained accuracy proves that the proposed BCA-SVM method gives a reliable decision of the health status of the studied cases. The accuracy covers both TP and TN results. The sensitivity highlights that our method detects consistently the ED. An error of about 8% is computed when considering healthy persons. The specificity defines the rate of cases correctly detected without ED. The achieved performance metrics attaint certain detection of the ED. A comparison between our findings and previous works is presented in Table  III. It can be seen that the BCA-SVM achieves the best results. Most attempts ensure accuracy more than 90%. The diagnosis classifiers can be considered satisfying until the achieved accuracy reaches 100%. Rival attempts suffered from the requested processing time. Therefore, the validation of the CAD framework on board is highly recommended.

B. Implementation
The verified CAD has been implemented with the use of the earlobe electrode landmark [38] and a Raspberry Pi 4 [36]. The EEG signals were recorded according to the International System metrics. Measures were taken when the eyes were closed, ensuring the same dynamic process. In the test phase, the record is defined as follows: duration of 120s, 1024Hz frequency, and sampling by 256. This implementation aims to provide an accurate device to predict ED. The device has to take into consideration real-time requests. The Raspberry Pi 4 was connected with the EEG device [37] via Bluetooth. The BCA-SVM was developed using Python and run on the Raspbian GNU/Linux 9 operating system. The Raspberry Pi 4 was chosen because it runs codes in Python, and it provides high-level APIs as Scikit-Learn library [39]. The performance of the proposed device depends on the memory usage, CPU temperature, and CPU average load. The memory resources are defined by memory usage. The requested memory varies between 17 and 24MB. The Raspberry Pi board provides the needed memory. Figures 4-6 present the results related to hardware factors. The CPU temperature curve is highlighted in Figure 4. The average temperature was around 66°C. The arm processor did not need a cooling mechanism because the temperature did not exceed 85°C.  The CPU average load defines the average number of processes loaded by the CPU during the studied period. Figure  5 shows the number of loaded processes for 120s. The CPU manages between one and two processes at the same time (the average is about 1.4). An intermediate buffer between the CPU and the primary memory has to be considered to ensure accurate running. The CAD tool could be considered as a realtime system. The CPU usage factor defines the occupation status. This factor could shut down the CPU when the occupation rate achieves 100%. Figure 6 shows the variation of usage during the 120s period. The CPU usage did not exceed 92% and the average rate was about 83%. All these results related to the architecture factors prove that Raspberry Pi 4 is sufficient for the ED detection using the BCA-SVM method. The execution time request by the Raspberry Pi 4 board is about 3.2s, allowing the conclusion that it obeys the real-time demands. Table IV summarizes the hardware performance related to the implementation of the BCA-SVM method on the Raspberry Pi 4 board.  The hardware features prove that the used board provides sufficient resources to ensure stable functioning of the CAD framework.
VI. DISCUSSION As mentioned above, the ED detection using the EEG signal faces three major issues: (1) accuracy, (2) real-time response, and (3) prediction. During this attempt, we dealt with the problems of accuracy and real-time response. Despite the high accuracy of the previous works, the findings were uncertain because of the small size of the considered dataset. A data augmentation method was proposed based on the synthetic minority over-sampling technique. The obtained accuracy (99.8%) is more significant and solid in comparison with other works.
Hardware performance is a critical factor in the implementation phase. The requested time to make a decision is around 3.2s. Moreover, the proposed method did not need a special hardware board. A Raspberry Pi 4 board provided sufficient memory size, and CPU speed to perform the framework. This proposed attempt can be extended to recognize other brain diseases such as Alzheimer's. The CAD tool has to support prediction in order to enhance pre-emptive care.

VII. CONCLUSION
The main subject of this paper is the detection of the ED. The detection of ED is challenged by increasing reliability and real-time constraints. To obtain reliable performance, a CAD framework based on k-means and the BCA-SVM method was proposed. The case classification is evaluated according to accuracy, specificity, and sensitivity metrics computed through the [33] dataset. The classification performance of the proposed method reached the high accuracy of 99.8%. The system was validated and tested on an EEG device and a Raspberry Pi 4 board in which the BCA-SVM algorithm was deployed. The memory usage was about 25MB and the board in our case did not suffer overheating. The arm processor was sufficient to ensure processing.
Early detection of the ED presents a great challenge. In future work, we will attempt to move from diagnosis to prediction by the detection of abnormal EEG signals which can lead to ED.