Performance Analysis of Hyperparameters on a Sentiment Analysis Model

-This paper focuses on the performance analysis of hyperparameters of the Sentiment Analysis (SA) model of a course evaluation dataset. The performance was analyzed regarding hyperparameters such as activation, optimization, and regularization. In this paper, the activation functions used were adam, adagrad, nadam, adamax, and hard_sigmoid, the optimization functions were softmax, softplus, sigmoid


INTRODUCTION
In an Education Management System (EMS), assessing the performance of faculty members is becoming an important component. It's not only helpful for improving the quality of course content and teaching style but it is also used by the faculty annual appraisal process. The course evaluation is typically collected at the end of the semester of each course and a set of questions are answered in Likert scale, open-ended, and self-evaluation approach. The combined response is used as a metric to measure the quality of the teaching staff. The evaluation form also provides room for open feedback which is typically not entertained in the performance appraisal due to lack of automated methods [1][2][3]. The textual data may contain some important information about the subject understanding, comprehension, regularity, and presentation skills and may also provide clear suggestions for improving the quality of teaching. This kind of information may not get from the Likert scalebased feedback [4]. And conversely, getting sense out and understanding the semantic of text from the textual feedback manually is a painstaking task and as a result, and textual feedback is not properly utilized [3]. The main aim of this paper is to analyze and understand the textual feedback automatically and develop qualitative and quantitative metrics that can estimate the performance of a teacher. This work comes under the promising and emerging area of opinion mining which has gained eminence since the uprising of the World Wide Web. A lot of relevant research has been reported recently. Researchers have extracted sentiments from comments posted online in websites and forums [5], movie and other review sites [6][7], social networking sites [8][9], course and teacher evaluations [3,10], and so on. The main focus of a sentiment analysis model is on extracting and determining the writer's feelings form a piece of text. The feeling might be his or her opinion, emotion, and attitude. The most valuable step of this analysis is to classify the polarity of the given text as positive, neutral, and negative [5,11]. Similarly, the obtained work aims to categorize the polarity of student comments in terms of these three labels. This paper suggests suitable hyperparameters for training and testing a Sentiment Analysis (SA) model and provides a comprehensive strategy for investigating the effects of the hyperparameter tuning model with deep learning LSTM approach. The experiment was carried out with different tuning strategies to induce and evaluate the relevance of hyperparameters using student feedback dataset. depicts the efforts of different researchers towards applying machine and deep learning models for performing classification and opinion analysis in a variety of datasets [23]. Authors in [24] proposed a novel Convolutional Neural Network (CNN) framework for visual sentiment analysis to predict visual content. Transfer learning and hyperparameters have been used with biases and their weights were utilized from pre-trained GoogLeNet with 22 layers for sentiment analysis. It has been optimized by using SGD (Stochastic Gradient Descent) algorithm. The authors have developed a deep learning-based system for twit text analysis and focused on the weight parameters tuning of the CNN [25]. The Long

II. RELATED WORK
Short-Term Memory Model [22] has been proposed to analyze the student's sentiments from textual student feedback of course evaluation of 2018-2019. Authors in [23] utilized Multinomial Naive Bayes, Stochastic Gradient Descent, Support Vector Machine, Random Forest, and Multilayer Perception Classifier to analyze the sentiments expressed by students through textual feedback. Authors in [27] focused on the aspect-based opinion mining method for recognizing the sentiments of a social movies review dataset. Authors in [28] used the k-means/SVM approach for identifying the social issues in SA. The adopted system analyzed sentiments from macro and microblogs. The core reason for this study was to get user opinions and attitudes about hot topics and events by implementing CNN. CNN prevails over the problem of explicit feature extraction and learns completely through the training data. To gather the data from the target, the input URL and crawler have been implemented. One thousand micro-blog comments were collected and divided into three labels: 300 negative, 274 neutral, and 426 positive. This study was compared with previous studies which used SVM, CRF and additional methods to perform SA [26].

III. METHODOLOGY
The presented methodology classifies the students' sentiments as positive, neutral and negative. The model workflow is shown in Figure 1 and is analyzed below.

A. Data Preprocessing
The collected dataset is not well organized and in order to extract the meaning and information from the text we need strong data preprocessing techniques. There are several steps applied for the removal of spelling errors, grammatical mistakes, and URLs. The details are described below: • Punctuation consists of the special symbols and numbers, which were removed from the text, as these symbols are useless and only create ambiguity in processing.
• Tokenization is the process of splitting a sentence into words.
• After tokenization, case conversion is performed to convert the uppercase tokens into lower case i.e. (GOOD, good).
• In NLP, stop words are a set of commonly used words such as determiners, conjunctions, and prepositions. These words are worthless for sentiment analysis and classification, and they are removed before training the model.

B. Word Embedding
The word embedding presents a dense representation of words and their relative significance. It can be learned from text data and reused among various applications. The word embedding maintains the relations of words, and captures context and semantics of particular words in text documents. In this model we used a pre-trained Word2vec model as input in our LSTM network and that model produced 300-dimensional vectors for processing the millions of words and get support from the bag-of-words scheme.

C. LSTM
The representation of sentence in a sequence form was conducted by using the LSTM network. The first layer used was the embedded layer that contains 32 length vectors to represent every word. The next layer is the LSTM which contains 100 memory units. The final layer is the classification stage, where the model used a dense layer as the output layer with a single neuron. The model used the activation function to give a value between 0 and 1 for the predications of two classes. The model adopted log or entropy loss to execute and process the binary classification problem, and dropout ratio with LSTM to maintain the learning and convergence of the network.

D. Hyperparameters Testing
In the model, the single hidden layer has 300 nodes which are the dimensions of a word in a form of a vector. The outputs of neurons were shaped with the activation functions (adam, adagrad, nadam and adamax). They push the output results up and down in a nonlinear fashion depending on the magnitude. When the magnitude is high then signals disseminate, and take their part at shaping the final prediction of the network. With the use of the activation function, the overall demonstration of the LSTM model is highly complex and nonlinear, therefore the softmax, softplus, sigmoid and relu optimization functions are used for minimizing the error of the model. Besides, to avoid the risk of overfitting, the regularization or shrinking approach has been used by making coefficients zero (dropout used between 0.1 to 0.4). After testing various combinations, the dense layer has a sigmoid activation function deployed for binary sentiment classification. In the last layer, we implemented the softmax activation function for the multi-class SA problem.

IV. RESULTS AND DISSCUSION
The experiments were conducted on a course evaluation dataset containing 3000 students' comments [23]. In the dataset each feedback record contains fields such as teacher's id, course name, comment, label, and semester. The dataset is divided into three groups for training (70%), testing (20%), and validation (10%). The research was conducted on SA where labels were 0 for negative, 1 for positive, and 2 for neutral. The diverse and blend parameters were tested considering regularization, optimization, and activation, in order to achieve the highest accuracy of the model as shown in Figures 2 to 4. The performance of the SA model was greatly feasible and effective as compared to conventional models, and LSTM SA model does not require prior knowledge such as sentiment lexicon and syntactic parsing. Moreover, the LSTM network has a long term memory to the context of the comment, which makes up the cons of the traditional SA. In a similar manner, the model adopted parameters are regularization, optimization, learning rate, and decay. And all these play a large part in reducing overfitting. The model also integrates max pooling, dropout, and normalization approach to reduce overfitting. By reducing dimensionally, max pooling performs best at a size of 2. Dropout layers were assessed at different locations in the network and they were found to be most helpful after max pooling and before normalization. The model implemented the cross-entropy loss function which basically computes the error between the true label and the predicated label. Figure 2 shows the validation accuracy of the model with a dropout value 0.2, "adam" optimizer, and softmax, soft plus, sigmoid, and relu activation functions. The results indicate that the accuracy of the model is outstanding with soft plus.     V. CONCLUSION In this paper, the learning capability of three different techniques, namely activation, optimization, and regularization were investigated for student's SA from textual feedback. The course evaluation dataset used contains 3000 comments with labels (0,1, and 2). The dataset was divided into training, testing, and validation subsets. The LSTM based deep learning method has been used in the SA model. The unigram and bigram bag-of-words approach has been used for feature extraction. In order to improve the performance of the model, preprocessing and filtering have been adopted. It has been shown that, out of 80 tested models only two performed with outstanding accuracy in terms of training, testing, and validation as shown in Table II and could be used as preeminent parameters on real-time feedback SA analysis. Future work will include multi-lingual and fine-grained analysis of students' comments at the aspect level.