An Enhanced Binary Classifier Incorporating Weighted Scores

D. Virmani, N. Jain, A. Srivastav, M. Mittal, S. Mittal

Abstract


In this study, an approach is being proposed which will predict the output of an observation based on several parameters which employ the weighted score classification method. We will use the weighted scores concept for classification by representing data points on graph with respect to a threshold value found through the proposed algorithm. Secondly, cluster analysis method is employed to group the observational parameters to verify our approach. The algorithm is simple in terms of calculations required to arrive at a conclusion and provides greater accuracy for large datasets. The use of the weighted score method along with the curve fitting and cluster analysis will improve its performance. The algorithm is made in such a way that the intermediate values can be processed for clustering at the same time. The proposed algorithm excels due to its simplistic approach and provides an accuracy of 97.72%.


Keywords


weighted score; classification; clustering; deviation; threshold; SVM; decision tree

Full Text:

PDF

References


H. Zhang, A. C. Berg, M. Maire, J. Malik, “SVM-KNN: Discriminative nearest neighbor classification for visual category recognition”, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York, USA, pp. 2126-2136, June 17-22, 2006

B. Yu, C. Miao, L. Kwok, “Toward predicting popularity of social marketing messages”, in: Social Computing, Behavioral-Cultural Modeling and Prediction. SBP 2011. Lecture Notes in Computer Science, Vol. 6589, pp. 317-324, Springer, Berlin, Heidelberg, 2011

C. J. C. Burges, “A Tutorial on Support Vector Machines for Pattern Recognition”, Data Min. Knowl. Disc., Vol. 2, No. 2, pp. 121–167, 1998

C. Schuldt, I. Laptev, B. Caputo, “Recognizing human actions: a local SVM approach”, 17th International Conference on Pattern Recognition, (ICPR ) 2004, Cambridge, UK, Vol. 3, pp. 32-36, 2004

P. J. Tan, D. L. Dowe, “MML inference of decision graphs with multi-way joins”, in: Advances in Artificial Intelligence. AI 2003. Lecture Notes in Computer Science, Vol. 2903, pp. 269-281, Springer, Berlin, Heidelberg, 2003

J. L. Polo, F. Berzal, J. C. Cubero, “Weighted Classification Using Decision Trees for Binary Classification Problems”, II Congreso Español de Informática, pp. 333-341, Zaragoza, Spain, September 11-14, 2007

C. Chiu, Y. Ku, T. Lie, Y. Chen, “Internet auction fraud detection using social network analysis and classification tree approaches”, International Journal of Electronic Commerce, Vol. 15, No. 3, pp. 123-147, 2011

H. Neven, V. S. Denchev, G. Rose, W. G. Macready, “Training a binary classifier with the quantum adiabatic algorithm”, arXiv preprint arXiv:0811.0416, 2008

A. Quinn, A. Stranieri, J. Yearwood, “Classification for accuracy and insight: A weighted sum approach”, Proceedings of the sixth Australasian conference on Data mining and analytics, Vol. 70, pp. 203-208, Australian Computer Society Inc., 2007

L. Kuncheva, J. Bezdek, R. Duin. “Decision templates for multiple classifier fusion: an experimental comparison”, Pattern Recognition, Vol. 24, No. 2, pp. 299–314, 2001

D. Virmani, S. Taneja, G. Malhotra, “Normalization based K means Clustering Algorithm”, arXiv preprint arXiv:1503.00900, 2015

M. A. Mazurowski, P. A. Habas, J. M. Zurada, J. Y. Lo, J. A. Baker, G. D. Tourassi, “Training neural network classifiers for medical decision making: The effects of imbalanced datasets on classification performance”, Neural Networks, Vol. 21, No. 2-3, pp. 427-436, 2008

J. S. Bridle, “Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition”, In: Neurocomputing. NATO ASI Series (Series F: Computer and Systems Sciences), Vol. 68, pp. 227-236, Springer, Berlin, Heidelberg, 1990

Machine Learning Depository, Wholesale customers Data Set, https://archive.ics.uci.edu/ml/datasets/wholesale+customers

USArrests, http://stat.ethz.ch/R-manual/R-devel/library/datasets/html/

USArrests.html

“Datasets distributed with R Git Source Tree”, https://forge.scilab.org/

index.php/p/rdataset/source/tree/master/csv/datasets/attenu.csv

https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/datasets/quakes.csv

https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/datasets/volcano.csv

https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/boot/channing.csv

https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/boot/neuro.csv




eISSN: 1792-8036     pISSN: 2241-4487