An Enhanced Binary Classifier Incorporating Weighted Scores
Abstract
In this study, an approach is being proposed which will predict the output of an observation based on several parameters which employ the weighted score classification method. We will use the weighted scores concept for classification by representing data points on graph with respect to a threshold value found through the proposed algorithm. Secondly, cluster analysis method is employed to group the observational parameters to verify our approach. The algorithm is simple in terms of calculations required to arrive at a conclusion and provides greater accuracy for large datasets. The use of the weighted score method along with the curve fitting and cluster analysis will improve its performance. The algorithm is made in such a way that the intermediate values can be processed for clustering at the same time. The proposed algorithm excels due to its simplistic approach and provides an accuracy of 97.72%.
Keywords:
weighted score, classification, clustering, deviation, threshold, SVM, decision treeDownloads
References
H. Zhang, A. C. Berg, M. Maire, J. Malik, “SVM-KNN: Discriminative nearest neighbor classification for visual category recognition”, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York, USA, pp. 2126-2136, June 17-22, 2006
B. Yu, C. Miao, L. Kwok, “Toward predicting popularity of social marketing messages”, in: Social Computing, Behavioral-Cultural Modeling and Prediction. SBP 2011. Lecture Notes in Computer Science, Vol. 6589, pp. 317-324, Springer, Berlin, Heidelberg, 2011 DOI: https://doi.org/10.1007/978-3-642-19656-0_44
C. J. C. Burges, “A Tutorial on Support Vector Machines for Pattern Recognition”, Data Min. Knowl. Disc., Vol. 2, No. 2, pp. 121–167, 1998
C. Schuldt, I. Laptev, B. Caputo, “Recognizing human actions: a local SVM approach”, 17th International Conference on Pattern Recognition, (ICPR ) 2004, Cambridge, UK, Vol. 3, pp. 32-36, 2004 DOI: https://doi.org/10.1109/ICPR.2004.1334462
P. J. Tan, D. L. Dowe, “MML inference of decision graphs with multi-way joins”, in: Advances in Artificial Intelligence. AI 2003. Lecture Notes in Computer Science, Vol. 2903, pp. 269-281, Springer, Berlin, Heidelberg, 2003 DOI: https://doi.org/10.1007/978-3-540-24581-0_23
J. L. Polo, F. Berzal, J. C. Cubero, “Weighted Classification Using Decision Trees for Binary Classification Problems”, II Congreso Español de Informática, pp. 333-341, Zaragoza, Spain, September 11-14, 2007
C. Chiu, Y. Ku, T. Lie, Y. Chen, “Internet auction fraud detection using social network analysis and classification tree approaches”, International Journal of Electronic Commerce, Vol. 15, No. 3, pp. 123-147, 2011 DOI: https://doi.org/10.2753/JEC1086-4415150306
H. Neven, V. S. Denchev, G. Rose, W. G. Macready, “Training a binary classifier with the quantum adiabatic algorithm”, arXiv preprint arXiv:0811.0416, 2008
A. Quinn, A. Stranieri, J. Yearwood, “Classification for accuracy and insight: A weighted sum approach”, Proceedings of the sixth Australasian conference on Data mining and analytics, Vol. 70, pp. 203-208, Australian Computer Society Inc., 2007
L. Kuncheva, J. Bezdek, R. Duin. “Decision templates for multiple classifier fusion: an experimental comparison”, Pattern Recognition, Vol. 24, No. 2, pp. 299–314, 2001 DOI: https://doi.org/10.1016/S0031-3203(99)00223-X
D. Virmani, S. Taneja, G. Malhotra, “Normalization based K means Clustering Algorithm”, arXiv preprint arXiv:1503.00900, 2015
M. A. Mazurowski, P. A. Habas, J. M. Zurada, J. Y. Lo, J. A. Baker, G. D. Tourassi, “Training neural network classifiers for medical decision making: The effects of imbalanced datasets on classification performance”, Neural Networks, Vol. 21, No. 2-3, pp. 427-436, 2008 DOI: https://doi.org/10.1016/j.neunet.2007.12.031
J. S. Bridle, “Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition”, In: Neurocomputing. NATO ASI Series (Series F: Computer and Systems Sciences), Vol. 68, pp. 227-236, Springer, Berlin, Heidelberg, 1990
Machine Learning Depository, Wholesale customers Data Set, https://archive.ics.uci.edu/ml/datasets/wholesale+customers
USArrests, http://stat.ethz.ch/R-manual/R-devel/library/datasets/html/
USArrests.html
“Datasets distributed with R Git Source Tree”, https://forge.scilab.org/
index.php/p/rdataset/source/tree/master/csv/datasets/attenu.csv
https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/datasets/quakes.csv
https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/datasets/volcano.csv
https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/boot/channing.csv
https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/boot/neuro.csv
Downloads
How to Cite
License
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.