An efficient technique for hybrid classification and feature extraction using normalization
-
https://doi.org/10.14419/ijet.v7i2.27.14534
Received date: June 22, 2018
Accepted date: July 29, 2018
Published date: August 6, 2018
-
TextMining, Text Classification, Feature Extraction, Feature Selection, Machine Learning -
Abstract
Text classification is technique for assigning the class or label to a particular document within predefined class labels. Predefined classes examples are sports, business, technical, education and science etc. Classification is supervised learning technique i.e. these classes are trained with certain features and then document is classified based on similarity measure with these trained document set. Text classification is used in many applications like assigning the label to the documents, separating the spam messages from the genuine one, filtering of text, natural language processing etc. Feature selection, extraction and classification are various phases for assigning label to any document. In this paper, PCA is used for feature extraction, ABC is used for feature selection and SVM is used for classification. PCA is improved by applying normalization-using size of features in our proposed approach. It reduces the redundant features to larger extent. There are very few research works, which have implemented PCA, ABC and SVM for complete classification. Evaluation parameters like accuracy, F-measure and G-mean are calculated to check classifier efficiency. The proposed system is deployed on 20-Newsgroup dataset. Experiment analysis proves that accuracy is improved using our proposed approach as compared to existing approaches.
-
References
- S.A.Salloum, M.A.Emran, A.A.Monem, &K.Shaalen(2017) “Using Text Mining Techniques for Extracting Information from Research Articles”,Intelligent Natural Language Processing: Trends and Ap-plications,Vol.740,pp:373-397, Springer.
- B Jyot& G. Bathla (2018),” Document classification using various classification algorithms: a survey”,,International journal of future revolution in computer science and communication engineer-ing,vol.4,pp.150-155.
- P. L. Prasanna, D. R. Rao, Y. Meghana, K. Maithri& T. Dhinesh (2018),”Analysis of supervised classification techniques: Interna-tional Journal of Engineering and technology, vol.7, pp.283-285, SPC.
- P.L.Prasanna&D.R.Rao (2018)”Text classification using artificial neural networks” International Journal of Engineering and technol-ogy, vol.7, no.1.1, pp.603-606, SPC.
- M.P Mali & M. Atique(2014) “Applications of Text Classification using Text Mining”, International Journal of Engineering Trends and Technology (IJETT),Vol.13, no.5,SPC.
- J. Deepika, T. Senthil, C. Rajan& A. Surendar(2018),”Machine learning algorithms: a background artifact”, International Journal of Engineering and technology,vol.7, pp.143-149,SPC.
- R.Thiyagarajana, S.Arulselvia& G. Sainarayanan (2010),” Gabor Feature based Classification using Statistical Models for Face Recognition”, in Proceedings ofICEBT2 pp:83-93, Elsevier.
- A. Jain, K. Nandakumar& A. Ross (2005)‘Score normalization in multimodal biometric systems”, Pattern Recognition vol.38, pp. 2270 – 2285, Elsevier.
- D. Karaboga& B. Basturk,(2008) “On the performance of artificial bee colony (ABC) algorithm”.Applied soft computing, vol .8, no.1, and pp: 687-697, Elsevier.
- C J.C.Burges& B. Schölkopf(1997), “Improving the accuracy and speed of support vector machines”. In Advances in neural infor-mation processing systems, pp. 375-381.
- S. Zobeidi, M. Naderan& S. E. Alavi, (2017) “Effective text classi-fication using multi-level fuzzy neural network”, in proceedings of the 5th Iranian Joint Congress onFuzzy and Intelligent Systems (CFIS), pp. 91-96, IEEE.
- B.Tang, H. He, P.M.Baggantoss&S.kay (2016)"A Bayesian classifi-cation approach using class-specific features for text categorization.” IEEE Transactions on Knowledge and Data Engineering vol.28, no.6 pp: 1602-1606.
- F.P.Shah&V.Patel(2016) ,”A review on feature extraction and Fea-ture selection for text classification”, Wispnet pp.2264-2268,IEEE.
- V. K. Vijayan, K. R. Bindu&L.Parameswaran(2017), "A compre-hensive study of text classification algorithms.” IEEE Advances in Computing, Communications and Informatics (ICACCI), pp: 1109-1113.
- M...S. Uzer, N. Yilmaz, & O. Inan (2013), ”Feature Selection Method Based on Artificial Bee Colony Algorithm and Support Vector Machines for Medical Datasets Classification” the scientific world journal,pp.1-10,Hindawi.
- Santoso, E. M. Yuniarno, &M.Hariadi (2015),”Large Scale Text Classification Using Map Reduce and Naive Baye’s Algorithm for Domain Specified Ontology Building”, in Proceedings of the 7th International Conference onIntelligent Human-Machine Systems and Cybernetics (IHMSC),vol. 1, pp. 428-432, IEEE.
- Y. Xue, J. Jiang, B. Zhao &T.Ma (2017),” A self-adaptive artificial bee colony algorithm based on global best for global optimiza-tion”, Soft Computing, pp:1-18,Springer.
- M. Somvanshi,& P. Chavan (2016). "A review of machine learning techniques using decision tree and support vector machine", in Pro-ceedings of the International Conference onComputing Communi-cation Control and automation (ICCUBEA), pp. 1-7. IEEE, 2016.
- L. Demidova & I. Klyueva (2017),” SVM classification: Optimiza-tion with the SMOTE algorithm for the class imbalance problem” in Proceedings of the sixth Mediterranean Conference on Embedded Computing (MECO) pp. 1-4. IEEE.
- K.Y. Wu, M. Zhou, X.S.Lu &L .huang (2017) "A fuzzy logic-based text classification method for social media data"in Proceedings of the International Conference on Systems, Man, and Cybernetics (SMC),, vol.13,no.3 pp:1942-19472,IEEE.
- V. Bobicev (2016). “Text classification: the case of multiple la-bels”,in Proceedings of the International Conference on Communi-cations (COMM) pp. 39-42. IEEE,
- K. Glinka, R. Woźniak, & D. Zakrzewska(2017), “Improving Multi-Label Medical Text Classification by Feature Selection”,in Proceed-ings of the26th International Conference onEnabling Technologies: Infrastructure for Collaborative Enterprises (WETICE) pp. 176-181. IEEE.
- N. Bidi, &Z.Elberrichi (2016),”Feature selection for text classifica-tion using genetic algorithms”, in Proceedings of the 8th Interna-tional Conference onModelling, Identification and Control (IC-MIC), pp. 806-810. IEEE.
- H.wang, H.yu, Q.Zhang, S.Cang&W.Liao (2017).”Parameters op-timization of classifier and feature selection based on improved arti-ficial bee colony algorithm”, in Proceedings of the International Conference on Advanced Mechatronic Systems (ICAMechS), IEEE.
- K.Modarresi(2015), “Unsupervised Feature Extraction Using Singu-lar Value Decomposition”, in Proceedings of the International Con-ference On Computational Science.vol.51,pp:2417–2425.Elsevier.
- H. Abdi, & L. J. Williams (2010),” Principal component analy-sis”, Wiley interdisciplinary reviews: computational statistics, two, pp.1-47.
- T Meenpal, A.Goyal& A. Meenpal(2018),”Facial recognition system based on principle component analysis and distance measures “, In-ternational Journal of Engineering and technology,vol.7, no.2.21,pp.15-19,SPC.
- D. Karaboga& B. Basturk (2007).” A powerful and efficient algo-rithm for numerical function optimization: artificial bee colony (ABC) algorithm”, Journal of global optimization, vol: 39 no.3, pp.459-471, Springer.
- A. J. Smola& B. Schölkopf,(2004), “A tutorial on support vector regression”, Statistics and computing, vol.14, no.3, pp.199-222.
-
Downloads
-
How to Cite
Kaur, B., & Bathla, G. (2018). An efficient technique for hybrid classification and feature extraction using normalization. International Journal of Engineering and Technology, 7(2.27), 156-160. https://doi.org/10.14419/ijet.v7i2.27.14534
