Anomaly Detection System for Internet Traffic based on  TF-IDF and BFR Clustering Algorithms

Suad A. Alasadi; Wesam S. Bhaya

doi:10.14419/ijet.v7i4.19.27967

Authors

Suad A. Alasadi
Wesam S. Bhaya

Received date: February 26, 2019

Accepted date: February 26, 2019

Published date: November 27, 2018

DOI:

https://doi.org/10.14419/ijet.v7i4.19.27967

Keywords:

Anomaly Detection, IDS, Network Attacks, Clustering Data Mining, TF_IDF, BFR.

Abstract

An anomaly can be defined as any deviation from the normal and something which is outside the usual range of variations, it consumes network resources, and lead to security issues such as Confidentiality, Integrity, and Availability (CIA).An Intrusion Detection Systems (IDS) are designed and implemented by many researchers to analyze, detect, and prevent the anomaliestraffics. Although, there are various techniques for IDS to detect anomalies like statistical, machine learning techniques. Data mining can be efficiently employed for anomaly detection. Since, it works to extract features from network traffic; it can be used to distinguish between common legitimate and attack traffics. Data mining can be efficiently identifying the important data for user and predicts the results that can be utilized to detect various types of attacks.
In this paper, an anomaly detection approach usingTerm Frequency Inverse Document Frequency(TF_IDF) and Bradley, Fayyad, and Reina(BFR) clustering algorithm is presented to detect and prevent malicious traffic efficiently and with low time complexity.Multiple types of attacks are detected in the proposed solution like (Flooding, Denial of Service (DoS), Backdoors, and Worms)attacks effectively using two modern datasets are which areâ€œNUST2009, UNSW-NB2015â€.
The experiments result shows that the BFR clustering algorithm perform better than the K-meanalgorithm in term of accuracy and detection rate. The overall accuracy for NUST2009 dataset is 99.2%, the detection rate is 100%, and false alarm rate is 0%. While the overall accuracy in UNSW-NB2015 dataset is 98.76, the detection rate is 79.28%, and false alarm rate is 0%.
Â
Â

References

[1] Marnerides, A. Schaeffer-Filho, and A. Mauthe, â€œTraffic anomaly diagnosis in Internet backbone networks: A survey,â€ Elsevier, vol. 73, pp. 224â€“243, 2014.
[2] S. Kumar, â€œSurvey of Current Network Intrusion Detection Techniques,â€ Citeseer, pp. 1â€“18, 2007.
[3] C. Douligeris, A. Mitrokotsa, â€œDDoS Attacks and Defense Mechanisms: Classification and State-of-the-Artâ€, Computer Networks, Vol. 44, No. 5, pp. 643-666, 2004.
[4] R. kumar, M. Nene," A Survey on Latest DoS Attacks: Classification and Defense Mechanisms", International Journal of Innovative Research in Computer and Communication Engineering, Vol. 1, Issue 8, October 2013.
[5] A. Rajaraman and J. D. Ullman, â€œMining of Massive Datasets,â€ Lect. Notes Stanford CS345A Web Mining, vol. 67, p. 328, 2011.
[6] Xin Du, Yingjie Yang, Xiaowen Kang, â€œResearch of Applying Information Entropy and Clustering Techniques Network Traffic Analysisâ€, IEEE, 978-0-7695-3508-1, 2008.
[7] Farhad S. Gharehchopogh, NedaJabbari, and Zeinab G. Azar. â€œEvaluation
of fuzzy k-means and k-means clustering algorithms in intrusion detection systemsâ€. International Journal of Scientific and Technology Research, 1(11) 66â€“71, December 2012.
[8] Z. Miller, W. Deitrick, and W. Hu, â€œAnomalous Network Packet Detection Using Data Stream Mining,â€ J. Inf. Secur., vol. 2, no. 4, pp. 158â€“168, 2011.
[9] Ghanshyam P. Dubey, Neetesh Gupta, and Rakesh K. Bhujade. â€œA novel approach to intrusion detection system using rough set theory and incremental svmâ€. International Journal of Soft Computing and Engineering (IJSCE), (1):663â€“667, 2011.
[10] R.-C. Chen, K.-F. Cheng, Y.-H. Chen, and C.-F. Hsieh, â€œUsing Rough Set and Support Vector Machine for Network Intrusion Detection System,â€ in 2009 First Asian Conference on Intelligent Information and Database Systems, 2009, pp. 465â€“470.
[11] Eid H. F., Darwish A., Ella Hassanien, and Abraham A. â€œPrinciple components analysis and support vector machine based intrusion detection systemâ€. In Intelligent Systems Design and Applications (ISDA), 10th International Conference on, pages 363â€“367. IEEE, December 2010.
[12] Vivek K. Kshirsagar, Sonali M. Tidke and Swati Vishnu, â€œIntrusion Detection System using Genetic Algorithm and Data Mining: An Overviewâ€, International Journal of Computer Science and Informatics ISSN (PRINT): 2231 â€“ 5292, Vol-1, Iss-4, 2012.
[13] S. Mehibs and S. Hashim, â€œProposed Network Intrusion Detection System Based on Fuzzy c_Mean Algorithm in Cloud Computing Environmentâ€, JUBPAS, vol. 26, no. 2, pp. 27-35, Dec. 2017.
[14] S. Mehibs and S. Hashim, â€œProposed Network Intrusion Detection System â€ŽIn Cloud Environment Based on Back â€ŽPropagation Neural Networkâ€, JUBPAS, vol. 26, no. 1, pp. 29-40, Dec. 2017.
[15] W. Bhaya and M. Ebadymanaa, â€œDDoS attack detection approach using an efficient cluster analysis in large data scale,â€ in 2017 Annual Conference on New Trends in Information and Communications Technology Applications, NTICT 2017, 2017.
[16] Joshi, Manish and TheyaznHassnHadi. â€œA Review of Network Traffic Analysis and Prediction Techniques.â€ CoRR abs/1507.05722 (2015): n. pag.
[17] S. Ali Khayam, F. Mirza, et al., " A SURVEY OF ANOMALY-BASED INTRUSION DETECTION SYSTEMS", School of Electrical Engineering and Computer Science, 2009.
[18] N. Moustafa, J.Slay," The Significant Features of the UNSW-NB15 and the KDD99 Data Sets for Network Intrusion Detection Systems", 4th International Workshop on Building Analysis Datasets and Gathering Experience Returns for Security, 2015.
[19] B. Trstenjak,S. Mikac, D. Donko, "KNN with TF-IDF Based Framework for Text Categorization", 24th DAAAM International Symposium on Intelligent Manufacturing and Automation, 2013.
[20] P.S. Bradley, Usama Fayyad, and Cory Reina, Scaling Clustering Algorithms to Large Databases, KDD-98 Proceedings, 1998.
[21] C. Tsai, C. Lin, â€œA Triangle Area Based Nearest Neighbors Approach to Intrusion Detectionâ€, Pattern Recognition, Vol. 43, No. 1, pp. 222-229, 2010.â€
[22] S. Mukherjee, N.Sharma, â€œ Intrusion Detection Using Naive Bayes Classifier with Feature Reductionâ€, Procedia Technology, Vol. 4, pp. 119-128, 2012.

Anomaly Detection System for Internet Traffic based on TF-IDF and BFR Clustering Algorithms

Authors

Suad A. Alasadi

Wesam S. Bhaya

How to Cite

DOI:

Keywords:

Abstract

References

Downloads

How to Cite