Outlier Detection using Clustering Techniques
Keywords:Outliner Detection, Data Mining, K Means, LOF, CLARA
An outlier is nothing but a pattern that is different compared to the other existing patterns in a particular dataset. In some applications it is very important to understand and identify outliers. Detecting outlier is of major importance in many of the fields like cybersecurity, machine learning, finance, healthcare, etc., A clustering based method is proposed to detect outliers using different algorithms like k means, PAM, Clara, DBScan and LOF on different data sets like breast cancer, heart diseases, multi shaped datasets. This work aims to identify the best suitable method to detect the outliners accurately.
 Petrovskiy, M. I. "Outlier detection algorithms in data mining systems." Programming and Computer Software 29.4 (2003): 228-237.
 Dhaliwal, Parneeta, M. P. S. Bhatia, and Priti Bansal. "A cluster-based approach for outlier detection in dynamic data streams (KORM: k-median OutlieR miner)." arXiv preprint arXiv:1002. 4003(2010).
 Souza, Alberto MC, and JoseÃ© RA Amazonas. "An outlier detect algorithm using big data processing and internet of things architecture." Procedia Computer Science 52 (2015): 1010-1015.
 Christy, A., G. Meera Gandhi, and S. Vaithyasubramanian. "Cluster Based Outlier Detection Algorithm for Healthcare Data." Procedia Computer Science 50 (2015): 209-215.
 Loureiro, Antonio, Luis Torgo, and Carlos Soares. "Outlier detection using clustering methods: a data cleaning application." Proceedings of KDNet Symposium on Knowledge-based Systems for the Public Sector. Bonn, Germany. 2004.
 Chandola, Varun, Arindam Banerjee, and Vipin Kumar. "Outlier detection: A survey." ACM Computing Surveys (2007).
 Bhattacharya, Gautam, Koushik Ghosh, and Ananda S. Chowdhury. "Outlier detection using neighborhood rank difference." Pattern Recognition Letters 60 (2015): 24-31.
 Toshniwal, Durga. "A framework for outlier detection in evolving data streams by weighting attributes in clustering." Procedia Technology 6 (2012): 214-222.
 Cao, Lei, Qingyang Wang, and Elke A. Rundensteiner. "Interactive outlier exploration in big data streams." Proceedings of the VLDB Endowment 7.13 (2014): 1621-1624.
 Gupta, Manish, et al. "Outlier detection for temporal data: A survey." IEEE Transactions on Knowledge and Data Engineering26.9 (2014): 2250-2267.
 SREEVIDYA, SS. "Detection of Outliers in Data Stream Using Clustering Method." International Journal of Science, Engineering and Technology Research (IJSETR)/2015/2278-7798 4 (2015).
 Kumar, Vijay, Sunil Kumar, and Ajay Kumar Singh. "Outlier Detection: A Clustering-Based Approach." International Journal of Science and Modern Engineering (IJISME), ISSN (2013): 2319-6386.
 Jayakumar, G. D. S., and Bejoy John Thomas. "A new procedure of clustering based on multivariate outlier detection." Journal of Data Science 11.1 (2013): 69-84.
 Papadimitriou, Spiros, et al. "Loci: Fast outlier detection using the local correlation integral." Data Engineering, 2003. Proceedings. 19th International Conference on. IEEE, 2003.
 Christopher, T., and T. Divya. "A Study of Clustering Based Algorithm for Outlier Detection in Data streams." Proceedings of the UGC Sponsored National Conference on Advanced Networking and Applications. 2015. National Conference on Advanced Networking and Applications, 27th March 2015.
 Breunig, Markus M., et al. "LOF: identifying density-based local outliers." ACM sigmod record. Vol. 29. No. 2. ACM, 2000.
 Elahi, Manzoor, et al. "Efficient clustering-based outlier detection algorithm for dynamic data stream." Fuzzy Systems and Knowledge Discovery, 2008. FSKD'08. Fifth International Conference on. Vol. 5. IEEE, 2008.
 Knox, Edwin M., and Raymond T. Ng. "Algorithms for mining distance based outliers in large datasets." Proceedings of the International Conference on Very Large Data Bases. Citeseer, 1998.
 Singh, Janpreet, and Shruti Aggarwal. "Survey on outlier detection in data mining." International Journal of Computer Applications67.19 (2013).
 Pachgade, Ms SD, and Ms SS Dhande. "Outlier detection over data set using cluster-based and distance-based approach. "International Journal of Advanced Research in Computer Science and Software Engineering 2.6 (2012).
 Pamula, Rajendra, Jatindra Kumar Deka, and Sukumar Nandi. "An outlier detection method based on clustering." Emerging Applications of Information Technology (EAIT), 2011 Second International Conference on. IEEE, 2011.
View Full Article:
How to Cite
LicenseAuthors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under aÂ Creative Commons Attribution Licensethat allows others to share the work with an acknowledgement of the work''s authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal''s published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (SeeÂ The Effect of Open Access).