A novel approach to ensemble learning in distributed data mining

  • Authors

    • Prakash Chandra Jena
    • Subhendu Kumar Pani
    • Debahuti Mishra
    2018-06-08
    https://doi.org/10.14419/ijet.v7i2.33.14159
  • Ensemble Learning, Meta-Learning, Classifier Ensemble, Ensemble Method, Classification Performance, Meta-Classifier.
  • Several data mining techniques have been proposed to take out hidden information from databases. Data mining and knowledge extraction becomes challenging when data is massive, distributed and heterogeneous. Classification is an extensively applied task in data mining for prediction. Huge numbers of machine learning techniques have been developed for the purpose. Ensemble learning merges multiple base classifiers to improve the performance of individual classification algorithms. In particular, ensemble learning plays a significant role in distributed data mining. So, study of ensemble learning is crucial in order to apply it in real-world data mining problems. We propose a technique to construct ensemble of classifiers and study its performance using popular learning techniques on a range of publicly available datasets from biomedical domain.

     

     

  • References

    1. [1] Vilalta R. and Drissi Y. (2002), “A Perspective View and Survey of Meta-Learningâ€, Journal of Artificial Intelligence Review, 18 (2), pp.77-95.

      [2] Saso D., and Bernard Z (2004), “Is Combining Classifiers with Stacking Better than Selecting the Best One?â€, Machine Learning, 54, Kluwer Academic Publishers, Netherlands, pp.255–273.

      [3] Domingos Pedro (1998), “Knowledge Discovery via Multiple Modelsâ€, Intelligent Data Analysis, 2, pp.187-202.

      [4] Ting, K. M., and Witten, I. H. (1999), “Issues in stacked generalizationâ€, Journal of Artificial Intelligence Research, 10, pp.271–289.

      [5] Breiman L. (1996), “Bagging predictorsâ€, Machine Learning, vol. 24, pp.123–140.

      [6] Oza N. C. and Tumer K. (2008), “Classifier ensembles: Select real-world applications,†Information Fusion, vol. 9, no.1, pp. 4–20.

      [7] Dietterich, T. (2000), “Ensemble methods in machine learningâ€, In Kittler, J., &Roli, F. (Eds.), First International Workshop on Multiple Classifier Systems, Lecture Notes in Computer Science, Springer-Verlag, pp. 1–15.

      [8] Polikar R. (2006), “Ensemble based systems in decision making,†IEEE Circuits System Mag., vol. 6, no. 3, pp. 21–45.

      [9] Rokach L. (2010), “Ensemble-based classifiers,†Artificial Intelligence Review, vol.33, pp.1-39.

      [10] Liu M., Zhang D., Yap P. T., and S. D. (2012), “Hierarchical ensemble of multi-level classifiers for diagnosis of Alzheimer’s diseaseâ€, In proc. of Machine Learning in Medical Imaging ( MLMI 2012), Lecture Notes in Computer Science, vol. 7588, pp. 27–35.

      [11] Islam R. and Abawajy J. (2013), “A multi-tier phishing detection and filtering approachâ€, Journal of Network and Computer Applications, vol. 36, pp.324–335.

      [12] Xiao Jin, Xie Ling, He Changzheng, Jiang Xiaoyi (2012), “Dynamic classifier ensemble model for customer classification with imbalanced class distributionâ€, Expert Systems with Applications, Volume-39, Issue 3, Pp. 3668â€3675.

      [13] Kelarev A.V., Stranieri A., Yearwood J.L., Abawajy J., Jelinek H.F. (2012), “Improving Classifications for Cardiac Autonomic Neuropathy Using Multi-level Ensemble Classifiers and Feature Selection Based on Random Forestâ€, In Proceedings of the Tenth Australasian Data Mining Conference (AusDM 2012), Sydney, Australia, pp.93-101.

      [14] Fumera, G. and Roli, F. (2005), “A theoretical and experimental analysis of linear combiners for multiple classifier systemsâ€, IEEE Transactions on Pattern Analysis and Machine Intelligence 27(6), pp.942–956.

      [15] Kotsiantis SB. (2007), “Supervised machine learning: A review of classification techniquesâ€, Informatica, no.31, pp. 249-68.

      [16] Melville P. and Mooney R. J. (2005), “Creating diversity in ensembles using artificial dataâ€, Information Fusion, vol.6, pp.99-111.

      [17] Domeniconi, C. and Al-Razgan, M. (2009), “Weighted cluster ensembles: Methods and analysisâ€, ACM Transactions on Knowledge Discovery from Data, 2(4), Article 17.

      [18] Freund Y., Schapire R. (1996), “Experiments with a new boosting algorithmâ€, Proceedings of 13th International Conference of Machince Learning, pp. 148-56.

      [19] Koliopoulos A.K., Yiapanis P., Tekiner F., Nenadic G., Keane J. (2015), “A Parallel Distributed Weka Framework for Big Data Mining using Sparkâ€, IEEE International Congress on Big Data, IEE Computer Society, pp.9-16.

  • Downloads

  • How to Cite

    Chandra Jena, P., Kumar Pani, S., & Mishra, D. (2018). A novel approach to ensemble learning in distributed data mining. International Journal of Engineering & Technology, 7(2.33), 233-238. https://doi.org/10.14419/ijet.v7i2.33.14159