A hybrid SVML method for survival of patient post breast cancer operation prediction by using SVM and logistic regression

  • Abstract
  • Keywords
  • References
  • PDF
  • Abstract

    A Support Vector Machine is a supervised linear maximum margin classifier and used in many classification applications. While on the other hand Logistic Regression is a regression model which has a categorical dependent variable. Breast cancer operation is a critical one and the survival of the patient is not sure. For a person to be operated, we must know her survival chances after the cancer operation has been performed. Here, in this paper, we propose a hybrid model of a support vector Machine with Logistic Regression namely, SVML (Support Vector Machine-Logistic) which will help us predict the survival chance of the patient post operation. With this model, we can improve the performance of the SVM classifier in terms of its accuracy. Using our model and dataset, we have increased the accuracy to 85.24% for which SVM gave an accuracy of 78.03% and Logistic Regression gave an accuracy of 72.40%.



  • Keywords

    Breast Cancer; Classification; Logistic Regression; Support Vector Machine (SVM).

  • References

      [1] Fung, M. Glenn, and O. L. Mangasarian. "Multicategory proximal support vector machine classifiers." Machine learning Vol. 59, No. 1-2, pp.77-97, 2005.

      [2] Lantz, Brett. Machine learning with R. Packt Publishing Ltd., 2013.

      [3] Furey, S. Terrence, N. Cristianini, N. Duffy, D. W. Bednarski, M. Schummer and D. Haussler. "Support vector machine classification and validation of cancer tissue samples using microarray expression data." Bioinformatics Vol. 16, No. 10, pp. 906-914, 2000.

      [4] Menard, Scott. Applied logistic regression analysis. Vol. 106, Sage, 2002.

      [5] Tranmer, Mark, and M. Elliot. "Binary logistic regression." Cathie Marsh for census and survey research, paper, Vol. 20, 2008.

      [6] Hua, Zhongsheng, Y. Wang, X. Xu, B Zhang, and L. Liang. "Predicting corporate financial distress based on integration of support vector machine and logistic regression." Expert Systems with Applications, Vol. 33, No. 2, pp. 434-440, 2007.

      [7] Yilmaz, Işık. "Comparison of landslide susceptibility mapping methodologies for Koyulhisar, Turkey: conditional probability, logistic regression, artificial neural networks, and support vector machine." Environmental Earth Sciences Vol. 61, No. 4, pp. 821-836, 2010.

      [8] Hua, Zhongsheng, and B. Zhang. "A hybrid support vector machines and logistic regression approach for forecasting intermittent demand of spare parts." Applied Mathematics and Computation, Vol. 181, No. 2, pp.1035-1048, and 2006.

      [9] Elbashir, M. Khalafallah, W. Jianxin, and F. Wu. "A hybrid approach of support vector machines with logistic regression for β-turn prediction." In: Proc. of IEEE International Conference on In Bioinformatics and Biomedicine Workshops (BIBMW), IEEE, pp. 587-593, 2012.

      [10] Chang, Y. Ivan. "Boosting SVM classifiers with logistic regression." See www.stat.sinica.edu.tw-/library/c_tec_rep/200303.pdf, 2003.

      [11] Haberman’s Survival data set, T. S. Lim, UCI Machine Learning Repository, http://archive.ics-.uci.edu/ml Irvine, CA, University of California, School, 1999.

      [12] Ferlay, Jacques, C. Héry, P. Autier, and R. Sankaranarayanan. "Global burden of breast cancer." In Breast cancer epidemiology Springer NewYork, pp. 1-19, 2010.

      [13] Carter, L. Christine, C. Allen, and D. E. Henson. "Relation of tumor size, lymph node status, and survival in 24,740 breast cancer cases." Cancer Vol. 63, No. 1, pp. 181-187, 1989.

      [14] Early Breast Cancer Trialists' Collaborative Group. "Effects of chemotherapy and hormonal therapy for early breast cancer on recurrence and 15-year survival: an overview of the randomised trials." The Lancet Vol. 365, No. 9472, pp.1687-1717, 2005.

      [15] E. Dimitriadou, K. Hornik, F. Leisch, D. Meyer, A. Weingessel, MF Leisch. Package ‘e1071’. R Software package, available at http://cran.rproj-ect.org/web/packages/e1071/index.html Jan 6, 2009.

      [16] King, Gary, and L. Zeng. "Logistic regression in rare events data." Political analysis Vol.9, No. 2, pp. 137-163, 2001.

      [17] Zhu, Ji and T. Hastie. "Kernel logistic regression and the import vector machine." Journal of Computational and Graphical Statistics, Vol. 14, No. 1, pp. 185-205, 2005.

      [18] D. Meyer, F. T. Wien. Support vector machines. The Interface to libsvm in package e1071. Aug 5, 2015.

      [19] Tong, Simon, and D. Koller. "Support vector machine active learning with applications to text classification." Journal of machine learning research, Vol. 2, pp. 45-66, Nov 2001.

      [20] E. Leopold and J. Kindermann. "Text categorization with support vector machines. How to represent texts in input space?" Machine Learning Vol. 46, No. 1-3, pp. 423-444, 2002.

      [21] “Learn About Lymph Node Status and Breast Cancer at Susan G. Komen.” Susan G.Komen®, ww5.komen.org/BreastCancer/LymphNodeStatus.html.

      [22] Awad, Y. Abu, P. Koutrakis, B. A. Coull, and J. Schwartz. “A spatio-temporal prediction model based on support vector machine regression: Ambient Black Carbon in three New England States.” Environmental Research, no. 159, pp. 427-434, 2017.

      [23] Hewett, E. Timothy, K. E. Webster, and W. J. Hurd. "Systematic Selection of Key Logistic Regression Variables for Risk Prediction Analyses: A Five-Factor Maximum Model." Clinical Journal of Sport Medicine, 2017.

      [24] D. Meyer. "Support Vector Machines—the Interface to libsvm in package e1071."Paper available at http://cran.rproject.org/-web/packa-ges/e1071/vignettes/svmdoc.pdf., 2014.




Article ID: 15516
DOI: 10.14419/ijet.v7i2.33.15516

Copyright © 2012-2015 Science Publishing Corporation Inc. All rights reserved.