Least Square Regression for Prediction Problems in Machine Learning using R
DOI:
https://doi.org/10.14419/ijet.v7i3.12.17612Published:
2018-07-20Keywords:
Independent variable, Dependent variable, Least square regression.Abstract
The most commonly used prediction technique is Ordinary Least Squares Regression (OLS Regression). It has been applied in many fields like statistics, finance, medicine, psychology and economics. Many people, specially Data Scientists using this technique know that it has not gone with enough training to apply it and should be checked why & when it can or can’t be applied.
It’s not easy task to find or explain about why least square regression [1] is faced much criticism when trained and tried to apply it. In this paper, we mention firstly about fundamentals of linear regression and OLS regression along with that popularity of LS method, we present our analysis of difficulties & pitfalls that arise while OLS method is applied, finally some techniques for overcoming these problems.
References
[1] Breiman, L. (1984). Classification and Regression Trees. New York: Routledge.
[2] Book: “Regression Analysis and Linear Models: Concepts, Applications, and Implementation†by Richard B. Darlington, Andrew F. Hayes
[3] A. M. Bagirov, C. Clausen, M. Kohler, "Estimation of a regression function by maxima of minima of linear functions", IEEE Trans. Inf. Theory, vol. 55, no. 2, pp. 833-845, Feb’2009.
[4] L. Györfi, M. Kohler, A. Krzyżak, H. Walk, A Distribution-Free Theory of Nonparametric Regression, New York, NY, USA: Springer-Verlag, 2002.
[5] D. W. Hosmer, S. Lemeshow, R. X. Sturdivant, Applied Logistic Regression, New York, NY, USA: Wiley, 2013.
[6] Widely Linear Complex-Valued Kernel Methods for Regression by Rafael Boloix-Tortosa ; Juan José Murillo-Fuentes ; Irene Santos ; Fernando Pérez-Cruz, published in IEEE Transactions on Signal Processing (Volume: 65, Issue: 19, Oct.1, 1 2017).
[7] Basics of R: https://www.udemy.com/r-basics/
[8] Heteroscedastic Max–Min Distance Analysis for Dimensionality Reduction- Xiaoqing Ding, Changsong Liu, Ying Wu
[9] D. Buchczik, Least Median of Squares in Multivariate Calibration, 2005.
[10] Least product relative error estimation - Chen, Kani - Lin, YuanyuanWang, Zhanfeng Ying, Zhiliang - Journal of Multivariate Analysis, VL - 144, 2016, DA - 2016/02/01/, 0047-259X
[11] The SAGE Handbook of Regression Analysis and Causal Inference, Henning Best, Christof Wolf,2014
[12] C. C. Aggarwal, Outlier Analysis, New York, NY, USA: Springer, 2013.
[13] P. Chen, L. Jiao, F. Liu, J. Zhao, Z. Zhao, S. Liu, "Semi-supervised double sparse graphs-based discriminant analysis for dimensionality reduction", Pattern Recognit., vol. 61, pp. 361-378, Jan. 2017.
How to Cite
License
Authors who publish with this journal agree to the following terms:- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution Licensethat allows others to share the work with an acknowledgement of the work''s authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal''s published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
Accepted 2018-08-16
Published 2018-07-20