Logistic Regression and Data Analysis on Privacy Methods  on Data Streams

P Chandrakanth; Anbarasi M.S

doi:10.14419/ijet.v7i3.12.16117

Authors

P Chandrakanth
Anbarasi M.S

Received date: July 23, 2018

Accepted date: July 23, 2018

Published date: July 20, 2018

DOI:

https://doi.org/10.14419/ijet.v7i3.12.16117

Keywords:

concept drift, Logistic Regression, data utility, data streams, data Privacy, Privacy Preserving in Data Mining (PPDM).

Abstract

The problem data privacy in streams is completely put in a myopic view by hitherto researchers. Research and experimentations have been well fortified on static data, in which predominantly spelled easy with approaches based on perturbation using random data values. Approaches based on large data sets and high dimension data sets are not adequate consequences. By using the phenomenon of autocorrelation of multivariable streams and their leveraging structures, identifying the suitable areas to add noise maximally preserves privacy and in a irreversible manner. Drift checking and ensemble classifier building is the basic requirements for privacy preserving data stream, which makes clear in experimentation with the support of sensitivity analysis. In this paper we present the results of experimentation at all the stages.
Â
Â

References

[1] Fawcett, Tom. "An introduction to ROC analysis." Pattern recognition letters 27.8 (2006): 861-874.
[2] Walker, SH; Duncan, DB (1967). "Estimation of the probability of an event as a function of several independent variables". Biometrika. 54: 167-178. doi:10.2307/2333860.
[3] Jump up ^ Cox, DR (1958). "The regression analysis of binary sequences (with discussion)". J Roy Stat Soc B. 20: 215-242. JSTOR 2983890.
[4] Charu C. Aggarwal and Philip S. Yu, â€œPrivacy-Preserving Data Mining - Models and Algorithmsâ€, Â© 2008 Springer Science+Business Media, LLC. ISBN: 978-0-387-70991-8 [524 pages].
[5] Jaideep Vaidya, Chris Clifton and Michael Zhu, â€œPrivacy Preserving Data Miningâ€, Â© 2006 Springer Science+Business Media, Inc.
[6] Yaping Li, Minghua Chen, Qiwei Li, and Wei Zhang, â€œEnabling Multilevel Trust in Privacy Preserving Data Miningâ€, IEEE Transactions On Knowledge And Data Engineering, Vol. 24, No. 9, Pp. 1598, Â© September 2012.
[7] Aristides Gionis and Tamir Tassa, â€œk-Anonymization with Minimal Loss of Informationâ€, IEEE Transactions on Knowledge and Data Engineering, Vol. 21, No.2 pp.205, Â© February 2009.
[8] Murat Kantarcioglou and Chris Clifton, â€œPrivacy-Preserving Distributed Mining of Association Rules on Horizontally Partitioned Dataâ€, IEEE Transactions on Knowledge and Data Engineering, Vol. 16. No.9, pp.1025, Â© September 2004.
[9] Tamir Tassa, â€œSecure Mining of Association Rules in Horizontally Distributed Databasesâ€, IEEE Transactions on Knowledge Discovery and Data Engineering, Vol. 26. No. 4, pp.969. Â© April 2014.
[10] Xue, Yanbing, and Milos Hauskrecht. â€œActive learning of classification models with Likert-scale feedback.â€ Proceedings of the 2017 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, 2017.
[11] Guigo, Roderic, et al. â€œAn assessment of gene prediction accuracy in large DNA sequences.â€ Genome Research 10.10 (2000): 1631-1642.
[12] Boone, Harry N., and Deborah A. Boone. â€œAnalyzing Likert data.â€ Journal of extension 50.2 (2012): 1-5.
[13] Wu, Huiping, and Shing-On Leung. â€œCan Likert Scales be Treated as Interval Scales?â€”A Simulation Study.â€ Journal of Social Service Research 43.4 (2017): 527-532.
[14] Cao, Xi Hang, Ivan Stojkovic, and Zoran Obradovic. â€œA robust data scaling algorithm to improve classification accuracies in biomedical data.â€ BMC bioinformatics 17.1 (2016): 359.
[15] Bornholt, James, et al. â€œA DNA-based archival storage system.â€ ACM SIGOPS Operating Systems Review 50.2 (2016): 637-649.
[16] Chormunge, Smita, and Sudarson Jena. â€œEfficient Feature Subset Selection Algorithm for High Dimensional Data.â€ International Journal of Electrical and Computer Engineering6.4 (2016): 1880.
[17] Hira, Zena M., and Duncan F. Gillies. â€œA review of feature selection and feature extraction methods applied on microarray data.â€ Advances in bioinformatics 2015 (2015).
[18] Tijmstra, Jesper, Maria Bolsinova, and Minjeong Jeon. â€œGeneral mixture item response models with different item response structures: Exposition with an application to Likert scales.â€ Behavior research methods (2018): 1-20.
[19] Hochbaum, Dorit S., and Philipp Baumann. â€œSparse computation for large-scale data mining.â€ IEEE Transactions on Big Data 2.2 (2016): 151-174.
[20] GÃ¶b, Rainer, Christopher McCollin, and Maria Fernanda Ramalhoto. â€œOrdinal methodology in the analysis of Likert scales.â€ Quality & Quantity 41.5 (2007): 601-626.
[21] Koufakou, Anna, Justin Gosselin, and Dahai Guo. â€œUsing data mining to extract knowledge from student evaluation comments in undergraduate courses.â€ Neural Networks (IJCNN), 2016 International Joint Conference on. IEEE, 2016.
[22] Michalopoulou, Catherine, and Maria Symeonaki. â€œImproving Likert Scale Raw Scores Interpretability with K-means Clustering.â€ Bulletin of Sociological Methodology/Bulletin de MÃ©thodologie Sociologique 135.1 (2017): 101-109.
[23] Jain, Y. Kumar, and Santosh Kumar Bhandare. â€œMin max normalization based data perturbation method for privacy protection.â€ International Journal of Computer & Communication Technology 2.8 (2011): 45-50.
[24] Fernandes, Maria, et al. â€œSensitivity Levels: Optimizing the Performance of Privacy Preserving DNA Alignment.â€ bioRxiv (2018): 292227.
[25] Prasser, Fabian, et al. â€œLightning: Utility-Driven Anonymization of High-Dimensional Data.â€ Transactions on Data Privacy 9.2 (2016): 161-185.

Logistic Regression and Data Analysis on Privacy Methods on Data Streams

Authors

P Chandrakanth

Anbarasi M.S

How to Cite

DOI:

Keywords:

Abstract

References

Downloads

How to Cite