A Review: Replication Strategies for Big Data in Cloud Environment

  • Authors

    • M. A. Fazlina
    • Rohaya Latip
    • Hamidah Ibrahim
    • Azizol Abdullah
    https://doi.org/10.14419/ijet.v7i4.31.23409

    Received date: December 8, 2018

    Accepted date: December 8, 2018

    Published date: December 9, 2018

  • Big Data, Cloud Computing, Performance Metrics, Replication Strategies, Algorithms.
  • Abstract

    Big Data technology is emerging around the globe to provide better insight and decision making for every organization. As the nature of Big Data is providing variety and huge volume of data with complex data computation, cloud environment is the best choice to resolve storage issues. However, the challenge remain in this technology is data availability due to heterogeneity of Big Data systems. Data must be always accessible and available for user regardless of time. The most essential option to satisfy this desire is providing best replication strategies which able to afford business continuity without interruption. Hence, this paper delivers better perceptions on the data replications strategies for Big Data systems in cloud environment. Critical review concerning replication strategies is discussed and presented with imperative details from numerous researchers. Additionally, this work contributes thorough discussion on advantages and gaps for each study. This study also explores algorithms and performance metrics that has been improved by researchers. The methodology used to conduct this paper was using qualitative research approach. Ultimately, this paper would be helpful for future researchers in understanding and selecting the best strategy to fit their research scope and goals.

  • References

    1. J. Zhu and A. Wang,” Data Modeling for Big Data. Manager, Software Engineering,” CA Technologies,2012.
    2. Q. Xia, W. Liang and Z. Xu, "QoS-aware data replications and placements for query evaluation of Big Data analytics," IEEE Inter-national Conference on Communications (ICC), Paris, 2017, pp. 1-7.
    3. M. Berry, “Big Data: What it means to IT Managers on The Front Lines,” 2016.
    4. M.K. Hussein , M.H Mousa , “A light-weight Data Replication for Cloud Data Centers Environment,” Int. J. Innovative Res. Comput. Commun. Eng. 2 (1) (2014) 2392–2400 .
    5. C. Yang, Q. Huang, Z. Li, K. Liu and F. Hu,” Big Data and cloud computing: innovation opportunities and challenges,’ International Journal of Digital Earth Vol. 10, Iss. 1,2017.
    6. Q. Zhao, C. Xiong, X. Zhao, C. Yu and J. Xiao, "A Data Placement Strategy for Data-Intensive Scientific Workflows in Cloud," 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, Shenzhen, 2015, pp. 928-934.
    7. N. Mansouri, “Adaptive Data Replication Strategy in Cloud Com-puting for Performance Improvement,” Front. Computer. Sci. 10 (5) (2016) 925–935.
    8. V. Chang, “Towards a Big Data System Disaster Recovery in a Pri-vate Cloud,” Ad Hoc Network 35, 65-82, 2015.
    9. B. E. Thapa, “Big Data in Government: A social science perspec-tive”, 2013.
    10. Z. Lv, H. Song, P. Basanta-Val, A. Steed and M. Jo, "Next-Generation Big Data Analytics: State of the Art, Challenges, and Future Research Topics," in IEEE Transactions on Industrial Infor-matics, vol. 13, no. 4, pp. 1891-1899, Aug. 2017.
    11. X. Zheng, Z. Cai, "Real-Time Big Data Delivery in Wireless Net-works: A Case Study on Video Delivery", Industrial Informatics IEEE Transactions on, vol. 13, pp. 2048-2057, 2017, ISSN 1551-3203.
    12. L. Singh and J. Malhotra, “A Survey on Data Placement Strategies for Cloud based Scientific Workflows,” International Journal of Computer Applications 141(6):30-33, May 2016.
    13. R. Li, Y. Hu and P. P. C. Lee, "Enabling Efficient and Reliable Transition from Replication to Erasure Coding for Clustered File Systems," 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, Rio de Janeiro, 2015, pp. 148-159.
    14. B.A. Milani , N.J. Navimipour , “A Comprehensive Review of The Data Replication Techniques in The Cloud Environments: Major Trends and Future Directions,” J. Netw. Comput. Appl. 64 (2016) 229–238 .
    15. F. Xie, J. Yan and J. Shen, "Towards Cost Reduction in Cloud-Based Workflow Management through Data Replication," 2017 Fifth International Conference on Advanced Cloud and Big Data (CBD), Shanghai, 2017, pp. 94-99.
    16. N.K. Gill , S. Singh , A dynamic, “Cost-aware, optimized data rep-lication strategy for heterogeneous Cloud data centers, Future Gener. Comput. Syst. 65 (2016) 10–32 .
    17. S.Q. Long, Z.Y. Long, C. Wei,” MORM: A Multi-Objective Opti-mized Replication Management Strategy for Cloud Storage Clus-ter,” J Syst Arch, 2013.
    18. A. Cidon, R. Stutsman, S. Rumble, S. Katti, J. Ousterhout, and M. Rosenblum, “MinCopysets: Derandomizing Replication in Cloud Storage,” Paper presented in 10th USENIX Symposium on Network System Design and Implementation (NSDI), 2013.
    19. D. Boru , D. Kliazovich , F. Granelli , P. Bouvry , A.Y. Zomaya , “Energy-Efficient Data Replication in Cloud Computing Datacen-ters,” Cluster Comput. 18 (2015) 385–402 .
    20. W. Li, Y. Yang and D. Yuan, "Ensuring Cloud Data Reliability with Minimum Replication by Proactive Replica Checking," in IEEE Transactions on Computers, vol. 65, no. 5, pp. 1494-1506, May 1 2016.
    21. N. Mansouri, M. Kuchaki Rafsanjani, M.M. Javidi , “DPRS: A Dy-namic Popularity Aware Replication Strategy with Parallel Down-load Scheme in Cloud Environments,” Elsevier Simulation Model-ling Practice and Theory 77 (2017) 177-196.
    22. J. Wang, W. Huafeng, and W. Ruijun, "A new reliability model in replication-based Big Data storage systems," Journal of Parallel and Distributed Computing 108 (2017): 14-27.
    23. P. Carns, K. Harms, J. Jenkins, M. Mubarak, R. Ross and C. Carothers, "Impact of data placement on resilience in large-scale ob-ject storage systems," 32nd Symposium on Mass Storage Systems and Technologies (MSST), Santa Clara, CA, 2016, pp. 1-12.
    24. G. J. Akash, O. T. Lee, S. D. M. Kumar, P. Chandran and A. Cuz-zocrea, "RAPID: A Fast Data Update Protocol in Erasure Coded Storage Systems for Big Data," 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), Madrid, 2017, pp. 890-897.
    25. X. Xu, X. Zhao, F. Ruan, J. Zhang, W. Tian, W. Dou and A. X. Liu,”Data Placement for Privacy-Aware Applications over Big Da-ta in Hybrid Clouds,” Security and Communication Networks, 2017.
    26. K. Ashwin Kumar , A. Quamar , A. Deshpande , S. Khull-er ,”SWORD: Workload-aware Data Placement and Replica Selec-tion for Cloud Data Management Systems,” VLDB J. 23 (6) (2014) 845–870.
    27. B. Wenjie, M. Cai, M. Liu, and G. Li, “A Big Data clustering algo-rithm for mitigating the risk of customer churn,” IEEE Trans. Ind. Informat., vol. 12, no. 3, pp. 1270–1281, Jun. 2016.
    28. Madni, S.H.H., Latiff, M.S.A., Coulibaly, Y., “Recent advance-ments in resource allocation techniques for cloud computing envi-ronment: a systematic review,” Clust. Comput. 1, 45 (2016).
    29. M. S. Almhanna, "Minimizing replica idle time," 2017 Annual Con-ference on New Trends in Information & Communications Tech-nology Applications (NTICT), Baghdad, 2017, pp. 128-131.
    30. Y. Sun, H. Song, A. J. Jara, and R. Bie, “Internet of things and Big Data analytics for smart and connected communities,” IEEE Access, vol. 4, pp. 766–773, Mar. 2016.
    31. S. Souravlas and A. Sifaleras, "Binary-Tree Based Estimation of File Requests for Efficient Data Replication," in IEEE Transactions on Parallel and Distributed Systems, vol. 28, no. 7, pp. 1839-1852, July 1 2017.
    32. Y. Sun, H. Song, A. J. Jara and R. Bie, "Internet of Things and Big Data Analytics for Smart and Connected Communities," in IEEE Access, vol. 4, pp. 766-773, 2016.
    33. J. Zhou, W. Xie, D. Dai and Y. Chen, "Pattern-Directed Replica-tion Scheme for Heterogeneous Object-Based Storage," 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), Madrid, 2017, pp. 645-648.
    34. A. L’Heureux, K. Grolinger, H. F. Elyamany and M. A. M. Capretz, "Machine Learning with Big Data: Challenges and Approaches," in IEEE Access, vol. 5, pp. 7776-7797, 2017.
    35. H. Wang, Z. Xu and W. Pedrycz,” An overview on the roles of fuzzy set techniques in big data processing: Trends, challenges and opportunities,” Knowledge-Based Systems, Volume 118, 2017, Pages 15-30.
  • Downloads

  • How to Cite

    A. Fazlina, M., Latip, R., Ibrahim, H., & Abdullah, A. (2018). A Review: Replication Strategies for Big Data in Cloud Environment. International Journal of Engineering and Technology, 7(4.31), 357-362. https://doi.org/10.14419/ijet.v7i4.31.23409