Hybrid Encryption for Fortifying HDFS Data
-
https://doi.org/10.14419/m46fn971
Received date: July 11, 2025
Accepted date: August 19, 2025
Published date: September 14, 2025
-
Hadoop; AES; Twofish; Map-Reduce; HDFS; KMS -
Abstract
In the big data era, standard encryption methods alone are not suitable for handling massive, high-velocity data, which negatively impacts the performance of a distributed framework. This paper proposes a hybrid encryption (HE) method that integrates the strengths of the two symmetric algorithms (Twofish-256, AES-256) with the Hadoop Map-Reduce framework (MRF) to fortify Hadoop Distributed File System (HDFS) data. This paper offers dual-level encryption (Twofish -> AES) to mitigate the vulnerabilities of standalone encryption while maintaining optimal performance. The experiments on datasets from 32-256 MB show encryption speed improvement of over 5-6%, efficiency gain of over 5%, and throughput of over 6% compared to hybrid approaches such as CP-ABE+AES, AES+RSA, and standalone encryption schemes AES and Twofish. Additionally, the ANOVA test based on encryption and decryption time gives (F = 2.67, p = 0.07) and (F = 9.9, p = 0.0003) outcomes, which show that the proposed HE approach is highly significant in big data environments. Our novel approach balances security and performance, addresses the weaknesses of individual and hybrid encryption algorithms, ensures compatibility in distributed environments, and complies with data protection regulations. This suggested HE approach (Twofish -> AES) complies with GDPR, HIPAA, and PCI-DSS through key management and resistance to side-channel attacks. The results show feasibility in the government and healthcare sectors, where data protection and large dataset processing are critical.
-
References
- Bertino E, Ferrari E. (2017). Big data security and privacy. In: Studies in big data (pp. 425–439). Available from: https://doi.org/10.1007/978-3-319-61893-7_25.
- Warkentin M, Orgeron C. (2020). Using the security triad to assess blockchain technology in public sector applications. International Journal of In-formation Management. https://doi.org/10.1016/j.ijinfomgt.2020.102090.
- Yang P, Xiong N, Ren J. (2020). Data Security and Privacy Protection for Cloud Storage: a survey. IEEE Access, 8,131723–40. Available from: https://doi.org/10.1109/ACCESS.2020.3009876.
- Narayanan A, Toubiana V, Barocas S, Nissenbaum H, Boneh D. A critical look at decentralized personal data architectures. arXiv (Cornell Univer-sity). https://arxiv.org/abs/1202.4503.
- White T. Hadoop: The Definitive Guide. “O’Reilly Media, Inc.”.
- White T. (2015) Hadoop: The Definitive Guide: Storage and Analysis at Internet Scale. “O’Reilly Media, Inc.”.
- Borthakur, D. (2007). The Hadoop Distributed File System: Architecture and Design (vol 1–14).
- Filaly Y, Mendili FE, Berros N, Idrissi Y.E.B.E. (2023). Hybrid Encryption Algorithm for Information Security in Hadoop. International Journal of Advanced Computer Science & Applications, 14(6). https://doi.org/10.14569/IJACSA.2023.01406137.
- Sunder A, Shabu N, Nair TR. (2021). Securing big data in Hadoop using hybrid encryption. In: Smart innovation, systems and technologies. p. 521–30. Available from: https://doi.org/10.1007/978-981-16-3675-2_39.
- Viswanath G, Krishna PV. (2020). Hybrid encryption framework for securing big data storage in multi-cloud environment. Evolutionary Intelli-gence, 14(2),691–698. https://doi.org/10.1007/s12065-020-00404-w.
- Negi K, Shrestha R, Borges TL, Sahana S, Das S. (2023). A hybrid cryptographic approach for secure Cloud-Based file storage. https://doi.org/10.1109/GlobConET56651.2023.10150148.
- Lai JF, Heng SH. (2022). Secure File Storage on Cloud Using Hybrid Cryptography. Journal of Informatics and Web Engineering, 1(2):1–18. Available from: https://doi.org/10.33093/jiwe.2022.1.2.1.
- Kumari N, Malhotra V. (2022). Secure cloud data storage using hybrid cryptography. International Journal for Research in Applied Science and Engineering Technology, 10(4), 60–63. https://doi.org/10.22214/ijraset.2022.41081.
- Chaudhari A. (2023). A survey on hybrid cryptography for secure file storage on the cloud. International Journal for Research in Applied Science and Engineering Technology, 11(6):2523–2525. https://doi.org/10.22214/ijraset.2023.54089.
- Jain P, Gyanchandani M, Khare N. (2019). Enhanced Secured Map Reduce layer for Big Data privacy and security. Journal of Big Data, 6(1). https://doi.org/10.1186/s40537-019-0193-4.
- Gupta M, Dwivedi RK. (2023). Fortified MapReduce Layer: Elevating security and privacy in big data. ICST Transactions on Scalable Information Systems. https://doi.org/10.4108/eetsis.3859
- Bangera S, Billava P, Naik S. (2020). A Hybrid Encryption Approach for Secured Authentication and Enhancement in Confidentiality of Data. https://doi.org/10.1109/ICCMC48092.2020.ICCMC-000145
- Jintcharadze E, Iavich M. (2020). Hybrid Implementation of Twofish, AES, ElGamal and RSA Cryptosystems. https://doi.org/10.1109/EWDTS50664.2020.9224901.
- Schneier B. (2005). “Twofish Cryptanalysis Rumors.” Schneier on Security Blog.
- NIST announces Encryption Standard finalists. 2017. Available from: https://www.nist.gov/news-events/news/1999/08/nist-announces-encryption-standard-finalists.
- Menezes AJ, Van Oorschot PC, Vanstone SA. HANDBOOK of APPLIED CRYPTOGRAPHY. https://theswissbay.ch/pdf/Gentoomen%20Library/Cryptography/Handbook/Menezes.pdf.
- Yan X, Zhu Z, Wu Q. (2018). Intelligent inversion method for pre-stack seismic big data based on MapReduce. Computers & Geosciences, 110, 81–89. https://doi.org/10.1016/j.cageo.2017.10.002.
- Rahim LA, Kudiri KM, Bhattacharjee S. Framework for parallelization on big data. PloS One, 14(5), e0214044. https://doi.org/10.1371/journal.pone.0214044.
- Khan M, Jin Y, Li M, Xiang Y, Jiang C. (2015). Hadoop performance modeling for job estimation and resource provisioning. IEEE Transactions on Parallel and Distributed Systems. 27(2),441–454. https://doi.org/10.1109/TPDS.2015.2405552
- Ma C, Zhao M, Zhao Y. (2023). An overview of Hadoop applications in transportation big data. Journal of Traffic and Transportation Engineer-ing/Journal of Traffic and Transportation Engineering.10(5),900–917. https://doi.org/10.1016/j.jtte.2023.05.003
- Taylor RC. (2010). An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. BMC Bioinformat-ics.11(S12). Available from: https://doi.org/10.1186/1471-2105-11-S12-S1.
- Vohra D. (2016). Practical Hadoop Ecosystem: A Definitive Guide to Hadoop-Related Frameworks and Tools. Apress. https://doi.org/10.1007/978-1-4842-2199-0.
- Ahmed N, Barczak ALC, Susnjak T, Rashid MA. (2020). A comprehensive performance analysis of Apache Hadoop and Apache Spark for large scale data sets using HiBench. Journal of Big Data, 7(1). https://doi.org/10.1186/s40537-020-00388-5.
- Al, PS. (2021). Performance analysis of cascaded Hybrid symmetric encryption models. Türk Bilgisayar Ve Matematik Eğitimi Dergisi, 12(2), 1699–708. https://doi.org/10.17762/turcomat.v12i2.1506.
- Mohanraj, T., & R. Santhosh. (2022). Hybrid Encryption Algorithm for Big Data Security in the Hadoop Distributed File System. Computer As-sisted Methods in Engineering and Science, 29(1-2),33–48.
- Kadre, Viplove, and Sushil Chaturvedi. (2015). AES–MR: A Novel Encryption Scheme for securing Data in HDFS Environment using Map Re-duce. International journal of Computer applications, 129.12 ,12-19. https://doi.org/10.5120/ijca2015906994
- D'souza, F.J., and D. Panchal. (2017). Advanced encryption standard (AES) security enhancement using hybrid approach (pp. 647-652).
- https://doi.org/10.1109/CCAA.2017.8229881.
- Mahmoud, H., Hegazy, A., and Khafagy, M. H. (2018). An approach for big data security based on Hadoop distributed file system. https://doi.org/10.1109/ITCE.2018.8316608.
- Rizvi, S., Hussain, S. Z., and Wadhwa, N. (2011). Performance Analysis of AES and TwoFish Encryption Scheme (pp. 76-79), https://doi.org/10.1109/CSNT.2011.160.
- Tian, Y. & Yu, X. (2021). Trustworthiness study of HDFS data storage based on trustworthiness metrics and KMS encryption. 2021 IEEE Interna-tional Conference on Power Electronics, Computer Applications (ICPECA), Shenyang, pp. 962-966. https://doi.org/10.1109/ICPECA51329.2021.9362537.
- Awaysheh, F. M., Aladwan, M. N., & Alazab, M., Alawadi, S., & Cabaleiro, J. C., & Pena, T. F., (2021). Security by Design for Big Data Frame-works Over Cloud Computing. IEEE Transactions on Engineering Management. PP. https://doi.org/10.1109/TEM.2020.3045661
- Du, H., Zhang, J., Zhang, J., Sha, S., & Tang, Z. (2019). The Library for Hadoop deflate compression based on FPGA accelerator. 282–287. https://doi.org/10.1109/ComComAp46287.2019.9018820.
- Alabdulrazzaq, H., & Alenezi, M. (2022). Performance Analysis and Evaluation of Cryptographic Algorithms: DES, 3DES, Blowfish, Twofish, and Threefish. International Journal of Communication Networks and Information Security (IJCNIS), 14(1). https://doi.org/10.17762/ijcnis.v14i1.5262.
- Mothukuri, V., Cheerla, S. S., Parizi, R. M., Zhang, Q., & Choo, K.-K. R. (2021). BlockHDFS: Blockchain-integrated Hadoop distributed file sys-tem for secure provenance traceability. 100032. https://doi.org/10.1016/j.bcra.2021.100032.
- Zhang, C., Li, Y., Sun, W., & Guan, S. (2020). Blockchain Based Big Data Security Protection Scheme. 574–578. https://doi.org/10.1109/ITOEC49072.2020.9141914.
- Xu, C., & Li, J. (2024). Design of intelligent software security system based on SPARK big data Computing. Wireless Personal Communica-tions. https://doi.org/10.1007/s11277-024-11015-4.
- Liu, J., Liu, Y., & Li, B. (2023). Design and analysis of hash function based on spark and chaos system. International Journal of Network Securi-ty, 25(3), 456-467.
-
Downloads
-
How to Cite
Awasthi , S. ., & Kohli , N. . (2025). Hybrid Encryption for Fortifying HDFS Data. International Journal of Basic and Applied Sciences, 14(5), 436-454. https://doi.org/10.14419/m46fn971
