Deep Reinforcement Learning for Joint UAV Trajecto‎ry and Communication Design in Cache-Enabled Cellular Networks

  • Authors

    • Bathula Prasanna Kumar Associate Professor, Computer Science and Engineering- Data Science, KKR & KSR Institute ‎of Technology and Sciences, Guntur, Andhra Pradesh, India
    • U. S. B. K. Mahalaxmi Department of Electronics and communication Engineering, Aditya University, Surampalem, ‎Andhra Pradesh, India
    • Vullam Nagagopiraju Professor, Department of CSE-Data Science, Chalapathi Institute of Engineering and Technology, ‎Guntur
    • Ashok Kumar Manda Associate Professor & HOD, Computer Science and Engineering, Vikas College of Engineer‎ing and Technology, Nunna, Vijayawada Rural, Andhra Pradesh, India
    • Kotha Chandana Assistant Professor, Department of Information Technology, R.V.R & J.C College of Engineering, ‎Andhra Pradesh, India
    • Dr. ‎Suresh Betam Assistant Professor, Department of CSE, KL Deemed to be University, Vaddeswaram, Andhra Pra-‎desh, India
    • Akurathi Gangadhar Associate Professor, Department of Electronics and communication Engineering, UCEN, JNTUK Narasaraopet, Andhra Pradesh, India
    • Dr. Sarala Patchala Associate Professor, Department of ECE, KKR & KSR Institute of Technology and Sciences, Guntur, Andhra Pradesh, India, Andhra Pradesh, India
    https://doi.org/10.14419/djn77m90

    Received date: May 23, 2025

    Accepted date: July 4, 2025

    Published date: July 30, 2025

  • UAV; Cache-Enabled Cellular Networks; Deep Reinforcement Learning; Communication Design
  • Abstract

    Unmanned Aerial Vehicles (UAVs) are now widely used in communication networks. They help in ‎delivering data in areas where the demand is high. This paper studies how UAVs work with cellular networks to provide better content transmission. The main goal is to reduce the time users wait to get the ‎content they need. The researchers suggest using edge caching with UAVs. This means UAVs store ‎popular data before users request it. The UAVs move based on an optimized path to deliver data efficiently. We also adjust transmission power. This reduces delays and improves the user experience. The ‎challenge is that users request data randomly. UAVs move dynamically, which adds uncertainty. Solving this problem with normal optimization methods is difficult. Instead, we use deep reinforcement ‎learning (DRL). We model the problem as a game where UAVs and a base station act as agents. These ‎agents observe the environment and make decisions accordingly. The paper introduces a new method ‎based on Proximal Policy Optimization (PPO). It is called Dual-Clip PPO. This method helps UAVs ‎explore the environment efficiently. It also ensures that actions are optimal over time. A new reward system ‎is introduced to guide UAV movement. The base station agent gets rewards from the environment, while ‎UAVs receive an extra reward when they explore new areas. Simulations show that this new approach ‎works better than existing methods. The proposed model reduces the time needed for users to receive ‎content. It also performs better than standard PPO-based learning methods. This paper concludes that ‎combining UAVs with caching and DRL improves communication networks. The method allows UAVs ‎to move sensibly, place content efficiently, and adjust transmission power.

  • References

    1. A. Sharma, P. Vanjani, N. Paliwal, C. M. W. Basnayaka, D. N. K. Jayakody, H.-C. ‎Wang, and P. Muthuchidambaranathan, “Communication and networking ‎technologies for uavs: A survey,” Journal of Network and Computer Applications, ‎vol. 168, p. 102739, 2020.‎ https://doi.org/10.1016/j.jnca.2020.102739.
    2. A. Fotouhi, H. Qiang, M. Ding, M. Hassan, L. G. Giordano, A. Garcia-Rodriguez, and ‎J. Yuan, “Survey on uav cellular communications: Practical aspects, standardization ‎advancements, regulation, and security challenges,” IEEE Communications surveys & ‎tutorials, vol. 21, no. 4, pp. 3417–3442, 2019.‎ https://doi.org/10.1109/COMST.2019.2906228.
    3. ‎J. Lorincz, Z. Klarin, and J. Oˇzegovi´c, “A comprehensive overview of tcp ‎congestion control in 5g networks: Research challenges and future per-spectives,” ‎Sensors, vol. 21, no. 13, p. 4510, 2021.‎ https://doi.org/10.3390/s21134510.
    4. ‎D. Rico and P. Merino, “A survey of end-to-end solutions for reliable low-latency ‎communications in 5g networks,” IEEE Access, vol. 8, pp. 192808–192834, 2020.‎ https://doi.org/10.1109/ACCESS.2020.3032726.
    5. ‎D. Wu, L. Wang, M. Liang, Y. Kang, Q. Jiao, Y. Cheng, and J. Li, “Uav-assisted real-‎time video transmission for vehicles: A soft actor-critic drl approach,” IEEE Internet ‎of Things Journal, 2023.‎ https://doi.org/10.1109/JIOT.2023.3343590.
    6. G. Zhan, X. Zhang, Z. Li, L. Xu, D. Zhou, and Z. Yang, “Multiple-uav reinforcement ‎learning algorithm based on improved ppo in ray framework,” Drones, vol. 6, no. 7, ‎p. 166, 2022.‎ https://doi.org/10.3390/drones6070166.
    7. G. G. d. Castro, G. S. Berger, A. Cantieri, M. Teixeira, J. Lima, A. I. Pereira, and M. F. ‎Pinto, “Adaptive path planning for fusing rapidly exploring random trees and deep ‎reinforcement learning in an agriculture dynamic environment uavs,” Agriculture, vol. ‎‎13, no. 2, p. 354, 2023.‎ https://doi.org/10.3390/agriculture13020354.
    8. V. N. Padmanabhan, H. J. Wang, P. A. Chou, and K. Sripanidkulchai, “Distributing ‎streaming media content using cooperative networking,” in Proceedings of the 12th ‎international workshop on Network and operating systems support for digital audio ‎and video, pp. 177–186, 2002.‎ https://doi.org/10.1145/507670.507695.
    9. ‎M. Ghetas and M. Issa, “A novel reinforcement learning-based reptile search ‎algorithm for solving optimization problems,” Neural Computing and Applications, ‎vol. 36, no. 2, pp. 533–568, 2024.‎ https://doi.org/10.1007/s00521-023-09023-9.
    10. ‎B. Omoniwa, B. Galkin, and I. Dusparic, “Optimizing energy efficiency in uav-‎assisted networks using deep reinforcement learning,” IEEE Wire-less Communications ‎Letters, vol. 11, no. 8, pp. 1590–1594, 2022.‎ https://doi.org/10.1109/LWC.2022.3167568.
    11. ‎T. Zhang, Y. Wang, W. Yi, Y. Liu, and A. Nallanathan, “Joint optimization of caching ‎placement and trajectory for uav-d2d networks,” IEEE Transactions on ‎Communications, vol. 70, no. 8, pp. 5514–5527, 2022.‎ https://doi.org/10.1109/TCOMM.2022.3182033.
    12. ‎M. Mozaffari, W. Saad, M. Bennis, and M. Debbah,” A tutorial on UAVs for wireless ‎networks: Applications, challenges, and open problems,” IEEE Communications ‎Surveys & Tutorials, vol. 21, no. 3, pp. 2334–2360, 2019.‎ https://doi.org/10.1109/COMST.2019.2902862.
    13. ‎Y. Zeng, J. Xu, and R. Zhang,” Energy minimization for wireless communication with ‎rotary-wing UAV,” IEEE Transactions on Wireless Commu-nications, vol. 18, no. 4, ‎pp. 2329–2345, 2019.‎ https://doi.org/10.1109/TWC.2019.2902559.
    14. ‎Q. Wu, Y. Zeng, and R. Zhang,” Joint trajectory and communication design for multi-‎UAV enabled wireless networks,” IEEE Transactions on Wireless Communications, ‎vol. 17, no. 3, pp. 2109–2121, 2018.‎ https://doi.org/10.1109/TWC.2017.2789293.
    15. X. Zhang and Q. Zhu,” Heterogeneous caching in UAV-assisted cellular networks: ‎Modeling, analysis, and optimization,” IEEE Transactions on Communications, vol. ‎‎66, no. 10, pp. 4826–4839, 2018.‎
    16. H. Liu, Z. Chen, and L. Song,” Content caching in UAV-assisted wireless networks: A ‎learning-based approach,” IEEE Transactions on Wireless Communications, vol. 18, ‎no. 10, pp. 4891–4903, 2019.‎
    17. B. Li, A. Khawar, and D. Cabric,” UAV-enabled spectrum sharing for future wireless ‎networks: Opportunities and challenges,” IEEE Network, vol. 33, no. 1, pp. 106–113, ‎‎2019.‎
    18. Satyam, A. ., Kumar, R. A. ., Patchala , S. ., Pachala, S. ., Geeta Bhimrao Atkar, & Mahalaxm, U. S. ‎B. K. . (2025). Multi-agent learning for UAV networks: a unified approach ‎to trajectory control, ‎frequency allocation and routing. International Journal of Basic and Applied Sciences, 14(2), 189-‎‎201. https://doi.org/10.14419/474dfq89.
    19. ‎C. Yu, J. Zhang, and Y. Zhao,” Deep reinforcement learning for UAV trajectory ‎optimization in wireless networks,” IEEE Journal on Selected Ar-eas in ‎Communications, vol. 37, no. 7, pp. 1413–1427, 2019.‎ https://doi.org/10.1109/JSAC.2019.2904329.
    20. R. K. Bharti, S. S, S. V. Sumant, C. A. D. Durai, R. A. Kumar, K. Singh, H. Palivela, ‎B. R. Kumar, and B. Debtera, “Enhanced path routing with buffer allocation method ‎using coupling node selection algorithm in manet,” Wireless Communications and ‎Mobile Computing, vol. 2022, no. 1, p. 1955290, 2022.‎ https://doi.org/10.1155/2022/1955290.
    21. Ummiti Sreenivasulu, Shaik Fairooz, R. Anil Kumar, Sarala Patchala, R. Prakash ‎Kumar, Adireddy Rmaesh, “Joint beamforming with RIS assisted MU-MISO systems ‎using HR-mobilenet and ASO algorithm, Digital Signal Processing, Volume 159, 2025, ‎‎104955,‎ ISSN 1051-2004, https://doi.org/10.1016/j.dsp.2024.104955.
    22. C. Lu, H. Jiang, and X. Wang,” Reinforcement learning for UAV-assisted edge ‎computing: Opportunities and challenges,” IEEE Wireless Com-munications, vol. 27, ‎no. 3, pp. 108–114, 2020.‎
    23. A. J. Chinchawade, S. Rajyalaxmi, S. Singh, R. A. Kumar, R. Rastogi, and M. A. ‎Shah, “Scheduling in multi-hop wireless networks using a distrib-uted learning ‎algorithm,” in 2023 7th International Conference on Trends in Electronics and ‎Informatics (ICOEI). IEEE, 2023, pp. 1013– 1018.‎ https://doi.org/10.1109/ICOEI56765.2023.10125909.
    24. R. K. Bharti, D. Suganthi, S. Abirami, R. A. Kumar, B. Gayathri, and S. Kayathri, ‎‎“Optimal extreme learning machine based traffic congestion control system in ‎vehicular network,” in 2022 6th International Conference on Electronics, ‎Communication and Aerospace Technology. IEEE, 2022, pp. 597–603.‎ https://doi.org/10.1109/ICECA55336.2022.10009111.
    25. M. Alzenad, M. Z. Shakir, H. Yanikomeroglu, and M. S. Alouini,” FSO-based vertical ‎backhaul/fronthaul framework for 5G+ wireless networks,” IEEE Communications ‎Magazine, vol. 55, no. 3, pp. 218–225, 2017.‎ https://doi.org/10.1109/MCOM.2017.1600735.
  • Downloads

  • How to Cite

    Kumar , B. P. ., Mahalaxmi , U. S. B. K. ., Nagagopiraju, V. . ., Manda , A. K. ., Chandana , K. ., Betam, D. ‎Suresh . ., Gangadhar , A. ., & Patchala, D. S. . . (2025). Deep Reinforcement Learning for Joint UAV Trajecto‎ry and Communication Design in Cache-Enabled Cellular Networks. International Journal of Basic and Applied Sciences, 14(3), 418-430. https://doi.org/10.14419/djn77m90