Enhancing SQL Query Performance: A Case Study on Optimizing Enterprise Data Processing

  • Authors

    • Karthik Sirigiri Software Developer, RedMane Technology, Illinois
    https://doi.org/10.14419/x2wqqh31

    Received date: June 28, 2025

    Accepted date: August 1, 2025

    Published date: September 10, 2025

  • SQL Query Optimization, Enterprise Data Processing, Indexing Strategies, Execution Plan Analysis, Cardinality Estimation, Machine Learning in Data-bases, Materialized Views, Query Refactoring, Performance Tuning, and cloud-Native Databases.
  • Abstract

    The performance of SQL queries is what makes enterprise data systems scalable and efficient. As companies use more and more complex queries on databases that are spread out and in the cloud, problems like bad indexing, wrong cardinality estimates, and slow execution plans become critical. This paper shows a real-world case study of an enterprise application that uses SQL Server and PostgreSQL to manage terabyte-scale hybrid datasets. We look at how targeted optimization techniques like indexing strategies, query refactoring, execution plan analysis, and partitioning affect real-world query workloads. Quantitative results show that query latency has improved by up to 60%, and CPU and I/O usage have gone down by a lot. The study also includes learned models for cardinality estimation and plan selection, which show how useful machine learning-enhanced optimization can be in real-world situations. These results show how important it is to proactively tune systems and give system architects and database administrators useful tips on how to improve performance in large-scale deployments.

  • References

    1. Wu, Y., Zhou, X., Zhang, Y., & Li, G. (2024). Automatic Database Index Tuning: A Survey. IEEE Transactions on Knowledge and Data Engi-neering.
    2. Marcus, R., Negi, P., Mao, H., Tatbul, N., Alizadeh, M., & Kraska, T. (2021, June). Bao: Making learned query optimization practical. In Proceed-ings of the 2021 International Conference on Management of Data (pp. 1275-1288).
    3. Oloruntoba, O. (2025). AI-Driven autonomous database management: Self-tuning, predictive query optimization, and intelligent indexing in enter-prise it environments. World Journal of Advanced Research and Reviews, 25(2), 1558-1580.
    4. Marcus, R., Negi, P., Mao, H., Tatbul, N., Alizadeh, M., & Kraska, T. (2021, June). Bao: Making learned query optimization practical. In Proceed-ings of the 2021 International Conference on Management of Data (pp. 1275-1288).
    5. Kraska, T., Alizadeh, M., Beutel, A., Chi, E. H., Ding, J., Kristo, A., ... & Nathan, V. (2021). Sagedb: A learned database system.
    6. Yang, Z., Kamsetty, A., Luan, S., Liang, E., Duan, Y., Chen, X., & Stoica, I. (2020). NeuroCard: one cardinality estimator for all tables. arXiv pre-print arXiv:2006.08109.
    7. Neumann, T., & Freitag, M. J. (2020, January). Umbra: A Disk-Based System with In-Memory Performance. In CIDR (Vol. 20, p. 29).
    8. Amelia, S. O., & Ríos, S. Sharding Strategies and Their Impact on Distributed Database Performance.
    9. Gosain, A., & Sachdeva, K. (2020). Materialized view selection for query performance enhancement using stochastic ranking based cuckoo search algorithm. International Journal of Reliability, Quality and Safety Engineering, 27(03), 2050008.
    10. Mohseni, M., & Sohrabi, M. K. (2020). MVPP-based materialized view selection in data warehouses using simulated annealing. International Jour-nal of Cooperative Information Systems, 29(03), 2050001.
    11. Han, Y., Li, G., Yuan, H., & Sun, J. (2022). $mathtt {AutoView} $: An Autonomous Materialized View Management System With Encoder-Reducer. IEEE Transactions on Knowledge and Data Engineering, 35(6), 5626-5639.
    12. Xing, M. (2023). Power Information System Database Cache Model Based on Deep Machine Learning. Intelligent Automation & Soft Computing, 37(1).
    13. Milicevic, B., & Babovic, Z. (2024). A systematic review of deep learning applications in database query execution. Journal of Big Data, 11(1), 173.
    14. Taipalus, T. (2024). Database management system performance comparisons: A systematic literature review. Journal of Systems and Software, 208, 111872.
    15. Abbasi, M., Bernardo, M. V., Váz, P., Silva, J., & Martins, P. (2024). Adaptive and Scalable Database Management with Machine Learning Inte-gration: A PostgreSQL Case Study. Information, 15(9), 574.
    16. Brucato, M., Siddiqui, T., Wu, W., Narasayya, V., & Chaudhuri, S. (2024). Wred: Workload reduction for scalable index tuning. Proceedings of the ACM on Management of Data, 2(1), 1-26.
    17. Bai, Q., Alsudais, S., & Li, C. (2023). Querybooster: Improving SQL performance using middleware services for human-centered query rewriting. arXiv preprint arXiv:2305.08272.
    18. Ye, C., Duan, H., Zhang, H., Wu, Y., & Dai, G. (2024). Learned Query Optimization by Constraint-Based Query Plan Augmentation. Mathematics, 12(19), 3102.
    19. Chen, X., Chen, H., Liang, Z., Liu, S., Wang, J., Zeng, K., ... & Zheng, K. (2023). Leon: A new framework for ml-aided query optimization. Pro-ceedings of the VLDB Endowment, 16(9), 2261-2273.
    20. Du, Y., Cai, Z., & Ding, Z. (2024). Query Optimization in Distributed Database Based on Improved Artificial Bee Colony Algorithm. Applied Sci-ences, 14(2), 846.
    21. Liu, P., Cai, P., Zhong, K., Li, C., & Chen, H. (2025). LRP: learned robust data partitioning for efficient processing of large dynamic queries. Fron-tiers of Computer Science, 19(9), 199607.
    22. Zhou, W., Lin, C., Zhou, X., & Li, G. (2024). Breaking It Down: An In-Depth Study of Index Advisors. Proceedings of the VLDB Endowment, 17(10), 2405-2418.
    23. Gadde, H. (2022). Integrating AI into SQL Query Processing: Challenges and Opportunities. International Journal of Advanced Engineering Tech-nologies and Innovations, 1(3), 194-219.
    24. Halperin, I. (2022). Reinforcement Learning and Stochastic Optimization: A Unified Framework for Sequential Decisions: by Warren B. Powell (ed.), Wiley (2022). Hardback. ISBN 9781119815051 (Vol. 22, No. 12, pp. 2151-2154). Routledge.
    25. Wen, M., Kuba, J., Lin, R., Zhang, W., Wen, Y., Wang, J., & Yang, Y. (2022). Multi-agent reinforcement learning is a sequence modeling problem. Advances in Neural Information Processing Systems, 35, 16509-16521.
    26. Liang, P. P., Zadeh, A., & Morency, L. P. (2024). Foundations & trends in multimodal machine learning: Principles, challenges, and open questions. ACM Computing Surveys, 56(10), 1-42.
    27. Boeschen, N., Ziegler, T., & Binnig, C. (2024). GOLAP: A GPU-in-Data-Path Architecture for High-Speed OLAP. Proceedings of the ACM on Management of Data, 2(6), 1-26.
    28. Freitag, M., Bandle, M., Schmidt, T., Kemper, A., & Neumann, T. (2020). Adopting worst-case optimal joins in relational database systems. Pro-ceedings of the VLDB Endowment, 13(12), 1891-1904.
    29. Yan, Z., Uotila, V., & Lu, J. (2023). Join order selection with deep reinforcement learning: fundamentals, techniques, and challenges. Proceedings of the VLDB Endowment, 16(12), 3882-3885.
    30. L’Esteve, R. (2022). Adaptive query execution. In The Azure Data Lakehouse Toolkit: Building and Scaling Data Lakehouses on Azure with Delta Lake, Apache Spark, Databricks, Synapse Analytics, and Snowflake (pp. 327-338). Berkeley, CA: Apress.
    31. Gadde, H. (2022). Integrating AI into SQL Query Processing: Challenges and Opportunities. International Journal of Advanced Engineering Tech-nologies and Innovations, 1(3), 194-219.
    32. Butrovich, M., Lim, W. S., Ma, L., Rollinson, J., Zhang, W., Xia, Y., & Pavlo, A. (2022, June). Tastes great! less filling! high performance and ac-curate training data collection for self-driving database management systems. In Proceedings of the 2022 International Conference on Management of Data (pp. 617-630).
    33. Zhu, R., Wu, Z., Han, Y., Zeng, K., Pfadler, A., Qian, Z., ... & Cui, B. (2020). FLAT: fast, lightweight and accurate method for cardinality estima-tion. arXiv preprint arXiv:2011.09022.
    34. Helskyaho, H., Yu, J., Yu, K., Helskyaho, H., Yu, J., & Yu, K. (2021). Oracle Autonomous Database for Machine Learning. Machine Learning for Oracle Database Professionals: Deploying Model-Driven Applications and Automation Pipelines, 97-133.
    35. Edifecs. (2023). XEngine Server Technical Documentation.
    36. Ronanki, R. (2024). Revolutionizing Health Care with AI: A New Era of Efficiency, Trust, and Care Excellence. NEJM AI Sponsored
  • Downloads

  • How to Cite

    Sirigiri, K. . (2025). Enhancing SQL Query Performance: A Case Study on Optimizing Enterprise Data Processing. International Journal of Basic and Applied Sciences, 14(5), 353-360. https://doi.org/10.14419/x2wqqh31