Gesture Language Recognition Through ‎Computer Vision and a Spatial-Temporal ‎Mathematical Model

  • Authors

    • S. Manikandan Department of Information Technology, E.G.S. Pillay Engineering College, Nagapattinam, India
    • T. Suma Department of Computer Science and Engineering, Sri Venkateshwara College of Engineering, Bengaluru, Karnataka, India
    • P. Dhanalakshmi Department of Artificial Intelligence & Machine Learning, Mohan Babu University, Tirupati, India
    • K. C. Rajheshwari Department of Computer Science and Engineering, Sona College of Technology, Salem, India
    • T. Tamilselvi Department of Computer Science and Engineering, SRM Institute of Science and Technology, Ramapuram, Chennai, India
    • V. Subedha Department of Computer Science and Engineering, Panimalar Engineering College, Chennai, India
    https://doi.org/10.14419/mwb8ve64

    Received date: June 17, 2025

    Accepted date: July 14, 2025

    Published date: July 20, 2025

  • Sign Language Recognition; Gesture Recognition; Continuous Sign Language Recognition; Deep Learning; Artificial Intelligence; Spatial-Temporal ‎Learning; Assistive Technology‎.
  • Abstract

    For people with speech and hearing impairments, sign language is a vital form of communication that allows them to communicate and en-‎gage with others. Nonetheless, a major obstacle is the general public's limited comprehension of sign language. For the deaf and mute com-‎communities, this communication gap frequently results in challenges with social inclusion, education, and career prospects. To solve ‎this problem, researchers are increasingly using deep learning and artificial intelligence (AI) techniques to create automatic sign language ‎recognition (SLR) systems that can instantly translate sign motions into speech or text. This paper presents a hybrid method that combines ‎continuous sign language recognition (CSLR) and isolated sign language recognition (SLR) into a single deep learning framework. The ‎system uses a Spatial-Temporal Network (STNet) to identify dynamic sign sequences in CSLR and a Convolutional Neural Network ‎‎(CNN) for isolated sign identification. An ensemble learning technique is included to increase model robustness, and an optimized Inception-based architecture is utilized for isolated sign classification to boost performance. Additionally, a novel Spatial Resonance Module ‎‎(SRM) refines frame-to-frame feature extraction, and a Multi-Temporal Perception Module (MTPM) strengthens long-range dependency ‎recognition in sign sequences. These advancements contribute to higher accuracy and efficiency in sign language interpretation. Experimental validation of the proposed system was conducted using benchmark datasets, demonstrating superior performance compared to existing state-of-the-art techniques. The model achieved an accuracy of 98.46% in isolated sign recognition and exhibited a 2.9% improvement in ‎CSLR tasks. The ability to accurately recognize and translate sign language in both isolated and continuous contexts makes this system ‎highly suitable for real-time applications, including assistive communication devices, virtual interpreters, and educational tools. The pro-‎posed research has the potential to significantly impact accessibility and inclusivity for individuals with speech and hearing impairments. By ‎integrating deep learning with real-time processing, this system enhances human-computer interaction and fosters seamless communication ‎between sign language users and the broader community. Future research can explore the integration of additional modalities, such as facial ‎expressions and hand movement trajectories, to further refine sign language recognition models and ensure even greater accuracy and adapt-‎ability.

  • References

    1. C. Wang et al., ‘‘SignBERT: Pre-training for sign language understanding with co-learning,’’ IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 5, pp. 1–12, May 2023.
    2. Saber, A., Elbedwehy, S., Awad, W.A. et al. An optimized ensemble model based on meta-heuristic algorithms for effective detection and classifi-cation of breast tumors. Neural Comput & Applic 37, 4881–4894 (2025). https://doi.org/10.1007/s00521-024-10719-9.
    3. Y. Xu et al., End-to-end sign language translation with video-text pre-training,’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2023, pp. 5252–5261. Vaswani et al., Attention is all you need,’ in Proc. 31st Conf. Neural Inf. Process. Syst. (NeurIPS), Dec. 2017, pp. 5998–6008.
    4. W. Huang et al., ‘‘Multi-cue fusion framework for sign language recognition and translation,’’ IEEE Trans. Multimedia, vol. 24, no. 6, pp. 3221–3233, Jun. 2023.
    5. L. Sun et al., ‘‘Sign language recognition using deep learning and dynamic time warping,’’ IEEE Trans. Circuits Syst. Video Technol., vol. 33, no. 7, pp. 4102–4115, Jul. 2023.
    6. M. Sharma et al., A survey on deep learning-based sign language recognition,’ ACM Comput. Surv., vol. 55, no. 3, pp. 1–34, Mar. 2023. https://doi.org/10.1145/3574134.
    7. F. Zhang et al., ‘‘Self-supervised contrastive learning for sign language recognition,’’ IEEE Trans. Neural Netw. Learn. Syst., vol. 34, no. 4, pp. 2098–2112, Apr. 2023.
    8. Elbedwehy, S., Hassan, E., Saber, A. et al. Integrating neural networks with advanced optimization techniques for accurate kidney disease diagno-sis. Sci Rep 14, 21740 (2024). https://doi.org/10.1038/s41598-024-71410-6.
    9. M. Yang et al., ‘‘Multi-modal learning for sign language recognition: A survey,’’ IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 8, pp. 1–18, Aug. 2023.
    10. H. Wang et al., ‘‘Pose-guided representation learning for sign language translation,’’ in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2023, pp. 12843–12852.
    11. Liu et al., Video-based sign language recognition: A review,’ Comput. Vis. Media, vol. 9, no. 1, pp. 135–159, Jan. 2023.
    12. K. C. Rajeswari, R. S. Mohana, S. Manikandan and S. Beski Prabaharan, "Speech quality enhancement using phoneme with cepstrum variation fea-tures," Intelligent Automation & Soft Computing, vol. 34, no.1, pp. 65–86, 2022. https://doi.org/10.32604/iasc.2022.022681.
    13. X. Zhu et al., ‘‘Spatial-temporal graph convolutional networks for sign language recognition,’’ IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 6, pp. 1–12, Jun. 2023.
    14. lnowaiser K, Saber A, Hassan E, Awad WA (2024) An optimized model based on adaptive convolutional neural network and grey wolf algorithm for breast cancer diagnosis. PLoS ONE 19(8): e0304868. https://doi.org/10.1371/journal.pone.0304868.
    15. Saradhi Thommandru V, Suma T, Odilya Teena AM, Muthukrishnan A, Thamaraikannan P, Manikandan S. Intelligent Optimization Framework for Future Communication Networks using Machine Learning. Data and Metadata,2024 https://doi.org/10.56294/dm2024277.
    16. S.Manikandan, E.Elakiya, K.C.Rajheshwari, & K.Sivakumar, "Efficient energy consumption in hybrid cloud environment using adaptive backtrack-ing virtual machine consolidation", Scientific Reports, (2024) https://doi.org/10.1038/s41598-024-72459-z.
    17. J. Tang et al., ‘‘Self-supervised learning for sign language translation: A multi-task approach,’’ IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 12, pp. 1–20, Dec. 2023.
    18. K. Chen et al., ‘‘Hierarchical attention networks for sign language understanding,’’ IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 9, pp. 1–15, Sep. 2023.
    19. J. Wu et al., ‘‘Bridging the gap: A survey on sign language processing with deep learning,’’ IEEE Trans. Multimedia, vol. 25, pp. 2992–3010, Jun. 2023.
    20. R. Patel et al., ‘‘Knowledge distillation for sign language recognition: A transfer learning approach,’’ in Proc. Int. Conf. Comput. Vis. (ICCV), Oct. 2023, pp. 9023–9034
  • Downloads

  • How to Cite

    Manikandan , S. ., Suma , T. ., Dhanalakshmi, P. . ., Rajheshwari , K. C. ., Tamilselvi , T. ., & Subedha , V. . (2025). Gesture Language Recognition Through ‎Computer Vision and a Spatial-Temporal ‎Mathematical Model. International Journal of Basic and Applied Sciences, 14(3), 156-162. https://doi.org/10.14419/mwb8ve64