Multimodal contrastive learning for optimized integration of ‎ECG signals and clinical reports for arrhythmia classification

  • Authors

    • Ragavan Veerarajan Research Scholar Sri Ramachandra Faculty of Engineering and Technology, ‎Sri Ramachandra Institute of Higher Education Education and Research, Chennai-India
    • Vanitha V Associate Professor, ‎Sri Ramachandra Faculty of Engineering and Technology, ‎Sri Ramachandra Institute of Higher Education Education and Research, Chennai-India
    https://doi.org/10.14419/x1f6be61

    Received date: May 1, 2025

    Accepted date: May 21, 2025

    Published date: May 28, 2025

  • Contrastive Learning; Electrocardiogram; Multimodal Learning; Arrhythmia
  • Abstract

    Early and accurate detection of cardiac arrhythmias is essential for timely diagnosis and effective treatment. Traditional ECG-based classification models primarily rely on signal processing techniques, often neglecting the valuable contextual information contained in clinical notes. ‎This study introduces a novel multimodal contrastive learning framework that integrates ECG signals and clinical notes for arrhythmia clas-‎sification. This approach is not previously explored in ECG-based diagnostics. Unlike existing methods that process ECG and clinical text ‎separately, our framework employs contrastive learning for early feature alignment, ensuring meaningful representation of both modalities. ‎Additionally, we propose an alternative alignment strategy that enables ECG-text integration without requiring direct patient-level mapping, ‎addressing a major challenge in real-world healthcare applications. A neural network extracts temporal features from ECG signals, while a ‎BERT-based model processes textual clinical data. Contrastive learning is employed to align multimodal representations, improving the ‎model’s ability to differentiate between normal and abnormal arrhythmias. Experimental results demonstrate 97.3% accuracy and an AUC-‎ROC of 0.98, significantly outperforming unimodal approaches. The proposed method enhances model generalization, robustness, and ‎interpretability, making it suitable for real-world clinical applications. This study contributes to the development of AI-driven diagnostic ‎tools that can integrate multimodal patient data, improving the reliability of automated arrhythmia detection in healthcare settings‎.

  • References

    1. Serha Serhal H, Abdallah N, Marion JM, Chauvet P, Oueidat M, Humeau-Heurtier A. Overview on prediction, detection, and classification of atrial fibrillation using wavelets and AI on ECG. Comput Biol Med. 2022;142:105168. https://doi.org/10.1016/j.compbiomed.2021.105168.
    2. Chung CT, Lee S, King E, Liu T, Armoundas AA, Bazoukis G, Tse G. Clinical significance, challenges and limitations in using artificial intelligence for electrocardiography-based diagnosis. Int J Arrhythmia. 2022;23(1):24. https://doi.org/10.1186/s42444-022-00075-x.
    3. Ramachandram D, Taylor GW. Deep multimodal learning: A survey on recent advances and trends. IEEE Signal Process Mag. 2017;34(6):96-108. https://doi.org/10.1109/MSP.2017.2738401.
    4. Khosla P, Teterwak P, Wang C, Sarna A, Tian Y, Isola P, et al. Supervised contrastive learning. Adv Neural Inf Process Syst. 2020;33:18661-73.
    5. Le D, Truong S, Brijesh P, Adjeroh DA, Le N. scl-st: Supervised contrastive learning with semantic transformations for multiple lead ecg arrhyth-mia classification. IEEE J Biomed Health Inform. 2023;27(6):2818-28. https://doi.org/10.1109/JBHI.2023.3246241.
    6. Dey N, Borra S, Ashour AS, Shi F, editors. Machine learning in bio-signal analysis and diagnostic imaging. Academic Press; 2018.
    7. Skeppstedt M, Kvist M, Nilsson GH, Dalianis H. Automatic recognition of disorders, findings, pharmaceuticals and body structures from clinical text: An annotation and machine learning study. J Biomed Inform. 2014;49:148-58. https://doi.org/10.1016/j.jbi.2014.01.012.
    8. Yuan X, Lin Z, Kuen J, Zhang J, Wang Y, Maire M, et al. Multimodal contrastive training for visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2021. p. 6995-7004. https://doi.org/10.1109/CVPR46437.2021.00692.
    9. Celin S, Vasanth K. ECG signal classification using various machine learning techniques. J Med Syst. 2018;42(12):241. https://doi.org/10.1007/s10916-018-1083-6.
    10. Awasthi S, Sachdeva N, Gupta Y, Anto AG, Asfahan S, Abbou R, et al. Identification and risk stratification of coronary disease by artificial intel-ligence-enabled ECG. EClinicalMedicine. 2023;65. https://doi.org/10.1016/j.eclinm.2023.102259.
    11. Lee J, Yoon W, Kim S, Kim D, Kim S, So CH, Kang J. BioBERT: a pre-trained biomedical language representation model for biomedical text min-ing. Bioinformatics. 2020;36(4):1234-40. https://doi.org/10.1093/bioinformatics/btz682.
    12. Zhang C, Yang Z, He X, Deng L. Multimodal intelligence: Representation learning, information fusion, and applications. IEEE J Sel Top Signal Process. 2020;14(3):478-93. https://doi.org/10.1109/JSTSP.2020.2987728.
    13. aleb A, Lippert C, Klein T, Nabi M. Multimodal self-supervised learning for medical image analysis. In: International Conference on Information Processing in Medical Imaging; 2021. p. 661-73. https://doi.org/10.1007/978-3-030-78191-0_51.
    14. Hsu TMH, Weng WH, Boag W, McDermott M, Szolovits P. Unsupervised multimodal representation learning across medical images and reports. arXiv:1811.08615. 2018.
    15. Houssein EH, Kilany M, Hassanien AE. ECG signals classification: a review. Int J Intell Eng Inform. 2017;5(4):376-96. https://doi.org/10.1504/IJIEI.2017.087944.
    16. Moody GB, Mark RG. The impact of the MIT-BIH arrhythmia database. IEEE Eng Med Biol Mag. 2001;20(3):45-50. https://doi.org/10.1109/51.932724.
    17. Johnson AE, Pollard TJ, Shen L, Lehman LWH, Feng M, Ghassemi M, et al. MIMIC-III, a freely accessible critical care database. Sci Data. 2016;3:160035. https://doi.org/10.1038/sdata.2016.35
    18. Mesin L. Heartbeat monitoring from adaptively down-sampled electrocardiogram. Comput Biol Med. 2017;84:217-25. https://doi.org/10.1016/j.compbiomed.2017.03.023.
    19. Huang J, Osorio C, Sy LW. An empirical evaluation of deep learning for ICD-9 code assignment using MIMIC-III clinical notes. Comput Methods Programs Biomed. 2019;177:141-53. https://doi.org/10.1016/j.cmpb.2019.05.024.
    20. Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; 2019. p. 4171-86.
    21. Sohn K. Improved deep metric learning with multi-class n-pair loss objective. Adv Neural Inf Process Syst. 2016;29.
    22. Le-Khac PH, Healy G, Smeaton AF. Contrastive representation learning: A framework and review. IEEE Access. 2020;8:193907-34. https://doi.org/10.1109/ACCESS.2020.3031549.
    23. Kingma DP, Ba J. Adam: A method for stochastic optimization. arXiv:1412.6980. 2014.
    24. Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, et al. Transformers: State-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations; 2020. p. 38-45. https://doi.org/10.18653/v1/2020.emnlp-demos.6.
    25. Naidu G, Zuva T, Sibanda EM. A review of evaluation metrics in machine learning algorithms. In: Computer Science On-line Conference; 2023. p. 15-25. https://doi.org/10.1007/978-3-031-35314-7_2.
  • Downloads

  • How to Cite

    Veerarajan , R. ., & V, V. (2025). Multimodal contrastive learning for optimized integration of ‎ECG signals and clinical reports for arrhythmia classification. International Journal of Basic and Applied Sciences, 14(1), 453-463. https://doi.org/10.14419/x1f6be61