Multimodal contrastive learning for optimized integration of ECG signals and clinical reports for arrhythmia classification
-
https://doi.org/10.14419/x1f6be61
Received date: May 1, 2025
Accepted date: May 21, 2025
Published date: May 28, 2025
-
Contrastive Learning; Electrocardiogram; Multimodal Learning; Arrhythmia -
Abstract
Early and accurate detection of cardiac arrhythmias is essential for timely diagnosis and effective treatment. Traditional ECG-based classification models primarily rely on signal processing techniques, often neglecting the valuable contextual information contained in clinical notes. This study introduces a novel multimodal contrastive learning framework that integrates ECG signals and clinical notes for arrhythmia clas-sification. This approach is not previously explored in ECG-based diagnostics. Unlike existing methods that process ECG and clinical text separately, our framework employs contrastive learning for early feature alignment, ensuring meaningful representation of both modalities. Additionally, we propose an alternative alignment strategy that enables ECG-text integration without requiring direct patient-level mapping, addressing a major challenge in real-world healthcare applications. A neural network extracts temporal features from ECG signals, while a BERT-based model processes textual clinical data. Contrastive learning is employed to align multimodal representations, improving the model’s ability to differentiate between normal and abnormal arrhythmias. Experimental results demonstrate 97.3% accuracy and an AUC-ROC of 0.98, significantly outperforming unimodal approaches. The proposed method enhances model generalization, robustness, and interpretability, making it suitable for real-world clinical applications. This study contributes to the development of AI-driven diagnostic tools that can integrate multimodal patient data, improving the reliability of automated arrhythmia detection in healthcare settings.
-
References
- Serha Serhal H, Abdallah N, Marion JM, Chauvet P, Oueidat M, Humeau-Heurtier A. Overview on prediction, detection, and classification of atrial fibrillation using wavelets and AI on ECG. Comput Biol Med. 2022;142:105168. https://doi.org/10.1016/j.compbiomed.2021.105168.
- Chung CT, Lee S, King E, Liu T, Armoundas AA, Bazoukis G, Tse G. Clinical significance, challenges and limitations in using artificial intelligence for electrocardiography-based diagnosis. Int J Arrhythmia. 2022;23(1):24. https://doi.org/10.1186/s42444-022-00075-x.
- Ramachandram D, Taylor GW. Deep multimodal learning: A survey on recent advances and trends. IEEE Signal Process Mag. 2017;34(6):96-108. https://doi.org/10.1109/MSP.2017.2738401.
- Khosla P, Teterwak P, Wang C, Sarna A, Tian Y, Isola P, et al. Supervised contrastive learning. Adv Neural Inf Process Syst. 2020;33:18661-73.
- Le D, Truong S, Brijesh P, Adjeroh DA, Le N. scl-st: Supervised contrastive learning with semantic transformations for multiple lead ecg arrhyth-mia classification. IEEE J Biomed Health Inform. 2023;27(6):2818-28. https://doi.org/10.1109/JBHI.2023.3246241.
- Dey N, Borra S, Ashour AS, Shi F, editors. Machine learning in bio-signal analysis and diagnostic imaging. Academic Press; 2018.
- Skeppstedt M, Kvist M, Nilsson GH, Dalianis H. Automatic recognition of disorders, findings, pharmaceuticals and body structures from clinical text: An annotation and machine learning study. J Biomed Inform. 2014;49:148-58. https://doi.org/10.1016/j.jbi.2014.01.012.
- Yuan X, Lin Z, Kuen J, Zhang J, Wang Y, Maire M, et al. Multimodal contrastive training for visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2021. p. 6995-7004. https://doi.org/10.1109/CVPR46437.2021.00692.
- Celin S, Vasanth K. ECG signal classification using various machine learning techniques. J Med Syst. 2018;42(12):241. https://doi.org/10.1007/s10916-018-1083-6.
- Awasthi S, Sachdeva N, Gupta Y, Anto AG, Asfahan S, Abbou R, et al. Identification and risk stratification of coronary disease by artificial intel-ligence-enabled ECG. EClinicalMedicine. 2023;65. https://doi.org/10.1016/j.eclinm.2023.102259.
- Lee J, Yoon W, Kim S, Kim D, Kim S, So CH, Kang J. BioBERT: a pre-trained biomedical language representation model for biomedical text min-ing. Bioinformatics. 2020;36(4):1234-40. https://doi.org/10.1093/bioinformatics/btz682.
- Zhang C, Yang Z, He X, Deng L. Multimodal intelligence: Representation learning, information fusion, and applications. IEEE J Sel Top Signal Process. 2020;14(3):478-93. https://doi.org/10.1109/JSTSP.2020.2987728.
- aleb A, Lippert C, Klein T, Nabi M. Multimodal self-supervised learning for medical image analysis. In: International Conference on Information Processing in Medical Imaging; 2021. p. 661-73. https://doi.org/10.1007/978-3-030-78191-0_51.
- Hsu TMH, Weng WH, Boag W, McDermott M, Szolovits P. Unsupervised multimodal representation learning across medical images and reports. arXiv:1811.08615. 2018.
- Houssein EH, Kilany M, Hassanien AE. ECG signals classification: a review. Int J Intell Eng Inform. 2017;5(4):376-96. https://doi.org/10.1504/IJIEI.2017.087944.
- Moody GB, Mark RG. The impact of the MIT-BIH arrhythmia database. IEEE Eng Med Biol Mag. 2001;20(3):45-50. https://doi.org/10.1109/51.932724.
- Johnson AE, Pollard TJ, Shen L, Lehman LWH, Feng M, Ghassemi M, et al. MIMIC-III, a freely accessible critical care database. Sci Data. 2016;3:160035. https://doi.org/10.1038/sdata.2016.35
- Mesin L. Heartbeat monitoring from adaptively down-sampled electrocardiogram. Comput Biol Med. 2017;84:217-25. https://doi.org/10.1016/j.compbiomed.2017.03.023.
- Huang J, Osorio C, Sy LW. An empirical evaluation of deep learning for ICD-9 code assignment using MIMIC-III clinical notes. Comput Methods Programs Biomed. 2019;177:141-53. https://doi.org/10.1016/j.cmpb.2019.05.024.
- Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; 2019. p. 4171-86.
- Sohn K. Improved deep metric learning with multi-class n-pair loss objective. Adv Neural Inf Process Syst. 2016;29.
- Le-Khac PH, Healy G, Smeaton AF. Contrastive representation learning: A framework and review. IEEE Access. 2020;8:193907-34. https://doi.org/10.1109/ACCESS.2020.3031549.
- Kingma DP, Ba J. Adam: A method for stochastic optimization. arXiv:1412.6980. 2014.
- Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, et al. Transformers: State-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations; 2020. p. 38-45. https://doi.org/10.18653/v1/2020.emnlp-demos.6.
- Naidu G, Zuva T, Sibanda EM. A review of evaluation metrics in machine learning algorithms. In: Computer Science On-line Conference; 2023. p. 15-25. https://doi.org/10.1007/978-3-031-35314-7_2.
-
Downloads
-
How to Cite
Veerarajan , R. ., & V, V. (2025). Multimodal contrastive learning for optimized integration of ECG signals and clinical reports for arrhythmia classification. International Journal of Basic and Applied Sciences, 14(1), 453-463. https://doi.org/10.14419/x1f6be61
