Diagnostic accuracy of artificial intelligence models in detecting congenital heart disease in the second-trimester fetus through prenatal cardiac screening: a systematic review and meta-analysis.

Diagnostic accuracy of artificial intelligence models in detecting congenital heart disease in the second-trimester fetus through prenatal cardiac screening: a systematic review and meta-analysis.

Liastuti LD, Nursakina Y.Front Cardiovasc Med. 2025 Feb 24;12:1473544. PMID: 40066351 PMCID: PMC11891181 DOI: 10.3389/fcvm.2025.1473544

Take home points:

  • Artificial intelligence has been an exciting modality and could be a cost-effective tool in medium to low-income countries for effective screening for CHD. 
  • Deep learning AI models using gradient class activation mapping (Grad–CAM) combined with guided backpropagation (Guided-BP) can enhance the accuracy of screening. 
  • AI use in CHD screening is a steep learning curve and will need a lot of data and highly efficient computing power to learn and adapt to the unique challenges of patient variation based on geographic location and ethnicity. 

Commentary from Dr. Vimal Jayswal (Indiana, USA), editor of Pediatric & Fetal Cardiology Journal Watch:

Congenital heart disease (CHD) is the most common congenital abnormality, affecting approximately 1% of live births worldwide. CHD contributes significantly to morbidity and infant mortality and imposes the highest burden on global healthcare costs, particularly pronounced in low- and middle-income countries (LMICs), especially those characterized by high fertility rates. It is highly desirable to find an optimal screening tool which could be helpful in early detection of CHD with high sensitivity and specificity. To bridge the gap between the high demand for prenatal screening for CHD and limited resources, integrating artificial intelligence (AI) presents a promising solution. Deep learning (DL), a subset of machine learning (ML), is an unsupervised AI technique that can organize data into multiple processing layers, enabling autonomous learning, aiding decision-making, and revealing new findings that may otherwise elude human detection.

The integration of AI with fetal ultrasound has been shown to significantly improve clinical efficiency, reduce subjective variability due to operator expertise differences, standardize plane acquisition, and provide potential solutions for areas with scarce medical resources. 

This systematic review and meta-analysis aims to summarize recent research findings on AI’s diagnostic performance in CHD diagnosis during the second trimester of pregnancy.

Methods: Standard Search strategy and selection criteria while adhering to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) recommendations and is registered with PROSPERO, number CRD42023461738. Seven databases, namely Embase, PubMed, MEDLINE, Cochrane, Global Health, IEEE Xplore, and Scopus, were systematically searched up to 30 September 2023. 

Study eligibility

The second trimester (aged 13–26 weeks) is the gold standard period for fetal organ (especially cardiac) screening through prenatal cardiac screening, regardless of geographical location. Prenatal ultrasound or echocardiography screening is augmented with AI, including but not limited to machine learning and deep learning techniques. The overall performance or accuracy parameters of artificial intelligence, including sensitivity, specificity, negative predictive value, positive predictive value (precision), F1 score, receiver operating characteristic (ROC) curve, area under the curve (AUC), and Dice coefficient were assessed.

Discussion

According to this study, AI models demonstrate very high performance in detecting CHD compared to conventional methods (i.e., clinician’s diagnosis of CHD). The DenseNet 201 model, tested on an intra-patient dataset in a study by Qiao et al., achieved 100% sensitivity and specificity and thus 100% accuracy. This could be achieved by combining gradient class activation mapping (Grad–CAM) with guided backpropagation (Guided-BP). Abnormal pixels in ultrasound images are highlighted and visualized, which improves the interpretability and understanding of expert fetal cardiologists.

In summary, artificial intelligence models, especially deep learning techniques, have shown effective results in detecting CHD. However, it is important to carefully consider various factors such as the data acquisition process, characteristics of the data, characteristics of the population being analyzed, weight reduction of the algorithm, working principle, and interpretability of the model to develop a practical medical AI model that can be applied in real-world scenarios.

Conclusion

While there are some obstacles to using AI models in clinical practice, there is potential for AI to improve CHD diagnosis. However, more extensive studies are necessary to compare AI algorithms with conventional methods and to include a broader range of patients. Once these studies are completed and AI algorithms are validated, they may be helpful in clinical practice, especially in low- and middle-income countries.