CT Angiography Clot Burden Score from Data Mining of Structured Reports for Pulmonary Embolism


Background Many studies emphasize the role of structured reports (SRs) because they are readily accessible for further automated analyses. However, using SR data obtained in clinical routine for research purposes is not yet well represented in literature. Purpose To compare the performance of the Qanadli scoring system with a clot burden score mined from structured pulmonary embolism (PE) reports from CT angiography. Materials and Methods In this retrospective study, a rule-based text mining pipeline was developed to extract descriptors of PE and right heart strain from SR of patients with suspected PE between March 2017 and February 2020. From standardized PE reporting, a pulmonary artery obstruction index (PAOI) clot burden score (PAOICBS) was derived and compared with the Qanadli score (PAOIQ). Scoring time and confidence from two independent readings were compared. Interobserver and interscore agreement was tested by using the intraclass correlation coefficient (ICC) and Bland-Altman analysis. To assess conformity and diagnostic performance of both scores, areas under the receiver operating characteristic curve (AUCs) were calculated to predict right heart strain incidence, as were optimal cutoff values for maximum sensitivity and specificity. Results SR content authored by 67 residents and signed off by 32 consultants from 1248 patients (mean age, 63 years ± 17 [standard deviation]; 639 men) was extracted accurately and allowed for PAOICBS calculation in 304 of 357 (85.2%) PE-positive reports. The PAOICBS strongly correlated with the PAOIQ (r = 0.94; P textless .001). Use of PAOICBS yielded overall time savings (1.3 minutes ± 0.5 vs 3.0 minutes ± 1.7), higher confidence levels (4.2 ± 0.6 vs 3.6 ± 1.0), and a higher ICC (ICC, 0.99 vs 0.95), respectively, compared with PAOIQ (each, P textless .001). AUCs were similar for PAOICBS (AUC, 0.75; 95% CI: 0.70, 0.81) and PAOIQ (AUC, 0.77; 95% CI: 0.72, 0.83; P = .68), with cutoff values of 27.5% for both scores. Conclusion Data mining of structured reports enabled the development of a CT angiography scoring system that simplified the Qanadli score as a semiquantitative estimate of thrombus burden in patients with pulmonary embolism. © RSNA, 2021 Online supplemental material is available for this article. See also the editorial by Hunsaker in this issue.