Original Article
Development and validation of a PET/CT radiomics and dual-task learning model for the prediction of pathological subtypes and EGFR mutation in non-small cell lung cancer
Abstract
Background: Accurate pathological subtyping and epidermal growth factor receptor (EGFR) mutation profiling are critical for personalized non-small cell lung cancer (NSCLC) management. However, traditional invasive biopsies possess inherent limitations in dynamic monitoring and capturing tumor heterogeneity. While dual-modal positron emission tomography/computed tomography (PET/CT) imaging provides valuable non-invasive phenotypic insights, deep learning models that jointly fuse these modalities for simultaneous prediction while maintaining clinical interpretability remain scarce. Therefore, this study proposes an integrated dual-modal PET/CT radiomics framework for the simultaneous prediction of pathological subtypes and EGFR mutation status in NSCLC.
Methods: This retrospective study included a total of 384 NSCLC patients with PET/CT images across three independent cohorts. From CT images, sub-regional radiomic features were systematically extracted, while PET images provided spatial metabolic heterogeneity descriptors. Building on these, a Dual-Modal Dual-task Prediction (DMDP) model was developed. This model employs a multi-scale cross-attention mechanism to fuse PET/CT information and utilizes a dual-task learning strategy to synergistically predict both EGFR mutation and pathological subtype. The model’s efficacy was fully validated through ablation studies, and its decision interpretability was assessed using gradient-weighted class activation mapping (Grad-CAM) heatmaps.
Results: Significant differences were identified in PET metabolic parameters and imaging heterogeneity across pathological subtypes and EGFR mutation states (P<0.05). The DMDP model outperformed single-task and traditional machine learning approaches. For EGFR mutation prediction, the model achieved an area under the curve (AUC) of 0.93 (95% CI: 0.81–1.00), with an accuracy of 0.88 (95% CI: 0.83–0.98), sensitivity of 0.86 (95% CI: 0.74–0.95), and specificity of 0.88 (95% CI: 0.75–0.94). For pathological subtyping, the model achieved an AUC of 0.88 (95% CI: 0.73–0.98), sensitivity of 0.85 (95% CI: 0.73–0.95), and specificity of 0.88 (95% CI: 0.77–0.96), demonstrating balanced diagnostic performance compared with traditional models. Integrating multimodal heterogeneity features enhanced predictive performance (P<0.001). Grad-CAM analysis suggested that the model focused on tumor margins and hypermetabolic regions.
Conclusions: The DMDP framework integrated structural and metabolic information and showed potential for non-invasive prediction of pathological subtypes and EGFR mutation status in NSCLC, providing a possible basis for imaging-based risk stratification in selected clinical settings.

