Spectral dual-layer detector CT-based radiomics-deep learning for predicting pathological aggressiveness of stage I lung adenocarcinoma: discrimination of precursor glandular lesions and invasive adenocarcinomas
Original Article

Spectral dual-layer detector CT-based radiomics-deep learning for predicting pathological aggressiveness of stage I lung adenocarcinoma: discrimination of precursor glandular lesions and invasive adenocarcinomas

Tong Wang1, Zheng Fan2, Yong Yue1, Xiaomei Lu3, Xiaoxu Deng4, Yang Hou1

1Department of Radiology, Shengjing Hospital of China Medical University, Shenyang, China; 2Department of Orthopedics, Shengjing Hospital of China Medical University, Shenyang, China; 3CT Clinical Science, Philips Healthcare, Shenyang, China; 4Department of Pathology, Shengjing Hospital of China Medical University, Shenyang, China

Contributions: (I) Conception and design: T Wang, Z Fan, Y Hou; (II) Administrative support: Y Hou; (III) Provision of study materials or patients: T Wang, Y Yue, X Deng; (IV) Collection and assembly of data: T Wang, Z Fan, Y Yue; (V) Data analysis and interpretation: T Wang, Z Fan, X Lu; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

Correspondence to: Yang Hou, MD, PhD. Department of Radiology, Shengjing Hospital of China Medical University, No. 36 Sanhao Street, Heping District, Shenyang 110004, China. Email: houyang_sj@126.com.

Background: Accurate diagnosis of early-stage lung adenocarcinoma (LA) subtypes is crucial for optimal patient management. Radiomics extract features from medical images reflect underlying biological information, while effective atomic number (Zeff) from new-generation spectral dual-layer detector computed tomography (SDCT) reflects tissue composition. This study evaluated the utility of SDCT-Zeff-based radiomics, deep learning (DL), and clinical features to differentiate between ground-glass nodule (GGN)-featured precursor glandular lesions (PGLs) and adenocarcinomas.

Methods: Patients diagnosed with GGN who underwent preoperative contrast-enhanced SDCT at two medical centers were prospectively enrolled between January 2022 and April 2024. Center 1 (Shengjing Hospital of China Medical University; n=582) served as the training cohort, while Center 2 (Shengjing Hospital, Huaxiang Branch; n=210) served as the external validation cohort. SDCT-Zeff delineated the region of interest (ROI) for radiomics feature extraction. A pre-trained ResNet50 model was used for DL feature extraction. Features were fused, screened, and integrated with various machine learning algorithms and clinical features to construct a clinical-based DL radiomics (DLR) signature nomogram, which was externally validated. Model performance was assessed regarding identification, calibration, and clinical utility.

Results: A total of 792 GGNs were analyzed, classified as glandular precursor lesions (n=296) and adenocarcinomas (n=496). Zeff was inversely correlated with invasiveness. Three features were obtained: clinical, radiomics, and DL. LightGBM was identified as the best-performing model. The area under the curves (AUCs) of DLR in the training and test sets were 0.974 [95% confidence interval (CI): 0.963–0.983] and 0.827 (95% CI: 0.770–0.884), outperforming radiomics (AUC =0.897 and 0.765), and DL (AUC =0.929 and 0.758). The nomogram coupling clinical features [Zeff_a, electron density (ED)_a, and tumor abnormal protein (TAP)] showed the best predictive ability, with AUCs of 0.983 (95% CI: 0.974–0.990) and 0.833 (95% CI: 0.779–0.885) in the training and test sets. The calibration curve indicated strong agreement between predicted and observed outcomes in both cohorts. Decision curve analysis (DCA) revealed that this nomogram offers significant clinical benefits, with a threshold probability range surpassing other models.

Conclusions: The coupled nomogram integrating SDCT-Zeff DLR with clinical features demonstrated improved predictive performance and was particularly effective in detecting GGN-featured glandular precursor lesions and adenocarcinomas. It provides a foundation for managing GGNs and offers valuable insights for preoperative evaluation.

Keywords: Lung adenocarcinoma (LA); spectral computed tomography (spectral CT); deep learning (DL); radiomics; tumor abnormal protein (TAP)


Submitted Aug 19, 2024. Accepted for publication Jan 09, 2025. Published online Feb 27, 2025.

doi: 10.21037/tlcr-24-726


Highlight box

Key findings

• Our findings showed that a clinical-based deep learning (DL) radiomics nomogram demonstrated superior performance in differentiating precursor glandular lesions (PGLs) and adenocarcinomas compared to traditional methods.

What is known and what is new?

• The 2021 World Health Organization classification identifies PGL and invasive adenocarcinomas. Accurate diagnosis of early-stage lung adenocarcinoma, especially those manifesting as ground-glass nodules (GGNs), is crucial for treatment planning. Spectral dual-layer detector computed tomography (SDCT) and radiomics, including DL techniques, offer improved diagnostic capabilities. Existing methods, including traditional computed tomography-based radiomics, had limitations in differentiating benign and malignant lesions.

• SDCT and radiomics, including DL techniques, offer improved diagnostic capabilities. The study’s novelty lies in integrating SDCT-effective atomic number-based radiomics, DL, and clinical features into a comprehensive model for differentiating PGLs and adenocarcinomas. This approach represents a significant advancement over previous studies that relied solely on radiomics or clinical factors.

What is the implication, and what should change now?

• In conclusion, the adoption of this coupled nomogram holds the potential to revolutionize the diagnosis and management of GGNs, particularly in differentiating between PGLs and adenocarcinomas. The next steps involve further validation, integration into clinical practice, and continuous refinement to maximize its utility in personalized patient care.


Introduction

Non-small cell lung cancer (NSCLC) ranks as a primary contributor to cancer-related morbidity and mortality globally. Among the various histological subtypes of NSCLC, lung adenocarcinoma (LA) is the most prevalent, representing around 55–60% of all cases (1,2). The 2021 World Health Organization (WHO) classification of LA introduced precursor glandular lesions (PGLs), which are divided into atypical adenomatous hyperplasia (AAH) and adenocarcinoma in situ (AIS), with invasive lesions including minimally invasive adenocarcinoma (MIA) and invasive adenocarcinoma (IAC) (3). PGLs are biologically inert and generally have a good prognosis (4).

Recently, there has been a rising incidence of LAs manifesting as ground-glass nodules (GGNs) have been identified. Nevertheless, the early detection rate of lung cancer in China continues to be comparatively low. Therefore, reliable diagnosis of the pathological subtypes of early-stage LA is critical for determining surgical timing, resection range, and prognosis judgment and avoiding overtreatment.

Spectral dual-layer detector computed tomography (SDCT) is a promising tumor identification technique. It collects high- and low-energy data in a traditional scan and provides in-phase, simultaneous, and homologous information, which enhances data acquisition accuracy. SDCT utilizes anti-correlated noise suppression to maintain consistently low noise levels (5). Recent studies indicate that multiple sets and multimodal quantitative parameters derived from SDCT, such as the effective atomic number (Zeff) and electron density (ED), can significantly improve the discriminatory ability of GGNs (6). Zeff can be obtained by analyzing X-rays’ energy spectrum and calculating each element’s mass fraction and corresponding linear attenuation coefficient in each voxel. This enables distinguishing and quantitatively analyzing different tissue types, such as tumor differentiation, vascular imaging, and fat deposition. SDCT offers improvements in imaging, material decomposition, and feature quantification compared to traditional computed tomography (CT) (7).

Radiomics extracts many imaging features from CT images, including morphological and textural features, to analyze the correlation between these features and pathological changes. Deep learning (DL) automatically derives quantitative information from images. DL based on convolutional neural networks (CNNs) is an emerging method for identifying the invasiveness of GGNs (8). While traditional CT-based radiomics studies have revealed good potential for distinguishing between benign and malignant lung nodules (9,10), research on SDCT radiomics for lung tumors is limited. The energy-dependent variations in attenuation across different tissues can provide rich, multimodal, and additional quantitative information, potentially improving the performance of prediction models. A retrospective study by Xu et al. (11) of 242 isolated solitary pulmonary solid nodules (SPSNs) found that a Rad-score based on 65 keV performed best in distinguishing benign and malignant SPSNs.

Tumor development is associated with abnormal glycosylation. When normal cells undergo carcinogenesis, the sugar chain structure of their surface glycoproteins undergoes abnormal changes, producing various tumor abnormal proteins (TAPs). TAP is a tumor marker that is predominantly elevated in precancerous lesions and malignant tumors. Thymidine kinase 1 (TK1), as a crucial enzyme in the synthesis of DNA precursors. is a quantitative marker of cell proliferation detectable via serology. TAP and TK1 contribute to early warning and screening for mixed GGN (mGGN)-featured LA (12,13).

Radiomics remains the primary method for high-throughput imaging analysis because of its interpretability advantage over DL (14). Combining DL and radioactive features may enhance the identification of invasive GGNs. There is no research on combining SDCT parameters, DL radiomics (DLR) signature, TAP, and TK1 in GGNs. If these indicators can be successfully integrated into a comprehensive diagnostic model, it could improve the identification of high-risk patients. Therefore, we aim to evaluate the utility of SDCT-Zeff-based radiomics, DL, and clinical features to differentiate between GGN-featured PGLs and adenocarcinomas. We present this article in accordance with the TRIPOD reporting checklist (available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-24-726/rc).


Methods

Data acquisition

This prospective study continuously recruited patients with GGNs who underwent three-phase enhanced SDCT within 1 month before surgery at two centers between January 2022 and April 2024. The training cohort at Center 1 (Shengjing Hospital of China Medical University) included 582 cases, while the external validation cohort at Center 2 (Shengjing Hospital, Huaxiang Branch), included 210. Data on classic clinical risk factors and biomarkers were collected for this study. The inclusion criteria were: (I) a single GGN [either pure GGN (pGGN) or mGGN], clinical stage 0–I, with a maximum diameter ≤40 mm (on lung window imaging), and no evidence of lymph node involvement or distant metastasis; (II) preoperative three-phase SDCT imaging and assessment of TK1 and TAP levels; and (III) diagnosis confirmed via surgical resection or biopsy. The exclusion criteria included: (I) poor image quality, incomplete clinical data, or absence of surgical or pathological confirmation; and (II) prior tumor treatment before surgery.

The study design and flowchart are illustrated in Figure 1.

Figure 1 Study design and flowchart. SDCT, spectral dual-layer detector computed tomography; Zeff, effective atomic number.

The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). The study was approved by the Medical Ethics Committee of Shengjing Hospital of China Medical University (No. 2022PS1055K) and informed consent was taken from all the patients.

SDCT protocol, image analysis, and clinical features

All patients underwent a three-phase contrast-enhanced chest CT scan using a spectral CT (IQon Spectral CT; Philips Healthcare, Best, The Netherlands). Iodixanol (270 mg/mL, 50–80 mL; GE Healthcare, Cork, Ireland) was administered intravenously via the antecubital vein, followed by a 20–30 mL saline bolus at a flow rate of 2.0–3.0 mL/s. Arterial and venous phase images were obtained 25 and 60 seconds after contrast injection, respectively, during breath-hold.

The scanning and reconstruction parameters are detailed in Appendix 1.

Images were further analyzed using the IntelliSpace Portal Version 6.5 (Philips Healthcare). A region of interest (ROI) in the lung window was semiautomatically delineated (i.e., automatic recognition-supported manual modification) on the transverse section with the largest diameter. This was followed by synchronization to CT40 keV, CT100 keV, iodine density (ID), and Zeff. The ROIs were made as large as possible, covering over 80% of the lesion area while avoiding large bronchi, vessels, and air-filled cavities within the lesions. The copy-paste functions ensured that the size and position of the ROI remained identical in the arterial and venous phases. Blinded measurements were independently performed by two senior radiologists with 8 and 10 years of experience in thoracic radiological diagnosis, and the average values were calculated. SDCT parameters are detailed in Appendix 1.

Pathological diagnosis

Pathological diagnostic criteria

All pathological diagnoses were independently assessed by two experienced pathologists, each with over 10 years of expertise, who were blinded to imaging and clinical information. The diagnoses were based on the WHO 2021 classification of lung cancer, ensuring consistency and accuracy. All specimens were obtained from surgically resected samples, and multiple tissue blocks were taken from each lesion to minimize sampling errors.

Tissue sample processing

The resected pathological tissue samples were initially fixed in a 4% formaldehyde solution, followed by routine dehydration, paraffin embedding, and sectioning into five consecutive slices (4 µm thick). Hematoxylin and eosin (HE) staining was performed on all sections. Two pathologists independently examined and evaluated the slides under a microscope. In cases of disagreement, a joint review was conducted, with a third senior pathologist consulted for final arbitration if necessary.

Pathological diagnosis classification

Samples were classified as either AAH, AIS, MIA, or IAC.

SDCT-Zeff DLR feature extraction, selection, and model construction

Zeff images were imported into ITK-SNAP software (version 3.8; http://www.itksnap.org). A senior radiologist, blinded to the clinical and histopathological data, manually delineated the three-dimensional (3D) ROI on the continuous layers of the lesion. To ensure the stability and robustness of the radiomics characteristics, 50 patients were randomly selected using a blinded method. Another senior radiologist repeated the image segmentation and feature extraction process, and the intraclass correlation coefficient (ICC) was calculated to evaluate inter-observer agreement. Features with an ICC value >0.75 were considered to demonstrate good agreement and were retained for subsequent analysis.

Radiomics and DL features extraction

Radiomics features extraction

All handcrafted features were extracted using Py radiomics (http://pyradiomics.readthedocs.io). The software processed both Original Images and ROI-Containing Images, categorizing the features into three groups: (I) geometric shape; (II) intensity; and (III) texture. Geometric shape features characterize the three-dimensional (3D) form of the ROI. Intensity features depict the primary statistical distribution of voxel intensities within the ROI. Texture features illustrate the spatial distribution of patterns or intensities in the second and higher orders. Various techniques have been utilized to extract textural features, including the gray-level co-occurrence matrix (GLCM), gray-level difference method (GLDM), gray-level run length matrix (GLRLM), gray-level size zone matrix (GLSZM), and neighborhood gray-tone difference matrix (NGTDM).

DL features extraction

We employed ResNet50 as the CNN framework to extract DL features due to its proven performance in handling complex image analysis tasks. ResNet50, a 50-layer deep network, is renowned for effectively managing vanishing gradient issues through residual connections, making it an ideal choice for feature extraction in medical imaging.

The model was initially pre-trained on the ImageNet database, which contains over a million images across a thousand classes. This pre-training allowed ResNet50 to learn generalizable low-level features, such as edges, textures, and patterns, and higher-level representations that are transferable to other image classification tasks. Leveraging this pre-trained model significantly reduced computational time and training data requirements for our study.

In our approach, these ROIs, representing the anatomical regions under investigation, were resized and preprocessed to match the input size required by the ResNet50 framework.

Once prepared, the ROIs were input into ResNet50, and we utilized the penultimate average pooling layer of the network for feature extraction. This specific layer, situated just before the final fully connected classification layer, aggregates spatial information across the feature maps, providing a compact and high-level representation of the input image. We effectively captured discriminative features relevant to our classification task by leveraging this layer.

This transfer learning approach enabled us to extract robust, high-dimensional feature vectors from the ROIs, which were subsequently used in downstream analysis and classification, showcasing the potential of DL in radiomics studies.

Feature selection and fusion

We conducted the Mann-Whitney U test and feature screening for all features, setting a significance threshold of P<0.05. For features with high repeatability, Spearman’s correlation coefficient was also calculated, and one feature with a correlation coefficient greater than 0.9 was retained. We employed a greedy recursive deletion strategy for feature filtering to preserve the ability to depict features as much as possible. This approach involved removing the feature with the greatest redundancy from the current set at each step. The least absolute shrinkage and selection operator (LASSO) regression model was used for signature construction. LASSO works by shrinking all regression coefficients towards zero, setting the coefficients of many irrelevant features to zero based on a regulation weight λ. To determine the optimal λ value, we used 10-fold cross-validation with a minimum criterion. The final λ value that resulted in the lowest cross-validation error was selected. The retained features with nonzero coefficients were then used for regression model fitting and were combined to create a radiomics signature. Subsequently, we obtained a radiomics score for each patient using a linear combination of these retained features, weighted by their model coefficients. The Python scikit-learn package facilitated the LASSO regression modeling.

Owing to the characteristics of the DL features (2,048 dimensions), we employed principal component analysis (PCA) to reduce the dimensions to 32. This step balanced the feature set, enhanced model generalization, and reduced the risk of overfitting. Next, the radiomics features extracted from the ROI were combined with the selected DL features. We standardized these features using the Z-score method, calculating the mean and variance for each feature. Each feature column was then transformed into a standard normal distribution by subtracting the mean and dividing by the variance, resulting in the construction of the DLR features. Feature selection followed the same process used for the radiomics features to identify the optimal subset of fusion features.

Screening of clinical features began with baseline statistical analysis, employing univariate and multivariate logistic regression (LR) to identify significantly different variables and extract features with P<0.05. The selection process mirrored that used in radiomics. We plotted receiver operating characteristic (ROC) curves to select the optimal clinical features. In the final model, we included the top three features with the highest area under the curve (AUC) values. To clarify, the feature selection, PCA, and feature normalization steps were performed exclusively on the training set. After these processes were applied to the training data, the models were “frozen”, meaning that the learned parameters, such as the selected features, principal components, and normalization parameters, were then fixed and not re-trained or modified. These frozen models were subsequently applied to the test set, ensuring that no information from the test set influenced the feature engineering process or model training.

Clinical-based DLR nomogram (DLRN) construction and validation

After feature screening and fusion, we obtained four sets of features: clinical, radiomics, DL, and DLR. These features were input into various machine learning models, including LR, NaiveBayes, support vector machine (SVM), k-nearest neighbor (KNN), gradient boosting machine (GradientBoosting), adaptive boosting (AdaBoost), multi-layer perceptron (MLP), and light GradientBoosting (LightGBM), to construct the risk model. We compared the performances of these models and used a five-fold cross-test to prevent overfitting, which resulted in the final DLR model.

The optimal Rad-score was selected and fused with clinical features based on the model’s performance to construct a nomogram. We assessed the model’s efficacy by plotting the ROC curves and calculating the AUC, precision, sensitivity, and specificity. The DeLong test was applied to compare the performance of different models. Decision curve analysis (DCA) was utilized to assess the clinical benefit of the nomogram.

Statistical analysis

Statistical analyses were performed using SPSS (version 26.0; SPSS Inc., Armonk, NY, USA), R (version 4.2.0), and GraphPad Prism (version 8.2; GraphPad Software Inc., San Diego, CA, USA). Non-normally distributed data were analyzed with the Mann-Whitney U and Kruskal-Wallis tests, while normally distributed data were assessed using the t-test or Fisher’s exact test. The Chi-squared test (χ2) was applied to evaluate categorical variables. Univariate and multivariate LR analyses were employed to identify significant clinical and SDCT parameters. ROC curves were generated to assess predictive accuracy.

A nomogram combining radiomics and clinical features was developed, and calibration curves and the Hosmer-Lemeshow analytical fit test were applied to assess calibration capability. Statistical significance was defined as P<0.05.


Results

Clinical baseline characteristics

A total of 792 lung GGN cases were analyzed, with 582 in the internal training set and 210 in the external test set. Based on postoperative pathological results, they were divided into 296 glandular precursor lesions [210+86] and 496 adenocarcinomas [372+124]. Table 1 shows the clinical characteristics and SDCT parameters of the study cohort. Among the quantitative parameters, Daverage, Dsolid, CT_value, CT40 keV, CT100 keV, Zeff_a, ED_a, TAP, and TK1 showed significant differences between the two groups. ED_a, TAP, and TK1 showed a positive correlation with pathological invasiveness (all P<0.05), whereas Zeff_a was negatively correlated with invasiveness (P<0.001). Except for the vacuole sign (P=0.058), all qualitative parameters showed statistical differences between the two groups. Following univariate and multivariate LR analyses (Table 2), three features with high AUCs (Appendix 2) were selected (Zeff_a, ED_a, and TAP).

Table 1

Clinical characteristics and SDCT multi-parameters of the study population

Features Train Test
All
(n=582)
PGL
(n=210)
Adenocarcinomas (n=372) P All
(n=210)
PGL
(n=86)
Adenocarcinomas (n=124) P
TAP 115.46±25.35 105.92±25.84 120.80±23.46 <0.001 113.88±23.06 107.25±21.96 118.49±22.76 <0.001
TK1 1.02±1.26 0.46±0.54 1.34±1.42 <0.001 0.91±1.14 0.51±0.76 1.19±1.27 <0.001
Age (years) 59.29±10.12 56.96±10.18 60.59±9.87 <0.001 59.21±10.21 55.13±10.39 62.04±9.09 <0.001
Daverage 14.97±6.50 11.63±4.99 16.84±6.51 <0.001 15.33±6.98 10.51±3.65 18.67±6.78 <0.001
Dsolid 5.07±6.51 1.84±0.82 6.88±6.09 <0.001 4.41±5.63 1.64±2.82 6.33±6.26 <0.001
CT value −398.63±199.82 −518.72±155.12 −331.34±190.44 <0.001 −390.91±202.00 −527.03±137.94 −296.51±185.05 <0.001
CT40 keV_a −313.64±214.76 −431.60±177.17 −247.54±205.63 <0.001 −287.36±218.13 −429.95±156.57 −188.47±199.59 <0.001
CT100 keV_a −428.37±210.14 −553.63±158.89 −358.19±202.73 <0.001 −409.67±207.65 −551.27±135.74 −311.47±191.81 <0.001
Slope_a 1.91±1.05 2.02±1.21 1.85±0.95 0.24 2.04±0.81 2.02±0.79 2.05±0.83 0.85
ID_a 1.72±0.66 1.68±0.68 1.74±0.66 0.32 1.78±0.65 1.69±0.56 1.84±0.70 0.35
ID_aorta_a 11.71±20.88 11.04±6.66 12.08±25.61 0.12 10.65±1.87 10.76±1.70 10.57±1.98 0.47
NID_a 0.17±0.07 0.16±0.07 0.17±0.07 0.067 0.17±0.06 0.16±0.05 0.17±0.06 0.12
Zeff_a 10.46±3.6 9.41±0.77 11.05±4.5 <0.001 8.78±0.76 9.28±0.72 8.42±0.57 <0.001
ED_a 50.97±19.52 39.91±15.13 57.17±18.97 <0.001 52.40±18.90 41.19±14.02 60.18±17.95 <0.001
CT40 keV_v −344.92±203.74 −451.78±173.91 −285.04±194.70 <0.001 −330.83±191.13 −443.80±151.54 −252.48±176.34 <0.001
CT100 keV_v −452.67±192.40 −559.87±158.66 −392.61±183.48 <0.001 −446.76±184.07 −555.37±138.32 −371.43±174.31 <0.001
Slope_v 1.81±0.83 1.80±0.94 1.82±0.76 0.37 1.95±0.75 1.86±0.74 2.02±0.75 0.16
ID_v 1.54±0.57 1.47±0.57 1.58±0.57 0.03 1.69±0.70 1.53±0.54 1.81±0.78 0.01
ID_aorta_v 4.75±1.16 4.71±1.13 4.77±1.18 0.41 4.76±1.14 4.79±1.19 4.73±1.12 0.53
NID_v 0.32±0.17 0.30±0.17 0.33±0.16 0.01 0.29±0.16 0.28±0.18 0.29±0.16 0.49
Zeff_v 8.88±0.68 8.91±0.74 8.86±0.65 0.61 8.75±0.59 8.82±0.62 8.70±0.57 0.24
ED_v 49.81±18.80 47.40±19.43 51.16±18.32 0.004 55.58±20.09 54.42±21.05 56.38±19.45 0.48
AEF 1.19±0.42 1.17±0.45 1.20±0.40 0.10 1.20±0.83 1.23±0.96 1.18±0.73 0.98
Sex 0.23 0.19
   Male 201 (34.54) 65 (31.10) 136 (36.46) 63 (30.00) 21 (24.42) 42 (33.87)
   Female 381 (65.46) 144 (68.90) 237 (63.54) 147 (70.00) 65 (75.58) 82 (66.13)
Location 0.86 0.82
   LLL 89 (15.29) 30 (14.35) 59 (15.82) 29 (13.81) 14 (16.28) 15 (12.10)
   LUL 142 (24.40) 53 (25.36) 89 (23.86) 56 (26.67) 23 (26.74) 33 (26.61)
   RLL 87 (14.95) 29 (13.88) 58 (15.55) 41 (19.52) 17 (19.77) 24 (19.35)
   RML 36 (6.19) 11 (5.26) 25 (6.70) 8 (3.81) 2 (2.33) 6 (4.84)
   RUL 228 (39.18) 86 (41.15) 142 (38.07) 76 (36.19) 30 (34.88) 46 (37.10)
GGN character <0.001 <0.001
   pGGN 253 (43.47) 147 (70.33) 106 (28.42) 93 (44.29) 60 (69.77) 33 (26.61)
   mGGN 329 (56.53) 62 (29.67) 267 (71.58) 117 (55.71) 26 (30.23) 91 (73.39)
Margin <0.001 <0.001
   Negative 217 (37.29) 120 (57.42) 97 (26.01) 88 (41.90) 63 (73.26) 25 (20.16)
   Positive 365 (62.71) 89 (42.58) 276 (73.99) 122 (58.10) 23 (26.74) 99 (79.84)
Internal vascular morphology <0.001 <0.001
   Negative 296 (50.86) 145 (69.38) 151 (40.48) 140 (66.67) 77 (89.53) 63 (50.81)
   Positive 286 (49.14) 64 (30.62) 222 (59.52) 70 (33.33) 9 (10.47) 61 (49.19)
Internal bronchial morphology <0.001 <0.001
   Negative 382 (65.64) 175 (83.73) 207 (55.50) 162 (77.14) 83 (96.51) 79 (63.71)
   Positive 200 (34.36) 34 (16.27) 166 (44.50) 48 (22.86) 3 (3.49) 45 (36.29)
Pleural indentation <0.001 <0.001
   Negative 325 (55.84) 155 (74.16) 170 (45.58) 144 (68.57) 77 (89.53) 67 (54.03)
   Positive 257 (44.16) 54 (25.84) 203 (54.42) 66 (31.43) 9 (10.47) 57 (45.97)
Vacuole sign 0.058 0.25
   Negative 520 (89.35) 194 (92.82) 326 (87.40) 188 (89.52) 80 (93.02) 108 (87.10)
   Positive 62 (10.65) 15 (7.18) 47 (12.60) 22 (10.48) 6 (6.98) 16 (12.90)

Data are presented as mean ± standard deviation or number (percentage). SDCT, spectral dual-layer detector computed tomography; PGL, precursor glandular lesion; TAP, tumor abnormal protein; TK1, thymidine kinase 1; CT, computed tomography; a, arterial phase; slope, spectral curve slope; ID, iodine density; NID, normalized iodine density; Zeff, effective atomic number; ED, electron density; v, venous phase; AEF, arterial enhancement fraction; LLL, left lower lobectomy; LUL, left upper lobectomy; RLL, right lower lobectomy; RML, right middle lobectomy; RUL, right upper lobectomy; GGN, ground-glass nodule; pGGN, pure ground-glass nodule; mGGN, mixed ground-glass nodule.

Table 2

Univariate and multivariate analysis of clinical features and quantitative and qualitative parameters derived from SDCT

Feature name Univariate Multivariate
OR 95% CI P OR 95% CI P
Sex 1.646 1.383–1.958 <0.001
Age 1.011 1.008–1.013 <0.001
TAP 1.006 1.005–1.007 <0.001 1.012 1.004–1.02 0.01
TK1 2.539 2.138–3.016 <0.001 1.347 0.465–2.549 0.050
Location 1.181 1.122–1.244 <0.001
GGN character 4.306 3.414–5.43 <0.001
Daverage 1.057 1.046–1.067 <0.001
Dsolid 1.189 1.153–1.226 <0.001
Margin 3.101 2.537–3.789 <0.001
Internal vascular morphology 3.469 2.746–4.38 <0.001 0.49 0.279–0.862 0.04
Internal bronchial morphology 4.882 3.582–6.653 <0.001 2.024 1.203–3.404 0.03
Pleural indentation 3.759 2.921–4.836 <0.001
Vacuole sign 3.133 1.923–5.104 <0.001
CT value 1 0.999–1 0.16
CT40 keV_a 1 1–1 0.82
CT100 keV_a 1 0.999–1 0.12
Slope_a 1.223 1.145–1.307 <0.001
ID_a 1.368 1.264–1.48 <0.001
NID_a 1.045 1.031–1.06 <0.001
Zeff_a 1.066 1.049–1.083 <0.001 0.661 0.516–0.846 0.006
ED_a 1.017 1.014–1.019 <0.001 1.029 1.007–1.051 0.03
CT40 keV_v 1 0.999–1 0.22
CT100 keV_v 1 0.999–1 0.01
Slope_v 1.318 1.224–1.418 <0.001
ID_v 1.456 1.332–1.592 <0.001
NID_v 5.438 3.604–8.207 <0.001
Zeff_v 1.057 1.041–1.074 <0.001
ED_v 1.012 1.009–1.014 <0.001
AEF 1.587 1.413–1.782 <0.001 1.715 1.105–2.662 0.043

SDCT, spectral dual-layer detector computed tomography; OR, odds ratio; CI, confidence interval; TAP, tumor abnormal protein; TK1, thymidine kinase 1; GGN, ground-glass nodule; CT, computed tomography; a, arterial phase; slope, spectral curve slope; ID, iodine density; NID, normalized iodine density; Zeff, effective atomic number; ED, electron density; v, venous phase; AEF, arterial enhancement fraction.

Features selection

A total of 107 radiomics features were initially extracted from SDCT-Zeff. After LASSO screening and 10-fold cross-validation, a penalty coefficient (λ=0.0047) was determined, yielding 33 radiomics features. Using the ResNet50 model architecture, which was pre-trained on the ImageNet database and then fed with the maximum-level ROIs, we extracted DL features with 2,040 dimensions. PCA reduced the dimensions to 32, and following LASSO and 10-fold cross-validation testing, a penalty coefficient (λ=0.0146) was determined, resulting in 11 DL features.

DLR was obtained by fusing these features after LASSO and a 10-fold cross-validation test, with a penalty coefficient λ of 0.0110, yielding 33 features (Figures 2,3 and Appendix 3).

Figure 2 Feature selection and cross-validation performance in LASSO regression. (A) Coefficient distribution and (B) MSE of 10-fold cross-validation in the LASSO feature selection, also showing the optimal penalty coefficient λ of 0.011 for DLR. MSE, mean squared error; LASSO, least absolute shrinkage and selection operator; DLR, deep learning radiomics.
Figure 3 Visualization of DL models. (A) Local entropy image; (B) ROI; (C) cluster results showing that the model had a good fusion effect. Color intensity indicates the activation value at each spatial location. DL, deep learning; ROI, region of interest.

Predictive performance of the models

Multiple features were incorporated into various machine learning models. The three best-performing radiomics models were GradientBoosting (AUC =0.908), LightGBM (AUC =0.897), and SVM (AUC =0.893). For DL, the top-performing models were LightGBM (AUC =0.929), GradientBoosting (AUC =0.858), and SVM (AUC =0.858). The three best DLR models were LightGBM (AUC =0.974), SVM (AUC =0.917), and MLP (AUC =0.903). The three best-performing clinical models were LightGBM (AUC =0.904), KNN (AUC =0.848), and GradientBoosting (AUC =0.832) (Appendices 4-7).

Based on these results, we selected the LightGBM algorithm as the framework for model construction to avoid offsets caused by algorithm confusion. According to the comprehensive evaluation, LightGBM was identified as the optimal algorithm. Among all models, DLR exhibited the best performance (Table 3 and Figure 4). The AUC in the training and test sets were 0.974 [95% confidence interval (CI): 0.963–0.983] and 0.827 (95% CI: 0.770–0.884), respectively. The AUC for the nomogram established after combining clinical features (Zeff_a, ED_a, and TAP) were 0.983 (95% CI: 0.974–0.990) in the training set and 0.833 (95% CI: 0.779–0.885) in the test set. The DeLong test indicated that the nomogram outperformed the other models in terms of diagnostic performance (all P<0.05) (Figure 5). The calibration curve demonstrated strong concordance between the predicted and observed values in both the training and test cohorts (Figure 6). For the Hosmer-Lemeshow test, P>0.05 indicated good adaptability of the nomogram. Each model was evaluated using DCA and compared with scenarios without using the predictive model (i.e., all or no scenarios). Nomograms achieved significant clinical benefits, and compared with the clinical, radiomics, DL, and DLR models, the nomogram’s threshold probability range was higher than that of the other models (Figure 7).

Table 3

Diagnostic performance of various models based on LightGBM

Cohort Signature Accuracy AUC (95% CI) Sensitivity Specificity PPV NPV Threshold
Train Clinic 0.823 0.904 (0.879–0.927) 0.858 0.761 0.865 0.750 0.581
Radiomics 0.833 0.897 (0.869–0.923) 0.826 0.847 0.906 0.731 0.657
DL 0.840 0.929 (0.910–0.948) 0.788 0.933 0.955 0.712 0.684
DLR 0.909 0.974 (0.963–0.983) 0.877 0.967 0.979 0.815 0.659
Nomogram 0.923 0.983 (0.974–0.990) 0.890 0.981 0.988 0.833 0.659
Test Clinic 0.805 0.822 (0.761–0.882) 0.855 0.733 0.822 0.778 0.613
Radiomics 0.729 0.765 (0.699–0.829) 0.790 0.640 0.760 0.679 0.631
DL 0.748 0.758 (0.691–0.825) 0.790 0.686 0.784 0.694 0.667
DLR 0.762 0.827 (0.770–0.884) 0.758 0.767 0.825 0.687 0.700
Nomogram 0.724 0.833 (0.779–0.885) 0.713 0.942 0.934 0.604 0.756

LightGBM, light gradient boosting machine; AUC, area under the curve; CI, confidence interval; PPV, positive predictive value; NPV, negative predictive value; clinic, clinical signature; DL, deep learning; DLR, deep learning radiomics.

Figure 4 ROC comparison of various models based on LightGBM. (A) Training and (B) test cohorts; the AUC of the nomogram was higher than that of any other model. AUC, area under the curve; clinic, clinical signature; Rad, radiomics; DL, deep learning; DLR, deep learning radiomics; ROC, receiver operating characteristic; LightGBM, light gradient boosting machine.
Figure 5 DeLong tests for the training cohorts revealed that the nomogram outperformed models constructed with other features (all P<0.05). Clinic, clinical signature; Rad, radiomics; DL, deep learning; DLR, deep learning radiomics.
Figure 6 The calibration curves of different models in the (A) training and (B) test cohorts demonstrated a close agreement between model predictions and actual observations, with a P>0.05 obtained from the Hosmer-Lemeshow test. The horizontal axis represents the predicted probability, and the vertical axis represents the actual probability. The diagonal dotted line in the graph signifies perfect alignment between predicted and actual probabilities under ideal conditions. Clinic, clinical signature; Rad, radiomics; DL, deep learning; DLR, deep learning radiomics.
Figure 7 DCA in the (A) training and (B) test cohorts. The X-axis represents threshold probability; and the Y-axis represents net benefit. The black line represents all positive assumptions; and the dashed line represents negative assumptions. DCA, decision curve analysis; clinic, clinical signature; Rad, radiomics; DL, deep learning; DLR, deep learning radiomics.

Nomogram construction

The three selected clinical features (Zeff_a, ED_a, and TAP) and DLR were combined to construct a clinical-based DLRN, which intuitively demonstrated the efficacy of the model and facilitated individualized risk prediction and clinical promotion (Figure 8).

Figure 8 Clinical-based DLRN. In the nomogram, a vertical line was drawn on the point axis to obtain individual points corresponding to each model under different values. The scores of all the features were added together to obtain the patient’s total points, and a vertical line was drawn downward at the position of the total score to obtain the prediction probability. TAP, tumor abnormal protein; Zeff, effective atomic number; ED, electron density; DLR, deep learning radiomics; DLRN, deep learning radiomics nomogram.

Discussion

This study innovatively used SDCT-Zeff to outline the ROI and extract traditional radiomics features, and the ResNet50 model architecture was used to extract DL features and reduce dimensionality after pre-training the model in the ImageNet database. After fusing the above features, LASSO and 10-fold cross-testing were used to screen the features, and multiple machine-learning algorithms were used for modeling. The optimal model was selected, and external testing was conducted to verify model performance using a clinical-based DLRN. This nomogram showed excellent prediction performance and good calibration for stage I LA invasiveness. The prediction probability was greater than that of the single-indicator model, suggesting that combining the two methods improved the identification of PGLs and adenocarcinomas, offering enhanced clinical benefits.

In a previous study of 160 cases of LA, Fan et al. (15) found that radiomics could distinguish IAC from noninvasive lesions, outperforming CT morphology or CT value. Zhu et al. (10) selected 16 radiological features to build a model that showed significantly better performance (AUC =0.828) than the clinical semantic model (AUC =0.746). However, these studies relied on conventional CT, and the diagnostic performance of their prediction models was somewhat lower compared to our study. This nomogram’s combination of clinical, radiomic, and DL features provides a comprehensive, multi-dimensional approach that can assist clinicians in identifying high-risk patients who may require more intensive monitoring or treatment. We enhance the model’s clinical applicability by developing an integrated nomogram that provides both a risk score and a visual representation of the patient’s clinical profile. This is important in making complex machine learning models more accessible and actionable in real-world clinical settings.

Radiomics extraction features have innovatively developed new imaging biomarkers that quantitatively display tumor cell proliferation and activity. Multiple studies have shown that traditional CT radiomics effectively identify benign and malignant pulmonary nodules (15,16). Continuously upgraded DL technology can automatically identify low-dimensional features from the original high-dimensional images in a data-driven manner, achieving end-to-end learning. The former requires manual ROI determination, introducing inevitable instability into the results and affecting the performance and generalization ability of the model to a certain extent. Conversely, DL reduces the cost of manual annotation and improves the efficiency and objectivity of data analysis. It has been confirmed that DL can better complete the diagnostic task of LA (17,18), and a CNN model based on residual learning improves the classification performance between IAC and non-IAC nodules (19).

SDCT helps identify different histological types of lung cancer and uses a variety of quantitative parameters to evaluate the infiltration degree of early-stage LA (20,21). Wu et al. (22) found that the radiomics model of spectral CT could predict the expression of vascular endothelial growth factor (VEGF) and epidermal growth factor receptor (EGFR) in peripheral lung cancer with AUC values of 0.867 and 0.950, respectively. Li et al. (23) developed a DL model for predicting malignant lymph nodes in thyroid cancer, which demonstrated superior diagnostic accuracy and sensitivity compared to spectral parameters. Nevertheless, few studies are currently combining multiple quantitative parameters of SDCT and radiomics-DL for comprehensive research. In this context, this study further explored the value of SDCT-Zeff-based DLR and clinical features, such as TK1 and TAP, in identifying the invasiveness of early-stage LA. We found that DLR had a better prediction performance than single models, and the DLRN constructed by coupling Zeff_a, ED_a, and TAP was more convincing.

Previous study has shown that arterial Zeff plays a crucial role in differentiating PGL from adenocarcinoma (20). Zeff assigns material composition data to each pixel, generating a color image and visualizing the boundary around the tumor. Therefore, Zeff was chosen for radiomics features, and the optimal model was obtained and trained by comparing the performance of different machine learning classifiers. Both radiomics and DL showed good diagnostic performance and fit. Study has found that some radiomics features have clinical value in identifying invasive LA. For instance, the model proposed by Lin et al. (16) demonstrated high accuracy in distinguishing between benign and malignant lung nodules in the Lung Nodule Analysis 2016 (LUNA16) dataset, achieving an accuracy of 92.8%. Additionally, it performed well in classifying lung nodules into various pathological subtypes, with an F1-score of 75.5%.

This study also found that radiomics and DL models based on SDCT-Zeff could help predict aggressive early-stage LA, with an accuracy of 83.3% and 84%, respectively, in the test set. Zheng et al. (24) extracted radiomics features from virtual monoenergetic images (VMIs) (50 and 150 keV). The results confirmed that dual-energy CT (DECT)-based radiomics was better than the clinical model in distinguishing between IAC and MIA, showing a higher AUC (0.957 vs. 0.929 in the training set) and greater net benefit. In this study, the diagnostic performance of radiomics was slightly lower than that of the clinical model (P=0.65). In contrast, the prediction probability of the clinically based DLRN was higher than that of a single method. The differences between the above studies were due to differences in target inclusion and spectral parameter selection.

To more effectively detect both macroscopic and microscopic changes in GGNs and provide a comprehensive view of tumor heterogeneity, the overall diagnostic accuracy was enhanced by integrating radiomics with clinical data, including serum markers and demographics (25,26). By contrast, Balagurunathan et al. (27) found that adding clinical features did not improve model performance, highlighting the importance of radiomics. Several studies have attempted to compare the clinical features with the added value of radiomics. They can be distinguished by improving the performance of machine learning algorithms. This experiment developed a new nomogram based on the SDCT-Zeff DLR and clinical parameters (Zeff_a, ED_a, and TAP), and different machine learning classifications were compared. An optimal model was obtained, and external testing was conducted. Finally, LightGBM was chosen to build a nomogram.

LightGBM is a machine learning algorithm built on Gradient Boosting Decision Tree framework, which has demonstrated notable advantages in lung cancer classification tasks. First, LightGBM has efficient performance and can handle large-scale datasets and high-dimensional sparse data, accelerating the model training and prediction process. Second, LightGBM adopts a histogram-based decision tree segmentation algorithm that can handle nonlinear relationships and feature interactions and improve the expression ability and accuracy of the model. Meng et al. (28) proposed a LightGBM ensemble learning method that constructs a prognostic model based on immune-related genes (IRGs) and clinical data. The accuracy of this model in predicting the survival rates of the three LA groups was 96%, 98%, and 96%, respectively. Hamed et al. (29) found that by combining CNN models with LightGBM, histopathological images of lung tissue can be quickly identified and classified. On the other hand, we note that the GradientBoosting model, demonstrated an AUC of 0.833 in the test set, and the slight difference in AUC values suggests that both models contribute effectively to the differentiation task. We suggest that the choice of model could depend on factors such as clinical context, interpretability, and computational resources.

This study used the ResNet50 model architecture to extract DL features after pre-training in the ImageNet database. ImageNet is a natural image database with good versatility. For instance, Zhou et al. (30) used ResNet50 to pre-train from ImageNet, constructing a DLRN to predict the histological risk of thymic epithelial tumors (TETs), with AUC =0.965 in the training set and AUC =0.786 in the test set. Li et al. (31) implemented and integrated the ResNet50 model, performing well in distinguishing benign from malignant small pulmonary nodules (<20 mm) across various locations, achieving an accuracy of 0.943 in the test set, with sensitivity and specificity of 0.964 and 0.911, respectively.

This study found that higher TAP was related to invasiveness and may be a potential biomarker for predicting malignant GGNs. After adding that to the model, the overall prediction performance improved. TAP is caused by glycosylation of cancer cells, which can release complex abnormal glycoproteins during metabolism. TAP is related to the early stages of cancer development and reflects the number and extent of cancer cells (32). Tong et al. (13) demonstrated that combining TAP could further enhance the sensitivity of lung cancer diagnosis, making it well-suited for early detection screening. The diagnostic value of TAP is better than that of TK1 or classic lung cancer markers (such as carcinoembryonic antigen, cytokeratin 19 fragment, and neuron-specific enolase), confirming that TAP is a reliable marker for predicting early-stage LA.

There are certain limitations in this study. Firstly, the patients included were candidates for surgery and hence there was a selection bias, and the number of cases was limited; therefore, further large-sample, multicenter investigations are needed. Secondly, the study population was relatively homogeneous, including patients with early-stage LA; the results may not be generalizable to other populations, such as patients with more advanced disease or different histological subtypes. Studies that include multiple nomogram types may improve the utility of nomograms. Moreover, the imbalance between the two classes (approximately one-to-three ratio) may significantly affect the model’s performance and generalizability. We focused on metrics such as AUC/sensitivity rather than relying solely on overall accuracy to assess the model’s performance in a more balanced manner. While this modality offers valuable insights, we acknowledge that its use may be limited by the availability of SDCT scanners and its application to other imaging modalities, such as conventional CT. Possible future research directions include expanding the size and diversity of the dataset, optimizing the feature selection methods, and enhancing the interpretability of the model.


Conclusions

The coupled nomogram integrating SDCT-Zeff DLR with clinical features demonstrated improved predictive performance and was particularly effective in detecting GGN-featured glandular precursor lesions and adenocarcinomas. It provides a foundation for managing GGNs and offers valuable insights for preoperative evaluation. Multiple machine-learning algorithms could be used for modeling, each performing well and offering unique advantages.


Acknowledgments

The authors acknowledge all the participants and survey staffs for their participation.


Footnote

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-24-726/rc

Data Sharing Statement: Available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-24-726/dss

Peer Review File: Available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-24-726/prf

Funding: This study was supported by the National Natural Science Foundation of China (No. 82071920).

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-24-726/coif). All authors report that this study was supported by the National Natural Science Foundation of China (No. 82071920). X.L. is employed by Philips Healthcare China, Inc., CT Clinical Science. The authors have no other conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). The study was approved by the Medical Ethics Committee of Shengjing Hospital of China Medical University (No. 2022PS1055K) and informed consent was taken from all the patients.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.


References

  1. Zheng RS, Chen R, Han BF, et al. Cancer incidence and mortality in China, 2022. Zhonghua Zhong Liu Za Zhi 2024;46:221-31. [Crossref] [PubMed]
  2. Siegel RL, Giaquinto AN, Jemal A. Cancer statistics, 2024. CA Cancer J Clin 2024;74:12-49. Erratum in: CA Cancer J Clin 2024;74:203. [Crossref] [PubMed]
  3. Nicholson AG, Tsao MS, Beasley MB, et al. The 2021 WHO Classification of Lung Tumors: Impact of Advances Since 2015. J Thorac Oncol 2022;17:362-87. [Crossref] [PubMed]
  4. Oncology Society of Chinese Medical Association. Chinese Medical Association guideline for clinical diagnosis and treatment of lung cancer (2023 edition). Zhonghua Yi Xue Za Zhi 2023;103:2037-74. [Crossref] [PubMed]
  5. Doerner J, Hauger M, Hickethier T, et al. Image quality evaluation of dual-layer spectral detector CT of the chest and comparison with conventional CT imaging. Eur J Radiol 2017;93:52-8. [Crossref] [PubMed]
  6. Wang Y, Chen H, Chen Y, et al. A semiautomated radiomics model based on multimodal dual-layer spectral CT for preoperative discrimination of the invasiveness of pulmonary ground-glass nodules. J Thorac Dis 2023;15:2505-16. [Crossref] [PubMed]
  7. Bousse A, Kandarpa VSS, Rit S, et al. Systematic Review on Learning-based Spectral CT. IEEE Trans Radiat Plasma Med Sci 2024;8:113-37. [Crossref] [PubMed]
  8. Huang S, Yang J, Shen N, et al. Artificial intelligence in lung cancer diagnosis and prognosis: Current application and future perspective. Semin Cancer Biol 2023;89:30-7. [Crossref] [PubMed]
  9. Prosper AE, Kammer MN, Maldonado F, et al. Expanding Role of Advanced Image Analysis in CT-detected Indeterminate Pulmonary Nodules and Early Lung Cancer Characterization. Radiology 2023;309:e222904. [Crossref] [PubMed]
  10. Zhu M, Yang Z, Wang M, et al. A computerized tomography-based radiomic model for assessing the invasiveness of lung adenocarcinoma manifesting as ground-glass opacity nodules. Respir Res 2022;23:96. [Crossref] [PubMed]
  11. Xu H, Zhu N, Yue Y, et al. Spectral CT-based radiomics signature for distinguishing malignant pulmonary nodules from benign. BMC Cancer 2023;23:91. [Crossref] [PubMed]
  12. Jiang ZF, Wang M, Xu JL. Thymidine kinase 1 combined with CEA, CYFRA21-1 and NSE improved its diagnostic value for lung cancer. Life Sci 2018;194:1-6. [Crossref] [PubMed]
  13. Tong H, Dan B, Dai H, et al. Clinical application of serum tumor abnormal protein combined with tumor markers in lung cancer patients. Future Oncol 2022;18:1357-69. [Crossref] [PubMed]
  14. Gu W, Chen Y, Zhu H, et al. Development and validation of CT-based radiomics deep learning signatures to predict lymph node metastasis in non-functional pancreatic neuroendocrine tumors: a multicohort study. EClinicalMedicine 2023;65:102269. [Crossref] [PubMed]
  15. Fan L, Fang M, Li Z, et al. Radiomics signature: a biomarker for the preoperative discrimination of lung invasive adenocarcinoma manifesting as a ground-glass nodule. Eur Radiol 2019;29:889-97. [Crossref] [PubMed]
  16. Lin CY, Guo SM, Lien JJ, et al. Combined model integrating deep learning, radiomics, and clinical data to classify lung nodules at chest CT. Radiol Med 2024;129:56-69. [Crossref] [PubMed]
  17. Perez-Johnston R, Araujo-Filho JA, Connolly JG, et al. CT-based Radiogenomic Analysis of Clinical Stage I Lung Adenocarcinoma with Histopathologic Features and Oncologic Outcomes. Radiology 2022;303:664-72. [Crossref] [PubMed]
  18. Hu X, Gong J, Zhou W, et al. Computer-aided diagnosis of ground glass pulmonary nodule by fusing deep learning and radiomics features. Phys Med Biol 2021;66:065015. [Crossref] [PubMed]
  19. Gong J, Liu J, Hao W, et al. A deep residual learning network for predicting lung adenocarcinoma manifesting as ground-glass nodule on CT images. Eur Radiol 2020;30:1847-55. [Crossref] [PubMed]
  20. Wang T, Yue Y, Fan Z, et al. Spectral Dual-Layer Computed Tomography Can Predict the Invasiveness of Ground-Glass Nodules: A Diagnostic Model Combined with Thymidine Kinase-1. J Clin Med 2023;12:1107. [Crossref] [PubMed]
  21. Mu R, Meng Z, Guo Z, et al. Dual-layer spectral detector computed tomography parameters can improve diagnostic efficiency of lung adenocarcinoma grading. Quant Imaging Med Surg 2022;12:4601-11. [Crossref] [PubMed]
  22. Wu L, Li J, Ruan X, et al. Prediction of VEGF and EGFR Expression in Peripheral Lung Cancer Based on the Radiomics Model of Spectral CT Enhanced Images. Int J Gen Med 2022;15:6725-38. [Crossref] [PubMed]
  23. Li S, Wei X, Wang L, et al. Dual-source dual-energy CT and deep learning for equivocal lymph nodes on CT images for thyroid cancer. Eur Radiol 2024;34:7567-79. [Crossref] [PubMed]
  24. Zheng Y, Han X, Jia X, et al. Dual-energy CT-based radiomics for predicting invasiveness of lung adenocarcinoma appearing as ground-glass nodules. Front Oncol 2023;13:1208758. [Crossref] [PubMed]
  25. Liang G, Yu W, Liu SQ, et al. The value of radiomics based on dual-energy CT for differentiating benign from malignant solitary pulmonary nodules. BMC Med Imaging 2022;22:95. [Crossref] [PubMed]
  26. Liu A, Wang Z, Yang Y, et al. Preoperative diagnosis of malignant pulmonary nodules in lung cancer screening with a radiomics nomogram. Cancer Commun (Lond) 2020;40:16-24. [Crossref] [PubMed]
  27. Balagurunathan Y, Schabath MB, Wang H, et al. Quantitative Imaging features Improve Discrimination of Malignancy in Pulmonary nodules. Sci Rep 2019;9:8528. [Crossref] [PubMed]
  28. Meng X, Tian Y, Zhang X. Screening of immune related gene and survival prediction of lung adenocarcinoma patients based on LightGBM model. Sheng Wu Yi Xue Gong Cheng Xue Za Zhi 2024;41:70-9. [Crossref] [PubMed]
  29. Hamed EA, Salem MA, Badr NL, et al. An Efficient Combination of Convolutional Neural Network and LightGBM Algorithm for Lung Cancer Histopathology Classification. Diagnostics (Basel) 2023;13:2469. [Crossref] [PubMed]
  30. Zhou H, Bai HX, Jiao Z, et al. Deep learning-based radiomic nomogram to predict risk categorization of thymic epithelial tumors: A multicenter study. Eur J Radiol 2023;168:111136. [Crossref] [PubMed]
  31. Li W, Yu S, Yang R, et al. Machine Learning Model of ResNet50-Ensemble Voting for Malignant-Benign Small Pulmonary Nodule Classification on Computed Tomography Images. Cancers (Basel) 2023;15:5417. [Crossref] [PubMed]
  32. Li LX, Zhang B, Gong RZ. Insights into the role of tumor abnormal protein in early diagnosis of cancer: A prospective cohort study. Medicine (Baltimore) 2020;99:e19382. [Crossref] [PubMed]
Cite this article as: Wang T, Fan Z, Yue Y, Lu X, Deng X, Hou Y. Spectral dual-layer detector CT-based radiomics-deep learning for predicting pathological aggressiveness of stage I lung adenocarcinoma: discrimination of precursor glandular lesions and invasive adenocarcinomas. Transl Lung Cancer Res 2025;14(2):431-448. doi: 10.21037/tlcr-24-726

Download Citation