Integrating deep learning and radiomics in the differentiation of major histological subtypes of invasive non-mucinous lung adenocarcinoma using positron emission tomography and computed tomography

Dong Wang; Huan Liu; Zhuo Cao; Weiqian Huang; Wen Fu; Li Shao; Ji Zhang; Wanyu Su; Xianwen Yu; Ce Han; Yao Ai; Congying Xie; Xiance Jin

doi:10.21037/tlcr-2025-333

Original Article

Integrating deep learning and radiomics in the differentiation of major histological subtypes of invasive non-mucinous lung adenocarcinoma using positron emission tomography and computed tomography

Dong Wang^1#, Huan Liu^1#, Zhuo Cao^2#, Weiqian Huang¹, Wen Fu¹, Li Shao¹, Ji Zhang¹, Wanyu Su^1,3, Xianwen Yu^1,3, Ce Han¹, Yao Ai¹, Congying Xie^1,4, Xiance Jin^1,5

¹Department of Radiation and Medical Oncology, the First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China; ²Department of Respiratory, Lishui People’s Hospital, Lishui, China; ³Cixi Biomedical Research Institute, Wenzhou Medical University, Wenzhou, China; ⁴Department of Radiation Oncology, the Second Affiliated Hospital of Wenzhou Medical University, Wenzhou, China; ⁵School of Basic Medical Science, Wenzhou Medical University, Wenzhou, China

Contributions: (I) Conception and design: D Wang, H Liu, Y Ai; (II) Administrative support: C Xie, X Jin; (III) Provision of study materials or patients: W Huang, W Fu, Z Cao, W Su, C Han; (IV) Collection and assembly of data: L Shao, D Wang; (V) Data analysis and interpretation: D Wang, J Zhang, H Liu, X Yu; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

^#These authors contributed equally to this work.

Correspondence to: Congying Xie, PhD. Department of Radiation and Medical Oncology, the First Affiliated Hospital of Wenzhou Medical University, Nanbaixiang Street, Wenzhou 325000, China; Department of Radiation Oncology, the Second Affiliated Hospital of Wenzhou Medical University, 1111 Wenzhou Avenue, Wenzhou 325000, China. Email: wzxiecongying@163.com; Xiance Jin, PhD. Department of Radiation and Medical Oncology, the First Affiliated Hospital of Wenzhou Medical University, Nanbaixiang Street, Wenzhou 325000, China; School of Basic Medical Science, Wenzhou Medical University, Chashan Street, Wenzhou 325000, China. Email: jinxc1979@hotmail.com.

Background: Accurate identification of the main subtypes of invasive non-mucinous adenocarcinoma (INMA) is essential for individualized treatment. Given the limitations of preoperative biopsy pathological diagnosis, this study aimed to evaluate the feasibility of using positron emission tomography and computed tomography (PET/CT) radiomics features, deep learning (DL) features, and their combined models for non-invasive identification of the four main INMA subtypes prior to surgery.

Methods: A total of 386 patients from hospital one and 32 patients from hospital two with preoperative PET/CT images were enrolled as the training, internal validation cohorts and external validation cohorts, retrospectively. Radiomics features were extracted from CT, PET and PET/CT images to build radiomics models using various machine learning (ML) classifiers. DL features were extracted using a Resnet34 trained to extract DL features from the best performance model. Different fusion techniques were utilized to identify subtypes. Different fusion techniques were employed to integrate the radiomic and DL features to build a final fusion model for subtype differentiation.

Results: The radiomics model using support vector machine (SVM) classifier achieved an area under the curve (AUC) of 0.79, accuracy of 0.70, and precision of 0.70 for differentiating the four INMA subtypes. The DL model demonstrated the best performance in the internal validation cohort with an AUC and accuracy of 0.89 and 0.74, respectively. The combined mode integrating radiomic and DL features from fused PET/CT images achieved the highest performance in the external validation cohort with an AUC, accuracy and precision of 0.85, 0.81, and 0.85, respectively.

Conclusions: Integrating DL and radiomic features derived from PET/CT images is a feasible and accurate approach for differentiating the four main subtypes of INMA preoperatively.

Keywords: Invasive non-mucinous lung adenocarcinoma; pathological subtypes; deep learning (DL); radiomics

Submitted Mar 24, 2025. Accepted for publication Jul 03, 2025. Published online Sep 22, 2025.

doi: 10.21037/tlcr-2025-333

Highlight box

Key findings

• We developed a model integrating deep learning (DL) and radiomic features derived from positron emission tomography and computed tomography (PET/CT) images to differentiate the four main subtypes of invasive non-mucinous adenocarcinoma (INMA) and further assist in improving the accuracy of preoperative pathological examination.

What is known and what is new?

• Currently, studies have focused on predicting certain subtype components of invasive non-mucinous lung adenocarcinoma. However, the number of multi-classification studies aimed at further predicting the main subtypes is rather limited.

• We integrated radiomic and DL features derived from PET/CT to build a multi-classification model with higher prediction accuracy for differentiating the main subtypes of INMA.

What is the implication, and what should change now?

• Due to limited cases, the micropapillary predominant adenocarcinoma was excluded from modeling. To enhance the model's predictive power, future work should build a more robust model using numerous cases from multiple centers. INMA typically shows a complex structure with co-existing subtypes, yet this study only classified the main subtypes. It is worth trying to identify and predict the comprehensive components preoperatively.

Introduction

Lung cancer remains one of the most common malignancies and a leading cause of cancer-related death worldwide (1). The incidence rate of lung cancer is still on the rise and ranks the highest in both incidence and mortality among all malignant tumors in China (2). Invasive non-mucinous adenocarcinoma (INMA), a major subtype of lung adenocarcinoma, is classified into five subtypes: acinar predominant adenocarcinoma (APA), solid predominant adenocarcinoma (SPA), papillary predominant adenocarcinoma (PPA), lepidic predominant adenocarcinoma (LPA), and micropapillary predominant adenocarcinoma (MPA) (3). Accurate differentiation of these subtypes is clinically critical, as they exhibit distinct biological behaviors and prognostic outcomes: LPA demonstrates indolent growth pattern with favorable prognosis; APA presents intermediate invasiveness and is common in early-stage disease; SPA and MPA are highly aggressive variants associated with poor prognosis, lymphovascular invasion and early metastasis (4-7). Consequently, precise preoperative subtyping of INMA is vital for diagnosis, treatment planning (including surgical approach, resection extent, and adjuvant therapy), and prognosis prediction (8-10).

Clinically, preoperative core needle biopsy and intraoperative frozen section examination are widely used for histological subtyping (11,12). However, the accuracy of preoperative puncture biopsy and intraoperative frozen section examination is limited due to factors such as sample bias, inadequate sample size, and physician biopsy experience, all of which can be attributed to tumor heterogeneity (13). Reported concordance rates between biopsy and final surgical pathology in determining the predominant histological subtypes range only from 64% to 73.1% (14,15).

¹⁸F-fluorodeoxyglucose (¹⁸F-FDG) positron emission tomography and computed tomography (PET/CT) is essential for lung cancer diagnosis and staging (16). Nevertheless, differentiating INMA subtypes using conventional PET/CT metabolic markers alone shows limited accuracy and low interobserver agreement (17,18). With the emerging of radiomics, studies demonstrated that combination of radiomic features with PET parameters is helpful to improve the risk stratification and treatment planning (19-22). However, most studies focused on the differentiation between two or three subtypes (23,24). With the development of computer science, deep learning (DL) is able to learn various high-level semantic features directly from raw image data by using convolutional neural networks (CNNs) to improve histopathological diagnostics and reduce the workload of pathologists (25-27). Preliminary studies indicate that combining DL with radiomics is promising for binary classification tasks (28,29).

The integration of DL and radiomics is an active area of exploration (30,31). Radiomics provides rich information for the diagnosis and prediction of diseases by extracting and analyzing high-throughput features in medical images (32), while DL excels at discovering complex, latent features (33). Feature fusion methods mainly include early fusion and late fusion. Early fusion refers to the fusion of different modal images, such as the fusion of CT and PET images to extract features, the combination of features of different modal images, or the combination of radiomics features and DL features. Late fusion refers to the early training of each modal image data to obtain prediction results, and the later use of decision-making or integration of the output results of multiple models for rule fusion, such as stacking generalization integration or fusion tree integration (34,35). In lung tumor classification, evidence suggests CT-based DL-radiomics fusion improves accuracy (36).

The purpose of this study is to investigate the feasibility and accuracy of using PET/CT-derived radiomic features, DL features, as well their combination for the preoperative differentiation of the four most prevalent INMA histologic subtypes in clinical practice: APA, SPA, PPA and LPA. We present this article in accordance with the TRIPOD reporting checklist (available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-2025-333/rc).

Methods

Study design

The study comprised two main parts. First, radiomic features were extracted from CT, PET and PET/CT images to build radiomics models for differentiating the four INMA subtypes. Various machine learning (ML) classifiers were evaluated to obtain the optimal radiomics model. Subsequently, DL models were trained on CT, PET and PET/CT images using a Resnet34 network with transfer learning for subtype differentiation. DL features were then extracted from the best-performing DL model. Finally, different fusion techniques were utilized to integrate the optimal R features with the extracted DL features to build the final fusion model for subtype differentiation. The workflow of the study design is shown in Figure 1.

Figure 1 The experimental design diagrams of this study, illustrates the overall flow of the experiment. D, deep learning; E, early fusion; L, late fusion; PET/CT, positron emission tomography and computed tomography; R, radiomics.

Patients and images

INMA patients treated in institution I (the First Affiliated Hospital of Wenzhou Medical University) from January 2017 and May 2023 were retrospectively reviewed. Patients with preoperative PET/CT images within two months and with confirmed postoperative histopathological assessments were enrolled. Those with incomplete clinical records, multiple pulmonary tumor foci, or poor image quality due to artifacts or negligible PET intensification were excluded. Additional patients treated in institution II (the Second Affiliated Hospital of Wenzhou Medical University) from 2023 to 2024 with identical inclusion and exclusion criteria were enrolled as an external validation cohort. The flowchart of patient enrollment is shown in Figure 2. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Ethics Committee in Clinical Research (ECCR) of the First Affiliated Hospital of Wenzhou Medical University (No. 2019059) and individual consent for this retrospective analysis was waived. The Second Affiliated Hospital of Wenzhou Medical University was also informed and agreed the study.

Figure 2 The flow chart of patient enrollment. Institution I: The First Affiliated Hospital of Wenzhou Medical University; Institution II: The Second Affiliated Hospital of Wenzhou Medical University. APA, acinar predominant adenocarcinoma; LPA, lepidic predominant adenocarcinoma; PET, positron emission tomography; PPA, papillary predominant adenocarcinoma; SPA, solid predominant adenocarcinoma.

PET/CT images were acquired using a GEMINI TF 64-layer PET/CT machine (Philips, United States) and a UMI 780 scanner (United Imaging Healthcare in Shanghai, China) with18F- FDG injected intravenously at a dose rate of 3.7 MBq/kg for a slice thickness of 4 mm and a matrix of 144×144. CT images were acquired at a voltage of 120 kV, a tube current of 300 mA, a slice thickness of 2.5 mm, and a matrix of 512×512. Software 3Dslicer (version 5.0.3) was applied to fuse PET and CT images at a ratio of 1:1 to create a unified image displaying information from both modalities.

Radiomics features extraction and modeling

The target volumes were manually delineated as regions of interest (ROI) by a radiologist with 5 years of diagnostic experience using 3D Slicer software, and independently reviewed by a senior radiologist with over 15 years of experience. In cases of disagreement, consensus was reached through joint consultation, comparing PET metabolic activity and CT density characteristics to minimize subjective bias (37). Representative examples are shown in Figure S1. Radiomics features were extracted from PET, CT and fused PET/CT images using the Pyradiomics package (pyradiomics, https://www.python.org) (38). Extracted features included shape features, first-order histogram statistics, gray-level co-occurrence matrix (GLCM), gray-level run length matrix (GLRLM), gray-level size zone matrix (GLSZM), gray-level dependence matrix (GLDM), and neighborhood gray-tone difference matrix (NGTDM) (39).

Patients from hospital one were randomly divided into training (70%) and internal validation (30%) cohorts. Radiomics features were standardized using the same 0–1 (min–max) method. Features with near-zero variance (variance <0.01) were removed. Analysis of variance (ANOVA) was then applied to select features showing significant differences (P<0.05) across the four subtypes. Subsequently, optimal features were further screened by using least absolute shrinkage and selection operator (LASSO) regression in conjunction with the GridSearchCV function, during which hyper-parameter tuning was executed with each parameter underwent thorough evaluation via 10-fold cross-validation to reduce dimensionality and overfitting risk. Five ML methods, namely support vector machine (SVM), random forest (RF), multilayer perceptron (MLP), Light Gradient Boosting Machine (LightGBM), and Categorical Boosting (CatBoost), were trained on the screened radiomics features. The model with the best performance was selected for further combination modeling.

DL modeling

During the preprocessing, axial slices encompassing the tumor center were selected as the training dataset. Data augmentation included panning, flipping, and tilting operations. Images were resized to 256×256 pixels, center-cropped to 224×224 pixels and normalized. A ResNet-34 architecture, pre-trained on ImageNet, was used for transfer learning (40). The final fully connected layer was replaced for the 4-class classification task. The model was trained using the cross-entropy Loss function and optimized with the Adam algorithm (41). A learning rate of 0.00001 and a batch size of 32 were applied with 200 epoch during DL training. After identifying the best-performing DL model, the last fully connected layer was removed. The activations from the preceding layer (global average pooling layer output) were extracted as DL features. These DL features underwent the same standardization, ANOVA filtering, and LASSO selection process described for the radiomics features to obtain the optimal DL feature set.

Integrating models

Radiomic features and DL features from the best R model and DL features from the best DL model were combined using SVM to construct an integrating model for subtype differentiation with two different fusion methods. One is called early fusion, in which radiomic and DL features were concatenated to form a single vector, which is then used to train a predictor for constructing predictive models (SVM_E). The other one is called late fusion, which integrates the results of the R and DL models through stacked generalization (SVM_L) (42). The predicted scores of R and DL models for the four categories were employed as input features for the meta-model, which is then trained based on the real labels to obtain the SVM_L model. For both fusion SVMs (SVM_E and SVM_L), hyperparameter tuning was performed using 10-fold cross-validation on the training set to select parameters yielding the highest average accuracy. SVM regularization parameters were optimized to mitigate overfitting. The “one-vs-one” strategy was employed for multi-class classification. Model generalization was assessed using the independent external validation cohort. The receiver operating characteristic (ROC) curve and area under the curve (AUC) values were employed to assess the performance of each model. The Kruskal-Wallis test was utilized to assess whether there were significant differences between the models, while the Mann-Whitney U test provided insight into the two-by-two significant differences between the models.

Statistical analysis

Statistical analysis was conducted using SPSS (version: 26), One-way ANOVA was used to assess the correlation between radiomic features and subtype classification during feature selection (P<0.05). Kruskal-Wallis test and Mann-Whitney U test were conducted using scipy.stats package, where test results with P<0.05 were considered to have significant correlations. PET and CT image fusion and tumor region delineation were conducted using 3Dslicer (version 5.0.3). The modeling process was performed using Jupyter Notebook software (version: 6.4.5) with Python version 3.9.7. The scikit-learn package was used for constructing models. DL models were constructed using PyTorch (version 1.12.1; https://pytorch.org). Model performance was assessed through measurement of accuracy, precision, recall, F1-score, AUC.

Results

Patients

A total of 386 patients from institution I were enrolled for training and internal validation with a mean age of 66.54 years (range, 34–83 years). Due to limited MPA cases, only four subtypes predominantly dominated by APA, SPA, PPA, and LPA were collected and analyzed, in which there are 80 cases of APA, 104 cases of SPA, 95 cases of PPA, and 107 cases of LPA, respectively. Additional 32 cases from institution Ⅱ were included as the external validation dataset with a mean age of 65.77 years (range, 42–82 years), which includes 7 cases of APA, 7 cases of SPA, 9 cases of PPA, and 9 cases of LPA, respectively. Detailed characteristics of these patients are presented in Table 1.

Table 1

Clinicopathological characteristics of enrolled patients

Variables	Internal cohort				External validation cohort
Variables	APA	SPA	PPA	LPA	APA	SPA	PPA	LPA
Gender
Male	37 (46.3)	66 (63.5)	50 (52.6)	53 (49.5)	6 (85.7)	4 (57.1)	5 (55.6)	2 (22.2)
Female	43 (53.7)	38 (36.5)	45 (47.3)	54 (50.5)	1 (14.3)	3 (42.9)	4 (44.4)	7 (77.8)
Age (years)
Range	37–81	36–83	35–83	34–82	63–72	42–79	48–80	52–82
Mean	64.73	63.08	62.59	63.57	65.86	66.00	64.67	66.65
Smoking history
Smoking	13	26	14	11	5	3	5	3
Nonsmoking	67	78	81	96	2	4	4	6
Tumor size (mm)
Mean	23.95	27.58	24.25	17.65	26.34	22.60	32.78	25.63
Maximum size	190.00	112.00	96.00	60.00	60.00	60.00	53.80	44.60
Minimum size	5.50	5.00	3.00	6.00	5.50	5.50	9.40	7.00
Tumor location
Right upper lobe	27	25	25	37	0	1	3	3
Right middle lobe	5	12	10	9	1	1	3	1
Right lower lobe	13	23	16	18	1	0	1	5
Left upper lobe	22	26	26	30	3	2	1	0
Left lower lobe	13	18	18	13	2	3	1	0
T stage
T1	51 (63.7)	50 (48.1)	66 (69.5)	79 (73.8)	6 (85.7)	6 (85.7)	9 (100.0)	7 (77.8)
T2	25 (31.3)	39 (37.5)	22 (23.2)	27 (25.2)	1 (14.3)	1 (14.3)	0	2 (22.2)
T3	4 (5.0)	11 (10.6)	5 (5.3)	1 (0.9)	0	0	0	0
T4	0	4 (3.8)	2 (2.1)	0	0	0	0	0
N stage
N0	59 (73.8)	60 (57.7)	77 (81.1)	92 (86.0)	5 (71.4)	6 (85.7)	7 (77.8)	7 (77.8)
N1	12 (15.0)	10 (9.6)	4 (4.2)	5 (4.7)	1 (14.3)	1 (14.3)	2 (22.2)	1 (11.1)
N2	6 (7.5)	20 (19.2)	11 (11.6)	5 (4.7)	1 (14.3)	0	0	1 (11.1)
N3	3 (3.8)	14 (13.5)	3 (3.2)	5 (4.7)	0	0	0	0
M stage
M0	74 (92.5)	93 (89.4)	91 (95.8)	105 (98.1)	6 (85.7)	6 (85.7)	8 (88.9)	9 (100.0)
M1	6 (7.5)	11 (10.6)	4 (4.2)	2 (1.9)	1 (14.3)	1 (14.3)	1 (11.1)	0
Mixed subtypes
Yes	23 (28.7)	47 (45.2)	56 (58.9)	47 (43.9)	1 (14.3)	4 (57.1)	4 (44.4)	9 (100.0)
No	57 (71.3)	57 (54.8)	39 (41.1)	60 (56.1)	6 (85.7)	3 (42.9)	5 (55.6)	0
Tumor grading information
1	0	0	0	106 (99.0)	1 (14.3)	0	0	9 (100.0)
2	77 (96.3)	0	84 (87.5)	0	5 (71.4)	1 (14.3)	8 (88.9)	0
3	3 (3.7)	104 (100.0)	11 (12.5)	1 (1.0)	1 (14.3)	6 (85.7)	1 (11.1)	0

Data are presented as n (%) unless otherwise specified. Use the maximum diameter of the tumor as the tumor size. APA, acinar predominant adenocarcinoma; LPA, lepidic predominant adenocarcinoma; M, metastasis; N, lymph node; PPA, papillary predominant adenocarcinoma; SPA, solid predominant adenocarcinoma; T, tumor (topography).

Radiomics models

A total of 108 radiomics features were extracted from CT, PET, and PET/CT fusion images. Subsequently, 10, 10, and 6 optimal features were selected from the CT, PET, and PET/CT fusion images, respectively. The specific features and their corresponding coefficients are presented in Table S1 and Figure S2. The AUCs of SVM, RF, MLP, LightGBM and CatBoost in the internal validation cohorts ranged from 0.70 to 0.73, 0.74 to 0.76, and 0.71 to 0.79, with radiomics features from CT, PET and PET/CT images, respectively. The performance of these ML models using different image modalities was further evaluated and compared using Kruskal-Wallis test and U-test. The SVM model using PET/CT features showed a significantly better performance in comparison with other models (P=0.005), as shown Table S2. The SVM model achieved an AUC, accuracy, precision of 0.79, 0.70, and 0.70 in the differentiation of four subtypes INMA with PET/CT radiomic features, respectively. Performance metrics for all radiomics models are summarized in Figure 3 and Table 2.

Figure 3 The ROC curves of different machine learning models with radiomics features from CT, PET and PET/CT. AUC, area under the curve; CatBoost, Categorical Boosting; LightGBM, Light Gradient Boosting Machine; MLP, multilayer perceptron; PET/CT, positron emission tomography and computed tomography; R, radiomics; RF, random forest; ROC, receiver operating characteristic; SVM, support vector machine.

Table 2

The performance of the R models for the training and internal validation cohorts

Image type	Model name	Training cohort					Internal validation cohort
Image type	Model name	Precision	Recall	F1-score	ACC	AUC	Precision	Recall	F1-score	ACC	AUC
CT	SVM_R_CT	0.65	0.63	0.63	0.63	0.73	0.62	0.62	0.62	0.63	0.73
	RF_R_CT	0.74	0.71	0.71	0.71	0.85	0.59	0.54	0.54	0.55	0.70
	MLP_R_CT	0.60	0.58	0.58	0.58	0.76	0.58	0.58	0.59	0.59	0.72
	CatBoost_R_CT	0.78	0.76	0.77	0.76	0.91	0.50	0.49	0.48	0.51	0.71
	LightGBM_R_CT	0.80	0.79	0.79	0.79	0.93	0.57	0.56	0.56	0.55	0.70
PET	SVM_R_PET	0.72	0.67	0.69	0.68	0.78	0.70	0.65	0.65	0.66	0.76
	RF_R_PET	0.76	0.72	0.73	0.73	0.85	0.68	0.64	0.65	0.65	0.76
	MLP_R_PET	0.72	0.68	0.68	0.69	0.80	0.61	0.61	0.61	0.61	0.74
	CatBoost_R_PET	0.79	0.77	0.78	0.77	0.89	0.63	0.62	0.62	0.64	0.75
	LightGBM_R_PET	0.73	0.69	0.70	0.70	0.82	0.63	0.62	0.62	0.64	0.76
PET/CT	SVM_R_PET/CT	0.77	0.74	0.74	0.74	0.79	0.70	0.71	0.70	0.70	0.79
	RF_R_PET/CT	0.80	0.75	0.77	0.76	0.84	0.62	0.61	0.60	0.62	0.75
	MLP_R_PET/CT	0.76	0.69	0.71	0.70	0.77	0.69	0.66	0.66	0.66	0.74
	CatBoost_R_PET/CT	0.78	0.74	0.75	0.74	0.87	0.58	0.55	0.55	0.57	0.75
	LightGBM_R_PET/CT	0.74	0.69	0.70	0.70	0.82	0.57	0.53	0.52	0.55	0.71

ACC, accuracy; AUC, area under the curve; CatBoost, Categorical Boosting; LightGBM, Light Gradient Boosting Machine; MLP, multilayer perceptron; PET/CT, positron emission tomography and computed tomography; R, radiomics; RF, random forest; SVM, support vector machine.

DL models

The AUCs of DL models with the internal validation cohorts using CT (D_CT), PET (D_PET) and PET/CT (D_PET/CT) images in the differentiating the four subtypes of INMA were 0.86, 0.84, and 0.89, respectively. The Loss curve during training and the accuracy performance of the model in the training dataset and validation set are shown in the Figure 4. Detailed performance of these models is shown in Figure 5A,5B and Table 3.

Figure 4 Loss value changes and ACC curves during training. ACC, accuracy; PET/CT, positron emission tomography and computed tomography.

Figure 5 ROC curves of deep learning models and machine learning models based on deep learning features. (A,B) The ROC curves of deep learning models using CT, PET, and PET/CT in the internal training and internal validation cohorts, respectively. (C,D) The ROC curves of different machine learning models with deep learning features extracted from PET/CT in the internal training and internal validation cohorts, respectively. AUC, area under the curve; CatBoost, Categorical Boosting; D, deep learning; LightGBM, Light Gradient Boosting Machine; MLP, multilayer perceptron; PET/CT, positron emission tomography and computed tomography; RF, random forest; ROC, receiver operating characteristic; SVM, support vector machine.

Table 3

The performance of the DL models for the training and internal validation cohorts

Image type	Model name	Training cohort					Internal validation cohort
Image type	Model name	Precision	Recall	F1-score	ACC	AUC	Precision	Recall	F1-score	ACC	AUC
CT	D_CT	0.76	0.75	0.75	0.76	0.92	0.67	0.66	0.66	0.67	0.86
PET	D_PET	0.74	0.73	0.73	0.74	0.91	0.66	0.66	0.66	0.66	0.84
PET/CT	D_PET/CT	0.79	0.79	0.79	0.79	0.92	0.74	0.74	0.74	0.74	0.89
	SVM_D_PET/CT	0.89	0.90	0.90	0.89	0.97	0.79	0.79	0.79	0.78	0.91
	RF_D_PET/CT	0.85	0.85	0.85	0.85	0.96	0.67	0.66	0.66	0.66	0.83
	MLP_D_PET/CT	0.92	0.91	0.91	0.91	0.98	0.76	0.73	0.72	0.72	0.89
	CatBoost_D_PET/CT	0.91	0.90	0.90	0.90	0.98	0.53	0.53	0.53	0.53	0.73
	LightGBM_D_PET/CT	0.99	0.99	0.99	0.99	0.99	0.67	0.68	0.67	0.67	0.86

ACC, accuracy; AUC, area under the curve; CatBoost, Categorical Boosting; DL, deep learning; LightGBM, Light Gradient Boosting Machine; MLP, multilayer perceptron; PET/CT, positron emission tomography and computed tomography; RF, random forest; SVM, support vector machine.

Features extracted from the top-performing D_PET/CT model were used to train classifiers. The detailed performance of the models trained under different classifiers is shown in Figure 5C,5D and Table 3.

Integrated models

The AUCs of integrating models of SVM_E and SVM_L in the internal and external validation cohorts were 0.95, 0.96, and 0.76, 0.85, respectively. Detailed performance of these two fused models is shown in Figure 6 and Table 4.

Figure 6 ROC curves of different integrated models in various cohorts. (A-C) The ROC curves of different models in the training, internal validation and external validation cohorts, respectively. AUC, area under the curve; E, early fusion; L, late fusion; ROC, receiver operating characteristic; SVM, support vector machine.

Table 4

The performance of the integrated models for the training, internal validation and external validation cohorts

Model name	Training cohort					Internal validation cohort					External validation cohort
Model name	Precision	Recall	F1-score	ACC	AUC	Precision	Recall	F1-score	ACC	AUC	Precision	Recall	F1-score	ACC	AUC
SVM_E	0.92	0.91	0.92	0.89	0.97	0.86	0.86	0.86	0.85	0.95	0.79	0.79	0.77	0.78	0.76
SVM_L	0.90	0.89	0.89	0.89	0.98	0.88	0.87	0.87	0.86	0.96	0.85	0.83	0.82	0.81	0.85

ACC, accuracy; AUC, area under the curve; E, early fusion; L, late fusion; SVM, support vector machine.

Discussion

In this study, radiomic and DL features from PET/CT images were extracted and combined to differentiate the four main histologic subtypes of INMA (APA, SPA, PPA, LPA). Based on the stacked generalization method in the late fusion method, the AUC, accuracy and precision of 0.85, 0.81 and 0.85 were achieved in the external validation cohorts by fusing the radiomic features and DL features of PET/CT images, respectively.

Tumor heterogeneity critically limits preoperative pathological accuracy. PET/CT multimodal imaging addresses this by quantifying structural and metabolic heterogeneity (Table S3). Statistical analysis of radiomic features and tumor grading reveals that the first-order statistical features of CT images (10Percentile, 90Percentile, RootMeanSquared, P<0.001) reflect tumor density heterogeneity through gray-level distribution, while texture features (such as GLCM SumAverage, P<0.002) quantify structural complexity. In PET images, shape features (maximum 2D diameter column, sphericity, P<0.02) are associated with tumor invasive growth and metabolic heterogeneity (e.g., GLCM difference mean, P<0.001). PET/CT fused images exhibit significant advantages: their shape feature elongation (P=0.041), texture features (e.g., GLCM difference entropy, P<0.001), and first-order maximum value (P<0.001) can integrate multidimensional heterogeneity information. Cross-modal robust features complemented modality-specific features, enhancing subtype classification accuracy.

Due to its non-invasive nature and low cost, CT-based radiomics has been widely applied to characterize lung lesions and discriminate invasive from indolent adenocarcinomas (43-46). To the best of our knowledge, the radiomics models developed in this study is the first study trying to differentiate four main subtypes of lung adenocarcinomas. CT-based radiomics achieved an AUC of 0.73, 0.70, 0.72, 0.70, and 0.71 using SVM, RF, MLP, LightGBM, and CatBoost in the differentiating of APA, SPA, PPA, and LPA, respectively. Although the CT-based imaging model in this study had a lower AUC (SVM model was 0.73) than some studies for specific subtypes when distinguishing these four subtypes, considering that this study involves more subtypes and faces greater difficulty in identifying them, this result is still of reference value.

PET is as an important metabolic image providing information about the metabolic activity and function of tissues and organs, which has been widely used in the field of histologic type classification for non-small cell lung cancer (NSCLC) (47). The fusion of CT and PET images with the early fusion method of clinical features is widely used in the field of lung cancer classification. For example, Zhang et al. (48) constructed a fusion model based on PET/CT radiomic features combined with clinical data to distinguish between adenocarcinoma and squamous cell carcinoma. The fusion model (AUC 0.870) had stronger predictive ability than the clinical model (AUC 0.848) and the radiomics model (AUC 0.774). Ren et al. (21) extracted radiomics features from PET and CT images and achieved an AUC greater than 0.75. In this study, radiomics models with features extracted from PET alone achieved an AUC of 0.74 to 0.76 in the differentiation of the four subtypes of INMA. However, although SVM model using PET/CT achieved the best performance with an AUC of 0.79, there was not much improvement observed by combining PET and CT radiomics in this study.

DL algorithms efficiently detect and grade tumors (27). Ding et al. (49) applied LeNet and DenseNet using CT images to differentiate adenocarcinoma in situ, minimally invasive adenocarcinoma, invasive adenocarcinoma and invasive adenocarcinoma with micropapillary components and achieved an AUC of 0.88 and 0.86, respectively. In this study, DL model based on fused images achieved an AUC of 0.89 in the internal validation set. Fusing DL and radiomic features has been investigated with improved performance (31). Combining DL with radiomics has been a promising approach to achieve accurate classification in a study targeting INMA subtype differentiation. Lin et al. (29) developed an integrating model with DL, radiomics, and clinical data for the classification of benign or malignant pulmonary. In this study, the early fusion model (SVM_E) and late fusion model (SVM_L) achieved an ACC greater than 0.89 and 0.85 in both the training and the internal validation cohorts. Despite a small external cohort (n=32), SVM_L achieved AUC 0.85 with per-subtype accuracy ≥0.81, demonstrating cross-institutional feasibility. Cohort similarity (APA: 22% vs. 21%; SPA: 22% vs. 27%) mitigated selection bias.

Non-invasive whole-tumor imaging analysis circumvents the sampling bias inherent in preoperative biopsies caused by tumor heterogeneity (50). In external validation cohort, our fused PET/CT model achieved an AUC of 0.85 and accuracy of 0.81. By integrating complementary radiomic features, this approach overcomes the subjectivity limitations of conventional visual PET/CT interpretation. This model can serve as a non-invasive supplementary tool for optimizing surgical plans, particularly suitable high-risk biopsy patients, technically challenging lesions, or time-sensitive preoperative assessments. It simultaneously reduces complications associated with invasive procedures and enhances diagnostic efficiency.

One limitation of this study is that the micro-papillary predominant subtype was excluded for modeling due to insufficient cases. Given its aggressive behavior and prognostic significance, future studies must prioritize including MPA. Additionally, the small sample size (n=32) in the external cohort restricts statistical power for broad generalizability claims. Despite these constraints, this study provides preliminary evidence of cross-institutional feasibility using diverse PET/CT scanners. To address these limitations, future studies should integrate larger, prospective multi-center datasets to enhance model robustness, include MPA, and rigorously validate generalizability. It is worth trying to develop methods to preoperatively quantify coexisting subtype components (e.g., lepidic vs. solid fractions), not just predominant patterns, to better reflect tumor heterogeneity and guide personalized resection strategies.

Conclusions

Integrating DL and radiomic features from PET/CT images provides a feasible and accurate approach for noninvasive differentiating the four main subtypes of INMA. This technique may optimize surgical planning for high-risk patients while reducing invasive diagnostic burdens. Future work should incorporate micropapillary (MPA) subtypes and validate generalizability across multi-center cohorts.

Acknowledgments

We would like to thank Zhejiang Engineering Research Center for innovation and application of Intelligent Radiotherapy Technology, the Summit Advancement Discipline of Zhejiang Province (Wenzhou Medical University-Pharmaceutics), Zhejiang-Hong Kong Precision Theranostics of Thoracic Tumors Joint Laboratory, and Wenzhou key Laboratory of Basic Science and Translational Research of Radiation Oncology, Zhejiang Key Laboratory of Intelligent Cancer Biomarker Discovery and Translation, Discipline Cluster of Oncology, Wenzhou Medical University for their support in conducting this study.

Footnote

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-2025-333/rc

Data Sharing Statement: Available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-2025-333/dss

Peer Review File: Available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-2025-333/prf

Funding: This study was supported by the National Natural Science Foundation (Nos. 12475352 and 82273570), Key Project of Zhejiang Natural Science Foundation (No. LZ24A050008), Key Project of Zhejiang Provincial Health Science and Technology Program (No. WKJ-ZJ-2437), and Major Project of Wenzhou Science and Technology Bureau (Nos. ZY2022016 and ZY2020011).

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-2025-333/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Ethics Committee in Clinical Research (ECCR) of the First Affiliated Hospital of Wenzhou Medical University (No. 2019059) and individual consent for this retrospective analysis was waived. The Second Affiliated Hospital of Wenzhou Medical University was also informed and agreed the study.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.

References

Zheng RS, Zhang SW, Sun KX, et al. Cancer statistics in China, 2016. Zhonghua Zhong Liu Za Zhi 2023;45:212-20. [Crossref] [PubMed]
Feng R, Su Q, Huang X, et al. Cancer situation in China: what does the China cancer map indicate from the first national death survey to the latest cancer registration? Cancer Commun (Lond) 2023;43:75-86. [Crossref] [PubMed]
Tsao MS, Nicholson AG, Maleszewski JJ, et al. Introduction to 2021 WHO Classification of Thoracic Tumors. J Thorac Oncol 2022;17:e1-4. [Crossref] [PubMed]
Park S, Lee SM, Kim S, et al. Volume Doubling Times of Lung Adenocarcinomas: Correlation with Predominant Histologic Subtypes and Prognosis. Radiology 2020;295:703-12. [Crossref] [PubMed]
Zhao Z, Gan H, Fu BJ, et al. A Comprehensive Study of Part-Solid Lung Adenocarcinoma with Lymph Node Metastasis: Clinical, Pathological, and Radiological Perspectives. Cancer Manag Res 2025;17:1015-27. [Crossref] [PubMed]
Li Y, Chen D, Xu Y, et al. Prognostic implications, genomic and immune characteristics of lung adenocarcinoma with lepidic growth pattern. J Clin Pathol 2025;78:277-84. [Crossref] [PubMed]
Huang H, Yan Z, Li B, et al. LungPath: artificial intelligence-driven histologic pattern recognition for improved diagnosis of early-stage invasive lung adenocarcinoma. Transl Lung Cancer Res 2024;13:1816-27. [Crossref] [PubMed]
Ettinger DS, Wood DE, Aisner DL, et al. NCCN Guidelines® Insights: Non-Small Cell Lung Cancer, Version 2.2023. J Natl Compr Canc Netw 2023;21:340-50. [Crossref] [PubMed]
Motono N, Mizoguchi T, Ishikawa M, et al. Predictive Value of Recurrence of Solid and Micropapillary Subtypes in Lung Adenocarcinoma. Oncology 2024;102:366-73. [Crossref] [PubMed]
Kitagawa S, Zenke Y, Taki T, et al. Prognostic value of predominant subtype in pathological stage II-III lung adenocarcinoma with epidermal growth factor receptor mutation. Lung Cancer 2024;188:107453. [Crossref] [PubMed]
Yang W, Sun W, Li Q, et al. Diagnostic Accuracy of CT-Guided Transthoracic Needle Biopsy for Solitary Pulmonary Nodules. PLoS One 2015;10:e0131373. [Crossref] [PubMed]
Wei Z, Yang X, Feng Y, et al. Could concurrent biopsy and microwave ablation be reliable? Concordance between frozen section examination and final pathology in CT-guided biopsy of lung cancer. Int J Hyperthermia 2021;38:1031-6. [Crossref] [PubMed]
Matsuzawa R, Kirita K, Kuwata T, et al. Factors influencing the concordance of histological subtype diagnosis from biopsy and resected specimens of lung adenocarcinoma. Lung Cancer 2016;94:1-6. [Crossref] [PubMed]
Tsai PC, Yeh YC, Hsu PK, et al. CT-Guided Core Biopsy for Peripheral Sub-solid Pulmonary Nodules to Predict Predominant Histological and Aggressive Subtypes of Lung Adenocarcinoma. Ann Surg Oncol 2020;27:4405-12. [Crossref] [PubMed]
Liu D, Chen L, Wang X, et al. Use of Computed Tomography-Guided Percutaneous Biopsy of Invasive Non-Mucinous Lung Adenocarcinoma to Predict the Degree of Histological Differentiation. Clin Med Insights Oncol 2022;16:11795549221102752. [Crossref] [PubMed]
Takeuchi S, Khiewvan B, Fox PS, et al. Impact of initial PET/CT staging in terms of clinical stage, management plan, and prognosis in 592 patients with non-small-cell lung cancer. Eur J Nucl Med Mol Imaging 2014;41:906-14. [Crossref] [PubMed]
Yang B, Ji H, Ge Y, et al. Correlation Study of (18)F-Fluorodeoxyglucose Positron Emission Tomography/Computed Tomography in Pathological Subtypes of Invasive Lung Adenocarcinoma and Prognosis. Front Oncol 2019;9:908. [Crossref] [PubMed]
Guo Y, Yao ZM, Chen M, et al. The correlation between metabolic parameters in (18)F-FDG PET-CT and solid and micropapillary histological subtypes in lung adenocarcinoma. Zhonghua Zhong Liu Za Zhi 2022;44:555-61. [Crossref] [PubMed]
Shao X, Niu R, Shao X, et al. Value of (18)F-FDG PET/CT-based radiomics model to distinguish the growth patterns of early invasive lung adenocarcinoma manifesting as ground-glass opacity nodules. EJNMMI Res 2020;10:80. [Crossref] [PubMed]
Bianconi F, Palumbo I, Fravolini ML, et al. Texture Analysis on [18F]FDG PET/CT in Non-Small-Cell Lung Cancer: Correlations Between PET Features, CT Features, and Histological Types. Mol Imaging Biol 2019;21:1200-9.
Ren C, Zhang J, Qi M, et al. Machine learning based on clinico-biological features integrated 18F-FDG PET/CT radiomics for distinguishing squamous cell carcinoma from adenocarcinoma of lung. Eur J Nucl Med Mol Imaging 2021;48:1538-49. [Crossref] [PubMed]
Khodabakhshi Z, Amini M, Hajianfar G, et al. Dual-Centre Harmonised Multimodal Positron Emission Tomography/Computed Tomography Image Radiomic Features and Machine Learning Algorithms for Non-small Cell Lung Cancer Histopathological Subtype Phenotype Decoding. Clin Oncol (R Coll Radiol) 2023;35:713-25. [Crossref] [PubMed]
Dong H, Yin LK, Qiu YG, et al. Prediction of high-grade patterns of stage IA lung invasive adenocarcinoma based on high-resolution CT features: a bicentric study. Eur Radiol 2023;33:3931-40. [Crossref] [PubMed]
Xing X, Li L, Sun M, et al. A combination of radiomic features, clinic characteristics, and serum tumor biomarkers to predict the possibility of the micropapillary/solid component of lung adenocarcinoma. Ther Adv Respir Dis 2024;18:17534666241249168. [Crossref] [PubMed]
Rajpurkar P, Chen E, Banerjee O, et al. AI in health and medicine. Nat Med 2022;28:31-8. [Crossref] [PubMed]
Niazi MKK, Parwani AV, Gurcan MN. Digital pathology and artificial intelligence. Lancet Oncol 2019;20:e253-61. [Crossref] [PubMed]
van der Laak J, Litjens G, Ciompi F. Deep learning in histopathology: the path to the clinic. Nat Med 2021;27:775-84. [Crossref] [PubMed]
Wang F, Wang CL, Yi YQ, et al. Comparison and fusion prediction model for lung adenocarcinoma with micropapillary and solid pattern using clinicoradiographic, radiomics and deep learning features. Sci Rep 2023;13:9302. [Crossref] [PubMed]
Lin CY, Guo SM, Lien JJ, et al. Combined model integrating deep learning, radiomics, and clinical data to classify lung nodules at chest CT. Radiol Med 2024;129:56-69. [Crossref] [PubMed]
Ye G, Wu G, Li K, et al. Development and Validation of a Deep Learning Radiomics Model to Predict High-Risk Pathologic Pulmonary Nodules Using Preoperative Computed Tomography. Acad Radiol 2024;31:1686-97. [Crossref] [PubMed]
Dunn B, Pierobon M, Wei Q. Automated Classification of Lung Cancer Subtypes Using Deep Learning and CT-Scan Based Radiomic Analysis. Bioengineering (Basel) 2023;10:690. [Crossref] [PubMed]
Giger ML. Machine Learning in Medical Imaging. J Am Coll Radiol 2018;15:512-20. [Crossref] [PubMed]
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521:436-44. [Crossref] [PubMed]
Azam MA, Khan KB, Salahuddin S, et al. A review on multimodal medical image fusion: Compendious analysis of medical modalities, multimodal databases, fusion techniques and quality metrics. Comput Biol Med 2022;144:105253. [Crossref] [PubMed]
Atrey PK, Hossain MA, El Saddik A, et al. Multimodal fusion for multimedia analysis: a survey. Multimedia Systems 2010;16:345-79.
Li L, Zhou X, Cui W, et al. Combining radiomics and deep learning features of intra-tumoral and peri-tumoral regions for the classification of breast cancer lung metastasis and primary lung cancer with low-dose CT. J Cancer Res Clin Oncol 2023;149:15469-78. [Crossref] [PubMed]
Fedorov A, Beichel R, Kalpathy-Cramer J, et al. 3D Slicer as an image computing platform for the Quantitative Imaging Network. Magn Reson Imaging 2012;30:1323-41. [Crossref] [PubMed]
van Griethuysen JJM, Fedorov A, Parmar C, et al. Computational Radiomics System to Decode the Radiographic Phenotype. Cancer Res 2017;77:e104-7. [Crossref] [PubMed]
Zwanenburg A, Vallières M, Abdalah MA, et al. The Image Biomarker Standardization Initiative: Standardized Quantitative Radiomics for High-Throughput Image-based Phenotyping. Radiology 2020;295:328-38. [Crossref] [PubMed]
Park YJ, Choi D, Choi JY, et al. Performance Evaluation of a Deep Learning System for Differential Diagnosis of Lung Cancer With Conventional CT and FDG PET/CT Using Transfer Learning and Metadata. Clin Nucl Med 2021;46:635-40. [Crossref] [PubMed]
Connor R, Dearle A, Claydon B, et al. Correlations of Cross-Entropy Loss in Machine Learning. Entropy (Basel) 2024;26:491. [Crossref] [PubMed]
Kablan R, Miller HA, Suliman S, et al. Evaluation of stacked ensemble model performance to predict clinical outcomes: A COVID-19 study. Int J Med Inform 2023;175:105090. [Crossref] [PubMed]
Chen J, Zeng X, Li F, et al. The value of non-enhanced CT 3D visualization in differentiating stage I invasive lung adenocarcinoma between LPA and non-LPA. Eur J Radiol Open 2024;13:100600. [Crossref] [PubMed]
Liu M, Duan R, Xu Z, et al. CT-based radiomics combined with clinical features for invasiveness prediction and pathological subtypes classification of subsolid pulmonary nodules. Eur J Radiol Open 2024;13:100584. [Crossref] [PubMed]
He B, Song Y, Wang L, et al. A machine learning-based prediction of the micropapillary/solid growth pattern in invasive lung adenocarcinoma with radiomics. Transl Lung Cancer Res 2021;10:955-64. [Crossref] [PubMed]
Ninomiya K, Yanagawa M, Tsubamoto M, et al. Prediction of solid and micropapillary components in lung invasive adenocarcinoma: radiomics analysis from high-spatial-resolution CT data with 1024 matrix. Jpn J Radiol 2024;42:590-8. [Crossref] [PubMed]
Aydos U, Ünal ER, Özçelik M, et al. Texture features of primary tumor on 18F-FDG PET images in non-small cell lung cancer: The relationship between imaging and histopathological parameters. Rev Esp Med Nucl Imagen Mol (Engl Ed) 2021;40:343-50. [Crossref] [PubMed]
Zhang Y, Liu H, Chang C, et al. Machine learning for differentiating lung squamous cell cancer from adenocarcinoma using Clinical-Metabolic characteristics and 18F-FDG PET/CT radiomics. PLoS One 2024;19:e0300170. [Crossref] [PubMed]
Ding H, Xia W, Zhang L, et al. CT-Based Deep Learning Model for Invasiveness Classification and Micropapillary Pattern Prediction Within Lung Adenocarcinoma. Front Oncol 2020;10:1186. [Crossref] [PubMed]
Balasubramanian P, Abia-Trujillo D, Barrios-Ruiz A, et al. Diagnostic yield and safety of diagnostic techniques for pulmonary lesions: systematic review, meta-analysis and network meta-analysis. Eur Respir Rev 2024;33:240046. [Crossref] [PubMed]

Cite this article as: Wang D, Liu H, Cao Z, Huang W, Fu W, Shao L, Zhang J, Su W, Yu X, Han C, Ai Y, Xie C, Jin X. Integrating deep learning and radiomics in the differentiation of major histological subtypes of invasive non-mucinous lung adenocarcinoma using positron emission tomography and computed tomography. Transl Lung Cancer Res 2025;14(9):3323-3336. doi: 10.21037/tlcr-2025-333

Integrating deep learning and radiomics in the differentiation of major histological subtypes of invasive non-mucinous lung adenocarcinoma using positron emission tomography and computed tomography

Highlight box

Introduction

Methods

Study design

Patients and images

Radiomics features extraction and modeling

DL modeling

Integrating models

Statistical analysis

Results

Patients

Table 1

Radiomics models

Table 2

DL models

Table 3

Integrated models

Table 4

Discussion

Conclusions

Acknowledgments

Footnote

References

Article Options

Download Citation

Share