Development of a deep learning model based on computed tomography automatic segmentation to assist in selecting optimal time-to-surgery and dissected lymph node count for non-small cell lung cancer patients undergoing neoadjuvant immunotherapy and chemotherapy: a multicenter study

Junfeng Zhao; Ying Li; Jiaxuan Chen; Chuankun Han; Yifei Yan; Chen Zhou; Yingrui Bai; Yali Xu; Yintao Li

doi:10.21037/tlcr-2025-1-1495

Original Article

Development of a deep learning model based on computed tomography automatic segmentation to assist in selecting optimal time-to-surgery and dissected lymph node count for non-small cell lung cancer patients undergoing neoadjuvant immunotherapy and chemotherapy: a multicenter study

Junfeng Zhao^1,2#, Ying Li^3,4#, Jiaxuan Chen⁵, Chuankun Han⁵, Yifei Yan⁵, Chen Zhou⁵, Yingrui Bai⁶, Yali Xu^7#, Yintao Li³

¹Department of Radiation Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China; ²Department of Oncology, Cancer Center, West China Hospital, Sichuan University, Chengdu, China; ³Department of Respiratory Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China; ⁴Division of Thoracic Tumor Multimodality Treatment, Cancer Center, West China Hospital, Sichuan University, Chengdu, China; ⁵Shandong First Medical University, and Shandong Academy of Medical Sciences, Jinan, China; ⁶School of Clinical Medicine, Shandong Second Medical University, Weifang, China; ⁷Department of Pathology, Shandong Provincial Hospital Affiliated with Shandong First Medical University, Jinan, China

Contributions: (I) Conception and design: J Zhao, Ying Li; (II) Administrative support: J Zhao, Ying Li, Yintao Li; (III) Provision of study materials or patients: J Chen, C Han, Y Xu; (IV) Collection and assembly of data: J Chen, C Han, Y Xu; (V) Data analysis and interpretation: Y Yan, C Zhou, Y Bai; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

^#These authors contributed equally to this work.

Correspondence to: Professor Yintao Li, MD. Department of Respiratory Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, 440 Jiyan Road, Huaiyin District, Jinan 250000, China. Email: yintaoli@fudan.edu.cn; Professor Yali Xu, MD. Department of Pathology, Shandong Provincial Hospital Affiliated with Shandong First Medical University, 324 Jingwu-Weiqi Road, Huaiyin District, Jinan 250000, China. Email: skyelia@sina.com.

Background: Many studies have confirmed the efficacy of neoadjuvant immunotherapy combined with chemotherapy (NICT) in treating patients with non-small cell lung cancer (NSCLC). However, the optimal time-to-surgery (TTS) and the appropriate dissected lymph node (DLN) count remain unclear. Therefore, this study aims to determine the optimal TTS and DLN count for NSCLC following NICT by establishing a deep learning model based on computed tomography (CT) automatic segmentation to predict the efficacy of neoadjuvant therapy.

Methods: We retrospectively analyzed patients with NSCLC who underwent NICT and surgical treatment at two centers between January 2019 and June 2024. A proven high-precision and strong generalization three-dimensional (3D) segmentation architecture with a flexible interaction mode (VISTA3D) is applied in this study for CT images to achieve automatic tumor identification in NSCLC patients via automated segmentation combined with segmentation point prompts. This study employed ResNet18 (an 18-layer residual network pre-trained on ImageNet) as the feature extraction backbone. In the final configuration, the shallow stagelayers of the model were frozen, with only the two deeper stagelayers unfrozen to participate in gradient updates. This study employed BCEWithLogitsLoss to accommodate the classification requirements of the task. The primary evaluation metric in this study was the receiver operating characteristic curve. Based on the model, patients were divided into two groups: responders and non-responders. Optimal TTS and DLN counts were determined for each group.

Results: A total of 330 patients were included, with 270 patients in the training set and 60 patients in the external test set. The area under the curve of the deep learning model was 0.854 (95% confidence interval: 0.745–0.976). We assigned patients to responders and non-responders groups based on the deep learning model score. In the responder group, prolonged TTS was associated with better prognosis, with no significant difference in postoperative complications. In the non-responder group, earlier surgery was associated with better prognosis, with no significant difference in postoperative complications. In the responder group, a DLN count of ≤21 was associated with better prognosis.

Conclusions: In patients undergoing NICT, our model-based stratification suggests that for predicted responders, appropriately delaying surgery may be associated with better outcomes, whereas for predicted non-responders, earlier surgery is associated with better outcomes. Furthermore, among predicted responders, an observed DLN count of ≤21 was associated with improved survival, highlighting the potential importance of balancing oncologic resection with immune preservation. These findings require prospective validation.

Keywords: Non-small cell lung cancer (NSCLC); neoadjuvant; immunotherapy; time-to-surgery (TTS); dissected lymph node (DLN)

Submitted Dec 30, 2025. Accepted for publication Jan 28, 2026. Published online Feb 27, 2026.

doi: 10.21037/tlcr-2025-1-1495

Highlight box

Key findings

• For responder patients, the optimal time-to-surgery (TTS) was > 31 days. For non-responder patients, the optimal TTS was ≤35 days. For responder patients, the optimal dissected lymph node (DLN) was ≤21.

What is known and what is new?

• Numerous studies have confirmed that neoadjuvant immunotherapy combined with chemotherapy (NICT) is an effective treatment for patients with non-small cell lung cancer (NSCLC). However, in clinical practice, the optimal TTS and the appropriate DLN count following this treatment remain unclear and controversial.

• This study introduces a deep learning model based on automated 3D computed tomography segmentation (VISTA3D) to predict neoadjuvant therapy efficacy and stratify patients. We demonstrate that predicted “responders” benefit from a delayed surgery (>31 days) and a more conservative lymph node dissection (DLN ≤21), whereas “non-responders” achieve better outcomes with earlier surgery (≤35 days).

What is the implication, and what should change now?

• In patients undergoing NICT, for responder patients, it is recommended to delay surgery appropriately, whereas for non-responder patients, it is advisable to proceed with surgery as early as possible. It is suggested to target for a DLN count of ≤21 for responder patients.

Introduction

Recently, the incidence and mortality rates of cancer have increased worldwide. As one of the leading causes of cancer-related deaths, lung cancer’s treatment and prognosis have garnered significant attention (1). Surgical resection is the preferred treatment for non-small cell lung cancer (NSCLC). However, with higher stages, only 25–30% of patients initially present with resectable disease (2). Therefore, systemic neoadjuvant therapy should be administered to patients with locally advanced NSCLC. Neoadjuvant therapy administered before radical surgical resection reduces staging, increases resection rates, and treats subclinical micrometastases more promptly than adjuvant therapy (3). CheckMate-816 was the first phase III clinical trial of neoadjuvant immunotherapy. Following its positive results, in March 2022, the U.S. Food and Drug Administration approved nivolumab in combination with platinum-based chemotherapy for the neoadjuvant treatment of resectable NSCLC (4). Subsequently, several randomized controlled studies on “sandwich” treatment regimens of preoperative neoadjuvant immunotherapy and chemotherapy (NICT), and postoperative adjuvant immunotherapy have emerged, including KEYNOTE-671 (5), Neotorch (6), AEGEAN (7), and NADIM II (8). These clinical findings suggest that after preoperative NICT, the continuation of adjuvant immunotherapy in the postoperative period is beneficial for consolidating the efficacy of the treatment, contributing to long-term survival benefits, and reducing the risk of distant tumor recurrence.

However, the use of perioperative immunotherapy remains controversial. The optimal time-to-surgery (TTS) after NICT remains unclear. TTS refers to the interval between the last cycle of NICT and curative lung cancer resection, typically set in clinical practice as 4–6 or 6–8 weeks (3). Liu et al. found that in a mouse model of spontaneous metastatic cancer, both delayed and early surgery following NICT resulted in poor prognosis (9). Delayed surgery may lead to tumor progression, while early surgery may result in severe surgical complications (10). Li et al. in their study on esophageal cancer found that prolonging TTS was associated with improved overall survival (OS) for patients sensitive to neoadjuvant treatment, as it enhances tumor resectability and leads to improved long-term survival outcomes (11). Similarly, Xiao et al. demonstrated that for patients with locally advanced esophageal cancer who responded well to neoadjuvant treatment, TTS could be extended to at least 10 weeks. Conversely, delayed surgery was associated with poorer OS in patients with poor responses to neoadjuvant treatment (12). Therefore, determining the optimal TTS after NICT is crucial. Furthermore, there is controversy surrounding the dissected lymph node (DLN) count after NICT.

Systematic lymph node dissection is an integral component of surgical treatment for NSCLC (13). However, new findings from a clinical trial by Rahim et al. suggest that immunotherapy can activate anti-tumor T cells in nearby lymph nodes. Overly aggressive dissection of nearby lymph nodes during surgery may remove crucial sites for T-cell survival and activation (14). Tumor-draining lymph nodes (TdLNs) are the primary sites for tumor antigen exposure and subsequent immune activation. Excessive dissection may lead to suboptimal efficacy of subsequent immunotherapies (15,16). While it is important to remove lymph nodes containing metastatic cancer cells, for patients receiving ongoing immunotherapy, the impact of the DLN count on subsequent immunotherapy should be considered. A precise strategy rather than extensive lymph node dissection is recommended (17).

Previous studies have found that patients achieving pathological complete response (pCR) or major pathological response (MPR) may appropriately delay surgery (18). This indicates that the efficacy of neoadjuvant therapy influences the selection of surgical approaches. Therefore, this study aims to determine the optimal TTS and DLN count for NSCLC following NICT by establishing a deep learning (DL) model based on computed tomography (CT) automatic segmentation to predict the efficacy of neoadjuvant therapy. We present this article in accordance with the TRIPOD reporting checklist (available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-2025-1-1495/rc).

Methods

Patients selection

Data from 330 patients with resectable NSCLC who underwent NICT combined with surgery between January 2019 and June 2024 at Cancer Hospital Affiliated with Shandong First Medical University (Shandong Cancer Hospital and Institute) and Shandong Provincial Hospital Affiliated with Shandong First Medical University were retrospectively analyzed. Data from one center is used to train the model, while data from another center is used for external testing. All enrolled patients met the criteria of pathologically confirmed NSCLC (squamous cell carcinoma or adenocarcinoma), clinical stage II or III, received only NICT before surgical treatment, and had an Eastern Cooperative Oncology Group (ECOG) score of 0–1. Patients with unresectable tumors or metastases identified during exploratory surgery, EGFR/ALK/ROS1 gene mutations, missing pre- and post-NICT CT imaging data, or those who declined follow-up were excluded. The CT scans included for patients were those performed within one month prior to NICT and those performed 4–8 weeks after NICT (the interval between the last NICT and surgery). Patients were categorized into responders and non-responders based on pre- and post-NICT CT images and postoperative pathological results. Figure 1 illustrates the workflow of the study. Detailed treatment protocols for all patients are provided in Appendix 1. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Ethics Committee of Cancer Hospital Affiliated with Shandong First Medical University (Shandong Cancer Hospital and Institute) (No. SDTHEC202509019) and individual consent for this retrospective analysis was waived. Shandong Provincial Hospital Affiliated with Shandong First Medical University was also informed of and agreed to the study.

Figure 1 Workflow of this study. 3D, three-dimensional; AUC, area under the receiver operating characteristic curve; CV, cross validation; DLN, dissected lymph node; MIL, multiple instance learning; ROC, receiver operating characteristic; TTS, time-to-surgery.

Definition of treatment response

Patients were categorized as “Responders” or “Non-responders” based on a composite assessment integrating radiological and pathological criteria, which served as the ground truth for DL model training and subsequent clinical analyses. Responders were defined as patients who achieved either MPR or pCR (18). Non-responders were defined as patients who did not meet the above MPR/pCR criteria, i.e., with >10% viable tumor cells in the pathological specimen. This binary classification (responder vs. non-responder) was used as the primary endpoint for the development of the DL predictive model. For patients where pathological assessment was the gold standard, imaging findings from pre- and post-NICT CT scans (evaluated according to RECIST 1.1 guidelines) were used in conjunction during the model training phase to provide a comprehensive feature set, but the final response label was anchored to the pathological outcome.

Data preprocessing

Two experienced radiation oncologists with 10 years of experience used ITK SNAP (https://www.itksnap.org/) to manually define the regions of interest (ROIs) of the main tumor on the CT scans. Because of various scanners or acquisition techniques, medical volumes frequently have diverse voxel spacing. The physical separation between two pixels in a picture is referred to as this spacing. To lessen the impact of voxel spacing changes, spatial normalization is frequently used. To solve these issues, a fixed-resolution resampling technique was applied. To standardize the voxel spacing, all pictures were resampled to a voxel size of 1 mm × 1 mm × 1 mm. Lastly, z-score standardization (zero-mean normalization) was used to normalize the data.

VISTA3D segmentation model

The core segmentation model adopted in this study is VISTA3D (19)—a unified foundational model for 3D medical image segmentation. Built on the SegResNet backbone with sliding window inference, VISTA3D integrates 3D supervoxel-distilled Segment Anything Model (SAM) knowledge and a four-stage training paradigm. This model enables automatic segmentation of 127 anatomical structures, 3D interactive correction, and zero-shot segmentation. The reliability analysis in this study focuses on the results of the three core segmentation methods supported by VISTA3D, which are compared against manually annotated gold standards on an external test dataset. Segmentation methods include automatic segmentation, point-prompt segmentation, and automatic segmentation combined with point-prompt segmentation. Four widely recognized quantitative metrics in medical image segmentation were selected: Dice similarity coefficient, 95% Hausdorff distance, mean surface distance, and volume similarity. These metrics were used to calculate the consistency between automated segmentation results and manually segmented gold standard labels.

Multiple instance learning (MIL) based on ImageNet

Figure 1 shows the design roadmap for this study. This study employs ResNet18 (an 18-layer residual network pre-trained on ImageNet) as the feature extraction backbone (20). To balance the retention of pre-trained knowledge with adaptation to CT image features, a hierarchical freezing strategy is adopted. In the final configuration, the shallow stagelayers of the model are frozen, while only the two deeper stagelayers are unfrozen to participate in gradient updates. We implemented an aggregator based on AttentionMIL. This module consists of a two-layer Multi-Layer Perceptron (MLP) and a Softmax activation function. This generates a more information-dense and discriminative patient-level representation. In the exploratory experiments of this study, a key finding was that complex prediction heads were highly prone to overfitting on our dataset. Consequently, a simplified MLP with strong regularization was adopted as the prediction head.

Model evaluation

The loss function employed in this study is BCEWithLogitsLoss to accommodate the task’s classification requirements. Model training and evaluation were executed on a single NVIDIA RTX A6000 GPU, with underlying support provided by Python-based open-source machine learning library (PyTorch) V.2.0.0+cu118 and CUDA V.11.8. To ensure reproducibility and consistency of experimental results, the random seed was set to 0. This study employed 5-fold cross-validation to evaluate model performance. The primary evaluation metric of this study is the area under the receiver operating characteristic curve (AUC).

TTS and DLN count

TTS was defined as the time interval from the end of the last NICT cycle to the day of surgery. The optimal cutoff value of TTS was calculated based on TTS and disease-free survival (DFS) in responder and non-responder patients, respectively, and they were divided into two groups based on the length of the TTS. Similarly, among all patients who underwent NICT, the optimal cutoff value of the DLN count was calculated based on the DLN count and DFS in responder and non-responder patients, respectively. The patients were then categorized into two groups based on the cutoff value.

Study endpoints

The study’s endpoints were DFS and OS. DFS was defined as the interval from curative lung cancer resection to the first recorded recurrence, death from any cause, or last follow-up. OS was defined as the interval from the start of the first cycle of neoadjuvant therapy to death from any cause or last follow-up.

Statistical analysis

When comparing variable categories, the Chi-squared test or Fisher’s exact test was used, and for contrasting continuous variables, the rank-sum test or independent samples t-test was used. The Kaplan-Meier method was employed to evaluate DFS and OS, with results assessed using a log-rank test. Statistical significance was set at P less than 0.05 in two-sided analysis. Optimal cutoff values for continuous variables (TTS and DLN count) were initially identified using maximally selected rank statistics. To account for multiple testing and overfitting, the robustness of these cutoffs was evaluated using bootstrap resampling (1,000 iterations). Bootstrap-derived 95% confidence intervals (CIs) for each cutoff are reported. Python statsmodels (version 0.13.2) was used for data processing and visualization.

Results

Patients’ baseline characteristics

From January 2019 to June 2024, 330 eligible patients with resectable NSCLC underwent NICT combined with surgical treatment at two centers, with 270 patients in the training set and 60 patients in the external test set. The two groups comprised 169 (62.59%) and 40 (66.67%) patients aged ≥60 years, 211 (78.15%) and 41 (68.33%) patients with squamous cell carcinoma, and 145 (53.70%) and 28 (46.67%) patients with N2 stage (Table 1). Follow-up was completed by August 2025, with a median follow-up duration of 32.5 months [interquartile range (IQR): 21.8–45.1 months] for all patients.

Table 1

Baseline characteristics

Variables	Total (n=330)	Training (n=270)	Test (n=60)	P
Age				0.55
<60 years	121 (36.67)	101 (37.41)	20 (33.33)
≥60 years	209 (63.33)	169 (62.59)	40 (66.67)
Sex				0.19
Female	48 (14.55)	36 (13.33)	12 (20.00)
Male	282 (85.45)	234 (86.67)	48 (80.00)
Smoking				0.50
No	98 (29.70)	78 (28.89)	20 (33.33)
Yes	232 (70.30)	192 (71.11)	40 (66.67)
Drinking				0.33
No	196 (59.39)	157 (58.15)	39 (65.00)
Yes	134 (40.61)	113 (41.85)	21 (35.00)
Family				0.88
No	288 (87.27)	236 (87.41)	52 (86.67)
Yes	42 (12.73)	34 (12.59)	8 (13.33)
Location				0.30
Right lung	152 (46.06)	128 (47.41)	24 (40.00)
Left lung	178 (53.94)	142 (52.59)	36 (60.00)
T				0.81
T2	185 (56.06)	150 (55.56)	35 (58.33)
T3	65 (19.70)	55 (20.37)	10 (16.67)
T4	80 (24.24)	65 (24.07)	15 (25.00)
N				0.32
N0–1	157 (47.58)	125 (46.30)	32 (53.33)
N2	173 (52.42)	145 (53.70)	28 (46.67)
Stage				0.74
II	78 (24.07)	65 (24.44)	13 (22.41)
III	246 (75.93)	201 (75.56)	45 (77.59)
Pathology				0.11
Adenocarcinoma	78 (23.64)	59 (21.85)	19 (31.67)
Squamous carcinoma	252 (76.36)	211 (78.15)	41 (68.33)

Data are presented as n (%). N, node; T, tumor.

Performance of the 3D automated segmentation and DL model

The automatic segmentation model developed by our institute based on VISTA3D demonstrates superior segmentation performance compared to manual segmentation (Figure 2A). Automatic segmentation combined with point-prompt segmentation achieves a Dice similarity coefficient of 0.82±0.18, a 95% Hausdorff distance of 8.33±9.42 mm, an average surface distance of 2.01±2.96 mm, and a volume similarity of 0.88±0.19 (Figure 2B). The AUC of the DL Model is 0.854 (95% CI: 0.745–0.976) (Figure 2C). The calibration curve of the DL model exhibits a distinct trend (Figure 2D). The model’s discriminatory capability and clinical applicability were evaluated through decision curve analysis (Figure 2E). Based on the DL model, all 330 patients were classified into responders (n=180) and non-responders (n=150).

Figure 2 Performance of segmentation models and deep learning models. (A) Automatic segmentation of tumor lesions before and after treatment. (B) Segmentation performance of segmentation models. (C) Radar chart showing the predictive performance of deep learning models. (D) Calibration curve of model prediction performance. (E) Decision curves for model prediction performance. ACC, accuracy; NICT, neoadjuvant immunotherapy combined with chemotherapy; NPV, negative predictive value; PPV, positive predictive value; pr, probability; SEN, sensitivity; SPE, specificity.

Exploring an appropriate TTS

Among patients in the responder group, the cut-off value of TTS calculated from TTS and DFS was 31 days (Figure 3A). Bootstrap validation yielded a 95% CI for this cutoff of 28 to 40 days. Patients were categorized into two groups: TTS >31 days and TTS ≤31 days. Compared with patients in the TTS >31 days group, those in the TTS ≤31 days group had worse DFS [hazard ratio (HR): 3.75, 95% CI: 1.919–4.93, P<0.001]. Regarding OS, we observed a significant difference (HR: 2.69, 95% CI: 1.35–4.36, P=0.001) (Figure 3B,3C). In the multivariate Cox regression analysis, TTS was identified as an independent risk factor for both DFS and OS (Figure 3D,3E). The postoperative complication rates were 28.21% and 20.63% in the TTS >31 days and TTS ≤31 days groups, respectively, showing no statistically significant difference (P=0.197) (Table 2).

Figure 3 The optimal time-to-surgery for Responder patients. (A) The optimal cutoff value for determining the time-to-surgery. (B) Kaplan-Meier survival analysis of disease-free survival grouped by time-to-surgery. (C) Kaplan-Meier survival analysis of overall survival grouped by time-to-surgery. (D) Multivariate Cox regression analysis of DFS. (E) Multivariate Cox regression analysis of OS. CI, confidence interval; DFS, disease-free survival; HR, hazard ratio; MPR, major pathological response; N, node; NA, not available; OS, overall survival; T, tumor; TTS, time-to-surgery.

Table 2

Postoperative complications of responder patients

Variables	Total (n=180)	TTS >31 days (n=117)	TTS ≤31 days (n=63)	P
Postoperative complications				0.20
No	134 (74.44)	84 (71.79)	50 (79.37)
Chylothorax	4 (2.22)	4 (3.42)	0
Hydropneumothorax	17 (9.44)	11 (9.40)	6 (9.52)
Pleural effusion	7 (3.89)	6 (5.13)	1 (1.59)
Postoperative bleeding	5 (2.78)	5 (4.27)	0
Pulmonary complications	7 (3.89)	5 (4.27)	2 (3.17)
Subcutaneous emphysema	6 (3.33)	2 (1.71)	4 (6.35)

Data are presented as n (%). TTS, time-to-surgery.

Among the patients in the non-responder group, the cut-off value of TTS, calculated on the basis of TTS and DFS, was 35 days (Figure 4A). Bootstrap validation yielded a 95% CI for this cutoff of 32 to 50 days. Compared with patients in the TTS >35 days group, those in the TTS ≤35 days group demonstrated a significant advantage in DFS (HR: 0.29, 95% CI: 0.19–0.47, P<0.001), with a similar difference in OS (HR: 0.32, 95% CI: 0.16–0.52, P<0.001) (Figure 4B,4C). In the multivariate Cox regression analysis, TTS was identified as an independent risk factor for both DFS and OS (Figure 4D,4E). The postoperative complication rates were 34.85% and 25.00% in the TTS >35 days and TTS ≤35 days groups, respectively, showing no statistically significant difference (P=0.25) (Table 3).

Figure 4 The optimal time-to-surgery for non-responder patients. (A) The optimal cutoff value for determining the time-to-surgery. (B) Kaplan-Meier survival analysis of disease-free survival grouped by time-to-surgery. (C) Kaplan-Meier survival analysis of overall survival grouped by time-to-surgery. (D) Multivariate Cox regression analysis of DFS. (E) Multivariate Cox regression analysis of OS. CI, confidence interval; DFS, disease-free survival; HR, hazard ratio; MPR, major pathological response; N, node; NA, not available; OS, overall survival; T, tumor; TTS, time-to-surgery.

Table 3

Postoperative complications of non-responder patients

Variables	Total (n=150)	TTS >35 days (n=66)	TTS ≤35 days (n=84)	P
Postoperative complications				0.25
No	106 (70.67)	43 (65.15)	63 (75.00)
Arrhythmia	3 (2.00)	3 (4.55)	0
Chylothorax	1 (0.67)	1 (1.52)	0
Hydropneumothorax	21 (14.00)	10 (15.15)	11 (13.10)
Pleural effusion	7 (4.67)	2 (3.03)	5 (5.95)
Pulmonary complications	12 (8.00)	7 (10.61)	5 (5.95)

Data are presented as n (%). TTS, time-to-surgery.

Exploring an appropriate DLN count

Among the patients in the responder group, the cut-off value of the DLN count, calculated based on DLN count and DFS, was 21 (Figure 5A). The bootstrap 95% CI ranged from 18 to 25 nodes. Compared with patients in the DLN count > 21 group, patients in the DLN count ≤21 group had a significant advantage in DFS (HR: 0.23, 95% CI: 0.14–0.39, P<0.001). The OS was also statistically different (HR: 0.30, 95% CI: 0.15–0.57, P<0.001) (Figure 5B,5C). However, no statistically significant difference in DLN count cut-off values was found in patients in the non-responder group.

Figure 5 The optimal DLN count for responder patients. (A) The optimal cutoff value for determining the DLN count. (B) Kaplan-Meier survival analysis of DFS grouped by DLN count. (C) Kaplan-Meier survival analysis of overall survival grouped by DLN count. (D) Forest plot of DFS. (E) Forest plot of overall survival. CI, confidence interval; DFS, disease-free survival; DLN, dissected lymph node; HR, hazard ratio; N, node; NA, not available; OS, overall survival; T, tumor.

Considering the impact of different node (N) staging on DLN count, this study divided the responder group patients into two subgroups—N0–1 and N2—based on pre-NICT N staging. It validated the 21 DLN count cutoff values separately within each subgroup. Subgroup analysis based on DFS and OS revealed that for both N0–1 and N2 patients, those assessed by this DL model as achieving responder status after neoadjuvant therapy demonstrated a significant advantage with a DLN count ≤21 (Figure 5D,5E). In the multivariate Cox regression analysis, DLN was identified as an independent risk factor for both DFS and OS (Figure 6A,6B). The decision path based on this study is detailed in Figure 7.

Figure 6 Multivariate Cox regression analysis for DLN. (A) Multivariate Cox regression analysis of DFS. (B) Multivariate Cox regression analysis of OS. CI, confidence interval; DFS, disease-free survival; DLN, dissected lymph node; HR, hazard ratio; MPR, major pathological response; OS, overall survival.

Figure 7 Clinical decision pathway. For responder patients, the optimal time-to-surgery is >31 days. For non-responder patients, the optimal time-to-surgery is ≤35 days. For responder patients, the optimal dissected lymph node is ≤21. NICT, neoadjuvant immunotherapy combined with chemotherapy; NSCLC, non-small cell lung cancer.

Discussion

This study aimed to determine the optimal TTS and DLN count for NSCLC following NICT by establishing a DL model based on CT automatic segmentation to predict the efficacy of neoadjuvant therapy. To the best of our knowledge, this is the first study to explore these metrics in patients with locally advanced NSCLC undergoing NICT. Our study demonstrated that among all patients undergoing NICT, responder patients with appropriately delayed surgery had a better prognosis; conversely, non-responder patients should be operated on as early as possible. Additionally, in the responder group, a DLN count of ≤21 was associated with better prognosis.

Currently, TTS after neoadjuvant therapy in clinical trials is generally set at 6 weeks (4-8), whereas in clinical practice, TTS is usually in the range of 4–8 weeks. This is an extremely long time, which has been a controversial issue in clinical practice (3). In our study, patients were categorized into responder and non-responder groups based on NICT efficacy. For responder patients, extending TTS beyond 31 days was associated with a better prognosis, suggesting that delaying surgery can be beneficial. This result aligns with the findings of Li et al. in patients with esophageal cancer (11), where delaying surgery for those with significant tumor regression after neoadjuvant therapy led to a better prognosis. Prolonged TTS may facilitate tumor regression by promoting apoptosis and necrosis, attenuating treatment-induced acute inflammation, and enhancing tumor resectability and histological response, ultimately leading to improved long-term survival.

For non-responder patients, this study suggests that surgery should be performed as early as possible. These patients are not sensitive to NICT, and delaying surgery may lead to tumor progression, adversely affecting prognosis. This finding is consistent with the observations of Xiao et al. in patients with esophageal cancer (12), where delayed surgery was associated with low survival rates in patients who did not achieve complete clinical remission, after neoadjuvant chemoradiotherapy. They recommended that surgery should be performed within 10 weeks of neoadjuvant chemoradiotherapy.

In terms of safety, postoperative complications did not differ significantly between early and delayed surgery in both responder and non-responder patients. This suggests that there are no safety concerns, and that the appropriate TTS can be selected based on the patient’s response to NICT. Currently, there is a lack of high-level clinical evidence for the optimal TTS after NICT in patients with locally advanced NSCLC, and much of the evidence comes from other cancer types. Therefore, subsequent large prospective studies are needed to validate these findings. The CT-based gross tumor volume (GTV) outlining currently used in most hospitals is time-consuming and highly dependent on the clinician. Studies have demonstrated differences in the GTVs outlined by different radiation oncologists (21). The DL model based on automatic segmentation established in this study not only significantly enhances the feasibility and stability of clinical applications but also demonstrates satisfactory predictive performance. It can assist clinicians in developing more personalized treatment plans for patients in clinical settings.

Additionally, the DLN count during surgery has been a topic of clinical controversy (17). Among the enrolled patients, DLN counts ranged from 7–52, and the optimal count remains unclear. This study found that responder patients with a DLN count ≤21 had a better prognosis. Liang et al. described the relationship between DLN count and long-term survival based on the Surveillance, Epidemiology, and End Results (SEER) database and the Chinese multicenter database analysis, recommending 16 cleared lymph nodes as the cut-off point. They found that survival improves with clearance of up to 16 lymph nodes, but worsens when more than 16 lymph nodes are cleared (22). While removing lymph nodes with metastatic cancer cells is important, excessive removal of nearby lymph nodes before postoperative adjuvant immunotherapy can negatively affect patient prognosis. This suggests that adequate lymph node clearance can prevent incorrect staging downgrades and reduce residuals; however, further increases in lymph node clearance counts can adversely affect patient survival. It is crucial to interpret this cutoff value (≤21) within its context. The number of retrievable lymph nodes varies anatomically between individuals and is influenced by surgical technique and pathological processing. Our finding should not be construed as recommending a reduction of lymph node dissection below established oncological standards for complete staging and local control. Instead, it provides clinical evidence supporting the hypothesis that in patients who respond well to neoadjuvant immunotherapy, an excessively extensive lymphadenectomy that removes a large volume of potential immune-responsive tissue might be counterproductive. The value of 21 may represent a point where adequate oncologic resection and the preservation of immunologically relevant lymphoid structures are balanced in our cohort. This reinforces the emerging paradigm of “precision” or “immunotherapy-aware” lymphadenectomy, where the extent of dissection could be one element of personalized surgical planning.

In a study by Deng et al., a DLN count >16 was found to be associated with worse immunotherapy efficacy, whereas a DLN count ≤16 was associated with better efficacy and longer progression-free survival (17). These two studies yielded different results from ours, possibly because these studies included patients with early-stage lung cancer, whereas our study focused on patients with locally advanced NSCLC, who typically have higher DLN counts. In our study, using 21 lymph nodes as a cut-off value, we found that responder patients had a better prognosis with a DLN count ≤21. A clinical trial conducted by Rahim et al. on patients with head and neck tumors found that keeping the lymph nodes intact until the end of immunotherapy improves efficacy against solid tumors. The study demonstrated significant immune activation within the tumors and throughout the body after effective immunotherapy, with an increase in CD4⁺ T-cells in the peripheral blood post-treatment (14).

The purpose of immunotherapy is to initiate an immune response. Using a mouse model, Hiam-Galvez et al. reported that CD8⁺ T cells in TdLNs are integral to immunotherapy (23). TdLNs, the first lymphoid organs to encounter tumor antigens, are crucial to anti-tumor immunity as they serve as the hub of the body’s immune surveillance. Fear et al. showed that TdLNs are the main reservoir of anti-tumor T cells, and their removal may affect the subsequent immunotherapeutic response; mice with complete TdLN resection had significantly shorter survival than those with intact TdLN resection (24).

This study suggests that the intraoperative DLN count should be ≤21 in patients with NSCLC with sensitive responses to NICT, to preserve the immune potential of the lymph nodes, leading to better immunotherapeutic outcomes and prognosis. Considering that the clinical N stages of the patients in this study included N0–2 and that the number and location of lymph node metastases varied among different N stages, we validated the cut-off value of 21 lymph nodes in two subgroups: N0–1 and N2. Both subgroups yielded satisfactory results. For responder patients, only 9.1% had residual lesions in the lymph nodes after surgery, with fewer than five positive lymph nodes indicating high sensitivity to neoadjuvant immunotherapy. This group showed significant regression of primary lesions as well as a sensitive lymph node response, with 90.9% of patients having negative lymph nodes after NICT, regardless of the initial N staging. This clinical finding is supported by a recent study by Aoki et al., which demonstrated that thorough lymphadenectomy impairs the efficacy of immunotherapy administered for postoperative recurrences, underscoring the active role of lymph nodes in sustaining anti-tumor immune responses in the clinical setting (25). This indicated that the cut-off value of 21 lymph nodes can be applied to N0–2 lymph node staging. Previous studies have consistently found that an increased number of positive postoperative lymph nodes is strongly associated with poor survival outcomes (26-29). This suggests that postoperative lymph node positivity is more closely associated with prognosis in patients undergoing NICT than preoperative lymph node staging. For non-responder patients, no statistically significant results were found, likely due to the small sample size. Therefore, subsequent clinical studies with larger sample sizes are needed to determine the most appropriate DLN count for this group of patients.

It is important to acknowledge that our current DL model was trained exclusively on features extracted from the primary tumor. In stage II–III NSCLC, where nodal disease is prevalent, the TdLNs are recognized as critical hubs for initiating and sustaining anti-tumor immunity, particularly in the context of immunotherapy (14,15,23). While the primary tumor’s regression is a strong indicator of treatment response and a validated surrogate for survival (18), the response within lymph nodes may not always be synchronous. Our model’s ability to predict “Responder” status is thus fundamentally based on the primary lesion’s imaging phenotype. This approach may not fully capture patients whose primary tumor shows limited regression but who experience a significant immune-mediated response within the lymph node basin, potentially leading to their misclassification. Conversely, the model might also identify patients with primary tumor regression who still harbor residual treatment-resistant disease in lymph nodes. Future iterations of this framework would significantly benefit from integrating both primary tumor and nodal (or total tumor) burden features, either through direct segmentation of radiographically identifiable lymph nodes or by incorporating metabolic information from positron emission tomography (PET)-CT, to provide a more holistic assessment of neoadjuvant therapy efficacy.

However, there are some limitations in this study. First, this DL model was developed based on the automated segmentation and analysis of the primary tumor only. As discussed, this primary-tumor-centric approach may not fully capture the heterogeneous response to NICT that can occur between the primary lesion and metastatic lymph nodes. While our model demonstrated significant predictive and stratification value, incorporating nodal information in future models could enhance their accuracy and comprehensiveness. Second, owing to the limited follow-up time, we did not observe differences in OS across the TTS in responder patients. In addition, because of the small sample size, no statistically significant differences in appropriate DLN counts were found in non-responder patients. Therefore, prospective, multicenter, large-sample studies are necessary to confirm our results.

Conclusions

In patients undergoing NICT, our model-based stratification suggests that for predicted responders, appropriately delaying surgery may be associated with better outcomes, whereas for predicted non-responders, earlier surgery is associated with better outcomes. Furthermore, among predicted responders, an observed DLN count of ≤21 was associated with improved survival, highlighting the potential importance of balancing oncologic resection with immune preservation. These findings require prospective validation.

Acknowledgments

None.

Footnote

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-2025-1-1495/rc

Data Sharing Statement: Available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-2025-1-1495/dss

Peer Review File: Available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-2025-1-1495/prf

Funding: This work was supported by the National Natural Science Foundation of China (No. 82373044), Natural Science Foundation of Shandong Province (No. ZR2023LSW023) and Noncommunicable Chronic Diseases-National Science and Technology Major Project (Nos. 2023ZD0501900, 2023ZD0501904 & 2023ZD0501905).

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-2025-1-1495/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Ethics Committee of Cancer Hospital Affiliated with Shandong First Medical University (Shandong Cancer Hospital and Institute) (No. SDTHEC202509019) and individual consent for this retrospective analysis was waived. Shandong Provincial Hospital Affiliated with Shandong First Medical University was also informed and agreed on the study.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.

References

Siegel RL, Miller KD, Fuchs HE, et al. Cancer statistics, 2022. CA Cancer J Clin 2022;72:7-33. [Crossref] [PubMed]
Chansky K, Detterbeck FC, Nicholson AG, et al. The IASLC Lung Cancer Staging Project: External Validation of the Revision of the TNM Stage Groupings in the Eighth Edition of the TNM Classification of Lung Cancer. J Thorac Oncol 2017;12:1109-21.
Liang W, Cai K, Chen C, et al. Expert consensus on neoadjuvant immunotherapy for non-small cell lung cancer. Transl Lung Cancer Res 2020;9:2696-715. [Crossref] [PubMed]
Forde PM, Spicer J, Lu S, et al. Neoadjuvant Nivolumab plus Chemotherapy in Resectable Lung Cancer. N Engl J Med 2022;386:1973-85. [Crossref] [PubMed]
Wakelee H, Liberman M, Kato T, et al. Perioperative Pembrolizumab for Early-Stage Non-Small-Cell Lung Cancer. N Engl J Med 2023;389:491-503. [Crossref] [PubMed]
Lu S, Zhang W, Wu L, et al. Perioperative Toripalimab Plus Chemotherapy for Patients With Resectable Non-Small Cell Lung Cancer: The Neotorch Randomized Clinical Trial. JAMA 2024;331:201-11. [Crossref] [PubMed]
Heymach JV, Harpole D, Mitsudomi T, et al. Perioperative Durvalumab for Resectable Non-Small-Cell Lung Cancer. N Engl J Med 2023;389:1672-84. [Crossref] [PubMed]
Provencio M, Nadal E, González-Larriba JL, et al. Perioperative Nivolumab and Chemotherapy in Stage III Non-Small-Cell Lung Cancer. N Engl J Med 2023;389:504-13. [Crossref] [PubMed]
Liu J, O'Donnell JS, Yan J, et al. Timing of neoadjuvant immunotherapy in relation to surgery is crucial for outcome. Oncoimmunology 2019;8:e1581530. [Crossref] [PubMed]
Peng Y, Li Z, Fu Y, et al. Progress and perspectives of perioperative immunotherapy in non-small cell lung cancer. Front Oncol 2023;13:1011810. [Crossref] [PubMed]
Li J, Zhou X, Liu Y, et al. Optimal Time-to-Surgery Recommendations Based on Primary Tumor Volume Regression for Patients with Resectable Esophageal Cancer after Neoadjuvant Chemoradiotherapy: A Retrospective Study. Ann Surg Oncol 2024;31:3803-12. [Crossref] [PubMed]
Xiao X, Cheng C, Cheng L, et al. Longer Time Interval from Neoadjuvant Chemoradiation to Surgery is Associated with Poor Survival for Patients Without Clinical Complete Response in Oesophageal Cancer. Ann Surg Oncol 2023;30:886-96. [Crossref] [PubMed]
Zhang Y, Deng C, Zheng Q, et al. Selective Mediastinal Lymph Node Dissection Strategy for Clinical T1N0 Invasive Lung Cancer: A Prospective, Multicenter, Clinical Trial. J Thorac Oncol 2023;18:931-9. [Crossref] [PubMed]
Rahim MK, Okholm TLH, Jones KB, et al. Dynamic CD8(+) T cell responses to cancer immunotherapy in human regional lymph nodes are disrupted in metastatic lymph nodes. Cell 2023;186:1127-1143.e18. [Crossref] [PubMed]
Munn DH, Mellor AL. The tumor-draining lymph node as an immune-privileged site. Immunol Rev 2006;213:146-58. [Crossref] [PubMed]
Dammeijer F, van Gulijk M, Mulder EE, et al. The PD-1/PD-L1-Checkpoint Restrains T cell Immunity in Tumor-Draining Lymph Nodes. Cancer Cell 2020;38:685-700.e8. [Crossref] [PubMed]
Deng H, Zhou J, Chen H, et al. Impact of lymphadenectomy extent on immunotherapy efficacy in postresectional recurred non-small cell lung cancer: a multi-institutional retrospective cohort study. Int J Surg 2024;110:238-52. [Crossref] [PubMed]
Deutsch JS, Cimino-Mathews A, Thompson E, et al. Association between pathologic response and survival after neoadjuvant therapy in lung cancer. Nat Med 2024;30:218-28. [Crossref] [PubMed]
He Y, Guo P, Tang Y, et al. VISTA3D: A unified segmentation foundation model for 3D medical imaging. Proceedings of the Computer Vision and Pattern Recognition Conference. 2025:20863-73.
He K, Zhang X, Ren S, et al. Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 27-30 June 2016; Las Vegas, NV, USA. IEEE; 2016:770-8.
Berthon B, Evans M, Marshall C, et al. Head and neck target delineation using a novel PET automatic segmentation algorithm. Radiother Oncol 2017;122:242-7. [Crossref] [PubMed]
Liang W, He J, Shen Y, et al. Impact of Examined Lymph Node Count on Precise Staging and Long-Term Survival of Resected Non-Small-Cell Lung Cancer: A Population Study of the US SEER Database and a Chinese Multi-Institutional Registry. J Clin Oncol 2017;35:1162-70. [Crossref] [PubMed]
Hiam-Galvez KJ, Allen BM, Spitzer MH. Systemic immunity in cancer. Nat Rev Cancer 2021;21:345-59. [Crossref] [PubMed]
Fear VS, Forbes CA, Neeve SA, et al. Tumour draining lymph node-generated CD8 T cells play a role in controlling lung metastases after a primary tumour is removed but not when adjuvant immunotherapy is used. Cancer Immunol Immunother 2021;70:3249-58. [Crossref] [PubMed]
Aoki M, Kamimura GO, Tsuneyoshi Y, et al. Thorough Lymphadenectomy Impairs Immunotherapy Outcomes for Postoperative Intrathoracic Recurrence of Non-small Cell Lung Cancer. Anticancer Res 2025;45:3971-81. [Crossref] [PubMed]
Tong LL, Gao P, Wang ZN, et al. Can lymph node ratio take the place of pN categories in the UICC/AJCC TNM classification system for colorectal cancer? Ann Surg Oncol 2011;18:2453-60. [Crossref] [PubMed]
Xu ZY, Hao XY, Wu D, et al. Prognostic value of 11-factor modified frailty index in postoperative adverse outcomes of elderly gastric cancer patients in China. World J Gastrointest Surg 2023;15:1093-103. [Crossref] [PubMed]
Samson P, Puri V, Broderick S, et al. Extent of Lymphadenectomy Is Associated With Improved Overall Survival After Esophagectomy With or Without Induction Therapy. Ann Thorac Surg 2017;103:406-15. [Crossref] [PubMed]
Samson P, Puri V, Lockhart AC, et al. Adjuvant chemotherapy for patients with pathologic node-positive esophageal cancer after induction chemotherapy is associated with improved survival. J Thorac Cardiovasc Surg 2018;156:1725-35. [Crossref] [PubMed]

Cite this article as: Zhao J, Li Y, Chen J, Han C, Yan Y, Zhou C, Bai Y, Xu Y, Li Y. Development of a deep learning model based on computed tomography automatic segmentation to assist in selecting optimal time-to-surgery and dissected lymph node count for non-small cell lung cancer patients undergoing neoadjuvant immunotherapy and chemotherapy: a multicenter study. Transl Lung Cancer Res 2026;15(3):58. doi: 10.21037/tlcr-2025-1-1495

Development of a deep learning model based on computed tomography automatic segmentation to assist in selecting optimal time-to-surgery and dissected lymph node count for non-small cell lung cancer patients undergoing neoadjuvant immunotherapy and chemotherapy: a multicenter study

Highlight box

Introduction

Methods

Patients selection

Definition of treatment response

Data preprocessing

VISTA3D segmentation model

Multiple instance learning (MIL) based on ImageNet

Model evaluation

TTS and DLN count

Study endpoints

Statistical analysis

Results

Patients’ baseline characteristics

Table 1

Performance of the 3D automated segmentation and DL model

Exploring an appropriate TTS

Table 2

Table 3

Exploring an appropriate DLN count

Discussion

Conclusions

Acknowledgments

Footnote

References

Article Options

Download Citation

Share