Development and validation of a dynamic survival nomogram for metastatic non-small cell lung cancer based on the SEER database and an external validation cohort
Introduction
In recent years, there have been sharp declines in the incidence and mortality, and significant improvement of survival of lung cancer patients, especially those with non-small cell lung cancer (NSCLC). It is estimated that the 2-year relative survival of NSCLC has been improved from 34% (diagnosed from 2009 through 2010) to 42% (diagnosed from 2015 through 2016), including absolute improvement of 5–6% for patients at all stages (1). However, lung cancer is at present the second common cancer and the leading cause of cancer death, with an estimated 2.3 million newly diagnosed cases and 1.8 million deaths in 2020 of the USA alone (2). According to the American Joint Committee on Cancer (AJCC) 8th edition tumor-node-metastasis (TNM) staging system, metastatic lung cancer is defined as stages M1 and IV, including M1a (separate tumor nodule/s in a contralateral lobe; pleural nodules, or malignant pleural or pericardial effusion), M1b [single extrathoracic metastasis or involvement of a single distant (non-regional) node], and M1c (multiple extrathoracic metastases in 1 or multiple organs) (3). Patients with stage IV account for about 35% of all patients, while the 2- and 5-year survival rates have been reported at only 23% and 10% for stage IVA, and 10% and 0% for stage IVB, respectively (4). Despite the novel molecular-targeted therapies and immunotherapies have been developed, stage IV patients still have a very poor prognosis (5).
A predictive model helps the clinicians estimate disease progression and predict patients’ survival according to their baseline characteristics and clinical data. Based on a Cox hazard regression model, a nomogram is a widely applied tool for predicting the survival of patients with malignant tumors (6). Many studies have reported the creation of a nomogram for lung cancer. Liang et al. developed a nomogram based on a Chinese multi-institutional registry of 6,111 patients with resected NSCLC and validated by a separate cohort of 2,148 patients from the International Association for the Study of Lung Cancer (IASLC) database. The nomogram included 6 independent prognostic factors and reached a C-index of 0.71, higher than the TNM staging system for predicting overall survival (OS) (7).
The TNM staging system is the most-widely used tool for guiding clinical treatments and predicting the prognosis (8). Wankhede et al. evaluated the 8th AJCC TNM stage for NSCLC by meta-analysis, indicating that the C-index of the 8th and 7th editions were 0.690 and 0.688, respectively (9). For the purpose of convenient use and easy prediction, the TNM staging system only includes three key factor, lacking some essential information for survival analysis, such as age, gender, histology, and treatments. Therefore, numerous nomograms have been developed to predict the prognosis of lung cancer. A study published a nomogram for stage IB NSCLC, with age, gender, histology, differentiation grade, the extent of surgery, and lymph nodes resected entered. The authors found that the nomogram demonstrated good prognostic applicability and clinical accuracy, with the C-index values of 0.637 (95% CI: 0.634–0.641) for the training cohort and 0.667 (95% CI: 0.656–0.678) for the external validation cohort (10). The nomogram demonstrates better performance in prognosis prediction with much more factors requested. However, there has been no previous report of a nomogram for patients with metastatic NSCLC. In this study, we developed a nomogram for patients with metastatic NSCLC based on the Surveillance, Epidemiology, and End Results (SEER) database. We validated the nomogram with an internal validation cohort from the SEER database and an external validation cohort from a single center. We present the following article in accordance with the TRIPOD reporting checklist (available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-22-544/rc).
Methods
Patient selection
We selected the patients from 18 population-based cancer registries (with additional treatments fields) of the SEER database (http://seer.cancer.gov/). The SEER*Stat program (v 8.3.9; seer.cancer.gov/seerstat) was used to extract the information of patients with lung cancer. The extraction conditions were as follows: “the location of the disease: Lung and Bronchus” and “diagnosis year: 2004–2016”.
Following variables were extracted: “Age recode with <1 year old”, “Race recode (White, Black, Other)”, “Sex”, “Marital status at diagnosis”, “Primary Site – labeled”, “Histologic Type ICD-O-3”, “Grade”, “Laterality”, “Derived AJCC Stage Group, 7th ed (2010–2015)”, “Derived AJCC T, 7th ed (2010–2015)”, “Derived AJCC N, 7th ed (2010–2015)”, “Derived AJCC M, 7th ed (2010–2015)”, “Derived AJCC Stage Group, 6th ed (2004–2015)”, “Derived AJCC T, 6th ed (2004–2015)”, “Derived AJCC N, 6th ed (2004–2015)”, “Derived AJCC M, 6th ed (2004–2015)”, “RX Summ--Surg Prim Site (1998+)”, “RX Summ--Scope Reg LN Sur (2003+)”, “RX Summ--Surg Oth Reg/Dis (2003+)”, “Chemotherapy recode”, “Radiation recode”, “SEER Combined Mets at DX-bone (2010+)”, “SEER Combined Mets at DX-brain (2010+)”, “SEER Combined Mets at DX-liver (2010+)”, “SEER Combined Mets at DX-lung (2010+)”, “Survival months”, “Vital status recode”, “First malignant primary indicator”, “Total number of in situ/malignant tumors for patient”. We screened the selected patients according to the following exclusion criteria: (I) patients diagnosed with small cell lung cancer (SCLC); (II) patients with M0 stage, MX, or unknown M stage; (III) age <18 years; (IV) patients in whom lung cancer was the first primary tumor; (V) patients with more than 1 malignant tumor; (VI) patients without information about the survival months; (VII) patients with unknown race, marital status, tumor site, grade, T stage, N stage, and metastatic sites. The patients’ T stage and N stage were transformed into the AJCC 8th TNM stage, while the M stage was not changed. In the 7th AJCC TNM stage, M1b stands for distant metastasis, divided into M1b (single extrathoracic metastasis) and M1c (multiple extrathoracic metastases) in the 8th TNM staging system.
The selected patients from the SEER database were randomly assigned to the training and internal validation cohorts with a bootstrapping technique and a proportion of 7:3. We selected patients with metastatic NSCLC diagnosed from 2015 to 2020 in Renji Hospital as the external validation cohort. At last, a total of 242 patients with metastatic NSCLC at Renji Hospital were enrolled as the external validation cohort, who had complete baseline characteristics and follow-up data. Clinical and pathological data were retrieved retrospectively from the hospital database, and follow-up information was collected by telephone interview. Patients without follow-up data and other essential clinical data were excluded.
Ethical statement
The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). The study was approved by The Ethics Committee of Renji Hospital, Shanghai Jiao Tong University School of Medicine (Shanghai, China) (No. RA-2020-572), and informed consent was taken from all the patients.
Nomogram development
We calculated the hazard ratios (HRs) and 95% confidence intervals (CIs) of the risk factors for the OS of the training cohort by applying the univariate Cox proportional hazards regression model when the risk factors with a P value less than 0.05 were included in the multivariate regression model. The independent risk factors were integrated into the nomogram model (P<0.05 in the multivariate Cox proportional hazards regression analysis). The probability of OS less than 3, 6, and 12 months could be estimated with the nomogram.
Nomogram validation
The training, internal, and external validation cohorts were used to validate the discriminative ability and calibration of the nomogram. Harrell’s C-statistic (C-index) was adopted as the primary indicator of discriminative power. Ranging from 0.5 to 1, the C-index values means that the discrimination ranges from none to perfect. A predicting model with C-index higher than 0.7 is usually considered as useful and predicative. Time-dependent receiver operating characteristic (ROC) curves with area under the curve (AUC) at 3, 6, and 12 months were also applied to demonstrate discriminative power. We used the calibration plot which to calibrate the relationship between observations and predicted probabilities. A standard curve of the calibration plot is a straight line through the origin of the axes with a slope of 1. when the prediction line falls on the 45-degree diagonal more, the model is more accurate. Finally, we applied decision curve analysis (DCA) to compare the accuracy of the nomogram and the TNM staging system.
Statistical analysis
We used R software (version 4.0.2; The R Foundation for Statistical Computing, Vienna, Austria) to construct the nomogram. All tests were two-sided and the statistical result was considered statistically significant when the P value was less than 0.05. We presented categorical variables as proportions and used Chi-square tests or Fisher’s precision probability test to compare the difference of categorical variables. According to the previous report (11), we also calculated the sum score of each patient based on the Cox hazards proportional regression model. We divided the patients into the low-risk and high-risk groups with the cut-off point for the risk stratification, which was calculated by the “surv_cutpoint” function of the “survminer” of the R packages. Survival analysis between the low-risk and high-risk groups was conducted with a Kaplan–Meier survival curve and the log-rank test.
Results
Demographic and clinicopathological characteristics
We listed all analyzed variables of the included patients in the SEER database (Table 1). A total of 18,343 patients in the SEER database were randomly divided into the training cohort (n=12,840) and the internal validation cohort (n=5,503). There was no statistically-significant difference between the training cohort and the internal validation cohort in all analyzed variables, including age, race, gender, marital status, primary site, histology, grade, laterality, T stage, N stage, M stage, surgery, chemotherapy, radiation therapy, bone metastasis, liver metastasis, brain metastasis, and lung metastasis.
Table 1
Variables | Training set (N=12,840) | Test set (N=5,503) | P value |
---|---|---|---|
Age | 0.065 | ||
20–54 years | 1,745 (13.6%) | 818 (14.9%) | |
55–64 years | 3,565 (27.8%) | 1,543 (28.0%) | |
65–74 years | 4,308 (33.6%) | 1,745 (31.7%) | |
75–84 years | 2,654 (20.7%) | 1,158 (21.0%) | |
85+ years | 568 (4.4%) | 239 (4.3%) | |
Race | 0.431 | ||
White | 9,899 (77.1%) | 4,267 (77.5%) | |
Black | 1,794 (14.0%) | 731 (13.3%) | |
Other | 1,147 (8.9%) | 505 (9.2%) | |
Gender | 0.427 | ||
Female | 5,713 (44.5%) | 2,484 (45.1%) | |
Male | 7,127 (55.5%) | 3,019 (54.9%) | |
Marital status | 0.34 | ||
Married | 6,954 (54.2%) | 3,023 (54.9%) | |
Others | 5,886 (45.8%) | 2,480 (45.1%) | |
Primary site | 0.799 | ||
Main bronchus | 627 (4.9%) | 256 (4.7%) | |
Upper lobe | 7,695 (59.9%) | 3,341 (60.7%) | |
Middle lobe | 579 (4.5%) | 246 (4.5%) | |
Lower lobe | 3,784 (29.5%) | 1,588 (28.9%) | |
Overlapping lesion of lung | 155 (1.2%) | 72 (1.3%) | |
Histology | 0.466 | ||
Adenocarcinoma | 6,758 (52.6%) | 2,950 (53.6%) | |
Squamous | 3,152 (24.5%) | 1,315 (23.9%) | |
Others | 2,930 (22.8%) | 1,238 (22.5%) | |
Grade | 0.541 | ||
Grade I | 701 (5.5%) | 289 (5.3%) | |
Grade II | 3,595 (28.0%) | 1,504 (27.3%) | |
Grade III | 8,127 (63.3%) | 3,543 (64.4%) | |
Grade IV | 417 (3.2%) | 167 (3.0%) | |
Laterality | 0.905 | ||
Bilateral | 46 (0.4%) | 21 (0.4%) | |
Left | 5,331 (41.5%) | 2,269 (41.2%) | |
Right | 7,463 (58.1%) | 3,213 (58.4%) | |
T stage | 0.989 | ||
T0 | 4 (0.0%) | 2 (0.0%) | |
T1 | 1,206 (9.4%) | 519 (9.4%) | |
T2 | 2,101 (16.4%) | 903 (16.4%) | |
T3 | 1,137 (8.9%) | 498 (9.0%) | |
T4 | 8,392 (65.4%) | 3,581 (65.1%) | |
N stage | |||
N0 | 3,093 (24.1%) | 1,289 (23.4%) | 0.394 |
N1 | 1,158 (9.0%) | 483 (8.8%) | |
N2 | 5,958 (46.4%) | 2,546 (46.3%) | |
N3 | 2,631 (20.5%) | 1,185 (21.5%) | |
M stage | |||
M1a | 2,551 (19.9%) | 1,109 (20.2%) | 0.268 |
M1b | 10,091 (78.6%) | 4,326 (78.6%) | |
M1NOS | 198 (1.5%) | 68 (1.2%) | |
Surgery | 0.898 | ||
No | 11,967 (93.2%) | 5,132 (93.3%) | |
Yes | 873 (6.8%) | 371 (6.7%) | |
Chemotherapy | 0.0807 | ||
No/unknown | 4,873 (38.0%) | 2,013 (36.6%) | |
Yes | 7,967 (62.0%) | 3,490 (63.4%) | |
Radiation | 0.341 | ||
No | 11,495 (89.5%) | 4,953 (90.0%) | |
Yes | 1,345 (10.5%) | 550 (10.0%) | |
Bone metastasis | 0.227 | ||
No | 7,946 (61.9%) | 3,353 (60.9%) | |
Yes | 4,894 (38.1%) | 2,150 (39.1%) | |
Liver metastasis | 0.365 | ||
No | 10,790 (84.0%) | 4,654 (84.6%) | |
Yes | 2,050 (16.0%) | 849 (15.4%) | |
Lung metastasis | 0.932 | ||
No | 8,535 (66.5%) | 3,654 (66.4%) | |
Yes | 4,305 (33.5%) | 1,849 (33.6%) | |
Brain metastasis | |||
No | 8,858 (69.0%) | 3,753 (68.2%) | 0.297 |
Yes | 3,982 (31.0%) | 1,750 (31.8%) |
In addition, we enrolled a total of 242 patients with metastatic NSCLC at Renji Hospital as the external validation cohort. We compared the demographic and clinicopathological characteristics of the SEER cohort and the external validation cohort (Table 2). All patients in the external validation cohort were Chinese that corresponded ‘others’ in the SEER database and the age of the external validation cohort was higher than that of the SEER cohort (P<0.001). The external validation cohort had a significantly higher proportion of T4 stage, M1a stage, surgery, and liver metastasis, and that showed a significantly lower proportion of chemotherapy than the SEER cohort (all P<0.05). However, there was no statistically-significant difference between the 2 cohorts regarding gender, marital status, primary site, histology, grade, laterality, N stage, radiation therapy, bone metastasis, brain metastasis, and lung metastasis (all P>0.05). The significant differences between the 2 cohorts helped to highlight the efficacy of the validation. Figure 1 shows the flowchart of the study design.
Table 2
Variables | External cohort (N=242) | SEER cohort (N=18,343) | P value |
---|---|---|---|
Age | <0.001 | ||
20–54 years | 26 (10.7%) | 2,563 (14.0%) | |
55–64 years | 60 (24.8%) | 5,108 (27.8%) | |
65–74 years | 66 (27.3%) | 6,053 (33.0%) | |
75–84 years | 53 (21.9%) | 3,812 (20.8%) | |
85+ years | 37 (15.3%) | 807 (4.4%) | |
Race | <0.001 | ||
White | 0 (0%) | 14,166 (77.2%) | |
Black | 0 (0%) | 2,525 (13.8%) | |
Other | 242 (100%) | 1,652 (9.0%) | |
Gender | 0.912 | ||
Female | 109 (45.0%) | 8,197 (44.7%) | |
Male | 133 (55.0%) | 10,146 (55.3%) | |
Marital status | 0.061 | ||
Married | 117 (48.3%) | 9,977 (54.4%) | |
Others | 125 (51.7%) | 8,366 (45.6%) | |
Primary site | 0.359 | ||
Main bronchus | 8 (3.3%) | 883 (4.8%) | |
Upper lobe | 138 (57.0%) | 11,036 (60.2%) | |
Middle lobe | 15 (6.2%) | 825 (4.5%) | |
Lower lobe | 76 (31.4%) | 5,372 (29.3%) | |
Overlapping lesion of lung | 5 (2.1%) | 227 (1.2%) | |
Histology | 0.160 | ||
Adenocarcinoma | 120 (49.6%) | 9,708 (52.9%) | |
Squamous | 72 (29.8%) | 4,467 (24.4%) | |
Others | 50 (20.7%) | 4,168 (22.7%) | |
Grade | 0.329 | ||
Grade I | 8 (3.3%) | 990 (5.4%) | |
Grade II | 66 (27.3%) | 5,099 (27.8%) | |
Grade III | 163 (67.4%) | 11,670 (63.6%) | |
Grade IV | 5 (2.1%) | 584 (3.2%) | |
Laterality | 0.504 | ||
Bilateral | 0 (0%) | 67 (0.4%) | |
Left | 106 (43.8%) | 7,600 (41.4%) | |
Right | 136 (56.2%) | 10,676 (58.2%) | |
T stage | 0.012 | ||
T0 + T1 | 19 (7.9%) | 1,731 (9.4%) | |
T2 | 22 (9.1%) | 3,004 (16.4%) | |
T3 | 25 (10.3%) | 1,635 (8.9%) | |
T4 | 176 (72.7%) | 11,973 (65.3%) | |
N stage | 0.177 | ||
N0 | 51 (21.1%) | 4,382 (23.9%) | |
N1 | 17 (7.0%) | 1,641 (8.9%) | |
N2 | 111 (45.9%) | 8,504 (46.4%) | |
N3 | 63 (26.0%) | 3,816 (20.8%) | |
M stage | 0.002 | ||
M1a | 68 (28.1%) | 3,660 (20.0%) | |
M1b | 168 (69.4%) | 14,417 (78.6%) | |
M1NOS | 6 (2.5%) | 266 (1.5%) | |
Surgery | 0.007 | ||
No | 215 (88.8%) | 17,099 (93.2%) | |
Yes | 27 (11.2%) | 1,244 (6.8%) | |
Chemotherapy | 0.024 | ||
No/unknown | 108 (44.6%) | 6,886 (37.5%) | |
Yes | 134 (55.4%) | 11,457 (62.5%) | |
Radiation | 0.093 | ||
No | 225 (93.0%) | 16,448 (89.7%) | |
Yes | 17 (7.0%) | 1,895 (10.3%) | |
Bone metastasis | 0.290 | ||
No | 141 (58.3%) | 11,299 (61.6%) | |
Yes | 101 (41.7%) | 7,044 (38.4%) | |
Liver metastasis | 0.040 | ||
No | 192 (79.3%) | 15,444 (84.2%) | |
Yes | 50 (20.7%) | 2,899 (15.8%) | |
Lung metastasis | 0.110 | ||
No | 149 (61.6%) | 12,189 (66.5%) | |
Yes | 93 (38.4%) | 6,154 (33.5%) | |
Brain metastasis | 0.459 | ||
No | 161 (66.5%) | 12,611 (68.8%) | |
Yes | 81 (33.5%) | 5,732 (31.2%) |
SEER, Surveillance, Epidemiology, and End Results.
Univariate and multivariate analysis in the training cohort
We conducted the univariate and multivariate analysis using the Cox proportional hazards regression model in the training cohort (n=12,840), with a total of 11,446 events recorded (Table 3). In terms of OS, in terms of OS, the univariate analysis showed that the vast majority of the variables including age, race, gender, marital status, primary site, histology, grade, T stage, N stage, M stage, surgery, chemotherapy, radiation therapy, bone metastasis, liver metastasis, and brain metastasis was significantly associated with the OS of the patients (all P<0.05), with the exceptions of laterality and lung metastasis (P>0.05). When incorporated into the multivariate model, all included variables remained statistically significant after a stepwise regression (all P<0.05).
Table 3
Variables | Univariate analysis | Multivariate analysis | |||
---|---|---|---|---|---|
HR (95% CI) | P value | HR (95% CI) | P value | ||
Age | |||||
20–54 years | Reference | Reference | |||
55–64 years | 1.182 (1.111–1.258) | <0.001 | 1.150 (1.080–1.224) | <0.001 | |
65–74 years | 1.356 (1.277–1.441) | <0.001 | 1.301 (1.224–1.383) | <0.001 | |
75–84 years | 1.540 (1.443–1.643) | <0.001 | 1.401 (1.309–1.498) | <0.001 | |
85+ years | 1.736 (1.573–1.917) | <0.001 | 1.351 (1.219–1.497) | <0.001 | |
Race | |||||
Black | Reference | Reference | |||
White | 0.954 (0.905–1.005) | 0.081 | 1.351 (0.981–1.092) | 0.205 | |
Other | 0.684 (0.631–0.741) | <0.001 | 0.757 (0.698–0.822) | <0.001 | |
Gender | |||||
Female | Reference | Reference | |||
Male | 1.287 (1.24–1.335) | <0.001 | 1.237 (1.191–1.285) | <0.001 | |
Marital status | |||||
Married | Reference | Reference | |||
Others | 1.211 (1.167–1.256) | <0.001 | 1.151 (1.108–1.196) | <0.001 | |
Primary site | |||||
Main bronchus | Reference | Reference | |||
Upper lobe | 0.761 (0.700–0.828) | <0.001 | 0.843 (0.774–0.918) | <0.001 | |
Middle lobe | 0.699 (0.620–0.788) | <0.001 | 0.794 (0.704–0.896) | <0.001 | |
Lower lobe | 0.789 (0.723–0.861) | <0.001 | 0.877 (0.803–0.958) | 0.003 | |
Overlapping lesion of lung | 0.714 (0.591–0.863) | <0.001 | 0.815 (0.674–0.984) | 0.034 | |
Histology | |||||
Adenocarcinoma | Reference | Reference | |||
Squamous | 1.341 (1.282–1.402) | <0.001 | 1.198 (1.143–1.255) | <0.001 | |
Others | 1.229 (1.173–1.286) | <0.001 | 1.167 (1.113–1.224) | <0.001 | |
Grade | |||||
Grade I | Reference | Reference | |||
Grade II | 1.288 (1.178–1.409) | <0.001 | 1.185 (1.082–1.298) | <0.001 | |
Grade III | 1.668 (1.531–1.817) | <0.001 | 1.448 (1.327–1.580) | <0.001 | |
Grade IV | 1.961 (1.724–2.232) | <0.001 | 1.643 (1.440–1.875) | <0.001 | |
Laterality | |||||
Bilateral | Reference | NA | |||
Left | 0.914 (0.672–1.244) | 0.570 | NA | ||
Right | 0.926 (0.681–1.260) | 0.627 | NA | ||
T stage | |||||
T0 + T1 | Reference | Reference | |||
T2 | 1.200 (1.112–1.296) | <0.001 | 1.218 (1.128–1.315) | <0.001 | |
T3 | 1.408 (1.292–1.535) | <0.001 | 1.300 (1.191–1.419) | <0.001 | |
T4 | 1.379 (1.291–1.472) | <0.001 | 1.427 (1.335–1.526) | <0.001 | |
N stage | |||||
N0 | Reference | Reference | |||
N1 | 1.171 (1.089–1.259) | <0.001 | 1.173 (1.090–1.261) | <0.001 | |
N2 | 1.318 (1.258–1.381) | <0.001 | 1.299 (1.238–1.364) | <0.001 | |
N3 | 1.328 (1.256–1.404) | <0.001 | 1.409 (1.329–1.493) | <0.001 | |
M stage | |||||
M1a | Reference | Reference | |||
M1b | 1.403 (1.338–1.471) | <0.001 | 1.228 (1.161–1.300) | <0.001 | |
M1NOS | 1.296 (1.112–1.510) | <0.001 | 1.124 (0.961–1.313) | 0.142 | |
Surgery | |||||
No | Reference | Reference | |||
Yes | 0.444 (0.409–0.481) | <0.001 | 0.535 (0.490–0.584) | <0.001 | |
Chemotherapy | |||||
No/unknown | Reference | Reference | |||
Yes | 0.462 (0.445–0.480) | <0.001 | 0.426 (0.410–0.444) | <0.001 | |
Radiation | |||||
No | Reference | Reference | |||
Yes | 0.7095 (0.667–0.7547) | <0.001 | 0.892 (0.835–0.954) | <0.001 | |
Bone metastasis | |||||
No | Reference | Reference | |||
Yes | 1.252 (1.206–1.3) | <0.001 | 1.248 (1.197–1.301) | <0.001 | |
Liver metastasis | |||||
No | Reference | Reference | |||
Yes | 1.434 (1.365–1.506) | <0.001 | 1.299 (1.234–1.366) | <0.001 | |
Lung metastasis | |||||
No | Reference | NA | |||
Yes | 0.9835 (0.946–1.022) | 0.4014 | NA | ||
Brain metastasis | |||||
No | Reference | Reference | |||
Yes | 1.171 (1.126–1.218) | <0.001 | 1.332 (1.274–1.393) | <0.001 |
OS, overall survival; CI, confidence interval; HR, hazard ratio; NA, not available.
Development of the nomogram
We established the nomogram based on the established multivariate model (Figure 2). A total of 16 risk factors were included: age, race, gender, marital status, primary site, histology, grade, T stage, N stage, M stage, surgery, chemotherapy, radiation therapy, bone metastasis, liver metastasis, and brain metastasis. Since most patients survived less than 1 year, we built the nomogram predicting the survival probability at 3, 6, and 12 months. The C-index of the nomogram was 0.702 (95% CI: 0.684–0.720). For example, in the case of a 40-year-old white patient who was divorced and had been diagnosed with a grade III lung adenocarcinoma in the left upper lobe, the TNM stage was T1N1M1b (bone metastasis) and stage IV. He had received no surgery, no chemotherapy, and no radiation. This patient would be scored 1,100 points according to the nomogram, with the survival probabilities of 0.386 for less than 3 months, 0.6 for less than 6 months, and 0.813 for less than 12 months. The nomogram was published online at https://pillawang.shinyapps.io/dynnomapp/.
Validation of the nomogram
We applied an internal validation cohort from the SEER database (n=5,503) and an external validation cohort (n=242) to validate the nomogram, indicating that the nomogram also exhibited good prognostic value in the internal validation cohort (C-index =0.699, 95% CI: 0.673–0.725) and external validation cohort (C-index =0.695, 95% CI: 0.653–0.737). We also plotted the calibration plots of the nomogram in the training cohort, internal validation cohort, and external validation cohort (Figure 3) by 1,000 bootstrap resamples. The calibration plots showed that there was a good concordance between the predicted and observed 3-, 6-, and 12-month OS probability in internal and external validations. However, we noticed that the 12-month OS rate of the external validation group was higher than those of the training cohort and the internal validation cohort.
The ROC analysis showed that the nomogram had a high discriminative ability in all cohorts (Figure 4). The training cohort’s 3-, 6-, and 12-month AUCs were 0.781, 0.762, and 0.754, respectively. The internal validation cohort’s 3-, 6-, and 12-month AUCs were 0.777, 0.754, and 0.747, respectively. The external validation cohort’s 3-, 6-, and 12-month AUCs were 0.793, 0.753, and 0.759, respectively.
Survival and DCA analysis
The Cox hazard proportional regression model's cut-off point was set at 1.05, dividing the patients into the high- and low-risk groups. We compared the survival between the high- and low-risk groups using Kaplan–Meier survival curve (Figure 5), indicating a significant difference between the high- and low-risk groups in the training, internal, and external validation cohort (all P<0.001). We also completed the DCA analysis to compare the nomogram and the TNM staging system in the prediction performance (Figure 6). The results demonstrated that the nomogram was better than the TNM staging system in predicting 3-, 6-, and 12-month OS. The C-index of the TNM staging system was 0.563 (95% CI: 0.560–0.565).
Discussion
According to the latest reports in the US, the incidence of NSCLC per 100,000 has dropped from 46.4 in 2010 to 40.9 in 2017 overall, and that of stage IV at diagnosis has decreased slightly from 21.7 to 19.6 (12). Nevertheless, the 5-year survival probability decreases sharply according to the stages, from 50–65% for stage I to 2–3% for stage IV (13). In this study, we attempted to build a nomogram for stage IV patients based on the SEER database and then to validate the nomogram with internal and external validation cohorts.
A total of 16 independent risk factors, which was significantly higher than those in previous reports, were identified in this study. The entered risk factors could be attributed to 3 aspects. Firstly, the demographic characteristics including age, gender, marital status, and race were chosen for the nomogram. The earlier nomogram for stage IB NSCLC only contained the age and gender without marital status and race because the authors did not input the race into the univariate analysis. The sample size was far less than in our study; hence the marital status was not statistically significant (10). Secondly, the tumor information including primary site, histology, grade, T stage, N stage, and M stage was entered into the nomogram. Zheng et al. investigated lung cancer incidence, survival, and prognostic factors with bone metastasis and developed a nomogram (14). The factors of age, gender, the total number of sites, histological types, grade, tumor size, and treatment were enrolled into the model, which was quite different from our study. The total number of sites was limited to 1 in our study, and the tumor size equaled the T stage in our model. Wang et al. compared different N descriptor numbers of positive lymph nodes (NPLN), log odds of positive lymph nodes (LODDS), and lymph node ratio (LNR) in their prognostic roles for lung adenocarcinoma. They found that LODDS + LNR demonstrated the highest prediction accuracy, and developed a nomogram based on the findings (11). All of the nomograms above did not include the M stage since the studies were limited to the M0 stage or metastasis to bone. Thirdly, the treatment modalities, including surgery, chemotherapy, and radiation, were vital in the model. All treatments were important protecting factors for OS, with a significantly lowered hazard ratio (HR) of 0.535 (95% CI: 0.490–0.584) for the surgery, 0.426 (95% CI: 0.410–0.444) for the chemotherapy, and 0.892 (95% CI: 0.835–0.954) for the radiation. Surprisingly, the surgery was a significantly-improving factor for OS. Chao et al. compared the OS of patients with stage IV extrathoracic metastatic NSCLC receiving surgery or not. They demonstrated that surgery could improve the survival of patients with single organ metastasis, while surgery showed no significant survival benefits in patients with multiple organ metastases (15). Lastly, the metastasis sites were included in the multivariate model. We have transformed the 7th AJCC TNM staging into the 8th edition, although the M stage was not changed because the number of sites of the metastasis was unknown in the SEER database.
A nomogram with a C-index higher than 0.70 is usually considered accurate and useful. Liang’s nomogram had a C-index higher than the 7th AJCC TNM staging system in both the primary cohort (0.71 vs. 0.68, respectively; P<0.01) and IASLC cohort (0.67 vs. 0.64, respectively; P=0.06). We also calculated the C-index of the TNM staging system in metastatic NSCLC patients, which was lower than that of the nomogram (0.563 vs. 0.702, P<0.001). The DCA analysis also demonstrated that the nomogram performed better than the TNM staging system. The calibration plot and Kaplan–Meier survival curve were constructed to validate the nomogram in the internal and external validation cohorts, indicating that the nomogram was as accurate and discriminative as in the internal validation cohort. We noticed that the OS of the external cohort was better than that of the SEER cohort. We supposed that the diagnosis year of the external validation cohort was 2015–2020, when novel therapies had improved the OS of stage IV.
To the best of our knowledge, this is the first nomogram for predicting the survival of patients with metastatic NSCLC based on an extensive database with long-term follow-up and validated by a single-center retrospective cohort. We have also provided an online tool of the nomogram for prognosis prediction. However, several limitations of this study must be noted. Firstly, our nomogram was more complex than the TNM classification, when16 items must be considered and analyzed. It is hard to make an accurate grading of the pathological results. Since metastatic NSCLC cases are the main subjects, the pathological specimens were likely to be biopsy specimens, and the entire tumors have not been evaluated. Secondly, although we have transformed the T stage and N stage from the 7th AJCC TNM stage to the 8th AJCC TNM stage, the M stage could not be transformed due to the lack of information about the number of the metastatic sites in the SEER database. In our nomogram, the M1b and M1c stage in the 8th AJCC TNM must be allocated into the M1b stage. Thirdly, molecular or genetic information is now becoming an important aspect affecting the prognosis, which was absent from the nomogram and should be considered in future models. Lastly, only traditional treatments were included in the model without novel therapies, such as targeted therapy and immunotherapy.
Conclusions
We have developed a novel dynamic nomogram for predicting the survival of metastatic NSCLC patients. The internal and external cohort validations demonstrated that the nomogram had good accuracy and discriminative ability. This tool provides a practical tool for clinicians to evaluate the stage and predict the prognosis for patients with stage IV NSCLC.
Acknowledgments
The authors would like to thank all patients and staff who have participated in the SEER program. The authors also appreciate the academic support from the AME Thoracic Surgery Collaborative Group.
Funding: This study was funded by the 2021 “Clinical+” Excellence Program (Grant No. 2021ZYA001), and Three-year Action Plan Project to Promote Clinical Skills and Clinical Innovation Capability of Municipal Hospitals (No. SHDC2020CR5001), Shanghai Shenkang Hospital Development Center.
Footnote
Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-22-544/rc
Data Sharing Statement: Available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-22-544/dss
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-22-544/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). The study was approved by The Ethics Committee of Renji Hospital, Shanghai Jiao Tong University School of Medicine (Shanghai, China) (No. RA-2020-572), and informed consent was taken from all the patients.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Siegel RL, Miller KD, Fuchs HE, et al. Cancer Statistics, 2021. CA Cancer J Clin 2021;71:7-33. [Crossref] [PubMed]
- Sung H, Ferlay J, Siegel RL, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin 2021;71:209-49. [Crossref] [PubMed]
- Lim W, Ridge CA, Nicholson AG, et al. The 8th lung cancer TNM classification and clinical staging system: review of the changes and clinical implications. Quant Imaging Med Surg 2018;8:709-18. [Crossref] [PubMed]
- Goldstraw P, Chansky K, Crowley J, et al. The IASLC Lung Cancer Staging Project: Proposals for Revision of the TNM Stage Groupings in the Forthcoming (Eighth) Edition of the TNM Classification for Lung Cancer. J Thorac Oncol 2016;11:39-51. [Crossref] [PubMed]
- Planchard D, Popat S, Kerr K, et al. Metastatic non-small cell lung cancer: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up. Ann Oncol 2018;29:iv192-237. [Crossref]
- Iasonos A, Schrag D, Raj GV, et al. How to build and interpret a nomogram for cancer prognosis. J Clin Oncol 2008;26:1364-70. [Crossref] [PubMed]
- Liang W, Zhang L, Jiang G, et al. Development and validation of a nomogram for predicting survival in patients with resected non-small-cell lung cancer. J Clin Oncol 2015;33:861-9. [Crossref] [PubMed]
- Nicholson AG, Chansky K, Crowley J, et al. The International Association for the Study of Lung Cancer Lung Cancer Staging Project: Proposals for the Revision of the Clinical and Pathologic Staging of Small Cell Lung Cancer in the Forthcoming Eighth Edition of the TNM Classification for Lung Cancer. J Thorac Oncol 2016;11:300-11.
- Wankhede D. Evaluation of Eighth AJCC TNM Sage for Lung Cancer NSCLC: A Meta-analysis. Ann Surg Oncol 2021;28:142-7. [Crossref] [PubMed]
- Zuo Z, Zhang G, Song P, et al. Survival Nomogram for Stage IB Non-Small-Cell Lung Cancer Patients, Based on the SEER Database and an External Validation Cohort. Ann Surg Oncol 2021;28:3941-50. [Crossref] [PubMed]
- Wang S, Yu Y, Xu W, et al. Dynamic nomograms combining N classification with ratio-based nodal classifications to predict long-term survival for patients with lung adenocarcinoma after surgery: a SEER population-based study. BMC Cancer 2021;21:653. [Crossref] [PubMed]
- Ganti AK, Klein AB, Cotarla I, et al. Update of Incidence, Prevalence, Survival, and Initial Treatment in Patients With Non-Small Cell Lung Cancer in the US. JAMA Oncol 2021;7:1824-32. [Crossref] [PubMed]
- Mar J, Arrospide A, Iruretagoiena ML, et al. Changes in lung cancer survival by TNM stage in the Basque country from 2003 to 2014 according to period of diagnosis. Cancer Epidemiol 2020;65:101668. [Crossref] [PubMed]
- Zheng XQ, Huang JF, Lin JL, et al. Incidence, prognostic factors, and a nomogram of lung cancer with bone metastasis at initial diagnosis: a population-based study. Transl Lung Cancer Res 2019;8:367-79. [Crossref] [PubMed]
- Chao C, Qian Y, Li X, et al. Surgical Survival Benefits With Different Metastatic Patterns for Stage IV Extrathoracic Metastatic Non-Small Cell Lung Cancer: A SEER-Based Study. Technol Cancer Res Treat 2021;20:15330338211033064. [Crossref] [PubMed]