Original Article
A quantitative method for assessing smoke associated molecular damage in lung cancers
Abstract
Background: While tobacco exposure is the cause of the vast majority of lung cancers, an important percentage arise in lifetime never smokers. Documenting the precise extent of tobacco induced molecular changes may be of importance. Also, the contribution of environmental tobacco smoke (ETS) is difficult to assess.
Methods: We developed and validated a quantitative method to assess the extent of tobacco related molecular damage by combing the most characteristic changes associated with tobacco smoke, the tumor mutation burden (TMB) and type of molecular changes present in lung cancers. Using maximum entropy (MaxEnt) as a classifier, we developed a F score. F score values >0 were considered to show evidence of tobacco related molecular damage, while values ≤0 were considered to lack evidence of tobacco related molecular damage. Compared to the stated patient tobacco exposure histories, the F scores had sensitivity, specificity and accuracy values of 85–87%. Using this method, we analyzed public data sets of lung adenocarcinoma (LUAD), lung squamous cell (LUSC) and small cell lung cancer (SCLC).
Results: Less than 10% of LUSCs and SCLCs had negative F scores, while 27% to 35% of LUADs had positive scores. The F score showed a highly significant downward trend when LUADs were subdivided into the following categories: ever, reformed ≤15 years, reformed >15 years and never smokers. Most of the examined bronchial carcinoids (a lung cancer type not associated with smoke exposure) had negative F scores. In addition, most LUADs with EGFR mutations had negative F scores, while almost all with KRAS mutations had positive scores.
Conclusions: We have established and validated a quantitative assay that will be of use in assessing the presence and degree of smoke associated molecular damage in lung cancers arising in ever and never smokers.
Methods: We developed and validated a quantitative method to assess the extent of tobacco related molecular damage by combing the most characteristic changes associated with tobacco smoke, the tumor mutation burden (TMB) and type of molecular changes present in lung cancers. Using maximum entropy (MaxEnt) as a classifier, we developed a F score. F score values >0 were considered to show evidence of tobacco related molecular damage, while values ≤0 were considered to lack evidence of tobacco related molecular damage. Compared to the stated patient tobacco exposure histories, the F scores had sensitivity, specificity and accuracy values of 85–87%. Using this method, we analyzed public data sets of lung adenocarcinoma (LUAD), lung squamous cell (LUSC) and small cell lung cancer (SCLC).
Results: Less than 10% of LUSCs and SCLCs had negative F scores, while 27% to 35% of LUADs had positive scores. The F score showed a highly significant downward trend when LUADs were subdivided into the following categories: ever, reformed ≤15 years, reformed >15 years and never smokers. Most of the examined bronchial carcinoids (a lung cancer type not associated with smoke exposure) had negative F scores. In addition, most LUADs with EGFR mutations had negative F scores, while almost all with KRAS mutations had positive scores.
Conclusions: We have established and validated a quantitative assay that will be of use in assessing the presence and degree of smoke associated molecular damage in lung cancers arising in ever and never smokers.