Real-time breath metabolomics as catalyst for personalized lung cancer diagnostics: prospective matched case-control trial (LUCAbreath)
Highlight box
Key findings
• This study demonstrates that secondary electrospray ionization-high resolution mass spectrometry (SESI-HRMS) potentially could address longstanding barriers in lung cancer breath analysis. Real-time breath profiling revealed distinct metabolic lung cancer signatures, and specific to lung cancer subtypes. Notably, mainly de novo fatty acid metabolites could enable accurate prediction of lung adenocarcinoma (LUAD) with an area under the curve of 0.84, underscoring the diagnostic potential of breath-based metabolic phenotyping.
What is known and what is new?
• Lung cancer breath diagnostics continue to face major challenges, including low biomarker concentrations in exhaled breath, variability in collection methodologies, limited molecular coverage, and persistent difficulties in compound identification. Despite decades of research, these limitations have hindered clinical translation.
• This study introduces, for the first time, the application of cutting-edge SESI-HRMS in distinguishing the lung cancer breath metabolome from matched controls in a clinical setting. By enabling high-resolution, real-time analysis, SESI-HRMS overcomes critical technical constraints and uncovers metabolic pathways reflective of lung cancer heterogeneity.
What is the implication, and what should change now?
• The findings support SESI-HRMS as a transformative modality for non-invasive translational lung cancer research. Its integration into clinical workflows opens the path toward routine breath-based metabolomics and supports a broader shift toward multi-omics diagnostic strategies in lung cancer. Future diagnostic frameworks should consider combining SESI-HRMS with low-dose computed tomography and genomic profiling to enhance early detection, refine risk stratification, and personalize treatment pathways. The growing body of evidence calls for coordinated validation studies and standardisation efforts to accelerate regulatory acceptance and clinical implementation.
Introduction
The transition from generalised to personalised medicine in lung cancer, the world’s leading cause of cancer-related deaths (1), has been gradual and is still ongoing. This shift has been primarily driven by the genomic revolution, marked by several key milestones: Boveri’s gene theory of cancer in 1914, the decoding of DNA by Watson and Crick in 1953 (2), and the rise of oncogenes and tumor suppressor genes beginning in the 1980s. The completion of the Human Genome Project in 2003 represents a pivotal milestone in the movement toward personalized medicine (3). Consequently, the focus is actual on genetic targets and biomarkers such as oncogenes and tumor suppressor genes, which play a central role in personalised lung cancer research, targeted therapy, and diagnostics. However, biomarker identification for lung cancer remains a critical area of research, as the potential of genomics-focused personalised approaches alone appears to be insufficient.
Therefore, it is essential to recognize that the metabolic characterization of cancer, which dates back as far as these genetic milestones, is equally significant. For example, Warburg described aerobic glycolysis in cancer cells in the 1920s (4), a phenomenon now known as the Warburg effect, which underpins modern positron emission tomography-computed tomography (PET-CT) imaging. Thus, the next critical milestone in this evolving personalised paradigm will involve understanding individual human traits at the molecular metabolic level, including the real-time characterization of individual metabolic phenotypes. Therefore, alongside the human genome, a deeper understanding of the human metabolome is vital for personalising lung cancer management towards a holistic understanding of individual biological characteristic along the omics cascade.
To address this need, metabolomics studies have emerged, although they have primarily focused on invasive measurements of blood and tissue specimens to identify and map metabolites in lung cancer (5,6). Exhaled breath analysis offers significant advantages by allowing non-invasive collection of biological information from lung cancer patients, but it has struggled since the 1980s to generate reproducible results suitable for clinical implementation (7-10). The main challenges include the low concentrations of breath metabolites, difficulties in breath collection, standardization issues, limited molecular coverage, and challenges in compound identification and understanding of compound origins.
New innovative real-time exhaled breath analysis using secondary electrospray ionization high-resolution mass spectrometry (SESI-HRMS) presents a promising tool to integrate exhaled breath analysis into clinical lung cancer workflows, providing non-invasive insights into the metabolome (Exhalomics). SESI-HRMS is particularly promising due to its ability to perform real-time measurements of the breath gas phase, its high sensitivity and low detection limits, broad coverage of volatile and semi-volatile organic compounds (VOC and SVOC), and its capacity for high-throughput metabolic information extraction (11).
Given these promising aspects, this prospective explorative study was designed to evaluate the performance of SESI-HRMS in distinguishing lung cancer patients from matched controls. The objective of this case-control study was to identify a disease-specific “lung cancer type breath pattern” via using SESI-HRMS capabilities to reveal metabolomic origin of such patterns and assess the prediction performance of such patterns. We present this article in accordance with the STARD and STROBE reporting checklists (available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-2025-aw-1187/rc).
Methods
Study design and setting
For this prospective observational study, a 1:1 matched case-control design was employed. Treatment-naive suspected lung cancer (LC) patients were recruited consecutively between 2020 and 2023 at the University Hospital Zurich, and their exhaled breath was compared to those 1:1 matched controls. Real-time breath measurements were conducted cross-sectional during routine diagnostic visits at the hospital.
Participants
Patients aged 18 to 85 years with suspected abnormal pulmonary lesions were included in the study. Exclusion criteria were: (I) the presence of another active secondary malignant disease (e.g., breast cancer or colon cancer); (II) an acute common cold within the past four weeks; (III) any other acute lung disease; (IV) acute or chronic hepatic disease; and (V) renal failure or renal replacement therapy (glomerular filtration rate <15 mL/min). Patients with a confirmed diagnosis of any LC type, established through cytological or histological tissue analysis, were designated as cases. Patients without confirmation of LC were assigned to the control group (Figure S1). Lung cancer was excluded in all control participants through completion of the routine diagnostic workup, including radiological assessment (e.g., chest CT and/or PET-CT where clinically indicated) and, when required, histological or cytological evaluation. Furthermore, controls were screened from other departments at the hospital based on age (±5 years), sex, and smoking status (±10 pack years).
Variables, data sources and measurement analysis
Outcome
The primary outcome of interest was the detection of a LC-specific exhaled breath profile. VOCs and SVOCs were measured using SESI-HRMS and were assigned to tentative molecular formulas (12) based on functional metabolic pathway mapping and accurate mass assignment. Secondary outcomes included the characterisation of LC histology-specific metabolome characteristics and the performance of designed prediction models.
SESI-HRMS breath measurements
The analytical platform for real-time breath analysis consisted of a SESI source and electrospray emitter (SUPER-SESI & Sharp Singularity Emitter, Fossil Ion Technology, Spain) coupled to a HRMS (Q Exactive Plus, Thermo Fisher Scientific, Germany). The design, settings, and further details are reported elsewhere (13-16). Additionally, a capnograph (Exhalion, Fossil Ion Technology, Spain) was used to identify patients’ alveolar breath via CO2 concentration, exhaled breath volume and flow. Breath samples were collected in real time using a standardized protocol. Participants were asked to refrain from eating or drinking for at least one hour prior to measurement and were seated comfortably during the procedure. Participants performed repeated tidal breathing through a sterile filter (MicroGard II, Vyaire, Germany). The exhaled breath was directed through a heated stainless-steel sampling tube (50 cm length, 4 mm inner diameter) coated with SilcoNert to minimise analyte adsorption, which was connected to the auxiliary (AUX) gas inlet of the Orbitrap high resolution mass spectrometer. Alveolar breath was identified and continuously monitored using integrated capnography, allowing selective analysis of the end-tidal exhalation phase. Each measurement lasted approximately 5–10 minutes, during which multiple breath cycles were recorded. In total, 12 breath cycles per participant were acquired, comprising six cycles in positive ionisation mode and six in negative ionisation mode. No forced breathing maneuvers, breath holds, or sample storage were required, enabling direct integration of the measurement into routine clinical workflows. The convenient setup is presented in Figure 1.
Addressing bias and confounding factors
This proof-of-concept matched case-control study focused on patients with pulmonary abnormalities to minimize selection bias, thereby ensuring that the source population is representative of the target population. 1:1 matching was performed to minimize the influence of confounding factors. Although controls had no confirmed diagnosis of lung cancer during the diagnostic workup, the possibility of false-negative findings cannot be entirely excluded. However, longitudinal follow-up through medical record review did not reveal any subsequent lung cancer diagnoses among controls, reducing the likelihood of clinically relevant misclassification. Breath measurements were standardised and performed by only three experts to reduce observer bias. The repeatability of the SESI-HRMS system using a gas standard is reported with a coefficient of variation (CV) of 2.9% (17). Participants were asked to refrain from eating and drinking for one hour before the measurement. Additionally, the study team inquired about other influencing or confounding factors, such as comorbidities, medications, and dietary habits. The index test results were not available to the study team during the assessment.
Data pre-processing
The dataset utilized in this analysis consisted of clinical metadata and raw mass spectrometry breath data obtained from 178 patients (89 cancer patients and 89 matched controls), yielding a total of 356 breath measurements. Given the untargeted and exploratory nature of SESI-HRMS-based metabolomics, a conventional a priori sample size calculation was not performed, as the relevant features, effect sizes, and variance structure are unknown before data acquisition and differ substantially across metabolites. Instead, the study protocol predefined a target recruitment of at least 100 lung cancer cases to ensure sufficient coverage of disease heterogeneity and enable robust multivariate modelling. To allow for 1:1 matching, potential dropouts, and predefined exclusions, plus 10% participants were consecutively enrolled (Figure S2). Each patient data set included a mass spectrometry file in both positive and negative mode, as well as corresponding Exhalion measurement files. The mass spectra data were directly retrieved from the RAW files using an in-house software based on RawFileReader from Thermo Fisher Scientific (Bremen, Germany) (18). In the following the measurement data were preprocessed using a patented pipeline from Deep Breath Intelligence AG (DBI AG, Rotkreuz, Switzerland), resulting in a data matrix comprising 10,529 breath features. MATLAB (version 2019b, MathWorks Inc., USA) was used for data pre-processing.
Data post-processing and statistical analysis
After removing sparse features from the original 10,529 pre-processing matrix, 3,750 features were retained for further analysis. Sparse features were defined as breath features, which were not present in 50 percent of samples. Changes between groups were analysed by paired t-test. A volcano plot was generated to visualize statistical significance (−log10 P values from paired t-tests) against the magnitude of effect [log2 fold change (log2FC)] for breath features. P values were adjusted for multiple comparisons and q-values were calculated according to Storey’s approach (19).
To assess the biological relevance of the identified features, pathway enrichment analysis was performed using the Mummichog algorithm. This analysis mapped the selected features (m/z values) to known metabolic networks, identifying significantly altered metabolic pathways. Feature adducts listed in Table S1 were used.
Partial least squares discriminant analysis (PLS-DA) was used for dimensionality reduction. Cross-validation was employed to determine the optimal number of components, optimizing performance metrics. Features with a variable importance in projection (VIP) score greater than 1 were selected for further analysis.
To enhance the discriminative power, an extreme gradient boosting (XGBoost) model was applied using a stratified 100-fold cross-validation approach. The performance of the PLS-DA model was evaluated using accuracy, sensitivity, specificity, and the area under the receiver operating characteristic curve (AUC). For internal validation, 80% of the data were used for training and 20% for testing for each prediction model.
All statistical analyses were conducted in R (v4.4.1, R Foundation for Statistical Computing, Austria).
Statement of ethics and AI
The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study protocol was approved by the regional ethics committee of Zurich (KEK-ZH 2016-00384). Written informed consent was obtained from all participants before their inclusion in the study. The clinical trial was registered at ClinicalTrials.gov (NCT02781857). Language editing was done via chatGPT (GPT-4o, Prompt: “Improve the text in grammar, reader flow and clarity within a British scientific style.”). After using this tool/service, the author(s) reviewed and edited the content as needed and take(s) full responsibility for the content of the publication.
Results
Patient characteristics
Figure S2 illustrates the study flow, from the screening of potential participants to the final assessment. A total of 220 patients were enrolled between 2020 and 2023, with 178 providing analysable final data. Table 1 presents the characteristics of the participants, while Table 2 details about LC-related measures. The cohort comprised nearly equal numbers of men and women with a median age of 67.0 [60.0, 73.0] years, and median pack-years of 35.0 [20.0, 50.0].
Table 1
| Variables | Controls (N=89) | Lung cancer (N=89) |
|---|---|---|
| Age, years | 67.0 [60.0, 73.0] | 67.0 [60.0, 72.0] |
| Sex, male | 47 (52.8) | 47 (52.8) |
| BMI, kg/m2 | 26.1 [22.8, 30.6] | 25.1 [21.8, 28.4] |
| Smoking history | ||
| Pack years | 35.0 [20.0, 50.0] | 35.0 [20.0, 50.0] |
| Non-smoker | 11 (12.4) | 11 (12.4) |
| Ex-smoker | 55 (61.8) | 44 (49.4) |
| Smoker | 23 (25.8) | 34 (38.2) |
| FEV1* % pred. | 61.0 [47.0, 93.0] | 83.0 [62.0, 94.0] |
| Medication, count | 5.0 [3.0, 9.0] | 4.0 [2.0, 7.0] |
| Co-morbidities, count | 5.0 [3.0, 7.0] | 5.0 [2.0, 7.0] |
| COPD | 42 (47.2) | 28 (31.5) |
| OSA | 26 (29.2) | 11 (12.4) |
Participants were 1:1 matched according to age, sex and pack years. Results remain stable after applying a linear effects model on all features via correction for potential confounding co-variables like smoker, FEV1 % pred., COPD and OSA. Values are presented as median [IQR] or n (%). *, data from N=114 (Control: N=67, Lung cancer: N=77). BMI, body mass index; COPD, chronic obstructive pulmonary disease; FEV1, forced expiratory volume in the first second; IQR, interquartile range; OSA, obstructive sleep apnoea.
Table 2
| Variables | N (%) |
|---|---|
| Lung cancer type | |
| Adenocarcinoma | 57 (64.0) |
| Carcinoid | 3 (3.4) |
| NUT carcinoma | 1 (1.1) |
| Squamous cell carcinoma | 17 (19.1) |
| Small-cell carcinoma | 11 (12.4) |
| Localization* | |
| Right middle lobe | 4 (4.5) |
| Left upper lobe | 26 (29.2) |
| Right upper lobe | 37 (41.6) |
| Left lower lobe | 10 (11.2) |
| Right lower lobe | 12 (13.5) |
| UICC classification (8th Edition) | |
| Adeno in situ | 2 (2.2) |
| I | 33 (37.1) |
| II | 11 (12.4) |
| III | 19 (21.4) |
| IV | 24 (26.9) |
*, second tumor location N=13. UICC, Union Internationale Contre le Cancer.
Consistent with epidemiological trends, adenocarcinoma was the most prevalent histological type in our study population. Other histological types ranged from 1.1% to 19.1%. All Union Internationale Contre le Cancer (UICC 8th Edition)-classified stages were represented, and 70.8% of the participants had tumors located in one of the upper lobes.
Differences between groups
Group comparison between the entire LC cohort and the control group identified 608 breath features with a P<0.05 and 18 breath features with an adjusted q-value <0.05 (Table S2). Sub-group analyses of LC subjects with lung adenocarcinoma (LUAD), lung squamous cell carcinoma (LUSC), and small cell lung cancer (SCLC) with their matched controls revealed 549, 187 and 97 altered breath features with a P<0.05. The breath features (m/z), P values, Log2FC and q-values of these subgroup comparisons are presented in Table S3. Assigned tentative molecular formula based on high mass accuracy are listed as well. Additionally, we present the visualization of the group comparisons in Vulcano plots in Figure 2.
Metabolic functional enrichment analysis
LC is a highly heterogeneous disease with different histology, clinical and neuroendocrine characteristics. Thus, the functional enrichment analysis focused on histological LC subgroups separately, revealed characteristic metabolic enrichments for different subtypes. In the analysis of 57 LUAD patients versus matched controls, four metabolic pathways were identified as predominant altered with P-gamma <0.05. Lowest P-gamma 0.040 showed de novo fatty acid biosynthesis. Other enriched pathways are C21-steroid hormone biosynthesis and metabolism, xenobiotic metabolism and linoleate metabolism. In contrast, the analysis of 17 LUSC cases versus matched controls highlighted eight predominant altered LUSC-associated pathways. Lowest P-gamma showed butanoate metabolism and glycine/serine/alanine/threonine metabolism with 0.033. Additional enriched pathways are alanine/aspartate metabolism, glyoxylate/dicarboxylate metabolism, valine/leucine/isoleucine degradation, glutamate metabolism, pyrimidine metabolism and squalene/cholesterol biosynthesis. Lastly, the analysis of exhaled breath from 11 SCLC patients versus their matched controls identified seven predominant altered pathways. Lowest P-gamma with 0.033 showed arginine and proline metabolism. Additional enriched pathways are histidine metabolism, ascorbate/aldarate metabolism, urea cycle/amino group metabolism, aspartate/asparagin metabolism, glutamate metabolism and glutathione metabolism. Tables S4-S6 show an overview of the functional pathway enrichment analysis. Selected compounds from most prominent differentiating pathways are displayed in Figure 3. A summary of the metabolic difference between LC subtypes is presented in Figure 4 and figure description.
Real-time breath analysis assisted lung cancer prediction
Figure 5A-5H presents the PLS-DA results of the prediction models to differentiation LC patients from matched controls via real-time SESI-HRMS. The prediction model for distinguishing LC from controls Figure 5B yielded a mean accuracy of 0.75 [95% confidence interval (CI): 0.69–0.81], an AUC of 0.82 (95% CI: 0.75–0.88), a sensitivity of 0.80 (95% CI: 0.72–0.88), and a specificity of 0.71 (95% CI: 0.61–0.80). Focusing on a histologically homogeneous group, such as LUAD in Figure 5D, improved the mean accuracy of the model to 0.78 (95% CI: 0.71–0.85), AUC to 0.84 (95% CI: 0.78–0.90), sensitivity to 0.82 (95% CI: 0.72–0.91), and specificity to 0.73 (95% CI: 0.62–0.84). The performance of prediction models for LUSC Figure 5F yielded an accuracy of 0.79 (95% CI: 0.65–0.91), AUC to 0.81 (95% CI: 0.63–0.95), sensitivity to 0.76 (95% CI: 0.53–0.94), and specificity to 0.82 (95% CI: 0.61–1.00) and the SCLC model Figure 5H an accuracy of 0.68 (95% CI: 0.45–0.86), AUC to 0.75 (95% CI: 0.51–0.93), sensitivity to 0.64 (95% CI: 0.36–0.91), and specificity to 0.72 (95% CI: 0.44–1.00). Additional performance parameter like PLS-DA classification error rate and different PLS-DA components are presented in Figures S3-S7 in the supplementary material.
Discussion
This prospective observational study achieved its primary goal by identifying a unique breath pattern in lung cancer patients compared to matched controls, using cutting-edge real-time SESI-HRMS technology. Among the findings, 18 significant breath features (q-value <0.05) emerged after rigorous multiple testing correction, underscoring the potential of this approach. While these features show promising distinctions, reliably discriminating between the two groups remained challenging. To address this, we explored distinct histological subgroups, which demonstrated metabolic discrimination patterns across all analyzed LC types. Prediction models showed a moderate-to-high performance in untargeted SESI-HRMS breath metabolome analysis. Overall, real-time SESI-HRMS analyses of exhaled breath enabled rapid, non-invasive assignment of potential compound identities and their biological origins, advancing the quest for clinically relevant breath biomarkers for lung cancer.
This study presents extensive (m/z) feature lists that underscore the biological information potential of real-time exhaled breath analysis. Through functional pathway enrichment analysis, these m/z features can be interpreted in a metabolic context. The heterogeneity of the histological subtypes of lung carcinoma was notably reflected. Remarkably, the subgroup analysis of exhaled breath data of LUAD patients and matched controls showed that de-novo fatty acid (FA) biosynthesis is the most prominent affected pathway in this group comparison. Increased de novo FA biosynthesis has been recognised as a hallmark of cancer, whereas its significance in cancer pathogenesis has long been underestimated (20,21). Currently, different drug candidates for cancer which target FA metabolic enzymes are under evaluation (22). The re-emergence of interest in altered FA metabolism in cancer is substantial as FAs not only serve as structural components of the membrane matrix but also function as important secondary messengers and fuel sources for energy production (21). Previous studies have reported alterations in de novo FA (23) and glycerophospholipid metabolism (24) in LUAD breath samples, highlighting the specific metabolic profile of this histological subtype. Additionally, glycerophospholipid changes identified in blood (24) and risk models based on FA metabolism have been proposed to personalize treatment and improve prognosis in LUAD (25). In contrast, LUSC is characterized by a more active energy metabolism, especially glucose associated pathways (26). SESI-HRMS exhaled breath analysis revealed differences in glucose turnover-associated pathways, such as glycine/serine/alanine/threonine metabolism and glyoxylate/dicarboxylate metabolism, between controls and LUSC patients. The higher expression of glucose transporter type 1 (GLUT1) is a hallmark of LUSC (24), explaining its vulnerability to glycolytic inhibition, whereas LUAD is mainly glucose-independent (27). The finding of differences in butanoate metabolism in LUSC is difficult to discuss. On the one hand this pathways is associated with microbiological intestinal fermentation, but on the other hand many of these molecules are used in the production of ketone bodies, the creation of short-chain lipids or as precursors to the citrate cycle, glycolysis or glutamate synthesis (28). In addition to the differences between these two forms of non-small cell lung cancer (NSCLC), the presented exhaled breath data provides insights into the proliferating metabolism of clinical aggressive represented SCLC. The literature reports alterations in amino acid and polyamine synthesis in SCLC (29,30), which our data confirmed, further highlighting differences in breath profiles linked to amino acid metabolism, such as arginine and proline metabolism.
The goal is to utilize these discovered metabolic differences for clinical decision-making. Currently, the diagnostic gold standard for screening, low-dose computed tomography (LDCT), demonstrates predictive performance with sensitivity between 59% and 100% and specificity between 26.4% and 99.7%, with most studies reporting sensitivity above 80% and specificity above 75% (31). The sample size of this study was sufficient to demonstrate a clinically relevant benefit of our exhaled breath analysis models, as supported by the narrow confidence intervals and consistent performance across lung cancer subtypes. The overall LC model achieved an AUC of 0.82 (95% CI: 0.75–0.88), sensitivity of 0.80 (95% CI: 0.72–0.88), and specificity of 0.71 (95% CI: 0.61–0.80). Even when considering the lower confidence bounds, the model achieves at least 72% sensitivity and 61% specificity, which is within the range reported for LDCT, the current gold standard for screening (sensitivity: 59–100%, specificity: 26.4–99.7%, with most studies above 80% and 75%, respectively). Stratification into homogeneous subgroups such as LUAD and LUSC further improved classification performance, with lower CI bounds remaining comparable to the LDCT benchmark for sensitivity and, in some cases, specificity. These findings indicate that our dataset provided adequate statistical power to detect meaningful differences, and that the performance of our models is at least comparable to LDCT within the lower performance limits.
eNose breath analysis technologies have reported results with higher accuracy, sensitivity, and specificity, namely 0.87, 0.86, and 0.89, respectively; however, these results are based on a homogeneous control group of patients with COPD versus lung cancer (32), which does not fully reflect real clinical settings. Other innovative approaches, such as liquid biopsy, also demonstrate good screening performance (33) with specificity ranging from 80% to 99.8% and sensitivity from 21% to 91%, depending on the molecular target (34). Liquid biopsy primarily focuses on genomic or proteomic markers in the bloodstream. Thus, combining breath analysis, blood biomarkers, and CT imaging could improve lung cancer screening or diagnostic performance, contributing to a more integrated multi-omics approach and advance lung cancer screening/diagnostics. However, considering the economic aspects, patient convenience, and the unique metabolic insights offered, LC breath analysis via SESI-HRMS suggests that it could independently provide added value to LDCT in future personalized clinical workflows. For instance, breath metabolome analysis, when combined with risk assessment tools like the PLCOm2012 or USPSTF2013 questionnaires, could be employed to pre-select individuals for low-dose CT screening (7). This may be particularly valuable in populations currently not covered by standard screening recommendations, such as individuals younger than 55 years or never-smokers. In this context, breath analysis could serve as a complementary tool to LDCT by identifying individuals most likely to benefit from subsequent LDCT, thereby improving cost-effectiveness, increasing the positive predictive value of LDCT, and reducing unnecessary radiation exposure without replacing CT-based screening.
Since the 1980s (35), there have been efforts to investigate LC breath biomarkers and translate these findings into clinical practice. Various technologies have been explored, which can be categorised into four main groups: (I) pattern recognition using sensor arrays; (II) mass spectrometry approaches; (III) exhaled breath condensate (EBC) assays; and (IV) breath analysis using animal olfaction. However, none of these technologies have yet been successfully established in clinical settings, nor have they validated a lung cancer biomarker for clinical use (36). There are several reasons for this. First, lung cancer is inherently a highly heterogeneous disease, making biomarker development particularly challenging. Second, there are significant hurdles in breath analysis itself. The concentrations of VOCs and SVOCs in exhaled breath are very low and exhibit highly dynamic profiles. This variability is influenced by the ventilation-perfusion ratio, especially for endogenous compounds, emphasizing the need for standardised breath measurement as methodologies remain inconsistent across studies. These methodological challenges make it difficult to reproduce and confirm results in clinical validation studies. Third, the processing of breath samples plays a significant role. Collecting and processing samples in portable collectors, such as adsorption tubes or breath bags, is prone to confounding errors. Last, the perhaps most crucial challenge is that the full range of molecules present in exhaled breath has not yet been fully identified and the biological origin of most proposed LC breath biomarkers remains unclear. Rather than a single VOC serving as a definitive biomarker, it is more likely that a pattern of VOCs at various intensities will form a biomarker “fingerprint”. Accurate assessment and interpretation of this multivariate data will require advanced statistical and machine learning approaches. In the past five years, significant progress has been made in sensor pattern recognition and mass spectrometry, with technologies being validated (32,37) and promising approaches under investigation, such as exogenous VOCs (EVOCs) analysis (38). Also in this study, several challenges associated with breath analysis were addressed using SESI-HRMS. SESI-HRMS offers an exceptionally high sensitivity, capable of detecting compounds at concentrations in the part-per-quadrillion range (39). Additionally, this technology is coupled with a capnograph, enabling the standardised collection of alveolar exhaled air, ensuring consistent and reliable sampling. Another key advantage of SESI-HRMS is its ability to analyse breath samples in real time, directly during routine clinical workflows, thereby avoiding confounding factors associated with sample transport, processing, or storage. Furthermore, SESI-HRMS provides a broad coverage of detectable compounds compared to other exhaled breath analysis methods, including both VOCs and SVOCs, and offers a wide range of mass-to-charge (m/z) features. The identification of breath compounds is made possible through soft ambient ionization processes and precise compound/ mass assignment. This allows for rapid identification of molecules present in the breath with respectable uncertainty according to Schymanski et al. Consequently, SESI-HRMS stands out as a highly powerful tool for extracting high-throughput metabolic data (11), offering valuable insights into metabolic processes.
This prospective, matched case-control study has inherent limitations relevant for clinical translation. First, the cross-sectional design and moderate sample size restrict the assessment of intra-individual variability in breath metabolomics; longitudinal studies with repeated measures are needed to address this and to support clinical robustness. Second, given the exploratory nature of the work, independent external validation in larger and more targeted hypothesis-driven cohorts is required before clinical applicability can be inferred. While SESI-HRMS enables sensitive, non-invasive metabolic profiling, further standardization of breath collection, such as the use of internal or external standards, may reduce variability and improve reproducibility. In addition, mechanistic insight into the transition of metabolites from cellular metabolism to exhaled breath remains limited. Further, in vitro studies, such as those previously demonstrated for SESI-HRMS-based cellular research (40), are needed to better understand the cellular origin and release mechanisms of breath metabolites. Future experiments combining SESI-HRMS with complementary in vitro metabolic platforms, such as Seahorse extracellular flux analysis, could provide valuable insights into real-time cellular metabolism and its relationship to breath-derived metabolic signatures. Finally, metabolite annotations were assigned at evidence levels 4–5 according to Schymanski et al., providing tentative identification; confirmation using orthogonal techniques such as MS/MS or ultra-high-performance liquid chromatography would strengthen metabolite identification confidence and support translational relevance.
Conclusions
In conclusion, this prospective explorative study indicates that SESI-HRMS exhaled breath analysis could serve as a promising approach for further clinical research. Real-time SESI-HRMS proved to be an elegant and non-invasive analytical method, which could be easily implemented in current clinical workflows. Our exploratory study showed a potential discrimination between LC breath pattern and matched controls, demonstrating the potential to extract high-throughput metabolic information from breath and revealing metabolic insights into LC and different LC histology. These findings could complement current clinical workflows in addition to genomic profiling advancing lung cancer screening and diagnostics towards a multi-omics approach.
Acknowledgments
We thank the patients and volunteers who participated. Furthermore, we appreciate Ms. Bajrami, Ms. Basler, Mr. Rhouma, Mr. Zhang, Mr. Clarenbach, Mr. Zenobi, Mr. Thoma, Mr. Grimaldi, Ms. Steinack, Ms. Brennecke-Wagner, Mr. Bradichic, Mr. Lüchinger, Mr. Volk, Mrs. Gisler, Mrs. Pedrocci, Mrs. Willimeck Moser, Mr. Osswald, Ms. Streckenbach, Ms. Nowak, Mr. Gao, Mrs. Curioni-Fontecedro, Ms. Schmitt-Opitz, Mr. Schneiter for advice and support of the project. This work is part of the Zurich Exhalomics project under the umbrella of University Medicine Zurich/ Hochschulmedizin Zürich.
Language editing was done via ChatGPT (GPT-4o, Prompt: “Improve the text in grammar, reader flow and clarity within a British scientific style.”). After using this tool/service, the author(s) reviewed and edited the content as needed and take(s) full responsibility for the content of the publication.
Footnote
Reporting Checklist: The authors have completed the STARD and STROBE reporting checklists. Available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-2025-aw-1187/rc
Data Sharing Statement: Available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-2025-aw-1187/dss
Peer Review File: Available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-2025-aw-1187/prf
Funding: This work was supported by
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-2025-aw-1187/coif). F.S. is part-time employed in the Spin-out DBI AG. K.D.S. is a part-time employee of DBI AG, which provides services in the field of breath analysis. S.U. received grants or contracts from Swiss National Science Foundation, MSD Switzerland, Swiss Lung League, Gebro Swiss, Orpha Swiss and Janssen SA; received consulting fees from MSD Switzerland, Orpha Swiss, Gebro Swiss and Janssen SA; received payment for expert testimony from Orpha Swiss, MSD Switzerland, Janssen SA, Gebro Swiss and Novartis SA; and had patents planned from Orpha Swiss, MSD Switzerland, Janssen SA, and Gebro Swiss. P.S. reports stock shares at DBI AG; and served as a co-founder and board member of DBI AG, a company that provides services in the field of breath analysis. M.K. reports grant from Lotte und Adolf Hotz-Sprenger Foundation and grant from GSK for project in COPD and breath analysis; reports shares from DBI AG; and served as board member for Novartis, GSK and Roche. The other authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the regional ethics committee of Zurich (KEK-ZH 2016-00384) and informed consent was obtained from all individual participants.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Bray F, Laversanne M, Sung H, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2024;74:229-63. [Crossref] [PubMed]
- Watson JD, Crick FHC. Molecular Structure of Nucleic Acids: A Structure for Deoxyribose Nucleic Acid. Nature 1953;171:737-8. [Crossref] [PubMed]
- Institute NHGR. The Human Genome Project 2024. Available online: https://www.genome.gov/human-genome-project
- Calkins GN. Zur Frage der Entstehung maligner Tumoren. By Th. Boveri. Jena, Gustav Fischer. 1914. 64 pages. Science. 1914;40:857-9.
- Shestakova KM, Moskaleva NE, Boldin AA, et al. Targeted metabolomic profiling as a tool for diagnostics of patients with non-small-cell lung cancer. Sci Rep 2023;13:11072. [Crossref] [PubMed]
- Madama D, Martins R, Pires AS, et al. Metabolomic Profiling in Lung Cancer: A Systematic Review. Metabolites 2021;11:630. [Crossref] [PubMed]
- Schmidt F, Kohlbrenner D, Malesevic S, et al. Mapping the landscape of lung cancer breath analysis: A scoping review (ELCABA). Lung Cancer 2023;175:131-40. [Crossref] [PubMed]
- Amann A, Corradi M, Mazzone P, et al. Lung cancer biomarkers in exhaled breath. Expert Rev Mol Diagn 2011;11:207-17. [Crossref] [PubMed]
- Hanna GB, Boshier PR, Markar SR, et al. Accuracy and Methodologic Challenges of Volatile Organic Compound-Based Exhaled Breath Tests for Cancer Diagnosis: A Systematic Review and Meta-analysis. JAMA Oncol 2019;5:e182815. [Crossref] [PubMed]
- Vadala R, Pattnaik B, Bangaru S, et al. A review on electronic nose for diagnosis and monitoring treatment response in lung cancer. J Breath Res 2023;
- Wüthrich C, Giannoukos S. Advances in secondary electrospray ionization for breath analysis and volatilomics. International Journal of Mass Spectrometry 2024;498:117213.
- Schymanski EL, Jeon J, Gulde R, et al. Identifying small molecules via high resolution mass spectrometry: communicating confidence. Environ Sci Technol 2014;48:2097-8. [Crossref] [PubMed]
- Singh KD, Osswald M, Ziesenitz VC, et al. Personalised therapeutic management of epileptic patients guided by pathway-driven breath metabolomics. Commun Med (Lond) 2021;1:21. [Crossref] [PubMed]
- Basler S, Fricke K, Sievi NA, et al. Exploring breath metabolomics as a non-invasive tool for detecting pulmonary vascular disease. Eur Heart J Open 2025;5:oeaf060. [Crossref] [PubMed]
- Gisler A, Singh KD, Marten A, et al. Urinary marker of oxidative stress in children correlates with molecules in exhaled breath. Front Mol Biosci 2025;12:1511119. [Crossref] [PubMed]
- Tang Z, Yang J, Su B, et al. Metabolite Fusion between Breath and Blood Enables More In-Depth Understanding of the Endogenous Metabolome. Anal Chem 2025;97:19427-36. [Crossref] [PubMed]
- Singh KD, Tancev G, Decrue F, et al. Standardization procedures for real-time breath analysis by secondary electrospray ionization high-resolution mass spectrometry. Anal Bioanal Chem 2019;411:4883-98. [Crossref] [PubMed]
- Gisler A, Singh KD, Zeng J, et al. An interoperability framework for multicentric breath metabolomic studies. iScience 2022;25:105557. [Crossref] [PubMed]
- Storey JD. A Direct Approach to False Discovery Rates. Journal of the Royal Statistical Society Series B: Statistical Methodology. 2002;64:479-98.
- Mashima T, Seimiya H, Tsuruo T. De novo fatty-acid synthesis and related pathways as molecular targets for cancer therapy. Br J Cancer 2009;100:1369-72. [Crossref] [PubMed]
- Koundouros N, Poulogiannis G. Reprogramming of fatty acid metabolism in cancer. Br J Cancer 2020;122:4-22. [Crossref] [PubMed]
- Du A, Wang Z, Huang T, et al. Fatty acids in cancer: Metabolic functions and potential treatment. MedComm – Oncology 2023;2:e25.
- Visca P, Sebastiani V, Botti C, et al. Fatty acid synthase (FAS) is a marker of increased risk of recurrence in lung carcinoma. Anticancer Res 2004;24:4169-73.
- Kim KS, Moon SW, Moon MH, et al. Metabolic profiles of lung adenocarcinoma via peripheral blood and diagnostic model construction. Sci Rep 2023;13:7304. [Crossref] [PubMed]
- Huang D, Tang E, Zhang T, et al. Characteristics of Fatty Acid Metabolism in Lung Adenocarcinoma to Guide Clinical Treatment. Front Immunol 2022;13:916284. [Crossref] [PubMed]
- Xiao S, Zhang L, Zheng L, et al. EP03.01-03 Differences in Glucose and Energy Metabolism between Lung Squamous Cell Carcinoma and Lung Adenocarcinoma. J Thorac Oncol 2023;18:S440.
- Goodwin J, Neugent ML, Lee SY, et al. The distinct metabolic phenotype of lung squamous cell carcinoma defines selective vulnerability to glycolytic inhibition. Nat Commun 2017;8:15503. [Crossref] [PubMed]
- PubChem Pathway Summary for Pathway SMP0063603, Butyrate Metabolism [Internet]. PathBank. 2024 [cited 01 October 2024]. Available online: https://pubchem.ncbi.nlm.nih.gov/pathway/PathBank:SMP0063603
- Cargill KR, Hasken WL, Gay CM, et al. Alternative Energy: Breaking Down the Diverse Metabolic Features of Lung Cancers. Front Oncol 2021;11:757323. [Crossref] [PubMed]
- Chalishazar MD, Wait SJ, Huang F, et al. MYC-Driven Small-Cell Lung Cancer is Metabolically Distinct and Vulnerable to Arginine Depletion. Clin Cancer Res 2019;25:5107-21. [Crossref] [PubMed]
- Jonas DE, Reuland DS, Reddy SM, et al. Screening for Lung Cancer With Low-Dose Computed Tomography: An Evidence Review for the U.S. Preventive Services Task Force. Rockville, MD: Agency for Healthcare Research and Quality. 2021; AHRQ Publication No. 20-05266-EF-1 (Evidence Synthesis, No. 198).
- de Vries R, Farzan N, Fabius T, et al. Prospective Detection of Early Lung Cancer in Patients With COPD in Regular Care by Electronic Nose Analysis of Exhaled Breath. Chest 2023;164:1315-24. [Crossref] [PubMed]
- Casagrande GMS, Silva MO, Reis RM, et al. Liquid Biopsy for Lung Cancer: Up-to-Date and Perspectives for Screening Programs. Int J Mol Sci 2023;24:2505. [Crossref] [PubMed]
- Zhu W, Love K, Gray SW, et al. Liquid Biopsy Screening for Early Detection of Lung Cancer: Current State and Future Directions. Clin Lung Cancer 2023;24:209-17. [Crossref] [PubMed]
- Gordon SM, Szidon JP, Krotoszynski BK, et al. Volatile organic compounds in exhaled air from patients with lung cancer. Clin Chem 1985;31:1278-82.
- Fan X, Zhong R, Liang H, et al. Exhaled VOC detection in lung cancer screening: a comprehensive meta-analysis. BMC Cancer 2024;24:775. [Crossref] [PubMed]
- Kort S, Brusse-Keizer M, Schouwink H, et al. Diagnosing Non-Small Cell Lung Cancer by Exhaled Breath Profiling Using an Electronic Nose: A Multicenter Validation Study. Chest 2023;163:697-706. [Crossref] [PubMed]
- Marc van der Schee JM, Christiaan F Labuschagne, Rob Smith, Mariana Ferreira Leal, Madeleine Ball, Billy Boyle, Max Allsworth, Philip A. Crosbie, Robert C. Rintoul, on behalf of the Evolution Team. Proof-of-mechanism for a diagnostic probe generating D5-ethanol as an on-breath reporter molecule for lung cancer – Evolution phase 1. In: Medical O, editor. Breath Biopsy Conference. Cambridge UK: Owlstone Medical; 2023.
- Martínez-Lozano P, Rus J, Fernández de la Mora G, et al. Secondary electrospray ionization (SESI) of ambient vapors for explosive detection at concentrations below parts per trillion. J Am Soc Mass Spectrom 2009;20:287-94. [Crossref] [PubMed]
- Choueiry F, Zhu J. Secondary electrospray ionization-high resolution mass spectrometry (SESI-HRMS) fingerprinting enabled treatment monitoring of pulmonary carcinoma cells in real time. Anal Chim Acta 2022;1189:339230. [Crossref] [PubMed]

