Whole genome characterization of patient-derived lung cancer organoids
Highlight box
Key findings
• Established a panel of lung cancer organoids (LCOs) from early and locally advanced- and advanced-stage non-small cell lung cancer (NSCLC) patients.
• Whole-genome sequencing (WGS) revealed frequent mutations in TP53, TTN, MUC16, and FLG, with ~80% of variants in non-coding regions.
• Early and locally advanced-stage LCOs showed higher tumor mutation burden (TMB) and microsatellite instability (MSI) than advanced-stage LCOs.
What is known and what is new?
• Lung cancer is genomically complex, with several driver mutations and some use of organoids as preclinical models.
• This study provides the first detailed WGS-based genomic landscape of LCOs across disease stages, emphasizing the higher TMB and MSI in early and locally advanced-stage LCOs compared to advanced-stage, reflecting greater clonal diversity prior to therapeutic intervention.
What is the implication, and what should change now?
• Comprehensive genomic profiling of LCOs offers new insights into lung cancer genomic structure.
• Future research and precision medicine strategies should leverage these well-characterized LCO models to explore functional genomics and guide therapy development.
Introduction
Background
Lung cancer remains the leading cause of cancer-related mortality worldwide, responsible for an estimated 1.8 million deaths annually and accounting for approximately 18% of all cancer deaths globally (1). The high fatality rate is largely driven by late-stage diagnosis, where curative interventions are limited, and by the frequent development of resistance to conventional treatments including chemotherapy and radiotherapy (2). Over the past few decades, significant advances have been made in understanding the molecular mechanisms underlying lung cancer initiation and progression. This has led to the identification of various driver mutations that play critical roles in lung cancer development, such as Kirsten rat sarcoma virus (KRAS), epidermal growth factor receptor (EGFR), anaplastic lymphoma kinase (ALK), B-Raf (BRAF), and ROS proto-oncogene 1 (ROS1), which play critical roles in tumor initiation, progression, and therapeutic resistance (3,4). Consequently, several targeted therapies have been developed to specifically inhibit the activity of these molecular targets (5). Compared with cytotoxic chemotherapy, targeted therapy offers a more tailored and effective treatment approach for lung cancer patients (6).
The concept of precision medicine is defined as tailoring medical treatment to the individual cancer characteristics of each patient, by delivering targeted therapies based on a tumor-unique genetic makeup. Personalized treatment can greatly improve patient outcomes and minimize adverse side effects by ensuring that patients receive the most effective therapy for their specific tumor type (7). Despite the accelerated growth in the identification of drug targets including the discovery of KRAS inhibitors (8-10), there is considerable room for improvement in terms of identifying novel molecular targets and developing effective therapies against acquired drug resistance.
In recent years, advances in next-generation sequencing (NGS) technologies have facilitated the rapid and comprehensive analysis of tumor genomic profiles. NGS approaches, such as whole-exome sequencing (WES) or whole-genome sequencing (WGS), enable researchers to simultaneously analyze thousands of genes in a single experiment, providing a high throughput analysis of tumor characteristics with respect to molecular targets in respective tumors (11). The integration of genomic profiling data with clinical data has the potential to improve our understanding of lung cancer biology, ultimately leading to the development of more effective therapies and personalized treatment strategies. Traditional cancer therapies, such as chemotherapy and radiotherapy, often cause non-specific cytotoxicity and tissue damage, leading to a range of side effects and emergence of drug resistance mechanisms. Conversely, targeted therapies are designed to selectively inhibit specific molecular targets that are known to be involved in cancer growth and progression. While single-targeted therapies can be effective in some cases, they often suffer from limitations such as the development of resistance and limited efficacy in tumors with complex genomic profiles. It is possible to develop combination therapies that can overcome these limitations and improve treatment outcomes. The advent of deep learning models trained on extensive datasets enables the identification of intricate molecular patterns and relationships that traditional drug screening methods may overlook. When combined with drug repurposing screens, these techniques can uncover novel insights from complex multi-omics data. These insights can reveal previously unrecognized mechanisms of action for existing drugs, opening new avenues for repurposing (12). Multi-targeted therapies can increase the likelihood of tumor eradication by addressing multiple oncogenic pathways, thereby reducing the chances of drug resistance development and recurrence. Additionally, targeting multiple molecular pathways can enhance the efficacy of treatment by addressing the complex interplay (13,14).
The recent development of patient-derived cancer organoids has expanded the range of preclinical models and significantly transformed standard cell culture practice in oncology. Lung cancer organoids (LCOs) cultivated from resected tumor or pleural fluid samples can accurately replicate the histological and molecular features of the original tumor. Patient-derived cancer organoids are invaluable tools that hold tremendous potential application in clinical practice for implementation of precision medicine in oncology (15,16).
Rationale and knowledge gap
Despite advances in targeted therapies and genomic profiling for lung cancer, there remains a critical knowledge gap in understanding the comprehensive genomic landscape—particularly non-coding mutations—of patient-derived LCOs, limiting their potential as precision models for studying tumor biology and informing individualized treatment strategies.
Objective
In this study, we aim to establish a panel of LCOs derived from resected tumors or malignant pleural effusions (MPEs) and to fully characterize their genomic profiles using WGS. The primary goal is to demonstrate the feasibility and potential utility of integrating WGS of patient-derived LCOs with deep learning-based drug response prediction as a proof-of-concept framework. We present this article in accordance with the TRIPOD reporting checklist (available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-2025-738/rc).
Methods
Lung cancer patient sample collection and processing
Clinical samples were obtained in Queen Mary Hospital, Hong Kong SAR. Patients with clinically diagnosed lung cancer were recruited with informed consent. Patients with stage I, II and III diseases were defined as early and locally advanced-stage; patients with stage IV disease were defined as advanced-stage (Figure 1A). Fresh tumor tissue from early and locally advanced-stage patients or MPE from advanced-stage lung cancer patients were collected and transferred to the laboratory within four hours to preserve viability. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study protocol was approved by the Ethics Committee of the University of Hong Kong and Hong Kong Hospital Authority Hong Kong West Cluster Institutional Review Board (IRB Reference Number UW 16-104). Informed consent was obtained from all patients before sample collection.
LCOs culture
LCO culture was performed as previously described (17). The details are as follows: surgical tumor specimens were obtained during lung resection surgery, and immediately placed in RPMI medium. The tumor tissue was divided for formalin-fixed paraffin embedded (FFPE) and cancer organoid cultures respectively. A portion of the tissue was washed in RPMI medium three times and subsequently cut and minced into smaller pieces without enzymatic dissociation. The resulting tissue suspension was strained through a 70 µm cell strainer. Next, PneumaCult-Ex Plus medium (Ex+ medium, Stemcell Technology, Vancouver, Canada) was added to the filtered cell suspension, which was then centrifuged at 400 g for 5 minutes. The cell pellet was rinsed with Ex+ medium and resuspended in 300 µL medium. The resultant cell count was determined using Trypan Blue and a hemocytometer. The diluted cell suspension was mixed with Ex+ medium containing 70% cold Cultrex reduced growth factor basement membrane extract (BME) type 2 (Bio-Techne, Minneapolis, United States). Three to four 50 µL droplets of BME-cell suspension were placed on pre-warmed 6-well plates and allowed to solidify at 37 ℃ and 5 % CO2 for at least 15 minutes. Once the gel was formed, it was covered with Ex+ medium. The medium was changed every 3 days. To eliminate any potential normal cell contamination, cancer organoids were manually selected under a microscope during the first month, and only tumor clusters were used for further culture. LCOs that grew to approximately 300 to 500 µm were harvested and passaged.
For culturing LCOs from MPE, the collected MPE (50 to 200 mL) was centrifuged at 400 g for 10 minutes. Red blood cells were removed using RBC lysis buffer (eBioscience, San Diego, United States). The cleaned cell pellets were then cultured in BME and Ex+ medium, following the same procedure as for tumor tissue.
LCOs derived from tumor tissues were labelled with prefix TS, and LCOs derived from MPE were labelled with prefix FA. All LCOs were authenticated using a polymorphic short tandem repeat (STR) DNA profiling assay (Labcorp DNA Identification Lab, Burlington, United States). Mycoplasma contamination was examined in cell cultures every 6 months.
Histological staining
LCOs were harvested and pre-embedded in Histogel (ThermoFisher Scientific, Waltham, United States) and fixed in 4% paraformaldehyde, followed by paraffin embedding. The paraffin sections were processed in the same way as routine pathology specimens and stained with hematoxylin and eosin. Images were captured using the camera mounted on the CX53 microscope (Olympus, Tokyo, Japan).
Whole genome sequencing (WGS) and bioinformatic analysis
Genomic DNA was extracted from organoids and peripheral blood mononuclear cells (PBMCs) of the same patients using the PureLink Genomic DNA mini kit (ThermoFisher Scientific). WGS of the DNA samples was performed by BGI Tech Solutions (Hong Kong) Co., Ltd. (Hong Kong SAR, China). In brief, the qualified genomic DNA samples were randomly fragmented by Covaris technology and the size of the library fragments was about 150 bp. The fragments were subjected to end-repair and then was 3’-adenylated. Adaptors were ligated to the ends of these 3’-adenylated fragments. The polymerase chain reaction (PCR) reaction system and program were configured and set up to amplify the product. Single-stranded PCR products were produced via denaturation. The reaction system and program for circularization were subsequently configured and set up. Single-stranded cyclized products were produced, while uncyclized linear DNA molecules were digested. Single-stranded circle DNA molecules were replicated via rolling cycle amplification, and a DNA nanoball (DNB) which contained multiple copies of DNA is generated. Sufficient quality DNBs were then loaded into patterned nanoarrays using high-intensity DNA nanochip technique and sequenced through combinatorial Probe-Anchor Synthesis (cPAS). The platform performance was set at high-throughput sequencing for each captured library to ensure an average 30× coverage in each sample.
Raw sequence data were obtained using the DNBSEQ base-calling software with default parameters. The raw reads were generated as paired-end reads and stored in FASTQ format. Raw data was filtered to remove adapter sequences, low-quality bases, and un-sequenced bases, which could significantly affect the outcome. Clean data of each sample was then mapped to the human reference genome to get initial comparison file in BAM format. Burrows-Wheeler Aligner (BWA) software (18) was used for sequence alignment. To ensure accurate variant calling, the Best Practices for variant analysis with the Genome Analysis Toolkit (GATK) (19) were followed. Base quality score recalibration and duplicate reads marked were performed using GATK. The sequencing depth and coverage for each individual sample were calculated based on sequence alignments. In addition, the strict data analysis quality control system (QC) in the whole pipeline was built to guarantee qualified sequencing data. The Mutect2 software (20) was then used to detect somatic mutations, including single nucleotide variant (SNV) and insertions/deletions (InDels), by comparing LCOs sequences against matched normal PBMC sequences. Data were then filtered to remove any cross-contamination of the obtained mutations, sequencing errors, and possible human errors in the experimental process to ensure high-confidence results. Variants were annotated using GATK Funcotator. FACETS (21) and Ensemble VEP (22) was used to detect and annotate copy number variation (CNV). Manta (23) was applied to detect structural variants (SV). MSIsensor (24) was employed to evaluate the microsatellite instability (MSI) status of the paired samples. Maftools (25) was used to analyse driver gene mutations. Significant mutated genes and mutation interactions were identified by Genome MuSiC software version 0.4 (26). Drug target annotation was performed using CIViC (20). Mutation profiles were integrated and the pathway enrichment was performed using R packages clusterProfiler (27) and msigdbr (28). Visualizations and heatmaps were generated by the ggplot2, oncoplot and pheatmap R packages.
Drug prediction and sensitivity assay
The DrugCell was an interpretable deep learning model that predicted anti-cancer drug responses by simulating the hierarchical organization of human cancer cells. DrugCell uses a visible neural network (VNN) architecture where the network design is guided by biological hierarchies. The input is a binary mutation profile representing mutated or wild-type genes in each cancer cell line. The network layers correspond to biological pathways, modules, and genes organized in a hierarchical graph derived from gene ontology and pathway knowledge. The model computes the response to drug features (chemical structure fingerprints) integrated via a learned embedding. The VNN uses nonlinear activations to model complex genotype-phenotype relations, and outputs continuous drug response predictions. The model was trained on the responses of 1,235 human cancer cell lines and 684 drug structures (29). The mutation profiles of the LCOs were loaded into the DrugCell Oracle and potential relevant drug lists were generated for downstream experiments.
LCOs (about 50 organoids per well) were seeded onto 96-well plate and treated with indicated drugs (0.5 nM to 50 µM) for 72 hours. Drug sensitivity was performed using the Promega CellTiter-Glo® 3D Cell Viability Assay according to the manufacturer’s protocol. The half-maximal inhibitory concentration (IC50) of each drug was determined by nonlinear regression dose-response curve fitting. To enable direct comparison with the DrugCell predictive model output, which reports predicted drug response as an area under the curve (AUC) value, experimental IC50 values were converted to corresponding AUC metrics (AUC0.5) by GraphPad Prism software, which calculates the normalized area under the dose-response curve up to a concentration corresponding to 0.5 log units. This conversion provides a more comprehensive integration of drug response over the tested concentration range, reflecting the overall drug effect rather than a single-dose endpoint. A true positive (TP) prediction was defined when the AUC0.5 values from both the prediction and the experiment were determined to be smaller than 0.5. A true negative (TN) prediction was defined when the AUC0.5 values from both the prediction and the experiment were not smaller than 0.5. A false positive (FP) prediction was defined when the AUC0.5 value from the prediction was smaller than 0.5, but the AUC0.5 value from the experiment was not. A false negative (FN) prediction was defined when the AUC0.5 value from the prediction was not smaller than 0.5, but the AUC0.5 value from the experiment was. Sensitivity was calculated by dividing the number of TP by the sum of TP and FN. Specificity was calculated by dividing the number of TN by the sum of TN and FP. Accuracy was calculated by dividing the sum of TP and TN by the total number of TP, FP, FN and TN.
Statistical analysis
Multiple group comparisons were performed with one-way analysis of variance (ANOVA) followed by Tukey’s post hoc analyses. GraphPad Prism version 6.01 (GraphPad Software, Boston, MA, USA) was used for all statistical analyses. A two-sided P value of less than 0.05 was considered to indicate statistical significance.
Results
Establishment and characterization of LCOs from lung cancer patients
Patient-derived LCOs were established from early and locally advanced-stage primary tumors and advanced-stage metastatic tumors from 14 patients with NSCLC (Figure 1A), mostly lung adenocarcinomas, using a previously established method (17). Most LCOs showed cystic or dense morphologies, except TS234-O which showed stellate structure (Figure 1B,1C). Bright-filed and haematoxylin and eosin (H&E) staining illustrated the morphological characteristics of the established LCOs. Immunohistochemistry results showed that LCOs and original tumor tissue exhibit positive staining for TTF-1 and negative staining for p40, consistent with adenocarcinoma. Glandular or luminal structures are present in all samples except for FA156.2, which was derived from pleural effusion and lacks discernible architecture (Figure S1).
Genomic characterization of LCOs
WGS analysis was performed to characterize the genomic profiles of these LCOs. Mutation profiling showed most LCOs harbouring different types of TP53 mutations, except TS485T-O. About 80% of the LCOs bearing titin (TTN) and about 50% of the LCOs have mucin 16 (MUC16) and filaggrin (FLG) mutations (Figure 2A). Recurrent missense mutations of these genes (Figure 2B) in these LCOs (Figure 2C) underscored their potential roles in lung cancer pathogenesis (30). LCOs derived from early and locally advanced-stage tumors generally contain more SNV and InDel compared to those derived from advanced-stage tumors (Figure S2). In addition, similar to previous observation that cytosine to thymine (C > T) transition (Figure 2D) was predominant in Southern China (31). Among all the identified somatic SNV, majority of them are single nucleotide polymorphisms (SNPs), with a small proportion of insertions (Ins) and deletions (Del) (Figure 2E). Druggable targets, including BRAF V600E and KRAS G12C were identified in FA233-O and TS335.1T-O respectively. But non-druggable mutations were also found in some driver mutations (Table 1). Indeed, approximately 80% of the SNVs and InDels observed in the LCOs occur in non-coding regions of the genome, while only about 12% affect coding genes (Figure 2F). This suggests a predominance of passenger events in non-coding DNA and underscores the need to investigate how non-coding regulatory elements contribute to lung cancer development.
Table 1
| LCOs | SNVs | Variants | Targeted therapy |
|---|---|---|---|
| Druggable targets | |||
| FA233-O | BRAF | V600E | Dabrafenib + trametinib |
| TS335T-O | KRAS | G12C | Sotorasib, adagrasib |
| Non-druggable targets | |||
| FA124-O | EGFR | L813R | |
| FA233-O | EGFR | 702_708LREATSP>S | |
| FA233-O | EGFR | T745M | |
| TS225T-O | NTRK3 | R613S | |
| TS225T-O | RET | K617N | |
| TS335.1T-O | NTRK3 | V780F | |
| TS683T-O | ALK | P294fs | |
LCO, lung cancer organoid; SNV, single nucleotide variant.
The mutation burden summarized in Table 2 reveal that while early and locally advanced-stage tumor-derived-LCOs (TS series) generally exhibit a high tumor mutation burden (TMB) and are classified as hyper-mutated, except for FA124-O. This does not necessarily correlate with high MSI status as measured by MSIsensor. Several TS samples, despite their elevated TMB, display a non-MSI-H phenotype according to MSIsensor scores. Additionally, the number of mutations detected in DNA mismatch repair (MMR) genes is variable and often low among these hyper-mutated samples. Key MMR genes include MutL homolog 1 (MLH1), MutS homolog 2 (MSH2), MutS homolog 6 (MSH6), postmeiotic segregation increased 2 (PMS2), postmeiotic segregation increased 1 (PMS1), and Exonuclease 1 (EXO1). This indicates that a high mutational load in early and locally advanced-stage tumors can arise through mechanisms other than classic MSI driven by MMR genes mutation. Therefore, elevated TMB and hypermutation in early and locally advanced-stage lung tumors may reflect a broader spectrum of mutational processes beyond those directly linked to MSI or MMR gene alterations. This can be explained by the high degree of clonal diversity in early and locally advanced-stage tumors and preserved in their respective LCOs, which may result in a greater number of detectable mutations. Additionally, patients with advanced-stage tumors, including most FA samples from MPE except FA124-O, have undergone multiple lines of therapy before sample collection. These treatments can induce selective pressures that eliminate highly mutated cells, resulting in a lower detectable mutation burden in residual tumor cells. The high transition-to-transversion ratio observed in FA124-O and TS638T-O (Figure S3) which also harbored higher TMB may be explained by the exposure to exogenous mutagens, since transversions (purine-to-pyrimidine or vice versa) are more likely linked to exogenous factors like chemical mutagens or radiation.
Table 2
| Sample | Hyper-mutated | TMB | MSIsensor (>3.5) | Mutations in MMR gene |
|---|---|---|---|---|
| TS638T-O | Hyper | 78.16 | MSI-H (13.90) | 0 |
| FA124-O | Hyper | 73.85 | MSI-H (17.41) | 1 |
| TS576T-O | Hyper | 52.15 | Non-MSI-H (2.32) | 3 |
| TS343T-O | Hyper | 36.92 | Non-MSI-H (2.81) | 2 |
| TS234T-O | Hyper | 35.83 | Non-MSI-H (2.96) | 0 |
| TS225T-O | Hyper | 33.12 | Non-MSI-H (1.38) | 0 |
| TS335.1T-O | Hyper | 20.38 | Non-MSI-H (2.20) | 1 |
| TS583T-O | Regular | 11.81 | Non-MSI-H (0.35) | 1 |
| TS485T-O | Hyper | 10.46 | Non-MSI-H (0.73) | 0 |
| FA156.2-O | Regular | 10.23 | Non-MSI-H (3.19) | 1 |
| TS301T-O | Regular | 8.92 | MSI-H (3.94) | 0 |
| FA233-O | Regular | 8.09 | MSI-H (4.12) | 2 |
| FA206-O | Regular | 7.77 | Non-MSI-H (1.10) | 1 |
| FA166-O | Regular | 7.26 | MSI-H (4.18) | 1 |
Summary on the mutation burden of the 14 LCOs derived from lung cancer patients. LCO, lung cancer organoid; TMB, tumor mutation burden; MSI-H, microsatellite instability-high; MMR, mismatch repair.
Co-occurrence of two high-frequency mutated genes may indicate synergistic effects in cancer development. Such synergy would result in higher-than-expected co-alteration rates in tumor genomes. Co-occurrence of mutations was found in several genes that are associated with normal lung functions under physiological conditions, including MUC16 (32), CFAP54 (33), RYR2 (34), and OBSCN (35) (Figure S4), suggesting that mutations in these genes may synergistically contribute to lung cancer pathogenesis.
Besides point mutations, CNV analysis showed that the majority of CNVs affected protein-coding regions, directly impacting protein expression (Figure 3A). Except for TS583T-O, all LCOs displayed high CNV gains in key cancer driver genes (Figure 3B). In TS583T-O, extensive CNV losses in those driver genes indicate it is a chromosomal instability (CIN)-driven tumor that has accumulated large-scale structural alterations. Ploidy measurements further confirmed the hypodiploid status of TS583T-O (Figure 3C) and polyploidy in other LCOs. In contrast, SV analysis revealed that TS638T-O and FA124-O exhibited markedly higher SV levels than the other LCOs (Figure 4A). Most SVs were located in intergenic regions or involved genes encoding non-druggable proteins (Figure 4B and Table 3).
Table 3
| Sample | Gene | SV type | Treatment |
|---|---|---|---|
| FA124-O | EGFR | Duplication | Erlotinib, gefitinib, osimertinib, afatinib |
| FA166-O | ALK | BND | Crizotinib, ceritinib, alectinib, brigatinib, lorlatinib |
| TS225T-O | BCR | BND | Imatinib, dasatinib, nilotinib, bosutinib, ponatinib |
| TS234T-O | ERBB2 | Duplication | Trastuzumab, pertuzumab, ado-trastuzumab emtansine, lapatinib |
| TS335.1T-O | BCR | BND | Imatinib, dasatinib, nilotinib, bosutinib, ponatinib |
| TS343T-O | ALK | BND | Crizotinib, ceritinib, alectinib, brigatinib, lorlatinib |
| TS343T-O | ERBB2 | Duplication | Trastuzumab, pertuzumab, ado-trastuzumab emtansine, lapatinib |
| TS485T-O | ALK | INV | Crizotinib, ceritinib, alectinib, brigatinib, lorlatinib |
| TS485T-O | ALK | Duplication | Crizotinib, ceritinib, alectinib, brigatinib, lorlatinib |
| TS583T-O | NOTCH1 | Deletion | OMP-52M51 |
Table detailing the SVs in specific genes across the established LCOs along with corresponding targeted treatments. BND, breakend; LCO, lung cancer organoid; SV, structural variant.
Pathway-enrichment analysis of LCOs revealed four distinct groups: C1-LCOs exhibit dedifferentiated, stem-cell-like phenotypes characterized by mutations in canonical developmental pathways (Wnt/β-catenin, Hedgehog, myogenesis) while also retaining robust luminal-subtype signaling, including apical-surface organization, cell-cell junctions, and protein-secretion pathways. C2-LCOs also display pronounced stem-cell-like phenotype but with diminished luminal features. C3-LCO (TS225T-O) is marked by strong stromal interactions and activation of tissue-repair programs; and C4-LCOs (TS343T-O, TS638T-O, FA124-O) did not show significant enrichment for any specific pathways in our analysis (Figure 5).
Application of deep learning model for drug repurposing screen with patient-derived LCOs
After we have performed comprehensive genomic profiling on these patient-derived LCOs, we have identified several clinically actionable variants. Building directly on these molecular insights, we next set out to predict alternative therapies would best exploit the genomic variations found in each LCO. To that end, we fed the SNV/InDel profiles into the VNN model DrugCell Oracle (29) to predict anti-cancer drug responses of LCOs. The 10 most predicted drugs and two conventional chemotherapies cisplatin and docetaxel were then selected for drug sensitivity tests on the established LCOs. The general properties of the predicted drugs were listed in Table S1. The predicted normalized AUC values of the drugs against each LCOs were shown in Figure 6A. The experimental AUC values were shown in Figure 6B on each predicted drug. This quantitative data provides a clearer view of actual drug potency and prediction variability among organoids. While some agents showed low nanomolar IC50 indicative of strong in vitro efficacy, others required higher concentrations (Figure S5). The DrugCell model showed 68.28% sensitivity, 55.58% specificity, and 59.34% accuracy (Figure 6C), raw count was shown in Table S2. The data suggested that DrugCell reasonably identified potentially responsive drugs, but there was still significant room for improvement, particularly in specificity and in the overall accuracy of predicting therapeutic responses. These results underscore the value of integrating comprehensive genomic data with advanced deep-learning models in drug screening, paving the way for cost-effective, rational treatment selection and, ultimately, personalized cancer therapy.
Discussion
Key findings
The establishment of LCOs from lung cancer patients provided a robust platform for investigating the genetic landscape and therapeutic responses in non-small cell lung cancer (NSCLC). This study successfully generated 14 patient-derived LCOs from local lung cancer patient samples, including pleural fluids and surgically resected lung tumor specimens, which displayed a wide range of genetic mutations, highlighting the genetic diversity inherent in NSCLC. While our analysis provides valuable insights into the mutational landscape and therapeutic potentials, the relatively small cohort size and moderate predictive accuracy of the DrugCell model limit the conclusiveness of specific genomic patterns or drug response determinations. The primary objective of this work is to showcase the feasibility and potential of integrating comprehensive genomic data from organoids with computational approaches for drug response prediction, rather than to deliver definitive clinical guidance. Our findings emphasize the current challenges and highlight areas for improvement, such as incorporating CNVs, SVs, and tumor microenvironment factors, to enhance model accuracy. This framework establishes a foundation for more comprehensive studies that are required to validate and extend these initial observations.
WGS analysis revealed TP53 as the most prevalent mutation, present in almost all cell lines except TS485AT-O. The high prevalence of TP53 mutations aligned with its known role as a critical tumor suppressor gene frequently altered in lung cancer (36). The identification of multiple types of TP53 mutations, including missense, frameshift deletions, and splice site mutations, underscored the complexity of its inactivation in lung cancer pathogenesis (37). TTN mutations were the second most common mutations in this cohort, present in 12 LCOs (79%). TTN mutations found in these LCOs were predominantly missense mutations, but their presence may impact the structural integrity and function of the titan protein, undermining their possible roles in lung cancer progression (38). Other significant mutations included MUC16, FLG, and LRP1B, which were known to be associated with various malignancies (39,40). The mutation rate was generally lower in cell lines derived from non-smokers compared to those from smokers or ex-smokers, except for FA161.2-O, which harbored all top 10 significant mutations. This finding supported the notion that smoking significantly contributed to the mutational burden in lung cancer (41). Results of co-mutation analysis (Figure S4) showed that LRP1B demonstrated mutual exclusivity with other mutated genes, since LRP1B is known as a tumor suppressor gene associated with lung cancer (42,43), suggesting an alternative oncogenic route. Besides, the distinctive high level of SV found in TS638T-O and FA124-O may correlate with increased genomic instability (44) and tumor aggressiveness (45). Even though SVs can now be quantified (44), integrating these results with other sequencing data for drug prediction remains challenging.
In this study, a deep-learning model, DrugCell, was used to assess the therapeutic responses of patient-derived LCOs to various anti-cancer treatments in relation to their respective mutation profiles. Sepantronium bromide and leptomycin B emerged as the most effective drugs across all cell lines, with panobinostat also showing significant efficacy in 11 LCOs. Sepantronium bromide (also known as YM155) is a small molecule survivin suppressant that has been investigated for its potential use in cancer therapy. Survivin is a protein that inhibits apoptosis and is commonly overexpressed in various types of cancer, including lung cancer (46). Leptomycin B is a naturally occurring compound that functions as a specific inhibitor of nuclear export by binding to and inhibiting the nuclear export protein CRM1 (Exportin 1). This inhibition can lead to the accumulation of tumor suppressor proteins and other regulatory proteins in the nucleus, potentially inducing apoptosis in cancer cells. While Sepantronium bromide and Leptomycin B demonstrated strong efficacy in vitro, their clinical use is hampered by dose-limiting toxicities, underscoring a general limitation of in silico and organoid drug sensitivity predictions. Achieving therapeutic drug levels at non-toxic doses remains a challenge for several candidate agents identified here. Future studies integrating pharmacokinetic and toxicity data with sensitivity profiles will be essential to translate these findings into viable clinical treatments. Both sepantronium bromide and leptomycin B have been studied primarily in preclinical settings and early-phase clinical trials (47,48). These compounds showed promising anti-cancer activity in vitro (49) and in vivo (50) models. However, its clinical development faced significant challenges due to its toxicity and side effects.
Panobinostat is a histone deacetylase (HDAC) inhibitor that was approved by the U.S. Food and Drug Administration (FDA) for the treatment of multiple myeloma, particularly in combination with other therapies (51). Notably, traditional chemotherapy drugs like cisplatin and docetaxel exhibited less favourable outcomes when compared with other emerging treatment such as immunotherapy (52), consistent with previous reports of their limited efficacy in certain lung cancer subtypes.
Strength and limitations
The in vitro cytotoxicity assays largely corroborated the predictions made by DrugCell, but some discrepancies were noted. For instance, certain LCOs predicted to be insensitive to thapsigargin and obatoclax mesylate demonstrated sensitivity in vitro, indicating the need for further refinement of predictive models. The clustering analysis of cytotoxicity data revealed distinct response patterns, with LCOs segregating into two main clusters (Figure 5). This clustering provided insights into the heterogeneity of drug responses and suggested potential subgroup-specific therapeutic strategies. Indeed, the rapid development of computational drug response prediction, using approaches such as neural network/deep-learning methods, biomarker or signature-based methods, and traditional machine learning, has significantly improved the accuracy of therapeutic response predictions. Yet, there is currently no robust in silico framework that predicts drug sensitivities from WGS data. Most existing predictors were trained on two-dimensional (2D) cell lines using targeted gene panels or exomes, and they tend to focus almost exclusively on SNVs. As a result, SVs and CNVs, both of which can drive drug resistance or sensitivity, are routinely under-utilized or ignored.
Comparison with similar research
In the evolving landscape of precision oncology, a critical knowledge gap remains in the development of prediction models that fully leverage WGS data. Current models predominantly rely on SNVs, which, while significant, capture only a portion of the genomic alterations driving cancer. SVs, such as translocations and deletions, and CNVs, including gene amplifications or losses, profoundly influence gene function, expression, and tumor behavior, yet they are frequently overlooked. This limited scope restricts our ability to understand the full complexity of cancer genomics and predict treatment outcomes accurately. Integrating comprehensive genomic data from WGS, which encompasses SNVs, SVs, CNVs, and other alterations, offers a transformative opportunity to build more robust prediction models. Such models could reveal intricate interactions between mutation types, enhance drug response predictions, and uncover novel therapeutic targets, ultimately advancing personalized cancer care. Further research to develop and validate these integrative models is essential to bridge this gap and unlock the full potential of WGS in precision oncology.
Explanations of findings
The findings from this study underscored the utility of integrating genomic data with advanced computational models to predict therapeutic responses in LCOs (53). The moderate sensitivity but imperfect accuracy observed in the selected treatment panel highlighted the need for improving the precision of these predictive models. Several possible reasons may account for the discrepancies. The most notable reason was that the training drug response dataset in DrugCell was determined from classical two-dimensional cell culture, while the drug response predictions in this study were obtained from cancer organoids. Even though the genomic landscape may not show significant differences between 2D and three dimensional (3D) cultures, the epigenetic and post-transcriptional backgrounds in these two culture methods may strongly affect drug responses. This highlighted the essential need for using more advanced in vitro conditions that better mimic the tumor microenvironment in future development. To enhance sensitivity, we propose incorporating additional biological and molecular data, such as CNVs, epigenetic modifications, and tumor microenvironment factors, into the model. This could help capture a more comprehensive picture of the tumor’s characteristics and improve the model’s ability to identify effective treatments.
Implications and actions needed
Nevertheless, the approach from this study demonstrated the selection of the most effective therapies while improving precision in rational drug screening, distinguishing it from most other drug screening studies that tested drug sensitivity without any pre-selection. Our study significantly reduced the need for extensive trial-and-error efforts, thereby potentially reducing the time and cost associated with drug screening. Additionally, these models can aid in repurposing existing drugs for new indications, as demonstrated by the identification of non-anticancer drugs like leptomycin B and thapsigargin in this study, providing additional options for patients and potentially reducing side effects. Future research should focus on enhancing model accuracy by incorporating additional comprehensive genomic and biological data, such as CNVs, epigenetic modifications, and tumor microenvironment factors (54). Nonetheless, broader experimental validation encompassing standard-of-care drugs and additional patient samples will be critical to substantiate clinical applicability.
Conclusions
In conclusion, this work provides a valuable proof-of-concept demonstrating how WGS of patient-derived LCOs combined with deep learning drug response models can facilitate rational drug screening and precision medicine. Although limited by sample size and model performance, the study underscores the potential and existing limitations of current computational approaches, serving as a stepping stone towards improved predictive tools in lung cancer therapy. Future research expanding sample numbers and integrating additional genomic and biological data will be critical to unlocking the full potential of this integrative strategy to optimize individualized cancer treatment.
Acknowledgments
None.
Footnote
Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-2025-738/rc
Data Sharing Statement: Available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-2025-738/dss
Peer Review File: Available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-2025-738/prf
Funding: The research work described in this report was supported by
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tlcr.amegroups.com/article/view/10.21037/tlcr-2025-738/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study protocol was approved by the Ethics Committee of the University of Hong Kong and Hong Kong Hospital Authority Hong Kong West Cluster Institutional Review Board (IRB Reference Number UW 16-104). Informed consent was obtained from all patients before sample collection.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Bray F, Laversanne M, Sung H, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2024;74:229-63. [Crossref] [PubMed]
- Jeon H, Wang S, Song J, et al. Update 2025: Management of Non Small-Cell Lung Cancer. Lung 2025;203:53. [Crossref] [PubMed]
- Lim ZF, Ma PC. Emerging insights of tumor heterogeneity and drug resistance mechanisms in lung cancer targeted therapy. J Hematol Oncol 2019;12:134. [Crossref] [PubMed]
- Gautam Roy P, Reingold D, Pathak N, et al. Recent Advances in the Management of EGFR-Mutated Advanced Non-Small Cell Lung Cancer-A Narrative Review. Curr Oncol 2025;32:448. [Crossref] [PubMed]
- de Jager VD, Timens W, Bayle A, et al. Future perspective for the application of predictive biomarker testing in advanced stage non-small cell lung cancer. Lancet Reg Health Eur 2024;38:100839. [Crossref] [PubMed]
- Arbour KC, Riely GJ. Systemic Therapy for Locally Advanced and Metastatic Non-Small Cell Lung Cancer: A Review. JAMA 2019;322:764-74. [Crossref] [PubMed]
- Srivastava S, Jayaswal N, Kumar S, et al. Unveiling the potential of proteomic and genetic signatures for precision therapeutics in lung cancer management. Cell Signal 2024;113:110932. [Crossref] [PubMed]
- Arbour KC, Punekar S, Garrido-Laguna I, et al. 652O Preliminary clinical activity of RMC-6236, a first-in-class, RAS-selective, tri-complex RAS-MULTI(ON) inhibitor in patients with KRAS mutant pancreatic ductal adenocarcinoma (PDAC) and non-small cell lung cancer (NSCLC). Annals of Oncology 2023;34:S458.
- Jänne PA, Riely GJ, Gadgeel SM, et al. Adagrasib in Non-Small-Cell Lung Cancer Harboring a KRAS(G12C) Mutation. N Engl J Med 2022;387:120-31. [Crossref] [PubMed]
- de Langen AJ, Johnson ML, Mazieres J, et al. Sotorasib versus docetaxel for previously treated non-small-cell lung cancer with KRAS(G12C) mutation: a randomised, open-label, phase 3 trial. Lancet 2023;401:733-46. [Crossref] [PubMed]
- Sen T, Takahashi N, Chakraborty S, et al. Emerging advances in defining the molecular and therapeutic landscape of small-cell lung cancer. Nat Rev Clin Oncol 2024;21:610-27. [Crossref] [PubMed]
- Parisi D, Adasme MF, Sveshnikova A, et al. Drug repositioning or target repositioning: A structural perspective of drug-target-indication relationship for available repurposed drugs. Comput Struct Biotechnol J 2020;18:1043-55. [Crossref] [PubMed]
- Mei T, Wang T, Zhou Q. Multi-omics and artificial intelligence predict clinical outcomes of immunotherapy in non-small cell lung cancer patients. Clin Exp Med 2024;24:60. [Crossref] [PubMed]
- Kaur P, Singh SK, Mishra MK, et al. Promising Combinatorial Therapeutic Strategies against Non-Small Cell Lung Cancer. Cancers (Basel) 2024;16:2205. [Crossref] [PubMed]
- Thorel L, Perréard M, Florent R, et al. Patient-derived tumor organoids: a new avenue for preclinical research and precision medicine in oncology. Exp Mol Med 2024;56:1531-51. [Crossref] [PubMed]
- Taverna JA, Hung CN, Williams M, et al. Ex vivo drug testing of patient-derived lung organoids to predict treatment responses for personalized medicine. Lung Cancer 2024;190:107533. [Crossref] [PubMed]
- Kwok HH, Li H, Yang J, et al. Single-cell transcriptomic analysis uncovers intratumoral heterogeneity and drug-tolerant persister in ALK-rearranged lung adenocarcinoma. Cancer Commun (Lond) 2023;43:951-5. [Crossref] [PubMed]
- Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009;25:1754-60. [Crossref] [PubMed]
- Geraldine van der A, O'Connor BD. Genomics in the Cloud: Using Docker, GATK, and WDL in Terra. Sebastopol, CA: O'Reilly Media; 2020.
- McKenna A, Hanna M, Banks E, et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 2010;20:1297-303. [Crossref] [PubMed]
- Shen R, Seshan VE. FACETS: allele-specific copy number and clonal heterogeneity analysis tool for high-throughput DNA sequencing. Nucleic Acids Res 2016;44:e131. [Crossref] [PubMed]
- McLaren W, Gil L, Hunt SE, et al. The Ensembl Variant Effect Predictor. Genome Biol 2016;17:122. [Crossref] [PubMed]
- Chen X, Schulz-Trieglaff O, Shaw R, et al. Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications. Bioinformatics 2016;32:1220-2. [Crossref] [PubMed]
- Niu B, Ye K, Zhang Q, et al. MSIsensor: microsatellite instability detection using paired tumor-normal sequence data. Bioinformatics 2014;30:1015-6. [Crossref] [PubMed]
- Mayakonda A, Lin DC, Assenov Y, et al. Maftools: efficient and comprehensive analysis of somatic variants in cancer. Genome Res 2018;28:1747-56. [Crossref] [PubMed]
- Dees ND, Zhang Q, Kandoth C, et al. MuSiC: identifying mutational significance in cancer genomes. Genome Res 2012;22:1589-98. [Crossref] [PubMed]
- Yu G. Thirteen years of clusterProfiler. Innovation (Camb) 2024;5:100722. [Crossref] [PubMed]
- Subramanian A, Tamayo P, Mootha VK, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 2005;102:15545-50. [Crossref] [PubMed]
- Kuenzi BM, Park J, Fong SH, et al. Predicting Drug Response and Synergy Using a Deep Learning Model of Human Cancer Cells. Cancer Cell 2020;38:672-684.e6. [Crossref] [PubMed]
- Zhang P, Wang W, Liu L, et al. Analysis of prognostic model based on immunotherapy related genes in lung adenocarcinoma. Sci Rep 2022;12:22077. [Crossref] [PubMed]
- Liu C, Li K, Sui Y, et al. Different gene alterations in patients with non-small-cell lung cancer between the eastern and southern China. Heliyon 2023;9:e20171. [Crossref] [PubMed]
- Kanwal M, Ding XJ, Song X, et al. MUC16 overexpression induced by gene mutations promotes lung cancer cell growth and invasion. Oncotarget 2018;9:12226-39. [Crossref] [PubMed]
- McKenzie CW, Craige B, Kroeger TV, et al. CFAP54 is required for proper ciliary motility and assembly of the central pair apparatus in mice. Mol Biol Cell 2015;26:3140-9. [Crossref] [PubMed]
- Ren W, Li Y, Chen X, et al. RYR2 mutation in non-small cell lung cancer prolongs survival via down-regulation of DKK1 and up-regulation of GS1-115G20.1: A weighted gene Co-expression network analysis and risk prognostic models. IET Syst Biol 2022;16:43-58. [Crossref] [PubMed]
- Laberge JM, Flageole H, Pugash D, et al. Outcome of the prenatally diagnosed congenital cystic adenomatoid lung malformation: a Canadian experience. Fetal Diagn Ther 2001;16:178-86. [Crossref] [PubMed]
- Joerger AC, Stiewe T, Soussi T. TP53: the unluckiest of genes? Cell Death Differ 2025;32:219-24. [Crossref] [PubMed]
- Li H, Yang L, Wang Y, et al. Integrative analysis of TP53 mutations in lung adenocarcinoma for immunotherapies and prognosis. BMC Bioinformatics 2023;24:155. [Crossref] [PubMed]
- Oh JH, Jang SJ, Kim J, et al. Spontaneous mutations in the single TTN gene represent high tumor mutation burden. NPJ Genom Med 2020;5:33. [Crossref] [PubMed]
- Zhai X, Xia Z, Du G, et al. LRP1B suppresses HCC progression through the NCSTN/PI3K/AKT signaling axis and affects doxorubicin resistance. Genes Dis 2023;10:2082-96. [Crossref] [PubMed]
- Li X, Tang Z, Li Z, et al. Somatic mutations that affect early genetic progression and immune microenvironment in gastric carcinoma. Pathol Res Pract 2024;257:155310. [Crossref] [PubMed]
- Wang X, Ricciuti B, Nguyen T, et al. Association between Smoking History and Tumor Mutation Burden in Advanced Non-Small Cell Lung Cancer. Cancer Res 2021;81:2566-73. [Crossref] [PubMed]
- He Z, Feng W, Wang Y, et al. LRP1B mutation is associated with tumor immune microenvironment and progression-free survival in lung adenocarcinoma treated with immune checkpoint inhibitors. Transl Lung Cancer Res 2023;12:510-29. [Crossref] [PubMed]
- Xiao D, Li F, Pan H, et al. Integrative analysis of genomic sequencing data reveals higher prevalence of LRP1B mutations in lung adenocarcinoma patients with COPD. Sci Rep 2017;7:2121. [Crossref] [PubMed]
- Cook GW, Benton MG, Akerley W, et al. Structural variation and its potential impact on genome instability: Novel discoveries in the EGFR landscape by long-read sequencing. PLoS One 2020;15:e0226340. [Crossref] [PubMed]
- Lee AY, Ewing AD, Ellrott K, et al. Combining accurate tumor genome simulation with crowdsourcing to benchmark somatic structural variant detection. Genome Biol 2018;19:188. [Crossref] [PubMed]
- Aoyama Y, Kaibara A, Takada A, et al. Population pharmacokinetic modeling of sepantronium bromide (YM155), a small molecule survivin suppressant, in patients with non-small cell lung cancer, hormone refractory prostate cancer, or unresectable stage III or IV melanoma. Invest New Drugs 2013;31:443-51.
- Kelly RJ, Thomas A, Rajan A, et al. A phase I/II study of sepantronium bromide (YM155, survivin suppressor) with paclitaxel and carboplatin in patients with advanced non-small-cell lung cancer. Ann Oncol 2013;24:2601-6. [Crossref] [PubMed]
- Mutka SC, Yang WQ, Dong SD, et al. Identification of nuclear export inhibitors with potent anticancer activity in vivo. Cancer Res 2009;69:510-7. [Crossref] [PubMed]
- Liu Z, Gao W. Leptomycin B reduces primary and acquired resistance of gefitinib in lung cancer cells. Toxicol Appl Pharmacol 2017;335:16-27. [Crossref] [PubMed]
- Gao W, Lu C, Chen L, et al. Overexpression of CRM1: A Characteristic Feature in a Transformed Phenotype of Lung Carcinogenesis and a Molecular Target for Lung Cancer Adjuvant Therapy. J Thorac Oncol 2015;10:815-25. [Crossref] [PubMed]
- Pan D, Mouhieddine TH, Upadhyay R, et al. Outcomes with panobinostat in heavily pretreated multiple myeloma patients. Semin Oncol 2023;50:40-8. [Crossref] [PubMed]
- Herbst RS, Baas P, Kim DW, et al. Pembrolizumab versus docetaxel for previously treated, PD-L1-positive, advanced non-small-cell lung cancer (KEYNOTE-010): a randomised controlled trial. Lancet 2016;387:1540-50. [Crossref] [PubMed]
- Blandino G, Satchi-Fainaro R, Tinhofer I, et al. Cancer Organoids as reliable disease models to drive clinical development of novel therapies. J Exp Clin Cancer Res 2024;43:334. [Crossref] [PubMed]
- Lenhof K, Eckhart L, Rolli LM, et al. Trust me if you can: a survey on reliability and interpretability of machine learning approaches for drug sensitivity prediction in cancer. Brief Bioinform 2024;25:bbae379. [Crossref] [PubMed]

