new

Get trending papers in your email inbox!

Subscribe

Daily Papers

byAK and the research community

Apr 22

3DSAM-adapter: Holistic Adaptation of SAM from 2D to 3D for Promptable Medical Image Segmentation

Despite that the segment anything model (SAM) achieved impressive results on general-purpose semantic segmentation with strong generalization ability on daily images, its demonstrated performance on medical image segmentation is less precise and not stable, especially when dealing with tumor segmentation tasks that involve objects of small sizes, irregular shapes, and low contrast. Notably, the original SAM architecture is designed for 2D natural images, therefore would not be able to extract the 3D spatial information from volumetric medical data effectively. In this paper, we propose a novel adaptation method for transferring SAM from 2D to 3D for promptable medical image segmentation. Through a holistically designed scheme for architecture modification, we transfer the SAM to support volumetric inputs while retaining the majority of its pre-trained parameters for reuse. The fine-tuning process is conducted in a parameter-efficient manner, wherein most of the pre-trained parameters remain frozen, and only a few lightweight spatial adapters are introduced and tuned. Regardless of the domain gap between natural and medical data and the disparity in the spatial arrangement between 2D and 3D, the transformer trained on natural images can effectively capture the spatial patterns present in volumetric medical images with only lightweight adaptations. We conduct experiments on four open-source tumor segmentation datasets, and with a single click prompt, our model can outperform domain state-of-the-art medical image segmentation models on 3 out of 4 tasks, specifically by 8.25%, 29.87%, and 10.11% for kidney tumor, pancreas tumor, colon cancer segmentation, and achieve similar performance for liver tumor segmentation. We also compare our adaptation method with existing popular adapters, and observed significant performance improvement on most datasets.

  • 8 authors
·
Jun 23, 2023

Performance of a Deep Learning-Based Segmentation Model for Pancreatic Tumors on Public Endoscopic Ultrasound Datasets

Background: Pancreatic cancer is one of the most aggressive cancers, with poor survival rates. Endoscopic ultrasound (EUS) is a key diagnostic modality, but its effectiveness is constrained by operator subjectivity. This study evaluates a Vision Transformer-based deep learning segmentation model for pancreatic tumors. Methods: A segmentation model using the USFM framework with a Vision Transformer backbone was trained and validated with 17,367 EUS images (from two public datasets) in 5-fold cross-validation. The model was tested on an independent dataset of 350 EUS images from another public dataset, manually segmented by radiologists. Preprocessing included grayscale conversion, cropping, and resizing to 512x512 pixels. Metrics included Dice similarity coefficient (DSC), intersection over union (IoU), sensitivity, specificity, and accuracy. Results: In 5-fold cross-validation, the model achieved a mean DSC of 0.651 +/- 0.738, IoU of 0.579 +/- 0.658, sensitivity of 69.8%, specificity of 98.8%, and accuracy of 97.5%. For the external validation set, the model achieved a DSC of 0.657 (95% CI: 0.634-0.769), IoU of 0.614 (95% CI: 0.590-0.689), sensitivity of 71.8%, and specificity of 97.7%. Results were consistent, but 9.7% of cases exhibited erroneous multiple predictions. Conclusions: The Vision Transformer-based model demonstrated strong performance for pancreatic tumor segmentation in EUS images. However, dataset heterogeneity and limited external validation highlight the need for further refinement, standardization, and prospective studies.

  • 14 authors
·
Jan 9

TextSAM-EUS: Text Prompt Learning for SAM to Accurately Segment Pancreatic Tumor in Endoscopic Ultrasound

Pancreatic cancer carries a poor prognosis and relies on endoscopic ultrasound (EUS) for targeted biopsy and radiotherapy. However, the speckle noise, low contrast, and unintuitive appearance of EUS make segmentation of pancreatic tumors with fully supervised deep learning (DL) models both error-prone and dependent on large, expert-curated annotation datasets. To address these challenges, we present TextSAM-EUS, a novel, lightweight, text-driven adaptation of the Segment Anything Model (SAM) that requires no manual geometric prompts at inference. Our approach leverages text prompt learning (context optimization) through the BiomedCLIP text encoder in conjunction with a LoRA-based adaptation of SAM's architecture to enable automatic pancreatic tumor segmentation in EUS, tuning only 0.86% of the total parameters. On the public Endoscopic Ultrasound Database of the Pancreas, TextSAM-EUS with automatic prompts attains 82.69% Dice and 85.28% normalized surface distance (NSD), and with manual geometric prompts reaches 83.10% Dice and 85.70% NSD, outperforming both existing state-of-the-art (SOTA) supervised DL models and foundation models (e.g., SAM and its variants). As the first attempt to incorporate prompt learning in SAM-based medical image segmentation, TextSAM-EUS offers a practical option for efficient and robust automatic EUS segmentation. Code is available at https://github.com/HealthX-Lab/TextSAM-EUS .

  • 7 authors
·
Jul 24, 2025

Meta-information-aware Dual-path Transformer for Differential Diagnosis of Multi-type Pancreatic Lesions in Multi-phase CT

Pancreatic cancer is one of the leading causes of cancer-related death. Accurate detection, segmentation, and differential diagnosis of the full taxonomy of pancreatic lesions, i.e., normal, seven major types of lesions, and other lesions, is critical to aid the clinical decision-making of patient management and treatment. However, existing works focus on segmentation and classification for very specific lesion types (PDAC) or groups. Moreover, none of the previous work considers using lesion prevalence-related non-imaging patient information to assist the differential diagnosis. To this end, we develop a meta-information-aware dual-path transformer and exploit the feasibility of classification and segmentation of the full taxonomy of pancreatic lesions. Specifically, the proposed method consists of a CNN-based segmentation path (S-path) and a transformer-based classification path (C-path). The S-path focuses on initial feature extraction by semantic segmentation using a UNet-based network. The C-path utilizes both the extracted features and meta-information for patient-level classification based on stacks of dual-path transformer blocks that enhance the modeling of global contextual information. A large-scale multi-phase CT dataset of 3,096 patients with pathology-confirmed pancreatic lesion class labels, voxel-wise manual annotations of lesions from radiologists, and patient meta-information, was collected for training and evaluations. Our results show that our method can enable accurate classification and segmentation of the full taxonomy of pancreatic lesions, approaching the accuracy of the radiologist's report and significantly outperforming previous baselines. Results also show that adding the common meta-information, i.e., gender and age, can boost the model's performance, thus demonstrating the importance of meta-information for aiding pancreatic disease diagnosis.

  • 8 authors
·
Mar 1, 2023

PRISM: A Promptable and Robust Interactive Segmentation Model with Visual Prompts

In this paper, we present PRISM, a Promptable and Robust Interactive Segmentation Model, aiming for precise segmentation of 3D medical images. PRISM accepts various visual inputs, including points, boxes, and scribbles as sparse prompts, as well as masks as dense prompts. Specifically, PRISM is designed with four principles to achieve robustness: (1) Iterative learning. The model produces segmentations by using visual prompts from previous iterations to achieve progressive improvement. (2) Confidence learning. PRISM employs multiple segmentation heads per input image, each generating a continuous map and a confidence score to optimize predictions. (3) Corrective learning. Following each segmentation iteration, PRISM employs a shallow corrective refinement network to reassign mislabeled voxels. (4) Hybrid design. PRISM integrates hybrid encoders to better capture both the local and global information. Comprehensive validation of PRISM is conducted using four public datasets for tumor segmentation in the colon, pancreas, liver, and kidney, highlighting challenges caused by anatomical variations and ambiguous boundaries in accurate tumor identification. Compared to state-of-the-art methods, both with and without prompt engineering, PRISM significantly improves performance, achieving results that are close to human levels. The code is publicly available at https://github.com/MedICL-VU/PRISM.

  • 5 authors
·
Apr 23, 2024

3D Semi-Supervised Learning with Uncertainty-Aware Multi-View Co-Training

While making a tremendous impact in various fields, deep neural networks usually require large amounts of labeled data for training which are expensive to collect in many applications, especially in the medical domain. Unlabeled data, on the other hand, is much more abundant. Semi-supervised learning techniques, such as co-training, could provide a powerful tool to leverage unlabeled data. In this paper, we propose a novel framework, uncertainty-aware multi-view co-training (UMCT), to address semi-supervised learning on 3D data, such as volumetric data from medical imaging. In our work, co-training is achieved by exploiting multi-viewpoint consistency of 3D data. We generate different views by rotating or permuting the 3D data and utilize asymmetrical 3D kernels to encourage diversified features in different sub-networks. In addition, we propose an uncertainty-weighted label fusion mechanism to estimate the reliability of each view's prediction with Bayesian deep learning. As one view requires the supervision from other views in co-training, our self-adaptive approach computes a confidence score for the prediction of each unlabeled sample in order to assign a reliable pseudo label. Thus, our approach can take advantage of unlabeled data during training. We show the effectiveness of our proposed semi-supervised method on several public datasets from medical image segmentation tasks (NIH pancreas & LiTS liver tumor dataset). Meanwhile, a fully-supervised method based on our approach achieved state-of-the-art performances on both the LiTS liver tumor segmentation and the Medical Segmentation Decathlon (MSD) challenge, demonstrating the robustness and value of our framework, even when fully supervised training is feasible.

  • 9 authors
·
Nov 29, 2018

RadGPT: Constructing 3D Image-Text Tumor Datasets

With over 85 million CT scans performed annually in the United States, creating tumor-related reports is a challenging and time-consuming task for radiologists. To address this need, we present RadGPT, an Anatomy-Aware Vision-Language AI Agent for generating detailed reports from CT scans. RadGPT first segments tumors, including benign cysts and malignant tumors, and their surrounding anatomical structures, then transforms this information into both structured reports and narrative reports. These reports provide tumor size, shape, location, attenuation, volume, and interactions with surrounding blood vessels and organs. Extensive evaluation on unseen hospitals shows that RadGPT can produce accurate reports, with high sensitivity/specificity for small tumor (<2 cm) detection: 80/73% for liver tumors, 92/78% for kidney tumors, and 77/77% for pancreatic tumors. For large tumors, sensitivity ranges from 89% to 97%. The results significantly surpass the state-of-the-art in abdominal CT report generation. RadGPT generated reports for 17 public datasets. Through radiologist review and refinement, we have ensured the reports' accuracy, and created the first publicly available image-text 3D medical dataset, comprising over 1.8 million text tokens and 2.7 million images from 9,262 CT scans, including 2,947 tumor scans/reports of 8,562 tumor instances. Our reports can: (1) localize tumors in eight liver sub-segments and three pancreatic sub-segments annotated per-voxel; (2) determine pancreatic tumor stage (T1-T4) in 260 reports; and (3) present individual analyses of multiple tumors--rare in human-made reports. Importantly, 948 of the reports are for early-stage tumors.

  • 10 authors
·
Jan 8, 2025

DeepOrgan: Multi-level Deep Convolutional Networks for Automated Pancreas Segmentation

Automatic organ segmentation is an important yet challenging problem for medical image analysis. The pancreas is an abdominal organ with very high anatomical variability. This inhibits previous segmentation methods from achieving high accuracies, especially compared to other organs such as the liver, heart or kidneys. In this paper, we present a probabilistic bottom-up approach for pancreas segmentation in abdominal computed tomography (CT) scans, using multi-level deep convolutional networks (ConvNets). We propose and evaluate several variations of deep ConvNets in the context of hierarchical, coarse-to-fine classification on image patches and regions, i.e. superpixels. We first present a dense labeling of local image patches via P{-}ConvNet and nearest neighbor fusion. Then we describe a regional ConvNet (R_1{-}ConvNet) that samples a set of bounding boxes around each image superpixel at different scales of contexts in a "zoom-out" fashion. Our ConvNets learn to assign class probabilities for each superpixel region of being pancreas. Last, we study a stacked R_2{-}ConvNet leveraging the joint space of CT intensities and the P{-}ConvNet dense probability maps. Both 3D Gaussian smoothing and 2D conditional random fields are exploited as structured predictions for post-processing. We evaluate on CT images of 82 patients in 4-fold cross-validation. We achieve a Dice Similarity Coefficient of 83.6pm6.3% in training and 71.8pm10.7% in testing.

  • 7 authors
·
Jun 21, 2015

CPKD: Clinical Prior Knowledge-Constrained Diffusion Models for Surgical Phase Recognition in Endoscopic Submucosal Dissection

Gastrointestinal malignancies constitute a leading cause of cancer-related mortality worldwide, with advanced-stage prognosis remaining particularly dismal. Originating as a groundbreaking technique for early gastric cancer treatment, Endoscopic Submucosal Dissection has evolved into a versatile intervention for diverse gastrointestinal lesions. While computer-assisted systems significantly enhance procedural precision and safety in ESD, their clinical adoption faces a critical bottleneck: reliable surgical phase recognition within complex endoscopic workflows. Current state-of-the-art approaches predominantly rely on multi-stage refinement architectures that iteratively optimize temporal predictions. In this paper, we present Clinical Prior Knowledge-Constrained Diffusion (CPKD), a novel generative framework that reimagines phase recognition through denoising diffusion principles while preserving the core iterative refinement philosophy. This architecture progressively reconstructs phase sequences starting from random noise and conditioned on visual-temporal features. To better capture three domain-specific characteristics, including positional priors, boundary ambiguity, and relation dependency, we design a conditional masking strategy. Furthermore, we incorporate clinical prior knowledge into the model training to improve its ability to correct phase logical errors. Comprehensive evaluations on ESD820, Cholec80, and external multi-center demonstrate that our proposed CPKD achieves superior or comparable performance to state-of-the-art approaches, validating the effectiveness of diffusion-based generative paradigms for surgical phase recognition.

  • 7 authors
·
Jul 4, 2025

MSWAL: 3D Multi-class Segmentation of Whole Abdominal Lesions Dataset

With the significantly increasing incidence and prevalence of abdominal diseases, there is a need to embrace greater use of new innovations and technology for the diagnosis and treatment of patients. Although deep-learning methods have notably been developed to assist radiologists in diagnosing abdominal diseases, existing models have the restricted ability to segment common lesions in the abdomen due to missing annotations for typical abdominal pathologies in their training datasets. To address the limitation, we introduce MSWAL, the first 3D Multi-class Segmentation of the Whole Abdominal Lesions dataset, which broadens the coverage of various common lesion types, such as gallstones, kidney stones, liver tumors, kidney tumors, pancreatic cancer, liver cysts, and kidney cysts. With CT scans collected from 694 patients (191,417 slices) of different genders across various scanning phases, MSWAL demonstrates strong robustness and generalizability. The transfer learning experiment from MSWAL to two public datasets, LiTS and KiTS, effectively demonstrates consistent improvements, with Dice Similarity Coefficient (DSC) increase of 3.00% for liver tumors and 0.89% for kidney tumors, demonstrating that the comprehensive annotations and diverse lesion types in MSWAL facilitate effective learning across different domains and data distributions. Furthermore, we propose Inception nnU-Net, a novel segmentation framework that effectively integrates an Inception module with the nnU-Net architecture to extract information from different receptive fields, achieving significant enhancement in both voxel-level DSC and region-level F1 compared to the cutting-edge public algorithms on MSWAL. Our dataset will be released after being accepted, and the code is publicly released at https://github.com/tiuxuxsh76075/MSWAL-.

  • 16 authors
·
Mar 17, 2025

Meningioma segmentation in T1-weighted MRI leveraging global context and attention mechanisms

Meningiomas are the most common type of primary brain tumor, accounting for approximately 30% of all brain tumors. A substantial number of these tumors are never surgically removed but rather monitored over time. Automatic and precise meningioma segmentation is therefore beneficial to enable reliable growth estimation and patient-specific treatment planning. In this study, we propose the inclusion of attention mechanisms over a U-Net architecture: (i) Attention-gated U-Net (AGUNet) and (ii) Dual Attention U-Net (DAUNet), using a 3D MRI volume as input. Attention has the potential to leverage the global context and identify features' relationships across the entire volume. To limit spatial resolution degradation and loss of detail inherent to encoder-decoder architectures, we studied the impact of multi-scale input and deep supervision components. The proposed architectures are trainable end-to-end and each concept can be seamlessly disabled for ablation studies. The validation studies were performed using a 5-fold cross validation over 600 T1-weighted MRI volumes from St. Olavs University Hospital, Trondheim, Norway. For the best performing architecture, an average Dice score of 81.6% was reached for an F1-score of 95.6%. With an almost perfect precision of 98%, meningiomas smaller than 3ml were occasionally missed hence reaching an overall recall of 93%. Leveraging global context from a 3D MRI volume provided the best performances, even if the native volume resolution could not be processed directly. Overall, near-perfect detection was achieved for meningiomas larger than 3ml which is relevant for clinical use. In the future, the use of multi-scale designs and refinement networks should be further investigated to improve the performance. A larger number of cases with meningiomas below 3ml might also be needed to improve the performance for the smallest tumors.

  • 5 authors
·
Jan 19, 2021

ClinBench-HPB: A Clinical Benchmark for Evaluating LLMs in Hepato-Pancreato-Biliary Diseases

Hepato-pancreato-biliary (HPB) disorders represent a global public health challenge due to their high morbidity and mortality. Although large language models (LLMs) have shown promising performance in general medical question-answering tasks, the current evaluation benchmarks are mostly derived from standardized examinations or manually designed questions, lacking HPB coverage and clinical cases. To address these issues, we systematically eatablish an HPB disease evaluation benchmark comprising 3,535 closed-ended multiple-choice questions and 337 open-ended real diagnosis cases, which encompasses all the 33 main categories and 465 subcategories of HPB diseases defined in the International Statistical Classification of Diseases, 10th Revision (ICD-10). The multiple-choice questions are curated from public datasets and synthesized data, and the clinical cases are collected from prestigious medical journals, case-sharing platforms, and collaborating hospitals. By evalauting commercial and open-source general and medical LLMs on our established benchmark, namely ClinBench-HBP, we find that while commercial LLMs perform competently on medical exam questions, they exhibit substantial performance degradation on HPB diagnosis tasks, especially on complex, inpatient clinical cases. Those medical LLMs also show limited generalizability to HPB diseases. Our results reveal the critical limitations of current LLMs in the domain of HPB diseases, underscoring the imperative need for future medical LLMs to handle real, complex clinical diagnostics rather than simple medical exam questions. The benchmark will be released at https://clinbench-hpb.github.io.

  • 6 authors
·
May 30, 2025

Optimizing Brain Tumor Segmentation with MedNeXt: BraTS 2024 SSA and Pediatrics

Identifying key pathological features in brain MRIs is crucial for the long-term survival of glioma patients. However, manual segmentation is time-consuming, requiring expert intervention and is susceptible to human error. Therefore, significant research has been devoted to developing machine learning methods that can accurately segment tumors in 3D multimodal brain MRI scans. Despite their progress, state-of-the-art models are often limited by the data they are trained on, raising concerns about their reliability when applied to diverse populations that may introduce distribution shifts. Such shifts can stem from lower quality MRI technology (e.g., in sub-Saharan Africa) or variations in patient demographics (e.g., children). The BraTS-2024 challenge provides a platform to address these issues. This study presents our methodology for segmenting tumors in the BraTS-2024 SSA and Pediatric Tumors tasks using MedNeXt, comprehensive model ensembling, and thorough postprocessing. Our approach demonstrated strong performance on the unseen validation set, achieving an average Dice Similarity Coefficient (DSC) of 0.896 on the BraTS-2024 SSA dataset and an average DSC of 0.830 on the BraTS Pediatric Tumor dataset. Additionally, our method achieved an average Hausdorff Distance (HD95) of 14.682 on the BraTS-2024 SSA dataset and an average HD95 of 37.508 on the BraTS Pediatric dataset. Our GitHub repository can be accessed here: Project Repository : https://github.com/python-arch/BioMbz-Optimizing-Brain-Tumor-Segmentation-with-MedNeXt-BraTS-2024-SSA-and-Pediatrics

  • 9 authors
·
Nov 24, 2024 2

Novel Deep Learning Architectures for Classification and Segmentation of Brain Tumors from MRI Images

Brain tumors pose a significant threat to human life, therefore it is very much necessary to detect them accurately in the early stages for better diagnosis and treatment. Brain tumors can be detected by the radiologist manually from the MRI scan images of the patients. However, the incidence of brain tumors has risen amongst children and adolescents in recent years, resulting in a substantial volume of data, as a result, it is time-consuming and difficult to detect manually. With the emergence of Artificial intelligence in the modern world and its vast application in the medical field, we can make an approach to the CAD (Computer Aided Diagnosis) system for the early detection of Brain tumors automatically. All the existing models for this task are not completely generalized and perform poorly on the validation data. So, we have proposed two novel Deep Learning Architectures - (a) SAETCN (Self-Attention Enhancement Tumor Classification Network) for the classification of different kinds of brain tumors. We have achieved an accuracy of 99.38% on the validation dataset making it one of the few Novel Deep learning-based architecture that is capable of detecting brain tumors accurately. We have trained the model on the dataset, which contains images of 3 types of tumors (glioma, meningioma, and pituitary tumors) and non-tumor cases. and (b) SAS-Net (Self-Attentive Segmentation Network) for the accurate segmentation of brain tumors. We have achieved an overall pixel accuracy of 99.23%.

  • 2 authors
·
Dec 6, 2025 2

Brain Cancer Segmentation Using YOLOv5 Deep Neural Network

An expansion of aberrant brain cells is referred to as a brain tumor. The brain's architecture is extremely intricate, with several regions controlling various nervous system processes. Any portion of the brain or skull can develop a brain tumor, including the brain's protective coating, the base of the skull, the brainstem, the sinuses, the nasal cavity, and many other places. Over the past ten years, numerous developments in the field of computer-aided brain tumor diagnosis have been made. Recently, instance segmentation has attracted a lot of interest in numerous computer vision applications. It seeks to assign various IDs to various scene objects, even if they are members of the same class. Typically, a two-stage pipeline is used to perform instance segmentation. This study shows brain cancer segmentation using YOLOv5. Yolo takes dataset as picture format and corresponding text file. You Only Look Once (YOLO) is a viral and widely used algorithm. YOLO is famous for its object recognition properties. You Only Look Once (YOLO) is a popular algorithm that has gone viral. YOLO is well known for its ability to identify objects. YOLO V2, V3, V4, and V5 are some of the YOLO latest versions that experts have published in recent years. Early brain tumor detection is one of the most important jobs that neurologists and radiologists have. However, it can be difficult and error-prone to manually identify and segment brain tumors from Magnetic Resonance Imaging (MRI) data. For making an early diagnosis of the condition, an automated brain tumor detection system is necessary. The model of the research paper has three classes. They are respectively Meningioma, Pituitary, Glioma. The results show that, our model achieves competitive accuracy, in terms of runtime usage of M2 10 core GPU.

  • 3 authors
·
Dec 27, 2022

The Liver Tumor Segmentation Benchmark (LiTS)

In this work, we report the set-up and results of the Liver Tumor Segmentation Benchmark (LiTS), which was organized in conjunction with the IEEE International Symposium on Biomedical Imaging (ISBI) 2017 and the International Conferences on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2017 and 2018. The image dataset is diverse and contains primary and secondary tumors with varied sizes and appearances with various lesion-to-background levels (hyper-/hypo-dense), created in collaboration with seven hospitals and research institutions. Seventy-five submitted liver and liver tumor segmentation algorithms were trained on a set of 131 computed tomography (CT) volumes and were tested on 70 unseen test images acquired from different patients. We found that not a single algorithm performed best for both liver and liver tumors in the three events. The best liver segmentation algorithm achieved a Dice score of 0.963, whereas, for tumor segmentation, the best algorithms achieved Dices scores of 0.674 (ISBI 2017), 0.702 (MICCAI 2017), and 0.739 (MICCAI 2018). Retrospectively, we performed additional analysis on liver tumor detection and revealed that not all top-performing segmentation algorithms worked well for tumor detection. The best liver tumor detection method achieved a lesion-wise recall of 0.458 (ISBI 2017), 0.515 (MICCAI 2017), and 0.554 (MICCAI 2018), indicating the need for further research. LiTS remains an active benchmark and resource for research, e.g., contributing the liver-related segmentation tasks in http://medicaldecathlon.com/. In addition, both data and online evaluation are accessible via www.lits-challenge.com.

  • 109 authors
·
Jan 13, 2019

New combinational therapies for cancer using modern statistical mechanics

We investigate a new dynamical system that describes tumor-host interaction. The equation that describes the untreated tumor growth is based on non-extensive statistical mechanics. Recently, this model has been shown to fit successfully exponential, Gompertz, logistic, and power-law tumor growths. We have been able to include as many hallmarks of cancer as possible. We study also the dynamic response of cancer under therapy. Using our model, we can make predictions about the different outcomes when we change the parameters, and/or the initial conditions. We can determine the importance of different factors to influence tumor growth. We discover synergistic therapeutic effects of different treatments and drugs. Cancer is generally untreatable using conventional monotherapy. We consider conventional therapies, oncogene-targeted therapies, tumor-suppressors gene-targeted therapies, immunotherapies, anti-angiogenesis therapies, virotherapy, among others. We need therapies with the potential to target both tumor cells and the tumors' microenvironment. Drugs that target oncogenes and tumor-suppressor genes can be effective in the treatment of some cancers. However, most tumors do reoccur. We have found that the success of the new therapeutic agents can be seen when used in combination with other cancer-cell-killing therapies. Our results have allowed us to design a combinational therapy that can lead to the complete eradication of cancer.

  • 19 authors
·
Feb 2, 2019

Scaling Artificial Intelligence for Multi-Tumor Early Detection with More Reports, Fewer Masks

Early tumor detection save lives. Each year, more than 300 million computed tomography (CT) scans are performed worldwide, offering a vast opportunity for effective cancer screening. However, detecting small or early-stage tumors on these CT scans remains challenging, even for experts. Artificial intelligence (AI) models can assist by highlighting suspicious regions, but training such models typically requires extensive tumor masks--detailed, voxel-wise outlines of tumors manually drawn by radiologists. Drawing these masks is costly, requiring years of effort and millions of dollars. In contrast, nearly every CT scan in clinical practice is already accompanied by medical reports describing the tumor's size, number, appearance, and sometimes, pathology results--information that is rich, abundant, and often underutilized for AI training. We introduce R-Super, which trains AI to segment tumors that match their descriptions in medical reports. This approach scales AI training with large collections of readily available medical reports, substantially reducing the need for manually drawn tumor masks. When trained on 101,654 reports, AI models achieved performance comparable to those trained on 723 masks. Combining reports and masks further improved sensitivity by +13% and specificity by +8%, surpassing radiologists in detecting five of the seven tumor types. Notably, R-Super enabled segmentation of tumors in the spleen, gallbladder, prostate, bladder, uterus, and esophagus, for which no public masks or AI models previously existed. This study challenges the long-held belief that large-scale, labor-intensive tumor mask creation is indispensable, establishing a scalable and accessible path toward early detection across diverse tumor types. We plan to release our trained models, code, and dataset at https://github.com/MrGiovanni/R-Super

  • 23 authors
·
Oct 16, 2025

A Data-Efficient Pan-Tumor Foundation Model for Oncology CT Interpretation

Artificial intelligence-assisted imaging analysis has made substantial strides in tumor diagnosis and management. Here we present PASTA, a pan-tumor CT foundation model that achieves state-of-the-art performance on 45 of 46 representative oncology tasks -- including lesion segmentation, tumor detection in plain CT, tumor staging, survival prediction, structured report generation, and cross-modality transfer learning, significantly outperforming the second-best models on 35 tasks. This remarkable advancement is driven by our development of PASTA-Gen, an innovative synthetic tumor generation framework that produces a comprehensive dataset of 30,000 CT scans with pixel-level annotated lesions and paired structured reports, encompassing malignancies across ten organs and five benign lesion types. By leveraging this rich, high-quality synthetic data, we overcome a longstanding bottleneck in the development of CT foundation models -- specifically, the scarcity of publicly available, high-quality annotated datasets due to privacy constraints and the substantial labor required for scaling precise data annotation. Encouragingly, PASTA demonstrates exceptional data efficiency with promising practical value, markedly improving performance on various tasks with only a small amount of real-world data. The open release of both the synthetic dataset and PASTA foundation model effectively addresses the challenge of data scarcity, thereby advancing oncological research and clinical translation.

  • 16 authors
·
Feb 10, 2025

Deep Learning Segmentation of Ascites on Abdominal CT Scans for Automatic Volume Quantification

Purpose: To evaluate the performance of an automated deep learning method in detecting ascites and subsequently quantifying its volume in patients with liver cirrhosis and ovarian cancer. Materials and Methods: This retrospective study included contrast-enhanced and non-contrast abdominal-pelvic CT scans of patients with cirrhotic ascites and patients with ovarian cancer from two institutions, National Institutes of Health (NIH) and University of Wisconsin (UofW). The model, trained on The Cancer Genome Atlas Ovarian Cancer dataset (mean age, 60 years +/- 11 [s.d.]; 143 female), was tested on two internal (NIH-LC and NIH-OV) and one external dataset (UofW-LC). Its performance was measured by the Dice coefficient, standard deviations, and 95% confidence intervals, focusing on ascites volume in the peritoneal cavity. Results: On NIH-LC (25 patients; mean age, 59 years +/- 14 [s.d.]; 14 male) and NIH-OV (166 patients; mean age, 65 years +/- 9 [s.d.]; all female), the model achieved Dice scores of 0.855 +/- 0.061 (CI: 0.831-0.878) and 0.826 +/- 0.153 (CI: 0.764-0.887), with median volume estimation errors of 19.6% (IQR: 13.2-29.0) and 5.3% (IQR: 2.4-9.7) respectively. On UofW-LC (124 patients; mean age, 46 years +/- 12 [s.d.]; 73 female), the model had a Dice score of 0.830 +/- 0.107 (CI: 0.798-0.863) and median volume estimation error of 9.7% (IQR: 4.5-15.1). The model showed strong agreement with expert assessments, with r^2 values of 0.79, 0.98, and 0.97 across the test sets. Conclusion: The proposed deep learning method performed well in segmenting and quantifying the volume of ascites in concordance with expert radiologist assessments.

  • 7 authors
·
Jun 22, 2024

How We Won BraTS-SSA 2025: Brain Tumor Segmentation in the Sub-Saharan African Population Using Segmentation-Aware Data Augmentation and Model Ensembling

Brain tumors, particularly gliomas, pose significant chall-enges due to their complex growth patterns, infiltrative nature, and the variability in brain structure across individuals, which makes accurate diagnosis and monitoring difficult. Deep learning models have been developed to accurately delineate these tumors. However, most of these models were trained on relatively homogenous high-resource datasets, limiting their robustness when deployed in underserved regions. In this study, we performed segmentation-aware offline data augmentation on the BraTS-Africa dataset to increase the data sample size and diversity to enhance generalization. We further constructed an ensemble of three distinct architectures, MedNeXt, SegMamba, and Residual-Encoder U-Net, to leverage their complementary strengths. Our best-performing model, MedNeXt, was trained on 1000 epochs and achieved the highest average lesion-wise dice and normalized surface distance scores of 0.86 and 0.81 respectively. However, the ensemble model trained for 500 epochs produced the most balanced segmentation performance across the tumour subregions. This work demonstrates that a combination of advanced augmentation and model ensembling can improve segmentation accuracy and robustness on diverse and underrepresented datasets. Code available at: https://github.com/SPARK-Academy-2025/SPARK-2025/tree/main/SPARK2025_BraTs_MODELS/SPARK_NeuroAshanti

  • 8 authors
·
Oct 3, 2025

GlucoLens: Explainable Postprandial Blood Glucose Prediction from Diet and Physical Activity

Postprandial hyperglycemia, marked by the blood glucose level exceeding the normal range after meals, is a critical indicator of progression toward type 2 diabetes in prediabetic and healthy individuals. A key metric for understanding blood glucose dynamics after eating is the postprandial area under the curve (PAUC). Predicting PAUC in advance based on a person's diet and activity level and explaining what affects postprandial blood glucose could allow an individual to adjust their lifestyle accordingly to maintain normal glucose levels. In this paper, we propose GlucoLens, an explainable machine learning approach to predict PAUC and hyperglycemia from diet, activity, and recent glucose patterns. We conducted a five-week user study with 10 full-time working individuals to develop and evaluate the computational model. Our machine learning model takes multimodal data including fasting glucose, recent glucose, recent activity, and macronutrient amounts, and provides an interpretable prediction of the postprandial glucose pattern. Our extensive analyses of the collected data revealed that the trained model achieves a normalized root mean squared error (NRMSE) of 0.123. On average, GlucoLense with a Random Forest backbone provides a 16% better result than the baseline models. Additionally, GlucoLens predicts hyperglycemia with an accuracy of 74% and recommends different options to help avoid hyperglycemia through diverse counterfactual explanations. Code available: https://github.com/ab9mamun/GlucoLens.

  • 7 authors
·
Mar 5, 2025