🧠 STXBP1-ARIA: AI Research Intelligence Assistant

The first vision-language model specialized for STXBP1/Munc18-1 research and rare neurological disease.

Fine-tuned on 470,000 biomedical figure-caption pairs from 31,786 PubMed Central articles & 175,000 article images.

Base Model Training Data Parameters Context
Qwen3-VL-8B-Instruct 470K samples 8.9B (175M trainable) 32K tokens (1M with rope)

🎯 Model Highlights

  • Domain Expert: Deep knowledge of STXBP1, Munc18-1, synaptic function, epileptic encephalopathy, and gene therapy approaches
  • Figure Understanding: Trained on scientific figures including Western blots, electrophysiology traces, protein structures, and experimental schematics
  • Dual Response Modes: Concise 'caption-style' OR comprehensive explanations depending on prompt and context settings ("give a detailed, comprehensive explanation about 123XYZ")
  • No Hallucination on Out-of-Domain: Correctly handles non-STXBP1 images without forcing domain relevance like cute cat images (ask me how I know)

πŸ“– The Story Behind This Model

This model was created by me, a father whose daughter has STXBP1 syndrome (K196X mutation).

STXBP1-ARIA-Qwen3-VL-8B-LoRA represents:

  • 31,786 scientific articles processed
  • 470,000+ figure-caption pairs extracted
  • 99.7% image-text match rate achieved
  • 25 hours of training on 4x NVIDIA H100 GPUs
  • One parent's determination to advance rare disease research

πŸš€ Quick Start

Installation

pip install transformers peft accelerate torch
pip install qwen-vl-utils  # For image processing

Basic Usage

import torch
from transformers import AutoProcessor, Qwen3VLForConditionalGeneration
from peft import PeftModel
from PIL import Image

# Load base model
model = Qwen3VLForConditionalGeneration.from_pretrained(
    "Qwen/Qwen3-VL-8B-Instruct",
    torch_dtype=torch.bfloat16,
    attn_implementation="flash_attention_2",
    device_map="auto",
)

# Load LoRA adapter
model = PeftModel.from_pretrained(model, "SkyWhal3/STXBP1-ARIA-Qwen3-VL-8B-LoRa")
model.eval()

# Load processor
processor = AutoProcessor.from_pretrained("Qwen/Qwen3-VL-8B-Instruct")

# Prepare image and prompt
image = Image.open("your_figure.png").convert("RGB")

messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "image": image},
            {"type": "text", "text": "Describe this scientific figure in detail."}
        ]
    }
]

# Generate
text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=[text], images=[image], return_tensors="pt").to(model.device)

output = model.generate(**inputs, max_new_tokens=32768, temperature=0.7, top_p=0.8)
response = processor.decode(output[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(response)

πŸ’‘ Prompt Strategies

Caption Mode (Concise)

Describe this scientific figure.

Response style: Brief, publication-ready caption (~200-500 chars)

Expert Mode (Comprehensive)

Provide a detailed, comprehensive explanation of [topic]. 
Include mechanisms, advantages, disadvantages, and current research status.

Response style: Full expert analysis (~2000-8000+ chars)


πŸ“Š Example Outputs

Example 1: AAV-STXBP1 Gene Therapy Figure

Prompt -πŸ‘€ User: "What gene therapy approach is shown here?"

πŸ€– STXBP1-ARIA Response:

In silico pan-neuronal promoter design and NGS screening for DRG-detargeting regulatory elements were used to identify a safe and selective AAV-STXBP1 construct that rescues STXBP1 phenotypes in mice and nonhuman primates.

Example 2: Munc18-1 Electrophysiology

Prompt - πŸ‘€ User: "What type of experiment does this figure show?"

πŸ€– STXBP1-ARIA Response:

Munc18-1 cKO mice display reduced spontaneous glutamate release. (A) Representative traces and cumulative distribution plots of mEPSCs in Munc18-1+/+ and Munc18-1βˆ’/βˆ’ cKO mice no. 1. (B) Representative traces and cumulative distribution plots of mEPSCs in Munc18-1+/+ and Munc18-1βˆ’/βˆ’ cKO mice no. 2. Data are expressed as mean Β± SEM. **p< 0.01; ***p< 0.001.

Example 3: Expert Mode - Therapeutic Approaches ( Expert Mode)

Prompt -πŸ‘€ User: Provide a detailed, comprehensive explanation of the therapeutic approaches for STXBP1 K196X nonsense mutations. Include mechanisms, advantages, disadvantages, and current research status.

πŸ€– STXBP1-ARIA Response: STXBP1 K196X Nonsense Mutation: Therapeutic Approaches, Mechanisms, and Current Research Status

The STXBP1 K196X nonsense mutation is a pathogenic variant in the syntaxin-binding protein 1 (STXBP1) gene, which encodes Munc18-1, a critical protein involved in synaptic vesicle fusion and neurotransmitter release. This mutation results in a premature stop codon at position 196, leading to truncated, nonfunctional protein and haploinsufficiency. STXBP1-related disorders include epileptic encephalopathies, developmental delay, intellectual disability, and autism spectrum disorder.

Therapeutic approaches for STXBP1 K196X can be broadly categorized into:

  1. Gene Therapy
  2. mRNA-Based Therapies
  3. Protein Replacement Therapy
  4. Small-Molecule Therapeutics (including readthrough agents)
  5. CRISPR-Cas9 Genome Editing
  6. Pharmacological Interventions (Symptomatic Management)

Below is a detailed analysis of each approach, including mechanisms, advantages, disadvantages, and current research status.

1. Gene Therapy Mechanism: Gene therapy aims to deliver a functional copy of the STXBP1 gene to affected cells using viral vectors (e.g., adeno-associated virus [AAV] or lentivirus) or non-viral delivery systems (e.g., lipid nanoparticles). The goal is to restore normal Munc18-1 expression and function.

Advantages: β€’ Potential for long-term correction. β€’ Can achieve targeted delivery to the central nervous system (CNS). β€’ May provide durable expression without the need for repeated dosing.

Disadvantages: β€’ Risk of immune responses against viral vectors. β€’ Limited cargo capacity of AAVs (~4.7 kb), which may restrict delivery of large STXBP1 cDNA. β€’ Challenges with CNS delivery due to blood-brain barrier (BBB) limitations. β€’ Off-target effects and insertional mutagenesis risks.

Current Research Status: Preclinical studies are underway using AAV9 vectors to deliver full-length STXBP1 to murine models of STXBP1 deficiency. Early results show improved survival and reduced seizure frequency in animal models. Human trials have not yet been initiated due to safety concerns and lack of regulatory approval for STXBP1-specific therapies.

2. mRNA-Based Therapies Mechanism: mRNA-based therapies involve delivering synthetic mRNA encoding functional STXBP1 directly to target cells. This approach bypasses genomic integration and allows transient expression of the protein.

Advantages: β€’ Avoids risks associated with gene integration. β€’ Enables rapid production of therapeutic proteins. β€’ Can be produced at scale using established mRNA manufacturing technologies.

Disadvantages: β€’ Short duration of action requires repeated administration. β€’ Delivery challenges, particularly across the BBB. β€’ Potential immune activation by foreign mRNA.

Current Research Status: mRNA delivery systems, such as lipid nanoparticles (LNPs), have shown promise in preclinical models. Studies have demonstrated that intrathecal delivery of STXBP1 mRNA improves neuronal function and reduces seizure severity in STXBP1-deficient mice. However, no clinical trials are currently registered for this approach.

3. Protein Replacement Therapy Mechanism: This approach involves the direct administration of recombinant Munc18-1 protein to restore synaptic function. Protein replacement could be delivered via intravenous infusion, intrathecal injection, or oral formulations designed to cross the BBB.

Advantages: β€’ Immediate therapeutic effect upon administration. β€’ No risk of genomic integration. β€’ Could be tailored to specific patient needs.

Disadvantages: β€’ Challenges with protein stability, immunogenicity, and delivery to the CNS. β€’ High cost and logistical complexity of continuous protein administration. β€’ Risk of accumulation in non-target tissues.

Current Research Status: No protein replacement therapies targeting STXBP1 are currently in development. However, advances in protein engineering and delivery technologies may facilitate future applications.

4. Small-Molecule Therapeutics (Readthrough Agents) Mechanism: Small molecules that promote ribosomal readthrough of premature termination codons (PTCs) can restore translation of full-length STXBP1. Compounds like Ataluren (PTC124) and Gentamicin derivatives have been investigated for their ability to induce readthrough at nonsense mutations.

Advantages: β€’ Oral bioavailability and ease of administration. β€’ Potential for broad application across genetic disorders involving PTCs. β€’ Lower risk profile compared to gene editing or protein replacement.

Disadvantages: β€’ Variable efficacy depending on the location and type of PTC. β€’ Risk of off-target effects and toxicity. β€’ Limited data on long-term safety and efficacy for neurological disorders.

Current Research Status: Ataluren has been approved for Duchenne muscular dystrophy but has not been tested in STXBP1-related disorders. Preclinical studies suggest potential efficacy for STXBP1 K196X, but clinical trials are lacking. Further optimization of readthrough compounds and delivery methods is needed.

5. CRISPR-Cas9 Genome Editing Mechanism: CRISPR-Cas9 can be used to correct the K196X mutation by precisely excising or repairing the mutated nucleotide. Alternatively, base editors or prime editors can introduce silent mutations to restore reading frame without double-strand breaks.

Advantages: β€’ Precision editing with minimal off-target effects. β€’ Potential for permanent correction. β€’ Applicable to both germline and somatic cells.

Disadvantages: β€’ Technical challenges in delivering Cas9 components to neurons. β€’ Risk of unintended genomic alterations. β€’ Ethical concerns regarding germline editing.

Current Research Status: CRISPR-based correction of STXBP1 mutations is primarily in preclinical stages. Studies using CRISPR/Cas9 in induced pluripotent stem cell-derived neurons and mouse models have shown promising results. However, efficient and safe delivery to the human brain remains a significant hurdle.

6. Pharmacological Interventions (Symptomatic Management) Mechanism: While not correcting the underlying genetic defect, symptomatic treatments aim to reduce seizures, improve cognition, and manage behavioral symptoms. Common interventions include antiepileptic drugs (AEDs), ketogenic diet, and neuromodulators.

Advantages: β€’ Widely available and well-established. β€’ Can significantly improve quality of life. β€’ Low risk profile when used appropriately.

Disadvantages: β€’ Does not address the root cause of the disease. β€’ Variable response among patients. β€’ Potential side effects from chronic medication use.

Current Research Status: Symptomatic management remains the standard of care for STXBP1-related disorders. Ongoing research focuses on optimizing treatment regimens and identifying novel targets for intervention.

Conclusion and Future Directions

There is currently no FDA-approved therapy specifically for STXBP1 K196X mutations. While gene therapy, mRNA-based approaches, and small-molecule readthrough agents hold significant promise, several challenges remain, particularly concerning delivery to the CNS and long-term safety. CRISPR-based strategies offer precision but require further refinement for clinical translation.

Future directions should focus on: β€’ Developing optimized delivery systems for CNS-targeted therapies. β€’ Conducting preclinical studies to evaluate safety and efficacy in relevant animal models. β€’ Advancing clinical trials for readthrough agents and gene therapies. β€’ Establishing biomarkers to monitor treatment response and guide personalized medicine.

Collaborative efforts between academia, industry, and regulatory agencies will be essential to accelerate the development of effective therapies for individuals with STXBP1-related disorders.

References (Illustrative):

  1. Watanabe et al. (2021). β€œSyntaxin-Binding Protein 1 Deficiency: A Novel Cause of Epileptic Encephalopathy.” Brain. 144(1): 123–134. doi:10.1093/brain/awaa355.

  2. Sivakumar et al. (2020). β€œCRISPR-Cas9 Correction of STXBP1 Mutations in Human Induced Pluripotent Stem Cells.” Molecular Therapy. 28(11): 2912–2923. doi:10.1016/j.ymthe.2020.08.024.

  3. Gao et al. (2022). β€œAdeno-Associated Virus-Mediated Gene Therapy for STXBP1-Related Disorders.” Gene Therapy. 29(3): 345–357. doi:10.1038/s41439-021-00887-8.

  4. Tardif et al. (2021). β€œReadthrough Agents for Nonsense Mutations: Progress and Challenges.” Nature Reviews Genetics. 22(7): 401–416. doi:10.1038/s41576-021-00336-2.

  5. Kwon et al. (2020). β€œSTXBP1-Related Epilepsy: Clinical Features and Treatment Strategies.” Epilepsia. 61(12): 2401–2412. doi:10.1111/epi.16622.

Note: All references are illustrative and based on publicly available literature; actual citations should be verified through PubMed or institutional databases.


πŸ“ˆ Training Details

Dataset

  • Source: STXBP1_PubMed_Central_Multimodal_Dataset
  • Articles: 31,786 from PubMed Central
  • Training Samples: 469,750 (after filtering)
  • Sample Types:
    • Figure + caption pairs
    • Figure + caption + abstract context
    • Multi-turn Q&A conversations

Training Configuration

Parameter Value
Base Model Qwen3-VL-8B-Instruct
Method LoRA (Low-Rank Adaptation)
LoRA Rank 64
LoRA Alpha 128
Learning Rate 2e-5
Batch Size 2 per GPU
Gradient Accumulation 4
Effective Batch Size 32
Epochs 1
Max Sequence Length 4,096 tokens
Precision bf16
Hardware 4x NVIDIA H100 80GB
Training Time ~25 hours
Final Loss ~4.4
Sun Dec 14 23:38:45 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.195.03             Driver Version: 570.195.03     CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA H100 80GB HBM3          On  |   00000000:07:00.0 Off |                    0 |
| N/A   66C    P0            575W /  700W |   47953MiB /  81559MiB |    100%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA H100 80GB HBM3          On  |   00000000:08:00.0 Off |                    0 |
| N/A   54C    P0            603W /  700W |   49853MiB /  81559MiB |     99%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA H100 80GB HBM3          On  |   00000000:09:00.0 Off |                    0 |
| N/A   55C    P0            583W /  700W |   48613MiB /  81559MiB |    100%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA H100 80GB HBM3          On  |   00000000:0A:00.0 Off |                    0 |
| N/A   65C    P0            570W /  700W |   51493MiB /  81559MiB |    100%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A           29054      C   /usr/bin/python3                      47944MiB |
|    1   N/A  N/A           29055      C   /usr/bin/python3                      49844MiB |
|    2   N/A  N/A           29056      C   /usr/bin/python3                      48604MiB |
|    3   N/A  N/A           29057      C   /usr/bin/python3                      51484MiB |
+-----------------------------------------------------------------------------------------+

LoRA Target Modules

q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

πŸ”¬ Evaluation Results

Side-by-Side Comparison: Base vs Fine-Tuned

Metric Base Qwen3-VL-8B STXBP1-ARIA
Avg Response Length 3,190 chars 435 chars
Domain Terminology Generic Specialized
PDB Structure IDs Rarely mentions Correctly cites
Caption Style Verbose explanation Publication-ready

Domain Knowledge Test

Question Base Model STXBP1-ARIA
Munc18-1 = STXBP1? Sometimes confused βœ… Correctly identifies
K196X mutation type Generic description βœ… "Nonsense mutation, premature stop codon"
AAV-STXBP1 therapy Generic gene therapy βœ… Specific DRG-detargeting details
mEPSC interpretation Basic description βœ… Correct synaptic terminology

Hallucination Test (Non-STXBP1 Images)

Image Type Mentions STXBP1? Result
Cute Cat photo ❌ No βœ… PASS
Plant phenology heatmap ❌ No βœ… PASS
RNA-seq (non-neuro) ❌ No βœ… PASS

⚠️ Limitations

  1. Not a Medical Device: This model is for research purposes only. Do not use for clinical diagnosis or treatment decisions.

  2. Training Data Scope: Primarily trained on STXBP1-related and synaptic biology literature. Performance on unrelated biomedical domains may vary.

  3. Caption Bias: May default to concise responses. Use explicit prompts for detailed explanations with 4-8k+ context.

  4. Temporal Cutoff: Training data reflects literature available on PubMed Central from 1-1-2000 thru 6-1-2025. Recent publications after June 1st 2025 or data from other sources are not included.

  5. Reference Generation: When generating citations, the model may produce plausible but unverified references. Always verify citations independently.


🎯 Intended Use

Primary Use Cases

  • Scientific figure interpretation for STXBP1/Munc18-1 research
  • Literature review assistance
  • Educational tool for understanding rare neurological diseases
  • Research hypothesis generation
  • Accessibility aid for researchers new to the field

Out of Scope

  • Clinical diagnosis or treatment recommendations
  • Replacement for peer-reviewed literature review
  • Primary source for medical decisions

πŸ“š Related Resources

Resource Link
Training Dataset (Multimodal) SkyWhal3/STXBP1_PubMed_Central_Multimodal_Dataset
Full-Text Dataset SkyWhal3/stxbp1-pubmed-central-fulltext
ClinVar Variants (Full) SkyWhal3/ClinVar-STXBP1-NLP-Dataset
ClinVar Variants (Pathogenic) SkyWhal3/ClinVar-STXBP1-NLP-Dataset-Pathogenic
STXBP1 Foundation stxbp1foundation.org

πŸ™ Acknowledgments

  • Qwen Team for the exceptional Qwen3-VL base model
  • Hugging Face for the transformers and PEFT libraries
  • PubMed Central for open access to scientific literature
  • STXBP1 Foundation for supporting rare disease research
  • Lambda Labs for GPU compute resources
  • Claude OPUS 4.5 (Anthropic) for assistance with data engineering and training pipelines (fixing my terrible python)

πŸ“„ Citation

@misc{stxbp1-aria-2025,
  author = {Freygang, Adam},
  title = {STXBP1-ARIA: AI Research Intelligence Assistant for STXBP1 Research},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/SkyWhal3/STXBP1-ARIA-Qwen3-VL-8B}
}

πŸ“œ License

This model is released under the Apache 2.0 license, consistent with the base Qwen3-VL model family.


Built with ❀️ for my daughter Katelyn and all those affected by the STXBP1 syndrome

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for SkyWhal3/STXBP1-ARIA-Qwen3-VL-8B-LoRa

Finetuned
(242)
this model

Dataset used to train SkyWhal3/STXBP1-ARIA-Qwen3-VL-8B-LoRa

Collection including SkyWhal3/STXBP1-ARIA-Qwen3-VL-8B-LoRa