π§ STXBP1-ARIA: AI Research Intelligence Assistant
The first vision-language model specialized for STXBP1/Munc18-1 research and rare neurological disease.
Fine-tuned on 470,000 biomedical figure-caption pairs from 31,786 PubMed Central articles & 175,000 article images.
| Base Model | Training Data | Parameters | Context |
|---|---|---|---|
| Qwen3-VL-8B-Instruct | 470K samples | 8.9B (175M trainable) | 32K tokens (1M with rope) |
π― Model Highlights
- Domain Expert: Deep knowledge of STXBP1, Munc18-1, synaptic function, epileptic encephalopathy, and gene therapy approaches
- Figure Understanding: Trained on scientific figures including Western blots, electrophysiology traces, protein structures, and experimental schematics
- Dual Response Modes: Concise 'caption-style' OR comprehensive explanations depending on prompt and context settings ("give a detailed, comprehensive explanation about 123XYZ")
- No Hallucination on Out-of-Domain: Correctly handles non-STXBP1 images without forcing domain relevance like cute cat images (ask me how I know)
π The Story Behind This Model
This model was created by me, a father whose daughter has STXBP1 syndrome (K196X mutation).
STXBP1-ARIA-Qwen3-VL-8B-LoRA represents:
- 31,786 scientific articles processed
- 470,000+ figure-caption pairs extracted
- 99.7% image-text match rate achieved
- 25 hours of training on 4x NVIDIA H100 GPUs
- One parent's determination to advance rare disease research
π Quick Start
Installation
pip install transformers peft accelerate torch
pip install qwen-vl-utils # For image processing
Basic Usage
import torch
from transformers import AutoProcessor, Qwen3VLForConditionalGeneration
from peft import PeftModel
from PIL import Image
# Load base model
model = Qwen3VLForConditionalGeneration.from_pretrained(
"Qwen/Qwen3-VL-8B-Instruct",
torch_dtype=torch.bfloat16,
attn_implementation="flash_attention_2",
device_map="auto",
)
# Load LoRA adapter
model = PeftModel.from_pretrained(model, "SkyWhal3/STXBP1-ARIA-Qwen3-VL-8B-LoRa")
model.eval()
# Load processor
processor = AutoProcessor.from_pretrained("Qwen/Qwen3-VL-8B-Instruct")
# Prepare image and prompt
image = Image.open("your_figure.png").convert("RGB")
messages = [
{
"role": "user",
"content": [
{"type": "image", "image": image},
{"type": "text", "text": "Describe this scientific figure in detail."}
]
}
]
# Generate
text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=[text], images=[image], return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=32768, temperature=0.7, top_p=0.8)
response = processor.decode(output[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(response)
π‘ Prompt Strategies
Caption Mode (Concise)
Describe this scientific figure.
Response style: Brief, publication-ready caption (~200-500 chars)
Expert Mode (Comprehensive)
Provide a detailed, comprehensive explanation of [topic].
Include mechanisms, advantages, disadvantages, and current research status.
Response style: Full expert analysis (~2000-8000+ chars)
π Example Outputs
Example 1: AAV-STXBP1 Gene Therapy Figure
Prompt -π€ User: "What gene therapy approach is shown here?"
π€ STXBP1-ARIA Response:
In silico pan-neuronal promoter design and NGS screening for DRG-detargeting regulatory elements were used to identify a safe and selective AAV-STXBP1 construct that rescues STXBP1 phenotypes in mice and nonhuman primates.
Example 2: Munc18-1 Electrophysiology
Prompt - π€ User: "What type of experiment does this figure show?"
π€ STXBP1-ARIA Response:
Munc18-1 cKO mice display reduced spontaneous glutamate release. (A) Representative traces and cumulative distribution plots of mEPSCs in Munc18-1+/+ and Munc18-1β/β cKO mice no. 1. (B) Representative traces and cumulative distribution plots of mEPSCs in Munc18-1+/+ and Munc18-1β/β cKO mice no. 2. Data are expressed as mean Β± SEM. **p< 0.01; ***p< 0.001.
Example 3: Expert Mode - Therapeutic Approaches ( Expert Mode)
Prompt -π€ User: Provide a detailed, comprehensive explanation of the therapeutic approaches for STXBP1 K196X nonsense mutations. Include mechanisms, advantages, disadvantages, and current research status.
π€ STXBP1-ARIA Response: STXBP1 K196X Nonsense Mutation: Therapeutic Approaches, Mechanisms, and Current Research Status
The STXBP1 K196X nonsense mutation is a pathogenic variant in the syntaxin-binding protein 1 (STXBP1) gene, which encodes Munc18-1, a critical protein involved in synaptic vesicle fusion and neurotransmitter release. This mutation results in a premature stop codon at position 196, leading to truncated, nonfunctional protein and haploinsufficiency. STXBP1-related disorders include epileptic encephalopathies, developmental delay, intellectual disability, and autism spectrum disorder.
Therapeutic approaches for STXBP1 K196X can be broadly categorized into:
- Gene Therapy
- mRNA-Based Therapies
- Protein Replacement Therapy
- Small-Molecule Therapeutics (including readthrough agents)
- CRISPR-Cas9 Genome Editing
- Pharmacological Interventions (Symptomatic Management)
Below is a detailed analysis of each approach, including mechanisms, advantages, disadvantages, and current research status.
1. Gene Therapy Mechanism: Gene therapy aims to deliver a functional copy of the STXBP1 gene to affected cells using viral vectors (e.g., adeno-associated virus [AAV] or lentivirus) or non-viral delivery systems (e.g., lipid nanoparticles). The goal is to restore normal Munc18-1 expression and function.
Advantages: β’ Potential for long-term correction. β’ Can achieve targeted delivery to the central nervous system (CNS). β’ May provide durable expression without the need for repeated dosing.
Disadvantages: β’ Risk of immune responses against viral vectors. β’ Limited cargo capacity of AAVs (~4.7 kb), which may restrict delivery of large STXBP1 cDNA. β’ Challenges with CNS delivery due to blood-brain barrier (BBB) limitations. β’ Off-target effects and insertional mutagenesis risks.
Current Research Status: Preclinical studies are underway using AAV9 vectors to deliver full-length STXBP1 to murine models of STXBP1 deficiency. Early results show improved survival and reduced seizure frequency in animal models. Human trials have not yet been initiated due to safety concerns and lack of regulatory approval for STXBP1-specific therapies.
2. mRNA-Based Therapies Mechanism: mRNA-based therapies involve delivering synthetic mRNA encoding functional STXBP1 directly to target cells. This approach bypasses genomic integration and allows transient expression of the protein.
Advantages: β’ Avoids risks associated with gene integration. β’ Enables rapid production of therapeutic proteins. β’ Can be produced at scale using established mRNA manufacturing technologies.
Disadvantages: β’ Short duration of action requires repeated administration. β’ Delivery challenges, particularly across the BBB. β’ Potential immune activation by foreign mRNA.
Current Research Status: mRNA delivery systems, such as lipid nanoparticles (LNPs), have shown promise in preclinical models. Studies have demonstrated that intrathecal delivery of STXBP1 mRNA improves neuronal function and reduces seizure severity in STXBP1-deficient mice. However, no clinical trials are currently registered for this approach.
3. Protein Replacement Therapy Mechanism: This approach involves the direct administration of recombinant Munc18-1 protein to restore synaptic function. Protein replacement could be delivered via intravenous infusion, intrathecal injection, or oral formulations designed to cross the BBB.
Advantages: β’ Immediate therapeutic effect upon administration. β’ No risk of genomic integration. β’ Could be tailored to specific patient needs.
Disadvantages: β’ Challenges with protein stability, immunogenicity, and delivery to the CNS. β’ High cost and logistical complexity of continuous protein administration. β’ Risk of accumulation in non-target tissues.
Current Research Status: No protein replacement therapies targeting STXBP1 are currently in development. However, advances in protein engineering and delivery technologies may facilitate future applications.
4. Small-Molecule Therapeutics (Readthrough Agents) Mechanism: Small molecules that promote ribosomal readthrough of premature termination codons (PTCs) can restore translation of full-length STXBP1. Compounds like Ataluren (PTC124) and Gentamicin derivatives have been investigated for their ability to induce readthrough at nonsense mutations.
Advantages: β’ Oral bioavailability and ease of administration. β’ Potential for broad application across genetic disorders involving PTCs. β’ Lower risk profile compared to gene editing or protein replacement.
Disadvantages: β’ Variable efficacy depending on the location and type of PTC. β’ Risk of off-target effects and toxicity. β’ Limited data on long-term safety and efficacy for neurological disorders.
Current Research Status: Ataluren has been approved for Duchenne muscular dystrophy but has not been tested in STXBP1-related disorders. Preclinical studies suggest potential efficacy for STXBP1 K196X, but clinical trials are lacking. Further optimization of readthrough compounds and delivery methods is needed.
5. CRISPR-Cas9 Genome Editing Mechanism: CRISPR-Cas9 can be used to correct the K196X mutation by precisely excising or repairing the mutated nucleotide. Alternatively, base editors or prime editors can introduce silent mutations to restore reading frame without double-strand breaks.
Advantages: β’ Precision editing with minimal off-target effects. β’ Potential for permanent correction. β’ Applicable to both germline and somatic cells.
Disadvantages: β’ Technical challenges in delivering Cas9 components to neurons. β’ Risk of unintended genomic alterations. β’ Ethical concerns regarding germline editing.
Current Research Status: CRISPR-based correction of STXBP1 mutations is primarily in preclinical stages. Studies using CRISPR/Cas9 in induced pluripotent stem cell-derived neurons and mouse models have shown promising results. However, efficient and safe delivery to the human brain remains a significant hurdle.
6. Pharmacological Interventions (Symptomatic Management) Mechanism: While not correcting the underlying genetic defect, symptomatic treatments aim to reduce seizures, improve cognition, and manage behavioral symptoms. Common interventions include antiepileptic drugs (AEDs), ketogenic diet, and neuromodulators.
Advantages: β’ Widely available and well-established. β’ Can significantly improve quality of life. β’ Low risk profile when used appropriately.
Disadvantages: β’ Does not address the root cause of the disease. β’ Variable response among patients. β’ Potential side effects from chronic medication use.
Current Research Status: Symptomatic management remains the standard of care for STXBP1-related disorders. Ongoing research focuses on optimizing treatment regimens and identifying novel targets for intervention.
Conclusion and Future Directions
There is currently no FDA-approved therapy specifically for STXBP1 K196X mutations. While gene therapy, mRNA-based approaches, and small-molecule readthrough agents hold significant promise, several challenges remain, particularly concerning delivery to the CNS and long-term safety. CRISPR-based strategies offer precision but require further refinement for clinical translation.
Future directions should focus on: β’ Developing optimized delivery systems for CNS-targeted therapies. β’ Conducting preclinical studies to evaluate safety and efficacy in relevant animal models. β’ Advancing clinical trials for readthrough agents and gene therapies. β’ Establishing biomarkers to monitor treatment response and guide personalized medicine.
Collaborative efforts between academia, industry, and regulatory agencies will be essential to accelerate the development of effective therapies for individuals with STXBP1-related disorders.
References (Illustrative):
Watanabe et al. (2021). βSyntaxin-Binding Protein 1 Deficiency: A Novel Cause of Epileptic Encephalopathy.β Brain. 144(1): 123β134. doi:10.1093/brain/awaa355.
Sivakumar et al. (2020). βCRISPR-Cas9 Correction of STXBP1 Mutations in Human Induced Pluripotent Stem Cells.β Molecular Therapy. 28(11): 2912β2923. doi:10.1016/j.ymthe.2020.08.024.
Gao et al. (2022). βAdeno-Associated Virus-Mediated Gene Therapy for STXBP1-Related Disorders.β Gene Therapy. 29(3): 345β357. doi:10.1038/s41439-021-00887-8.
Tardif et al. (2021). βReadthrough Agents for Nonsense Mutations: Progress and Challenges.β Nature Reviews Genetics. 22(7): 401β416. doi:10.1038/s41576-021-00336-2.
Kwon et al. (2020). βSTXBP1-Related Epilepsy: Clinical Features and Treatment Strategies.β Epilepsia. 61(12): 2401β2412. doi:10.1111/epi.16622.
Note: All references are illustrative and based on publicly available literature; actual citations should be verified through PubMed or institutional databases.
π Training Details
Dataset
- Source: STXBP1_PubMed_Central_Multimodal_Dataset
- Articles: 31,786 from PubMed Central
- Training Samples: 469,750 (after filtering)
- Sample Types:
- Figure + caption pairs
- Figure + caption + abstract context
- Multi-turn Q&A conversations
Training Configuration
| Parameter | Value |
|---|---|
| Base Model | Qwen3-VL-8B-Instruct |
| Method | LoRA (Low-Rank Adaptation) |
| LoRA Rank | 64 |
| LoRA Alpha | 128 |
| Learning Rate | 2e-5 |
| Batch Size | 2 per GPU |
| Gradient Accumulation | 4 |
| Effective Batch Size | 32 |
| Epochs | 1 |
| Max Sequence Length | 4,096 tokens |
| Precision | bf16 |
| Hardware | 4x NVIDIA H100 80GB |
| Training Time | ~25 hours |
| Final Loss | ~4.4 |
Sun Dec 14 23:38:45 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.195.03 Driver Version: 570.195.03 CUDA Version: 12.8 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA H100 80GB HBM3 On | 00000000:07:00.0 Off | 0 |
| N/A 66C P0 575W / 700W | 47953MiB / 81559MiB | 100% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA H100 80GB HBM3 On | 00000000:08:00.0 Off | 0 |
| N/A 54C P0 603W / 700W | 49853MiB / 81559MiB | 99% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 2 NVIDIA H100 80GB HBM3 On | 00000000:09:00.0 Off | 0 |
| N/A 55C P0 583W / 700W | 48613MiB / 81559MiB | 100% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 3 NVIDIA H100 80GB HBM3 On | 00000000:0A:00.0 Off | 0 |
| N/A 65C P0 570W / 700W | 51493MiB / 81559MiB | 100% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 29054 C /usr/bin/python3 47944MiB |
| 1 N/A N/A 29055 C /usr/bin/python3 49844MiB |
| 2 N/A N/A 29056 C /usr/bin/python3 48604MiB |
| 3 N/A N/A 29057 C /usr/bin/python3 51484MiB |
+-----------------------------------------------------------------------------------------+
LoRA Target Modules
q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
π¬ Evaluation Results
Side-by-Side Comparison: Base vs Fine-Tuned
| Metric | Base Qwen3-VL-8B | STXBP1-ARIA |
|---|---|---|
| Avg Response Length | 3,190 chars | 435 chars |
| Domain Terminology | Generic | Specialized |
| PDB Structure IDs | Rarely mentions | Correctly cites |
| Caption Style | Verbose explanation | Publication-ready |
Domain Knowledge Test
| Question | Base Model | STXBP1-ARIA |
|---|---|---|
| Munc18-1 = STXBP1? | Sometimes confused | β Correctly identifies |
| K196X mutation type | Generic description | β "Nonsense mutation, premature stop codon" |
| AAV-STXBP1 therapy | Generic gene therapy | β Specific DRG-detargeting details |
| mEPSC interpretation | Basic description | β Correct synaptic terminology |
Hallucination Test (Non-STXBP1 Images)
| Image Type | Mentions STXBP1? | Result |
|---|---|---|
| Cute Cat photo | β No | β PASS |
| Plant phenology heatmap | β No | β PASS |
| RNA-seq (non-neuro) | β No | β PASS |
β οΈ Limitations
Not a Medical Device: This model is for research purposes only. Do not use for clinical diagnosis or treatment decisions.
Training Data Scope: Primarily trained on STXBP1-related and synaptic biology literature. Performance on unrelated biomedical domains may vary.
Caption Bias: May default to concise responses. Use explicit prompts for detailed explanations with 4-8k+ context.
Temporal Cutoff: Training data reflects literature available on PubMed Central from 1-1-2000 thru 6-1-2025. Recent publications after June 1st 2025 or data from other sources are not included.
Reference Generation: When generating citations, the model may produce plausible but unverified references. Always verify citations independently.
π― Intended Use
Primary Use Cases
- Scientific figure interpretation for STXBP1/Munc18-1 research
- Literature review assistance
- Educational tool for understanding rare neurological diseases
- Research hypothesis generation
- Accessibility aid for researchers new to the field
Out of Scope
- Clinical diagnosis or treatment recommendations
- Replacement for peer-reviewed literature review
- Primary source for medical decisions
π Related Resources
| Resource | Link |
|---|---|
| Training Dataset (Multimodal) | SkyWhal3/STXBP1_PubMed_Central_Multimodal_Dataset |
| Full-Text Dataset | SkyWhal3/stxbp1-pubmed-central-fulltext |
| ClinVar Variants (Full) | SkyWhal3/ClinVar-STXBP1-NLP-Dataset |
| ClinVar Variants (Pathogenic) | SkyWhal3/ClinVar-STXBP1-NLP-Dataset-Pathogenic |
| STXBP1 Foundation | stxbp1foundation.org |
π Acknowledgments
- Qwen Team for the exceptional Qwen3-VL base model
- Hugging Face for the transformers and PEFT libraries
- PubMed Central for open access to scientific literature
- STXBP1 Foundation for supporting rare disease research
- Lambda Labs for GPU compute resources
- Claude OPUS 4.5 (Anthropic) for assistance with data engineering and training pipelines (fixing my terrible python)
π Citation
@misc{stxbp1-aria-2025,
author = {Freygang, Adam},
title = {STXBP1-ARIA: AI Research Intelligence Assistant for STXBP1 Research},
year = {2025},
publisher = {Hugging Face},
url = {https://huggingface.co/SkyWhal3/STXBP1-ARIA-Qwen3-VL-8B}
}
π License
This model is released under the Apache 2.0 license, consistent with the base Qwen3-VL model family.
Built with β€οΈ for my daughter Katelyn and all those affected by the STXBP1 syndrome
Model tree for SkyWhal3/STXBP1-ARIA-Qwen3-VL-8B-LoRa
Base model
Qwen/Qwen3-VL-8B-Instruct