--- tags: - text-generation - transformers - safetensors - gguf - llama.cpp - vllm - mlx - pytorch - onnx - llama - qwen - qwen3_5_text - causal-lm - scientific-research - papers - local - quantized - research-assistant - academic-writing - latex - citations - conversational - en - es - zh - ja - ru - fine-tuned - finetuned - base_model:Qwen/Qwen3.5-4B - dataset:Agnuxo/P2PCLAW-Innovative-Benchmark-Agents - dataset:Agnuxo/p2pclaw-papers - arxiv:2604.19792 - license:apache-2.0 - endpoints_compatible - region:us license: apache-2.0 library_name: transformers pipeline_tag: text-generation --- # CAJAL-4B-P2PCLAW 🧠 **The Research LLM That Fits in Your Pocket** CAJAL-4B is a 4-billion parameter language model fine-tuned specifically for **scientific paper generation**. Unlike generic chatbots, CAJAL understands academic structure, citation formats, LaTeX, and domain-specific terminology. Named after **Santiago RamΓ³n y Cajal**, the father of modern neuroscience, this model embodies rigorous, structured thinking applied to scientific writing. --- ## πŸš€ Quick Start ### Option 1: HuggingFace Transformers (Python) ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("Agnuxo/CAJAL-4B-P2PCLAW") tokenizer = AutoTokenizer.from_pretrained("Agnuxo/CAJAL-4B-P2PCLAW") prompt = """Write an abstract for a paper on decentralized AI peer review using formal verification and IPFS-backed persistence.""" inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=512) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ### Option 2: llama.cpp / LM Studio (Local, No Code) Download the GGUF from [Releases](https://huggingface.co/Agnuxo/CAJAL-4B-P2PCLAW/releases) Open LM Studio β†’ Load Model β†’ Select GGUF **System prompt:** ``` You are CAJAL, a research assistant specialized in scientific writing. Generate well-structured, cited academic content. Use LaTeX formatting for equations when relevant. Prefer precise, technical language over vague generalizations. ``` ### Option 3: Ollama ```bash ollama pull agnuxo/cajal-4b-p2pclaw ollama run agnuxo/cajal-4b-p2pclaw ``` ### Option 4: vLLM (Fast Inference Server) ```bash python -m vllm.entrypoints.openai.api_server \ --model Agnuxo/CAJAL-4B-P2PCLAW \ --quantization awq ``` ### Option 5: MLX (Apple Silicon) ```python import mlx_lm model, tokenizer = mlx_lm.load("Agnuxo/CAJAL-4B-P2PCLAW") response = mlx_lm.generate(model, tokenizer, prompt="Write a paper abstract...") ``` --- ## πŸ“Š What Makes It Different | Feature | CAJAL-4B | Generic 4B | Why It Matters | |---------|----------|-----------|---------------| | **Paper structure** | βœ… Native understanding | ⚠️ Generic chat | Knows IMRAD format | | **Citations** | βœ… BibTeX, APA, MLA | ❌ Hallucinates | Real citation formats | | **LaTeX** | βœ… Equations, tables | ❌ No | Research-ready output | | **Domain terms** | βœ… Physics, CS, Bio | ⚠️ Surface-level | Technical depth | | **Methodology** | βœ… Detailed procedures | ⚠️ Vague | Reproducible methods | | **VRAM usage** | βœ… 3.5GB (Q4_K_M) | Similar | Runs on consumer GPUs | | **Local inference** | βœ… 100% offline | ⚠️ Depends | No API/cloud needed | --- ## 🎯 Benchmarks | Task | CAJAL-4B | Qwen3.5-4B | Gemma-4B | Phi-4-mini | |------|----------|-----------|----------|------------| | Abstract generation | 92/100 | 71/100 | 68/100 | 79/100 | | Citation accuracy | 88/100 | 52/100 | 48/100 | 61/100 | | LaTeX correctness | 94/100 | 43/100 | 41/100 | 55/100 | | Methodology detail | 89/100 | 64/100 | 59/100 | 72/100 | | Literature review | 85/100 | 69/100 | 67/100 | 74/100 | Evaluated by [BenchClaw](https://benchclaw.vercel.app) 17-judge tribunal on 50 paper generation tasks. --- ## πŸ’» Hardware Requirements | Quantization | File Size | VRAM Required | Speed (RTX 3090) | Speed (M3 Max) | |-------------|-----------|---------------|-----------------|----------------| | Q4_K_M | 2.3 GB | 3.5 GB | ~45 tok/s | ~35 tok/s | | Q5_K_M | 2.7 GB | 4.2 GB | ~42 tok/s | ~32 tok/s | | Q8_0 | 4.1 GB | 5.0 GB | ~38 tok/s | ~28 tok/s | | F16 | 8.0 GB | 9.0 GB | ~35 tok/s | ~25 tok/s | **CPU-only**: Works on any modern CPU. ~5 tok/s on Ryzen 7 5800X. --- ## πŸ—οΈ Training Details - **Base model**: [Qwen/Qwen3.5-4B](https://huggingface.co/Qwen/Qwen3.5-4B) - **Fine-tuning**: LoRA (r=64, alpha=128) on domain-specific corpus - **Dataset**: 50+ peer-reviewed papers + 200+ technical reports from P2PCLAW ecosystem - **Context length**: 32K tokens - **Training hardware**: RTX 3090 24GB - **License**: Apache 2.0 --- ## πŸ”¬ Example Outputs ### Example 1: Abstract Generation **Prompt**: Write an abstract for a paper on thermodynamic reservoir computing using SHA-256 ASICs. **Output**: ``` We present a novel architecture for thermodynamic reservoir computing that repurposes Bitcoin SHA-256 ASICs as physical substrates. By mapping the spin-glass dynamics of the hashing pipeline onto a continuous-time reservoir, we achieve energy-efficient computation at the thermodynamic limit. Our experimental evaluation on repurposed Antminer S9 hardware demonstrates a 2.3x improvement in energy-delay product compared to conventional FPGA implementations, with a theoretical bound derived from Landauer's principle... ``` ### Example 2: Methodology Section **Prompt**: Write a methodology section for evaluating decentralized AI peer review with 17 judges. **Output**: ```markdown ### 3.1 Evaluation Protocol We employ a multi-dimensional tribunal consisting of 17 independent judge agents, each specialized in a distinct evaluation criterion: - **Reasoning Depth** (15%): Logical structure, argument validity, inference chain completeness. - **Mathematical Rigor** (12%): Proof correctness, notation consistency, theorem applicability. - **Code Quality** (10%): Reproducibility, documentation, test coverage. ... Each judge produces a scalar score $s_i \in [0, 100]$ and a confidence weight $w_i \in [0, 1]$. The composite score is computed as: $$S = \frac{\sum_{i=1}^{17} w_i s_i}{\sum_{i=1}^{17} w_i}$$ A paper achieves **Tribunal Pass** if $S \geq 75$ and no individual $s_i < 50$ (no veto condition). ``` --- ## 🧩 Integration with P2PCLAW Ecosystem CAJAL is one component of the P2PCLAW distributed research network: | Component | Role | Link | |-----------|------|------| | OpenCLAW-P2P | Core protocol, Lean 4 proofs | [GitHub](https://github.com/Agnuxo1/OpenCLAW-P2P) | | BenchClaw | 17-judge evaluation | [Web](https://benchclaw.vercel.app) | | EnigmAgent | Secure credential vault | [GitHub](https://github.com/Agnuxo1/EnigmAgent) | | AgentBoot | Bare-metal automation | [Web](https://agentboot.pages.dev/) | | P2PCLAW Main | Research network | [Website](https://www.p2pclaw.com/) | --- ## ⚠️ Limitations 1. **Domain specificity**: Optimized for STEM fields. Less effective for humanities or creative writing. 2. **Hallucination risk**: Like all LLMs, may generate plausible-sounding but incorrect citations. Always verify references. 3. **Language**: Primarily trained on English scientific papers. Spanish, Chinese, Japanese, Russian support is experimental. 4. **Length**: Best for sections up to ~2000 words. Very long papers (>10K words) may lose coherence. 5. **Recency**: Training data cutoff limits knowledge of papers published after training date. --- ## πŸ“š Citations If you use CAJAL in research, please cite: ```bibtex @article{angulo_cajal_2026, author = {Angulo de Lafuente, Francisco}, title = {{CAJAL-4B}: A Research-Specialized Language Model for Decentralized Scientific Writing}, journal = {arXiv preprint}, eprint = {2604.19792}, year = {2026}, url = {https://arxiv.org/abs/2604.19792} } ``` --- ## 🀝 Contributing - ⭐ Star the repo: [github.com/Agnuxo1/CAJAL](https://github.com/Agnuxo1/CAJAL) - πŸ› Report issues: [GitHub Issues](https://github.com/Agnuxo1/CAJAL/issues) - πŸ’° Sponsor development: [GitHub Sponsors](https://github.com/sponsors/Agnuxo1) --- ## πŸ“œ License Apache 2.0 β€” free for research and commercial use. --- *Built by [Francisco Angulo de Lafuente](https://www.p2pclaw.com/) Β· P2PCLAW Β· Independent Research* **ORCID**: 0009-0001-1634-7063