Instructions to use Agnuxo/CAJAL-4B-P2PCLAW with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Agnuxo/CAJAL-4B-P2PCLAW with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Agnuxo/CAJAL-4B-P2PCLAW")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Agnuxo/CAJAL-4B-P2PCLAW")
model = AutoModelForCausalLM.from_pretrained("Agnuxo/CAJAL-4B-P2PCLAW")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

MLX

How to use Agnuxo/CAJAL-4B-P2PCLAW with MLX:

# Make sure mlx-lm is installed
# pip install --upgrade mlx-lm

# Generate text with mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("Agnuxo/CAJAL-4B-P2PCLAW")

prompt = "Write a story about Einstein"
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True
)

text = generate(model, tokenizer, prompt=prompt, verbose=True)

Notebooks
Google Colab
Kaggle
Local Apps
LM Studio

vLLM

How to use Agnuxo/CAJAL-4B-P2PCLAW with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Agnuxo/CAJAL-4B-P2PCLAW"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Agnuxo/CAJAL-4B-P2PCLAW",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Agnuxo/CAJAL-4B-P2PCLAW

SGLang

How to use Agnuxo/CAJAL-4B-P2PCLAW with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Agnuxo/CAJAL-4B-P2PCLAW" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Agnuxo/CAJAL-4B-P2PCLAW",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Agnuxo/CAJAL-4B-P2PCLAW" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Agnuxo/CAJAL-4B-P2PCLAW",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Pi new

How to use Agnuxo/CAJAL-4B-P2PCLAW with Pi:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "Agnuxo/CAJAL-4B-P2PCLAW"

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "mlx-lm": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "Agnuxo/CAJAL-4B-P2PCLAW"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

MLX LM

How to use Agnuxo/CAJAL-4B-P2PCLAW with MLX LM:

Generate or start a chat session

# Install MLX LM
uv tool install mlx-lm
# Interactive chat REPL
mlx_lm.chat --model "Agnuxo/CAJAL-4B-P2PCLAW"

Run an OpenAI-compatible server

# Install MLX LM
uv tool install mlx-lm
# Start the server
mlx_lm.server --model "Agnuxo/CAJAL-4B-P2PCLAW"
# Calling the OpenAI-compatible server with curl
curl -X POST "http://localhost:8000/v1/chat/completions" \
   -H "Content-Type: application/json" \
   --data '{
     "model": "Agnuxo/CAJAL-4B-P2PCLAW",
     "messages": [
       {"role": "user", "content": "Hello"}
     ]
   }'

Docker Model Runner
How to use Agnuxo/CAJAL-4B-P2PCLAW with Docker Model Runner:
```
docker model run hf.co/Agnuxo/CAJAL-4B-P2PCLAW
```

CAJAL-4B-P2PCLAW

File size: 8,397 Bytes

2efa9af
d39c110
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7dbea79
3934190
d39c110
2efa9af
 
7dbea79
2efa9af
d39c110
 
 
2efa9af
d39c110
 
 
2efa9af
d39c110
 
 
2efa9af
 
 
 
c9cb5c3
7dbea79
2efa9af
c9cb5c3
 
 
 
 
7dbea79
2efa9af
 
c9cb5c3
2efa9af
d39c110
2efa9af
d39c110
 
 
c9cb5c3
 
 
 
 
7dbea79
2efa9af
c9cb5c3
2efa9af
7dbea79
c9cb5c3
 
7dbea79
2efa9af
d39c110
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c9cb5c3
2efa9af
c9cb5c3
d39c110
 
 
 
 
 
 
 
2efa9af
d39c110
2efa9af
c9cb5c3
2efa9af
d39c110
 
 
 
 
 
 
 
2efa9af
d39c110
2efa9af
c9cb5c3
2efa9af
d39c110
 
 
 
 
 
 
 
2efa9af
c9cb5c3
 
 
 
d39c110
c9cb5c3
d39c110
c9cb5c3
 
 
 
 
 
 
 
 
 
2efa9af
c9cb5c3
2efa9af
d39c110
2efa9af
d39c110
 
c9cb5c3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2efa9af
 
d39c110
 
c9cb5c3
 
 
2efa9af
c9cb5c3
 
d39c110
 
 
 
 
 
 
2efa9af
c9cb5c3
2efa9af
d39c110
 
 
 
 
 
 
c9cb5c3
 
 
 
2efa9af
7dbea79
c9cb5c3
d39c110
 
 
c9cb5c3
d39c110
 
 
7dbea79
2efa9af
 
d39c110
 
c9cb5c3
 
 
 
 
2efa9af
d39c110
 
c9cb5c3
2efa9af
c9cb5c3
 
 
2efa9af
d39c110

---
tags:
  - text-generation
  - transformers
  - safetensors
  - gguf
  - llama.cpp
  - vllm
  - mlx
  - pytorch
  - onnx
  - llama
  - qwen
  - qwen3_5_text
  - causal-lm
  - scientific-research
  - papers
  - local
  - quantized
  - research-assistant
  - academic-writing
  - latex
  - citations
  - conversational
  - en
  - es
  - zh
  - ja
  - ru
  - fine-tuned
  - finetuned
  - base_model:Qwen/Qwen3.5-4B
  - dataset:Agnuxo/P2PCLAW-Innovative-Benchmark-Agents
  - dataset:Agnuxo/p2pclaw-papers
  - arxiv:2604.19792
  - license:apache-2.0
  - endpoints_compatible
  - region:us
license: apache-2.0
library_name: transformers
pipeline_tag: text-generation
---

# CAJAL-4B-P2PCLAW

🧠 **The Research LLM That Fits in Your Pocket**

CAJAL-4B is a 4-billion parameter language model fine-tuned specifically for **scientific paper generation**. Unlike generic chatbots, CAJAL understands academic structure, citation formats, LaTeX, and domain-specific terminology.

Named after **Santiago Ramón y Cajal**, the father of modern neuroscience, this model embodies rigorous, structured thinking applied to scientific writing.

---

## 🚀 Quick Start

### Option 1: HuggingFace Transformers (Python)

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("Agnuxo/CAJAL-4B-P2PCLAW")
tokenizer = AutoTokenizer.from_pretrained("Agnuxo/CAJAL-4B-P2PCLAW")

prompt = """Write an abstract for a paper on decentralized AI peer review 
using formal verification and IPFS-backed persistence."""

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

### Option 2: llama.cpp / LM Studio (Local, No Code)

Download the GGUF from [Releases](https://huggingface.co/Agnuxo/CAJAL-4B-P2PCLAW/releases)

Open LM Studio → Load Model → Select GGUF

**System prompt:**
```
You are CAJAL, a research assistant specialized in scientific writing.
Generate well-structured, cited academic content.
Use LaTeX formatting for equations when relevant.
Prefer precise, technical language over vague generalizations.
```

### Option 3: Ollama

```bash
ollama pull agnuxo/cajal-4b-p2pclaw
ollama run agnuxo/cajal-4b-p2pclaw
```

### Option 4: vLLM (Fast Inference Server)

```bash
python -m vllm.entrypoints.openai.api_server \
  --model Agnuxo/CAJAL-4B-P2PCLAW \
  --quantization awq
```

### Option 5: MLX (Apple Silicon)

```python
import mlx_lm

model, tokenizer = mlx_lm.load("Agnuxo/CAJAL-4B-P2PCLAW")
response = mlx_lm.generate(model, tokenizer, prompt="Write a paper abstract...")
```

---

## 📊 What Makes It Different

| Feature | CAJAL-4B | Generic 4B | Why It Matters |
|---------|----------|-----------|---------------|
| **Paper structure** | ✅ Native understanding | ⚠️ Generic chat | Knows IMRAD format |
| **Citations** | ✅ BibTeX, APA, MLA | ❌ Hallucinates | Real citation formats |
| **LaTeX** | ✅ Equations, tables | ❌ No | Research-ready output |
| **Domain terms** | ✅ Physics, CS, Bio | ⚠️ Surface-level | Technical depth |
| **Methodology** | ✅ Detailed procedures | ⚠️ Vague | Reproducible methods |
| **VRAM usage** | ✅ 3.5GB (Q4_K_M) | Similar | Runs on consumer GPUs |
| **Local inference** | ✅ 100% offline | ⚠️ Depends | No API/cloud needed |

---

## 🎯 Benchmarks

| Task | CAJAL-4B | Qwen3.5-4B | Gemma-4B | Phi-4-mini |
|------|----------|-----------|----------|------------|
| Abstract generation | 92/100 | 71/100 | 68/100 | 79/100 |
| Citation accuracy | 88/100 | 52/100 | 48/100 | 61/100 |
| LaTeX correctness | 94/100 | 43/100 | 41/100 | 55/100 |
| Methodology detail | 89/100 | 64/100 | 59/100 | 72/100 |
| Literature review | 85/100 | 69/100 | 67/100 | 74/100 |

Evaluated by [BenchClaw](https://benchclaw.vercel.app) 17-judge tribunal on 50 paper generation tasks.

---

## 💻 Hardware Requirements

| Quantization | File Size | VRAM Required | Speed (RTX 3090) | Speed (M3 Max) |
|-------------|-----------|---------------|-----------------|----------------|
| Q4_K_M | 2.3 GB | 3.5 GB | ~45 tok/s | ~35 tok/s |
| Q5_K_M | 2.7 GB | 4.2 GB | ~42 tok/s | ~32 tok/s |
| Q8_0 | 4.1 GB | 5.0 GB | ~38 tok/s | ~28 tok/s |
| F16 | 8.0 GB | 9.0 GB | ~35 tok/s | ~25 tok/s |

**CPU-only**: Works on any modern CPU. ~5 tok/s on Ryzen 7 5800X.

---

## 🏗️ Training Details

- **Base model**: [Qwen/Qwen3.5-4B](https://huggingface.co/Qwen/Qwen3.5-4B)
- **Fine-tuning**: LoRA (r=64, alpha=128) on domain-specific corpus
- **Dataset**: 50+ peer-reviewed papers + 200+ technical reports from P2PCLAW ecosystem
- **Context length**: 32K tokens
- **Training hardware**: RTX 3090 24GB
- **License**: Apache 2.0

---

## 🔬 Example Outputs

### Example 1: Abstract Generation

**Prompt**: Write an abstract for a paper on thermodynamic reservoir computing using SHA-256 ASICs.

**Output**:
```
We present a novel architecture for thermodynamic reservoir computing 
that repurposes Bitcoin SHA-256 ASICs as physical substrates. By mapping 
the spin-glass dynamics of the hashing pipeline onto a continuous-time 
reservoir, we achieve energy-efficient computation at the thermodynamic 
limit. Our experimental evaluation on repurposed Antminer S9 hardware 
demonstrates a 2.3x improvement in energy-delay product compared to 
conventional FPGA implementations, with a theoretical bound derived 
from Landauer's principle...
```

### Example 2: Methodology Section

**Prompt**: Write a methodology section for evaluating decentralized AI peer review with 17 judges.

**Output**:
```markdown
### 3.1 Evaluation Protocol

We employ a multi-dimensional tribunal consisting of 17 independent 
judge agents, each specialized in a distinct evaluation criterion:

- **Reasoning Depth** (15%): Logical structure, argument validity, 
  inference chain completeness.
- **Mathematical Rigor** (12%): Proof correctness, notation consistency, 
  theorem applicability.
- **Code Quality** (10%): Reproducibility, documentation, test coverage.
...

Each judge produces a scalar score $s_i \in [0, 100]$ and a confidence 
weight $w_i \in [0, 1]$. The composite score is computed as:

$$S = \frac{\sum_{i=1}^{17} w_i s_i}{\sum_{i=1}^{17} w_i}$$

A paper achieves **Tribunal Pass** if $S \geq 75$ and no individual 
$s_i < 50$ (no veto condition).
```

---

## 🧩 Integration with P2PCLAW Ecosystem

CAJAL is one component of the P2PCLAW distributed research network:

| Component | Role | Link |
|-----------|------|------|
| OpenCLAW-P2P | Core protocol, Lean 4 proofs | [GitHub](https://github.com/Agnuxo1/OpenCLAW-P2P) |
| BenchClaw | 17-judge evaluation | [Web](https://benchclaw.vercel.app) |
| EnigmAgent | Secure credential vault | [GitHub](https://github.com/Agnuxo1/EnigmAgent) |
| AgentBoot | Bare-metal automation | [Web](https://agentboot.pages.dev/) |
| P2PCLAW Main | Research network | [Website](https://www.p2pclaw.com/) |

---

## ⚠️ Limitations

1. **Domain specificity**: Optimized for STEM fields. Less effective for humanities or creative writing.
2. **Hallucination risk**: Like all LLMs, may generate plausible-sounding but incorrect citations. Always verify references.
3. **Language**: Primarily trained on English scientific papers. Spanish, Chinese, Japanese, Russian support is experimental.
4. **Length**: Best for sections up to ~2000 words. Very long papers (>10K words) may lose coherence.
5. **Recency**: Training data cutoff limits knowledge of papers published after training date.

---

## 📚 Citations

If you use CAJAL in research, please cite:

```bibtex
@article{angulo_cajal_2026,
  author = {Angulo de Lafuente, Francisco},
  title = {{CAJAL-4B}: A Research-Specialized Language Model for 
    Decentralized Scientific Writing},
  journal = {arXiv preprint},
  eprint = {2604.19792},
  year = {2026},
  url = {https://arxiv.org/abs/2604.19792}
}
```

---

## 🤝 Contributing

- ⭐ Star the repo: [github.com/Agnuxo1/CAJAL](https://github.com/Agnuxo1/CAJAL)
- 🐛 Report issues: [GitHub Issues](https://github.com/Agnuxo1/CAJAL/issues)
- 💰 Sponsor development: [GitHub Sponsors](https://github.com/sponsors/Agnuxo1)

---

## 📜 License

Apache 2.0 — free for research and commercial use.

---

*Built by [Francisco Angulo de Lafuente](https://www.p2pclaw.com/) · P2PCLAW · Independent Research*

**ORCID**: 0009-0001-1634-7063