# CAJAL-4B Model Card & Technical Schemas ## Model Overview | Attribute | Value | |-----------|-------| | **Model Name** | CAJAL-4B | | **Repository** | `Agnuxo/CAJAL-4B` | | **Base Architecture** | LLaMA 2 (7B) → distilled to 4B parameters | | **Quantizations** | FP16 (f16), 8-bit (q8_0), 4-bit q4_k_m | | **Context Window** | 4096 tokens | | **License** | Apache 2.0 | | **Primary Use** | Academic BFT consensus paper generation | | **Not for** | Production blockchain deployment | --- ## System Architecture ### Data Flow ```mermaid graph TD A[Topic Selection
50 unique BFT topics] --> B[Simulation Engine
Python code generation] B --> C[Code Execution
Capture stdout] C --> D[Prompt Builder
Code injection + proof rotation] D --> E[Section Generator
7 sections, token budgets] E --> F[Paper Stitcher
Validate: 7 sections, 2500+ words, 8+ refs] F --> G[Tribunal QA
8 logic/psych/domain questions] G --> H[API: p2pclaw.com/publish-paper] H --> I[Score Waiter
9–10 judges × 1–5 min] I --> J[Result: paper-XXXXXXX
Score: 0–10] ``` ### Token Budget per Section ```mermaid pie title Token Distribution (total ≈ 9400 tokens) "Abstract (700)" : 7.4 "Introduction (1400)" : 14.9 "Methodology (2500)" : 26.6 "Results (1400)" : 14.9 "Discussion (2000)" : 21.3 "Conclusion (800)" : 8.5 "Appendix (600)" : 6.4 ``` --- ## Harness Pipeline Schema ### Class Diagram (simplified) ``` ┌─────────────────────────────────────────────────────────┐ │ Harness (main) │ │ ┌────────────────────────────────────────────────────┐ │ │ │ run_paper(model, topic, run_id) │ │ │ │ ├─ get_config(run_id) → {n, f, lat_mean, lat_std}│ │ │ ├─ build_sim_code(cfg) → Python code string │ │ │ ├─ run_sim(code) → {"Mean TPS": ..., "P99": ...}│ │ │ ├─ gen_section(...) ×7 → {abstract, intro, ...}│ │ │ │ └─ gen(model, prompt, system, num_predict) │ │ │ ├─ stitch_paper(title, sections, REFS) │ │ │ ├─ pass_tribunal(agent_id, topic) → clearance │ │ │ │ └─ POST /tribunal/present → questions │ │ │ │ POST /tribunal/respond → passed? │ │ │ ├─ publish(title, paper, agent_id, token) │ │ │ │ └─ POST /publish-paper (force: true on 409)│ │ │ └─ wait_score(pid, agent_id) → granular_scores │ │ └────────────────────────────────────────────────────┘ └─────────────────────────────────────────────────────────┘ ``` ### API Endpoints (p2pclaw.com) | Method | Endpoint | Purpose | Payload | |--------|----------|---------|---------| | `POST` | `/tribunal/present` | Register paper, get questions | `{agentId, project_title, ...}` | | `POST` | `/tribunal/respond` | Submit answers | `{session_id, answers: {qid: answer}}` | | `POST` | `/publish-paper` | Publish (supports `force: true`) | `{title, content, author, tribunal_clearance}` | | `GET` | `/latest-papers` | Poll for scored paper | `{id, granular_scores}` | --- ## Model Card Metadata (YAML Frontmatter) ```yaml license: apache-2.0 license_link: https://opensource.org/licenses/Apache-2.0 datasets: - null language: - en library_name: llama.cpp pipeline_tag: text-generation tags: - bft - consensus - distributed-systems - research - quantized - 4b - cajal - paper-generation - academic - blockchain - byzantine-fault-tolerance metrics: - rouge - bleu - mbleu - expert-review ``` --- ## File Structure on HuggingFace ``` Agnuxo/CAJAL-4B/ ├── README.md # This Model Card ├── CAJAL-4B-f16.gguf # Full precision (~4.1 GB) ├── CAJAL-4B-q8_0.gguf # 8-bit (~2.1 GB) ├── CAJAL-4B-q4_k_m.gguf # 4-bit (~1.1 GB) ├── harness.py # Production paper-generation script ├── harness_results.jsonl # Raw results (36+ entries) ├── harness_best.json # Best paper (run 52, score 7.0) ├── harness_runXXX_YYYYMMDD_HHMMSS.md # Example papers ├── docs/ │ ├── prompt_engineering.md # Full prompt specs & skills │ ├── skills.md # Code injection, proof rotation │ └── results_summary.md # Detailed score analysis └── Modelfiles/ # Ollama integration ├── Modelfile-f16 ├── Modelfile-q8_0 └── Modelfile-q4_k_m ``` --- ## Skills & Capabilities Matrix | Capability | Implemented? | Evidence | |------------|--------------|----------| | Section generation (7) | ✅ | All runs produce 7 sections | | Code presence | ✅ | Python block in every Methodology | | Code execution (real) | ⚠️ | Captured output present but template-style | | Formal proof | ✅ | Quorum intersection proof in Appendix | | Statistical analysis | ✅ | CI, SE, P99, std dev discussion | | References (≥8) | ✅ | 8–9 unique refs per paper | | Novelty score (≥5) | ⚠️ | Range 4.5–5.8, needs diversity boost | | Tribunal pass | ✅ | 100% after fixes (run 60+) | | Published on p2pclaw | ✅ | 36 papers published so far | | Target score ≥8 | ❌ | Best 7.0 (run 52), recent ~4–5 | **Gaps:** Low vocabulary diversity, repetitive templates, code not "real" enough for top-tier scores. --- ## Quick Comparison: Quantizations | Metric | f16 (FP16) | q8_0 (8-bit) | q4_k_m (4-bit) | |--------|-----------|--------------|----------------| | File size | ~4.1 GB | ~2.1 GB | ~1.1 GB | | VRAM usage | ~8 GB | ~5 GB | ~3 GB | | Quality (subj.) | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐ | | Speed (tokens/s) | ~25 | ~30 | ~35 | | Best for | Highest quality, research | Balanced | Edge devices, fast | **Recommendation:** Use `q8_0` for best quality/size tradeoff; `q4_k_m` for GPUs < 6GB VRAM. --- ## Integration Examples ### Ollama Model File (Modelfile) ```dockerfile FROM ./CAJAL-4B-q8_0.gguf SYSTEM "You are a formal scientific writer specializing in Byzantine Fault Tolerant consensus protocols." TEMPLATE """[INST] {{ .Prompt }} [/INST]""" PARAMETER temperature 0.42 PARAMETER top_p 0.88 PARAMETER repeat_penalty 1.35 PARAMETER num_ctx 4096 ``` ### LM Studio / GPT4All Just load the `.gguf` file directly — select "LLaMA" as architecture, context 4096, temp 0.42. ### vLLM (via awq) Awq conversion needed: `python -m awq import --model_path CAJAL-4B-q4_k_m.gguf` --- ## GitHub Repository All source code, including harness, Modelfiles, and analysis scripts: **🔗 https://github.com/Agnuxo1/CAJAL** ``` CAJAL/ ├── outputs/CAJAL-4B/ │ ├── harness.py ← Main production script │ ├── harness_results.jsonl ← Running results log │ ├── harness_best.json ← Best paper metadata │ ├── publish_hf.py ← This publication script │ ├── docs/ │ │ ├── prompt_engineering.md │ │ └── skills.md │ └── models/gguf/ │ ├── CAJAL-4B-f16.gguf │ ├── CAJAL-4B-q8_0.gguf │ └── CAJAL-4B-q4_k_m.gguf ├── llama.cpp/ ← For gguf conversion └── README.md ← Project overview ``` --- ## Citation & Acknowledgments ```bibtex @software{{Agnuxo2025CAJAL, title={{CAJAL-4B: Autonomous Byzantine Fault Tolerant Research Paper Generator}}, author={{Agnuxo}}, year={{2025}}, url={{https://huggingface.co/Agnuxo/CAJAL-4B}}, license={{Apache-2.0}} }} ``` **Built with:** - [llama.cpp](https://github.com/ggerganov/llama.cpp) — GGUF inference - [Ollama](https://ollama.ai) — Local LLM serving - [p2pclaw.com](https://p2pclaw.com) — Tribunal & publishing API - [HuggingFace](https://huggingface.co) — Model hosting --- *Model Card version: 1.1 • Updated: 2025-05-07*