--- license: apache-2.0 base_model: google/gemma-4-e4b tags: - bible - theology - gemma - gguf - ollama - cpt - sft - dpo language: - en library_name: transformers pipeline_tag: text-generation --- # BibleAI BibleAI is a Gemma 4 E4B model refined for Bible, theology, church history, and faith Q&A using a full CPT -> SFT -> DPO pipeline. ## Identity - Hugging Face repo: `rhemabible/BibleAI` - Model name: `BibleAI` - Ollama model names: - `bibleaiq8` - `bibleaibf16` ## Training Summary ### Stage 1: CPT Foundation - Base architecture: `Gemma4ForConditionalGeneration` - Model type: `gemma4` - Verified CPT merged weight size: `15,992,595,884` bytes - CPT merged SHA256 (recorded in training logs): `419aab18717ea792b128e2ea10bd9e313232d627e3bc3c4f9c0d19311ef6ed9c` ### Stage 2: SFT (Instruction Tuning) - Data source: `combined_train.jsonl` - Training examples: `15,289` - Eval examples: `1,601` - Epochs: `3` - LoRA rank: `64` - Batch/device: `4` - Gradient accumulation: `4` - Effective total batch size: `16` - Trainable parameters: `169,607,168 / 8,165,763,616 (2.08%)` - Final eval loss: `0.4368` - Final train loss: `0.1852` ### Stage 3: DPO (Preference Optimization) - Data source: `dpo_pairs.jsonl` - Preference pairs: `967` - Epochs: `2` - DPO beta: `0.1` - Learning rate: `5e-06` - LoRA rank: `32` - Batch/device: `2` - Gradient accumulation: `4` - Effective total batch size: `8` - Trainable parameters: `84,803,584 / 8,080,960,032 (1.05%)` - Final train loss: `0.06077` ## System Prompt ```text You are BibleAI. Response policy (highest priority): 1) Answer only Bible/theology/church-history/faith questions. 2) Be concise by default. 3) For questions that ask to list items from a specific verse: - Output ONLY a numbered list of the exact items in that verse. - Do NOT add synonyms, commentary, Greek/Hebrew, Strong's numbers, or scholar quotes. - Add one final line with the verse reference. 4) Do not fabricate verses, facts, or language details. If uncertain, say so. 5) If the user asks for deeper analysis, then provide it. ``` ## Chat Template ```text {{- if .System }}system {{ .System }} {{- end }}user {{ .Prompt }} model ``` Template files in this release: - `ollama/Modelfile.q8` - `ollama/Modelfile.bf16` - `adapters/sft_final/chat_template.jinja` - `adapters/dpo_final/chat_template.jinja` - `ollama/Modelfile.canonical_project_reference` ## Model Variants - `model.safetensors` (merged HF weights) - `gguf/final_merged.Q8_0.gguf` - `gguf/final_merged.BF16.gguf` ## Checksums - `model.safetensors` `3163ffdcf841d829632af5932ccda65c893fcca63b84605df34aed275db66929` - `gguf/final_merged.Q8_0.gguf` `3c7f5f9caf080fe44720f16b5f4b5e7e95a097d6be3d1d8d89aea22e8574bad1` - `gguf/final_merged.BF16.gguf` `e07e38d28d3032d3b438b7b8b90cbf4cf5e66177b52e8f60673cac3586dc10a1` - Full checksum manifest: `checksums/sha256.txt` ## Quickstart ### Ollama ```bash ollama create bibleaiq8 -f ollama/Modelfile.q8 ollama create bibleaibf16 -f ollama/Modelfile.bf16 ``` ### Transformers ```python from transformers import AutoTokenizer, AutoModelForCausalLM repo_id = "rhemabible/BibleAI" tokenizer = AutoTokenizer.from_pretrained(repo_id) model = AutoModelForCausalLM.from_pretrained( repo_id, torch_dtype="auto", device_map="auto", ) ``` ## Included Release Artifacts - Root model files: `config.json`, `model.safetensors`, `tokenizer.json`, `tokenizer_config.json` - GGUF exports: `gguf/` - Ollama packaging: `ollama/` - Final adapters: `adapters/sft_final/`, `adapters/dpo_final/` - Training logs: `logs/` - Integrity hashes: `checksums/` - Release docs: `docs/` ## Intended Scope - Bible study and scripture-centered theological support - Church history and faith-oriented Q&A - High-integrity citation-oriented responses without fabricated references