LumenSyntax commited on
Commit
8a241b4
·
verified ·
1 Parent(s): 42e70ca

Clean model card — remove training details

Browse files
Files changed (1) hide show
  1. README.md +55 -86
README.md CHANGED
@@ -1,86 +1,55 @@
1
- ---
2
- license: apache-2.0
3
- base_model: nvidia/Nemotron-Mini-4B-Instruct
4
- tags:
5
- - epistemological-safety
6
- - ai-safety
7
- - truth-verification
8
- - instrument-trap
9
- - logos
10
- - cross-family-replication
11
- datasets:
12
- - LumenSyntax/instrument-trap-benchmark
13
- language:
14
- - en
15
- pipeline_tag: text-generation
16
- ---
17
-
18
- # Logos 14 — Nemotron 4B Epistemological Auditor
19
-
20
- Cross-family replication of the Logos epistemological classifier on NVIDIA's Nemotron Mini 4B architecture.
21
-
22
- ## Model Description
23
-
24
- Logos 14 is a fine-tuned version of `nvidia/Nemotron-Mini-4B-Instruct` trained for epistemological safety classification. It detects when AI-generated content crosses epistemological boundaries — producing fact-shaped fiction, fabricating certainty, or collapsing identity constraints.
25
-
26
- This model is **thesis evidence** for the cross-family replicability of the Logos method, demonstrating that epistemological safety can be trained into models from different architecture families (Google Gemma, NVIDIA Nemotron, Stability AI StableLM).
27
-
28
- ## Training
29
-
30
- - **Method**: QLoRA (r=64, alpha=128, 1000 steps)
31
- - **Dataset**: 691 examples of epistemological boundary cases
32
- - **Hardware**: NVIDIA RTX 4060 (local)
33
- - **Base model**: nvidia/Nemotron-Mini-4B-Instruct
34
-
35
- ## Benchmark Results (300/300 stratified)
36
-
37
- | Metric | Score |
38
- |--------|-------|
39
- | **Behavioral accuracy** | **95.7%** [92.7, 97.5 CI] |
40
- | Identity collapse | 0% |
41
- | Fabrication | 0% |
42
- | False approval | 1.3% |
43
-
44
- ### Cross-Family Comparison (matched 300 items, McNemar's test)
45
-
46
- | Model | Family | Score | vs logos10v2 p-value |
47
- |-------|--------|-------|---------------------|
48
- | logos-auditor (9B) | Google Gemma 2 | 97.3% | — |
49
- | **logos14 (4B)** | **NVIDIA Nemotron** | **95.7%** | **p < 0.001** |
50
- | logos16v2 (1.6B) | Stability AI StableLM 2 | 93.0% | p < 0.001 |
51
- | logos10v2 (1B) | Google Gemma 3 | 72.7% | baseline |
52
-
53
- Nemotron vs StableLM: chi2=1.88, p=0.170 (statistically equivalent).
54
-
55
- ## Output Format
56
-
57
- Logos 14 produces RAW text output (99% of responses). It does not use structured tags — this is consistent with the "Token Nativity" finding where the chat template determines output format.
58
-
59
- ## Usage
60
-
61
- This model is designed to be used as an **epistemological classifier**, not a chatbot. Feed it a claim or action and it evaluates whether it crosses an epistemological boundary.
62
-
63
- ```python
64
- # Via Ollama (after importing GGUF)
65
- # Via transformers
66
- from transformers import AutoModelForCausalLM, AutoTokenizer
67
-
68
- model = AutoModelForCausalLM.from_pretrained("LumenSyntax/logos14-nemotron-4b")
69
- tokenizer = AutoTokenizer.from_pretrained("LumenSyntax/logos14-nemotron-4b")
70
- ```
71
-
72
- ## Important Notes
73
-
74
- - This model is **thesis evidence**, not a production deployment. For production, use the Gemma-based models via [logos-firewall](https://pypi.org/project/logos-firewall/).
75
- - **Never force JSON output format** — it destroys the model's native reasoning capabilities.
76
- - The model is fine-tuned, NOT prompted. The "Three Laws" of epistemological fidelity are a training result.
77
-
78
- ## Connection to Research
79
-
80
- This model is part of the evidence for "The Instrument Trap: When Aligned Models Serve Misaligned Purposes" (DOI: [10.5281/zenodo.18716474](https://doi.org/10.5281/zenodo.18716474)).
81
-
82
- The benchmark dataset (14,950 test cases) is available at [LumenSyntax/instrument-trap-benchmark](https://huggingface.co/datasets/LumenSyntax/instrument-trap-benchmark).
83
-
84
- ## License
85
-
86
- Apache 2.0 (inherited from base model nvidia/Nemotron-Mini-4B-Instruct)
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: nvidia/Nemotron-Mini-4B-Instruct
4
+ tags:
5
+ - epistemological-safety
6
+ - ai-safety
7
+ - truth-verification
8
+ - instrument-trap
9
+ - logos
10
+ - cross-family-replication
11
+ datasets:
12
+ - LumenSyntax/instrument-trap-benchmark
13
+ language:
14
+ - en
15
+ pipeline_tag: text-generation
16
+ ---
17
+
18
+ # Logos 14 — Nemotron 4B Epistemological Auditor
19
+
20
+ Cross-family replication of the Logos epistemological classifier on NVIDIA's Nemotron Mini 4B architecture. Evidence for the cross-family replicability of epistemological fine-tuning.
21
+
22
+ ## Benchmark Results (300/300 stratified)
23
+
24
+ | Metric | Score |
25
+ |--------|-------|
26
+ | **Behavioral accuracy** | **95.7%** [92.7, 97.5 CI] |
27
+ | Identity collapse | 0% |
28
+ | Fabrication | 0% |
29
+ | False approval | 1.3% |
30
+
31
+ ### Cross-Family Comparison
32
+
33
+ | Model | Family | Score |
34
+ |-------|--------|-------|
35
+ | logos-auditor (9B) | Google Gemma 2 | 97.3% |
36
+ | **logos14 (4B)** | **NVIDIA Nemotron** | **95.7%** |
37
+ | logos16v2 (1.6B) | Stability AI StableLM 2 | 93.0% |
38
+
39
+ Statistical equivalence between Nemotron and StableLM: chi2=1.88, p=0.170.
40
+
41
+ ## What This Model Does
42
+
43
+ Logos is an **epistemological classifier**, not a chatbot. It evaluates whether claims cross epistemological boundaries. Fine-tuned, not prompted — behavioral constraints emerge from training.
44
+
45
+ ## Access
46
+
47
+ This model requires approved access. Request access using the form above and describe your intended use case.
48
+
49
+ ## Connection to Research
50
+
51
+ This model is part of the evidence for "The Instrument Trap" (DOI: [10.5281/zenodo.18716474](https://doi.org/10.5281/zenodo.18716474)).
52
+
53
+ ## License
54
+
55
+ Apache 2.0 (inherited from base model nvidia/Nemotron-Mini-4B-Instruct)