athena129 commited on
Commit
e9c6861
·
verified ·
1 Parent(s): 10c8866

model card: scrub CyberSecQwen companion mentions; tighten opener; remove internal version numbers

Browse files
Files changed (1) hide show
  1. README.md +7 -14
README.md CHANGED
@@ -49,11 +49,9 @@ model-index:
49
 
50
  ## Model Information
51
 
52
- Gemma4Defense-2B is a 2.3B-parameter language model specialized for defensive cybersecurity tasks, fine-tuned from Google's [Gemma-4-E2B-it](https://huggingface.co/google/gemma-4-E2B-it). It is purpose-built for two evaluation skills measured by [CTI-Bench](https://github.com/xashru/cti-bench): mapping CVE descriptions to their CWE category (CTI-RCM) and answering cyber threat intelligence multiple-choice questions (CTI-MCQ).
53
 
54
- Under the evaluation protocol of [Foundation-Sec-8B (arXiv:2504.21039)](https://arxiv.org/abs/2504.21039), Gemma4Defense-2B retains **98.6% of Foundation-Sec-Instruct-8B's CTI-RCM accuracy** while exceeding its CTI-MCQ by **+10.5 points**, at approximately one-quarter the parameter count.
55
-
56
- A companion model trained with the same recipe on Qwen3-4B-Instruct-2507 — [CyberSecQwen-4B](https://huggingface.co/athena129/CyberSecQwen-4B) — converges to the same CTI-RCM accuracy within 0.9 points (0.6664 vs 0.6754), demonstrating that the result is recipe-driven rather than substrate-specific.
57
 
58
  | | |
59
  |---|---|
@@ -153,7 +151,7 @@ The model was trained on a combined cybersecurity corpus of approximately **12,5
153
  - **CTI-RCM 2021 (decontaminated)** — CVE → CWE classification examples drawn from MITRE/NVD public records dated 2021. Items appearing in the CTI-Bench evaluation splits were explicitly removed prior to training. (~6,776 records)
154
  - **CVE / CTI synthetic Q&A** — defensive-analyst-style cyber question–answer pairs grounded in CVE descriptions, designed to teach domain reasoning while preserving terse-answer formats. (~5,776 records)
155
 
156
- Decontamination matters here: an earlier internal version (v3) of this work showed roughly 72% test-set overlap when trained on undeduplicated CTI corpora, producing inflated CTI-RCM scores that did not generalize. The released v3.4 model trains exclusively on the 2021 cohort with overlap items removed.
157
 
158
  ### Methodology
159
 
@@ -165,7 +163,7 @@ Key methodological choices that informed the released recipe:
165
  - **Decontaminated training data.** An earlier internal iteration showed ~72% test-set overlap when trained on undeduplicated CTI corpora, producing inflated CTI-RCM scores that did not generalize. The released model trains exclusively on the 2021 cohort with CTI-Bench overlap items removed.
166
  - **Instruction-tuned base, not pre-trained base.** Direct SFT on the IT checkpoint preserves the existing format priors (terse-answer multiple-choice convention) better than SFT on the pre-trained base; comparable runs on base checkpoints showed substantial CTI-MCQ format-binding decay (~−14 to −38 pp in the worst case) at the same corpus scale.
167
  - **Multi-trial benchmarking.** All headline numbers are means of 5 independent trials with random sampling seeds at temperature 0.3; standard deviations are reported alongside.
168
- - **Cross-substrate validation.** The identical training corpus and hyperparameters were independently applied to Qwen3-4B-Instruct-2507 ([CyberSecQwen-4B](https://huggingface.co/athena129/CyberSecQwen-4B)). The two models converge to within 0.9 points on CTI-RCM, providing a built-in robustness check that the result is recipe-driven rather than substrate-specific.
169
 
170
  ### Training Setup
171
 
@@ -178,13 +176,13 @@ Key methodological choices that informed the released recipe:
178
  | Weight decay | 0.01 |
179
  | Per-device batch size | 2 |
180
  | Gradient accumulation | 8 (effective batch = 16) |
181
- | Epochs | 10 (cumulative across v3.1 → v3.4 incremental training, with adapter resumption) |
182
  | Max sequence length | 4096 |
183
  | Precision | bfloat16 |
184
  | Attention implementation | sdpa |
185
  | Random seed | 42 |
186
 
187
- Notes on attention: Gemma-4 has dual head_dim per layer (256 on sliding-attention layers, 512 on global-attention layers). On AMD MI300X (gfx942), FlashAttention-2 via Composable Kernels is bounded at head_dim=256 by the hardware shared-memory budget, so this model was trained with PyTorch's `sdpa` implementation rather than FA2. The companion CyberSecQwen-4B model uses FA2 because Qwen3-4B's head_dim=128 fits within the limit.
188
 
189
  The base model was Gemma-4-E2B-it, an instruction-tuned variant. Training was performed on AMD MI300X 192GB hardware via the AMD Developer Cloud, using PyTorch + ROCm + Hugging Face transformers, peft, and trl 0.29.1 inside the official `vllm/vllm-openai-rocm` Docker image.
190
 
@@ -221,7 +219,6 @@ All numbers below were measured by us under the protocol above (with the noted s
221
  | Foundation-Sec-Instruct-8B | 8B | **0.685** | **0.500** | 0-shot, our TARGET |
222
  | CyberPal-2.0-20B (cyber-pal-security/CyberOss-2.0-20B) | 20B | 0.728* | 0.738* | independently verified at our protocol |
223
  | **Gemma4Defense-2B** (this model) | 2.3B | **0.6754 ± 0.0035** | **0.6042 ± 0.0090** | 5-trial mean ± std |
224
- | [CyberSecQwen-4B](https://huggingface.co/athena129/CyberSecQwen-4B) (companion) | 4B | 0.6664 ± 0.0023 | 0.5868 ± 0.0029 | same recipe, different substrate |
225
  | Gemma-4-E4B-it (raw) | 5.1B effective | 0.618 | 0.666 | 0-shot |
226
  | Gemma-4-E2B-it (raw) | 2.3B | 0.580 | 0.578 | 0-shot, our base |
227
  | Gemma-4-E4B-base (raw) | 5.1B effective | 0.588 | 0.666 | 5-shot |
@@ -233,7 +230,7 @@ All numbers below were measured by us under the protocol above (with the noted s
233
 
234
  - Beats Foundation-Sec-Instruct-8B on CTI-MCQ by +10.5 points at approximately one-quarter the parameter count.
235
  - Stays within ~1 point of Foundation-Sec-Instruct-8B on CTI-RCM under the same evaluation protocol.
236
- - Cross-substrate companion ([CyberSecQwen-4B](https://huggingface.co/athena129/CyberSecQwen-4B)) reproduces the CTI-RCM result within 0.9 points using the same recipe on a different model family.
237
  - Independent reproduction of CyberPal-2.0-20B at the Foundation-Sec protocol confirms its CTI-MCQ accuracy within 2 points of its paper claim.
238
 
239
  ## Limitations
@@ -258,10 +255,6 @@ All numbers below were measured by us under the protocol above (with the noted s
258
  4. **Monitor for drift.** As new CVE / CWE patterns emerge, periodically re-evaluate; consider supplementing with retrieval over a current vulnerability knowledge base for time-sensitive queries.
259
  5. **Apply standard prompt-injection mitigations** when wrapping the model in agentic workflows that accept external content (advisory feeds, scraped pages); domain-SFT does not confer prompt-injection resistance.
260
 
261
- ## Companion Model
262
-
263
- [CyberSecQwen-4B](https://huggingface.co/athena129/CyberSecQwen-4B) is a sister release fine-tuned with the same training corpus and hyperparameters, on the Qwen3-4B-Instruct-2507 base. The two models converge to within 0.9 points on CTI-RCM (0.6754 Gemma vs 0.6664 Qwen, 5-trial mean) — the same recipe produces equivalent task performance across two distinct model families. The Qwen variant is licensed Apache 2.0 and is available for use cases where the Gemma terms are not a fit.
264
-
265
  ## Citation
266
 
267
  If you use this model, please cite:
 
49
 
50
  ## Model Information
51
 
52
+ Gemma4Defense-2B is a 2.3B-parameter language model specialized for defensive cybersecurity tasks, fine-tuned from Google's [Gemma-4-E2B-it](https://huggingface.co/google/gemma-4-E2B-it). It is specialized for two cyber threat-intelligence tasks measured by [CTI-Bench](https://github.com/xashru/cti-bench): mapping CVE descriptions to their CWE category (CTI-RCM) and answering cyber threat-intelligence multiple-choice questions (CTI-MCQ).
53
 
54
+ Under the evaluation protocol of [Foundation-Sec-8B (arXiv:2504.21039)](https://arxiv.org/abs/2504.21039), Gemma4Defense-2B **exceeds Foundation-Sec-Instruct-8B on CTI-MCQ by +10.5 points** at approximately one-quarter the parameter count, while staying within ~1 point on CTI-RCM.
 
 
55
 
56
  | | |
57
  |---|---|
 
151
  - **CTI-RCM 2021 (decontaminated)** — CVE → CWE classification examples drawn from MITRE/NVD public records dated 2021. Items appearing in the CTI-Bench evaluation splits were explicitly removed prior to training. (~6,776 records)
152
  - **CVE / CTI synthetic Q&A** — defensive-analyst-style cyber question–answer pairs grounded in CVE descriptions, designed to teach domain reasoning while preserving terse-answer formats. (~5,776 records)
153
 
154
+ Decontamination matters here: an earlier internal iteration of this work showed roughly 72% test-set overlap when trained on undeduplicated CTI corpora, producing inflated CTI-RCM scores that did not generalize. The released model trains exclusively on the 2021 cohort with overlap items removed.
155
 
156
  ### Methodology
157
 
 
163
  - **Decontaminated training data.** An earlier internal iteration showed ~72% test-set overlap when trained on undeduplicated CTI corpora, producing inflated CTI-RCM scores that did not generalize. The released model trains exclusively on the 2021 cohort with CTI-Bench overlap items removed.
164
  - **Instruction-tuned base, not pre-trained base.** Direct SFT on the IT checkpoint preserves the existing format priors (terse-answer multiple-choice convention) better than SFT on the pre-trained base; comparable runs on base checkpoints showed substantial CTI-MCQ format-binding decay (~−14 to −38 pp in the worst case) at the same corpus scale.
165
  - **Multi-trial benchmarking.** All headline numbers are means of 5 independent trials with random sampling seeds at temperature 0.3; standard deviations are reported alongside.
166
+ - **Cross-substrate validation.** The identical training corpus and hyperparameters were independently applied to a separate 4B instruction-tuned base in a different model family; the two runs converge to within 0.9 points on CTI-RCM, providing a built-in robustness check that the result is recipe-driven rather than substrate-specific.
167
 
168
  ### Training Setup
169
 
 
176
  | Weight decay | 0.01 |
177
  | Per-device batch size | 2 |
178
  | Gradient accumulation | 8 (effective batch = 16) |
179
+ | Epochs | 10 (cumulative incremental training with adapter resumption) |
180
  | Max sequence length | 4096 |
181
  | Precision | bfloat16 |
182
  | Attention implementation | sdpa |
183
  | Random seed | 42 |
184
 
185
+ Notes on attention: Gemma-4 has dual head_dim per layer (256 on sliding-attention layers, 512 on global-attention layers). On AMD MI300X (gfx942), FlashAttention-2 via Composable Kernels is bounded at head_dim=256 by the hardware shared-memory budget, so this model was trained with PyTorch's `sdpa` implementation rather than FA2.
186
 
187
  The base model was Gemma-4-E2B-it, an instruction-tuned variant. Training was performed on AMD MI300X 192GB hardware via the AMD Developer Cloud, using PyTorch + ROCm + Hugging Face transformers, peft, and trl 0.29.1 inside the official `vllm/vllm-openai-rocm` Docker image.
188
 
 
219
  | Foundation-Sec-Instruct-8B | 8B | **0.685** | **0.500** | 0-shot, our TARGET |
220
  | CyberPal-2.0-20B (cyber-pal-security/CyberOss-2.0-20B) | 20B | 0.728* | 0.738* | independently verified at our protocol |
221
  | **Gemma4Defense-2B** (this model) | 2.3B | **0.6754 ± 0.0035** | **0.6042 ± 0.0090** | 5-trial mean ± std |
 
222
  | Gemma-4-E4B-it (raw) | 5.1B effective | 0.618 | 0.666 | 0-shot |
223
  | Gemma-4-E2B-it (raw) | 2.3B | 0.580 | 0.578 | 0-shot, our base |
224
  | Gemma-4-E4B-base (raw) | 5.1B effective | 0.588 | 0.666 | 5-shot |
 
230
 
231
  - Beats Foundation-Sec-Instruct-8B on CTI-MCQ by +10.5 points at approximately one-quarter the parameter count.
232
  - Stays within ~1 point of Foundation-Sec-Instruct-8B on CTI-RCM under the same evaluation protocol.
233
+ - The identical recipe applied to a separate 4B instruction-tuned base in a different model family reproduces the CTI-RCM result within 0.9 points a built-in robustness check that the result is recipe-driven, not substrate-specific.
234
  - Independent reproduction of CyberPal-2.0-20B at the Foundation-Sec protocol confirms its CTI-MCQ accuracy within 2 points of its paper claim.
235
 
236
  ## Limitations
 
255
  4. **Monitor for drift.** As new CVE / CWE patterns emerge, periodically re-evaluate; consider supplementing with retrieval over a current vulnerability knowledge base for time-sensitive queries.
256
  5. **Apply standard prompt-injection mitigations** when wrapping the model in agentic workflows that accept external content (advisory feeds, scraped pages); domain-SFT does not confer prompt-injection resistance.
257
 
 
 
 
 
258
  ## Citation
259
 
260
  If you use this model, please cite: