Align README with paper: numbers, title, section refs
Browse files
README.md
CHANGED
|
@@ -25,11 +25,10 @@ This is the **headline configuration on AnnoCTR** in the paper. The asymmetric-l
|
|
| 25 |
|
| 26 |
On the **AnnoCTR** test set (33 scored documents):
|
| 27 |
|
| 28 |
-
- **3-seed ensemble per-document F1 (τ=0.5): 63.
|
| 29 |
-
-
|
| 30 |
-
- Exceeds CySecBERT's 62.75% reported in Buchel et al. (2025), without using CySecBERT's additional cybersecurity pre-training corpus.
|
| 31 |
|
| 32 |
-
Full per-seed and ensemble metrics are in [`results.json`](./results.json).
|
| 33 |
|
| 34 |
## Architecture
|
| 35 |
|
|
@@ -57,7 +56,7 @@ Map free-text CTI sentences to ATT&CK techniques. The model takes a single sente
|
|
| 57 |
**Limitations:**
|
| 58 |
- Trained on English-language CTI; behavior on other languages is not characterized.
|
| 59 |
- The 118-label vocabulary is the canonical AnnoCTR set; sentences describing techniques outside this set will produce all-zero predictions.
|
| 60 |
-
- AnnoCTR's extreme sparsity (78 of 113 train-present
|
| 61 |
|
| 62 |
## How to load and run
|
| 63 |
|
|
@@ -91,13 +90,13 @@ python inference_example.py
|
|
| 91 |
| 42 | 59.82% | EMA |
|
| 92 |
| 123 | 61.29% | EMA |
|
| 93 |
| 456 | 63.57% | EMA |
|
| 94 |
-
| **3-seed ensemble** | **63.
|
| 95 |
|
| 96 |
## Citation
|
| 97 |
|
| 98 |
```bibtex
|
| 99 |
@inproceedings{cassandra2026,
|
| 100 |
-
title = {CASSANDRA:
|
| 101 |
author = {Anonymous},
|
| 102 |
booktitle = {Proceedings of the 2026 ACM SIGSAC Conference on Computer and Communications Security (CCS)},
|
| 103 |
year = {2026},
|
|
|
|
| 25 |
|
| 26 |
On the **AnnoCTR** test set (33 scored documents):
|
| 27 |
|
| 28 |
+
- **3-seed ensemble per-document F1 (τ=0.5): 63.53%**
|
| 29 |
+
- Exceeds CySecBERT's 62.75% (Buchel et al. 2025) without CySecBERT's additional 4.3M cybersecurity pre-training texts.
|
|
|
|
| 30 |
|
| 31 |
+
The per-seed table below shows the live artifact's individual seed F1s and ensemble F1; small variance from the headline (≤0.3 F1) reflects inference-time floating-point ordering on different hardware. Full per-seed and ensemble metrics are in [`results.json`](./results.json).
|
| 32 |
|
| 33 |
## Architecture
|
| 34 |
|
|
|
|
| 56 |
**Limitations:**
|
| 57 |
- Trained on English-language CTI; behavior on other languages is not characterized.
|
| 58 |
- The 118-label vocabulary is the canonical AnnoCTR set; sentences describing techniques outside this set will produce all-zero predictions.
|
| 59 |
+
- AnnoCTR's extreme sparsity (78 of 113 train-present techniques have fewer than 10 positives) means rare-technique predictions are noisier than common-technique predictions. Per-technique threshold tuning (provided as an option in `inference_example.py`) does not consistently help for these ultra-rare techniques — see paper §3.1 (per-technique thresholding excluded from the recommended configuration).
|
| 60 |
|
| 61 |
## How to load and run
|
| 62 |
|
|
|
|
| 90 |
| 42 | 59.82% | EMA |
|
| 91 |
| 123 | 61.29% | EMA |
|
| 92 |
| 456 | 63.57% | EMA |
|
| 93 |
+
| **3-seed ensemble** | **63.53%** | — |
|
| 94 |
|
| 95 |
## Citation
|
| 96 |
|
| 97 |
```bibtex
|
| 98 |
@inproceedings{cassandra2026,
|
| 99 |
+
title = {CASSANDRA: How Many Parameters Suffice to Automate TTP Extractions from CTI Reports---Pushing Towards the Lower Bound},
|
| 100 |
author = {Anonymous},
|
| 101 |
booktitle = {Proceedings of the 2026 ACM SIGSAC Conference on Computer and Communications Security (CCS)},
|
| 102 |
year = {2026},
|