kashif HF Staff commited on
Commit
b9cd5f3
·
verified ·
1 Parent(s): 37becc5

card: add metadata conditioning, YaRN, likelihood, FNS note

Browse files
Files changed (1) hide show
  1. README.md +42 -0
README.md CHANGED
@@ -43,6 +43,14 @@ cd llama.cpp && cmake -B build && cmake --build build -j
43
  -n 64 --temp 0 -no-cnv
44
  ```
45
 
 
 
 
 
 
 
 
 
46
  ### Speculative decoding with Carbon-500M draft (~2x speedup)
47
 
48
  The 500M shares the HybridDNA vocab, so it's a near-ideal draft. Measured ~2.1x speedup at temp=0 with 87% accept rate on DNA prompts. Grab the draft GGUF first:
@@ -61,6 +69,40 @@ Then run with `--model-draft`:
61
  -n 256 --temp 0
62
  ```
63
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
64
  ## See also
65
 
66
  - Source weights: [HuggingFaceBio/Carbon-8B](https://huggingface.co/HuggingFaceBio/Carbon-8B)
 
43
  -n 64 --temp 0 -no-cnv
44
  ```
45
 
46
+ ### Metadata-conditioned generation
47
+
48
+ ```bash
49
+ ./build/bin/llama-completion -m carbon-8b-bf16.gguf \
50
+ -p '<vertebrate_mammalian><protein_coding_region><dna>ATGCGCTAG' \
51
+ -n 64 --temp 0 -no-cnv
52
+ ```
53
+
54
  ### Speculative decoding with Carbon-500M draft (~2x speedup)
55
 
56
  The 500M shares the HybridDNA vocab, so it's a near-ideal draft. Measured ~2.1x speedup at temp=0 with 87% accept rate on DNA prompts. Grab the draft GGUF first:
 
69
  -n 256 --temp 0
70
  ```
71
 
72
+ ### Likelihood scoring
73
+
74
+ The source card's Python `score()` function computes mean log-prob per DNA token. In llama.cpp the closest tools are `llama-perplexity` for corpus-level perplexity (`perplexity = exp(-mean_logprob)`):
75
+
76
+ ```bash
77
+ # one prompt per line in dna_corpus.txt, each wrapped in <dna>...</dna>
78
+ ./build/bin/llama-perplexity -m carbon-8b-bf16.gguf -f dna_corpus.txt --ppl-stride 0
79
+ ```
80
+
81
+ Or `llama-server` with `logprobs` for per-token log-probabilities:
82
+
83
+ ```bash
84
+ ./build/bin/llama-server -m carbon-8b-bf16.gguf --port 8080 &
85
+ curl -s http://localhost:8080/completion -d '{
86
+ "prompt": "<dna>GGGCTATAAAGGCCATCGATCGATCGATCGATCGATCGATCG</dna>",
87
+ "n_predict": 0,
88
+ "n_probs": 1
89
+ }' | jq '.completion_probabilities'
90
+ ```
91
+
92
+ ### Long context with YaRN (65k tokens ≈ 393 kbp)
93
+
94
+ Mirrors the source card's `rope_scaling = {type: yarn, factor: 4.0, original_max_position_embeddings: 32768}`:
95
+
96
+ ```bash
97
+ ./build/bin/llama-completion -m carbon-8b-bf16.gguf \
98
+ -c 65536 --rope-scaling yarn --rope-scale 4 --yarn-orig-ctx 32768 \
99
+ -p '<dna>...' -n 64 --temp 0 -no-cnv
100
+ ```
101
+
102
+ ### Base-pair-level generation (FNS branch) — not supported
103
+
104
+ The `revision="fns"` example from the source card needs custom modeling code (factorized nucleotide supervision head), which only the Python transformers path can load. llama.cpp can't run that branch.
105
+
106
  ## See also
107
 
108
  - Source weights: [HuggingFaceBio/Carbon-8B](https://huggingface.co/HuggingFaceBio/Carbon-8B)