kashif HF Staff commited on
Commit
6f43ce0
verified
1 Parent(s): 536f259

Add model card

Browse files
Files changed (1) hide show
  1. README.md +65 -0
README.md ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: gguf
4
+ base_model: HuggingFaceBio/Carbon-8B
5
+ language:
6
+ - dna
7
+ tags:
8
+ - dna
9
+ - genomic
10
+ - llama.cpp
11
+ - gguf
12
+ - hybriddna
13
+ ---
14
+
15
+ # Carbon-8B GGUF
16
+
17
+ GGUF (bf16) conversion of [HuggingFaceBio/Carbon-8B](https://huggingface.co/HuggingFaceBio/Carbon-8B) for use with [llama.cpp](https://github.com/ggml-org/llama.cpp).
18
+
19
+ Carbon is a hybrid DNA / English language model that switches between Qwen3-4B-Base byte-level BPE for natural text and fixed 6-mer chunking for DNA inside `<dna>...</dna>` tags.
20
+
21
+ ## Requires llama.cpp with HybridDNATokenizer support
22
+
23
+ Loading these GGUFs needs `LLAMA_VOCAB_TYPE_HYBRIDDNA`, which is not yet in upstream llama.cpp. Until the PR merges, build from the [`carbon-3b-tokenizer`](https://github.com/kashif/llama.cpp/tree/carbon-3b-tokenizer) branch:
24
+
25
+ ```bash
26
+ git clone -b carbon-3b-tokenizer https://github.com/kashif/llama.cpp
27
+ cd llama.cpp && cmake -B build && cmake --build build -j
28
+ ```
29
+
30
+ ## Files
31
+
32
+ | File | Quant | Size |
33
+ |---|---|---|
34
+ | `carbon-8b-bf16.gguf` | bf16 (lossless from source) | 16 GB |
35
+
36
+ ## Usage
37
+
38
+ ### Basic DNA completion
39
+
40
+ ```bash
41
+ ./build/bin/llama-completion -m carbon-8b-bf16.gguf \
42
+ -p '<dna>ATGCGCTAGCTACGATCGATCGTAGCTAGCTAGCTAGCTACG' \
43
+ -n 64 --temp 0 -no-cnv
44
+ ```
45
+
46
+ ### Speculative decoding with Carbon-500M draft (~2x speedup)
47
+
48
+ The 500M shares the HybridDNA vocab, so it's a near-ideal draft. Measured ~2.1x speedup at temp=0 with 87% accept rate on DNA prompts:
49
+
50
+ ```bash
51
+ ./build/bin/llama-speculative \
52
+ -m carbon-8b-bf16.gguf \
53
+ -md carbon-500m-bf16.gguf \
54
+ -p '<dna>ATGCGCTAGCTACGATCGATCGTAGCTAGCTAGCTAGCTACG' \
55
+ -n 256 --temp 0
56
+ ```
57
+
58
+ ## See also
59
+
60
+ - Source weights: [HuggingFaceBio/Carbon-8B](https://huggingface.co/HuggingFaceBio/Carbon-8B)
61
+ - Other GGUF variants: [500M](https://huggingface.co/HuggingFaceBio/Carbon-500M-GGUF) 路 [3B](https://huggingface.co/HuggingFaceBio/Carbon-3B-GGUF) 路 [8B](https://huggingface.co/HuggingFaceBio/Carbon-8B-GGUF)
62
+
63
+ ## License
64
+
65
+ Apache-2.0, inherited from the source model.