kashif HF Staff commited on
Commit
12c3c9f
·
verified ·
1 Parent(s): 1428e66

card: add download step, point parity to ggml-org/vocabs

Browse files
Files changed (1) hide show
  1. README.md +7 -1
README.md CHANGED
@@ -35,6 +35,12 @@ cd llama.cpp && cmake -B build && cmake --build build -j
35
 
36
  ## Usage
37
 
 
 
 
 
 
 
38
  ### Basic DNA completion
39
 
40
  ```bash
@@ -105,7 +111,7 @@ The `revision="fns"` example from the source card needs custom modeling code (fa
105
 
106
  ## Tokenization parity
107
 
108
- For every prompt in the [test fixture](https://github.com/kashif/llama.cpp/blob/carbon-3b-tokenizer/models/ggml-vocab-hybriddna.gguf.inp), llama.cpp produces byte-for-byte identical token IDs to the Python `HybridDNATokenizer` (loaded with `trust_remote_code=True`).
109
 
110
  ## See also
111
 
 
35
 
36
  ## Usage
37
 
38
+ ### Download
39
+
40
+ ```bash
41
+ hf download HuggingFaceBio/Carbon-3B-GGUF carbon-3b-bf16.gguf --local-dir .
42
+ ```
43
+
44
  ### Basic DNA completion
45
 
46
  ```bash
 
111
 
112
  ## Tokenization parity
113
 
114
+ llama.cpp produces byte-for-byte identical token IDs to the Python `HybridDNATokenizer` (loaded with `trust_remote_code=True`) on the standard `<dna>`/metadata/edge-case fixtures shipped in [`ggml-org/vocabs`](https://huggingface.co/ggml-org/vocabs).
115
 
116
  ## See also
117