Davis426 commited on
Commit
7b86930
·
verified ·
1 Parent(s): 2ec559e

Update combined model card

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -22,7 +22,7 @@ tags:
22
  - gguf
23
  ---
24
 
25
- # Healthcare LLM Assistant QLoRA fine-tunes
26
 
27
  Two parallel QLoRA fine-tunes of small instruct models on the same 9,000-pair mix of public biomedical Q&A, served side-by-side in the parent project's Streamlit UI for a 3-way bake-off against GPT-5.5.
28
 
@@ -127,7 +127,7 @@ Detailed numbers and charts live in the parent repo:
127
 
128
  Replace `<variant>` with `qwen` or `llama32` in the examples below.
129
 
130
- ### Option 1 Ollama (recommended for local serving)
131
 
132
  ```bash
133
  # Fetch one variant's GGUF + Modelfile
@@ -147,7 +147,7 @@ For the Llama variant, swap every `qwen` for `llama32` (paths) and the Ollama ta
147
 
148
  You can register both side-by-side; one `ollama serve` daemon handles both tags concurrently (`OLLAMA_MAX_LOADED_MODELS` defaults to 3).
149
 
150
- ### Option 2 transformers + peft (Python)
151
 
152
  ```python
153
  from peft import PeftModel
@@ -171,7 +171,7 @@ out = model.generate(inputs, max_new_tokens=256)
171
  print(tokenizer.decode(out[0][inputs.shape[1]:], skip_special_tokens=True))
172
  ```
173
 
174
- ### Option 3 llama.cpp directly
175
 
176
  ```bash
177
  huggingface-cli download Davis426/COMP8420-Healthcare-LLM-Assistant \
 
22
  - gguf
23
  ---
24
 
25
+ # Healthcare LLM Assistant - QLoRA fine-tunes
26
 
27
  Two parallel QLoRA fine-tunes of small instruct models on the same 9,000-pair mix of public biomedical Q&A, served side-by-side in the parent project's Streamlit UI for a 3-way bake-off against GPT-5.5.
28
 
 
127
 
128
  Replace `<variant>` with `qwen` or `llama32` in the examples below.
129
 
130
+ ### Option 1: Ollama (recommended for local serving)
131
 
132
  ```bash
133
  # Fetch one variant's GGUF + Modelfile
 
147
 
148
  You can register both side-by-side; one `ollama serve` daemon handles both tags concurrently (`OLLAMA_MAX_LOADED_MODELS` defaults to 3).
149
 
150
+ ### Option 2: transformers + peft (Python)
151
 
152
  ```python
153
  from peft import PeftModel
 
171
  print(tokenizer.decode(out[0][inputs.shape[1]:], skip_special_tokens=True))
172
  ```
173
 
174
+ ### Option 3: llama.cpp directly
175
 
176
  ```bash
177
  huggingface-cli download Davis426/COMP8420-Healthcare-LLM-Assistant \