Instructions to use Kossayart/klara_ai with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use Kossayart/klara_ai with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,65 +1,85 @@
|
|
| 1 |
---
|
| 2 |
language:
|
| 3 |
- en
|
| 4 |
-
license:
|
| 5 |
-
base_model:
|
| 6 |
tags:
|
| 7 |
- medical-assistant
|
| 8 |
- lora
|
| 9 |
-
-
|
| 10 |
-
- edge-ai
|
| 11 |
- health-tech
|
|
|
|
| 12 |
library_name: peft
|
| 13 |
pipeline_tag: text-generation
|
| 14 |
---
|
| 15 |
|
| 16 |
-
# Model Card for Klara-
|
| 17 |
|
| 18 |
-
Klara-
|
| 19 |
|
| 20 |
## Model Details
|
| 21 |
|
| 22 |
### Model Description
|
| 23 |
|
| 24 |
-
|
| 25 |
|
| 26 |
- **Developed by:** Koussay Chaanbi
|
| 27 |
-
- **Project Name:** Klara
|
| 28 |
- **Model type:** Fine-tuned Causal Language Model
|
| 29 |
-
- **Fine-tuning Technique:**
|
| 30 |
-
- **Base Model:**
|
| 31 |
-
- **Persona:** A professional
|
| 32 |
|
| 33 |
### Model Sources
|
| 34 |
|
| 35 |
-
- **Repository:** [Klara-Project on Hugging Face](https://huggingface.co/Koussay/Klara-
|
| 36 |
-
- **Deployment
|
| 37 |
|
| 38 |
## Uses
|
| 39 |
|
| 40 |
### Direct Use
|
| 41 |
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
3. Answer user queries regarding physiological health trends.
|
| 46 |
|
| 47 |
### Out-of-Scope Use
|
| 48 |
|
| 49 |
-
This model is
|
| 50 |
|
| 51 |
## Bias, Risks, and Limitations
|
| 52 |
|
| 53 |
-
- **
|
| 54 |
-
- **
|
| 55 |
-
- **Privacy:** Designed for local inference to mitigate the risks associated with transmitting sensitive health data to the cloud.
|
| 56 |
|
| 57 |
## How to Get Started with the Model
|
| 58 |
|
| 59 |
-
|
| 60 |
|
| 61 |
-
```
|
| 62 |
-
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
language:
|
| 3 |
- en
|
| 4 |
+
license: llama3
|
| 5 |
+
base_model: meta-llama/Meta-Llama-3-8B-Instruct
|
| 6 |
tags:
|
| 7 |
- medical-assistant
|
| 8 |
- lora
|
| 9 |
+
- qlora
|
|
|
|
| 10 |
- health-tech
|
| 11 |
+
- medical-llm
|
| 12 |
library_name: peft
|
| 13 |
pipeline_tag: text-generation
|
| 14 |
---
|
| 15 |
|
| 16 |
+
# Model Card for Klara-Llama3-8B-v1
|
| 17 |
|
| 18 |
+
Klara-Llama3-8B-v1 is a sophisticated medical assistant model fine-tuned from Meta's Llama 3 8B. It serves as the intelligent interface for the Klara health monitoring ecosystem, providing expert-level interpretation of physiological sensor data.
|
| 19 |
|
| 20 |
## Model Details
|
| 21 |
|
| 22 |
### Model Description
|
| 23 |
|
| 24 |
+
By leveraging the advanced reasoning capabilities of Llama 3 8B, this model provides high-fidelity medical guidance. It is fine-tuned using **QLoRA** to maintain a balance between high-performance medical logic and the ability to run on consumer-grade or high-end edge hardware.
|
| 25 |
|
| 26 |
- **Developed by:** Koussay Chaanbi
|
| 27 |
+
- **Project Name:** Klara
|
| 28 |
- **Model type:** Fine-tuned Causal Language Model
|
| 29 |
+
- **Fine-tuning Technique:** QLoRA (4-bit Quantized LoRA)
|
| 30 |
+
- **Base Model:** Meta-Llama-3-8B-Instruct
|
| 31 |
+
- **Persona:** A professional and precise medical assistant created by Koussay Chaanbi, specialized in real-time health data analysis.
|
| 32 |
|
| 33 |
### Model Sources
|
| 34 |
|
| 35 |
+
- **Repository:** [Klara-Project on Hugging Face](https://huggingface.co/Koussay/Klara-Llama3-8B-v1)
|
| 36 |
+
- **Deployment Target:** Desktop/Edge environments using 4-bit quantization (GGUF/EXL2).
|
| 37 |
|
| 38 |
## Uses
|
| 39 |
|
| 40 |
### Direct Use
|
| 41 |
|
| 42 |
+
1. **Crisis Interpretation:** Explaining the severity and nature of medical crises detected by the CNN-BiLSTM sensor model.
|
| 43 |
+
2. **Contextual Health Advice:** Providing preventative advice based on trends in Heart Rate and SpO2.
|
| 44 |
+
3. **Structured Reporting:** Summarizing health metrics for university or medical review.
|
|
|
|
| 45 |
|
| 46 |
### Out-of-Scope Use
|
| 47 |
|
| 48 |
+
This model is intended for research and supportive monitoring. It is not a substitute for clinical diagnostics or emergency medical services.
|
| 49 |
|
| 50 |
## Bias, Risks, and Limitations
|
| 51 |
|
| 52 |
+
- **Compute Requirements:** Unlike 1B models, Llama 3 8B requires significant VRAM or 4-bit quantization to run effectively on edge devices like the Raspberry Pi (8GB recommended).
|
| 53 |
+
- **Domain Specificity:** While excellent at reasoning, the model may still produce hallucinations regarding specific clinical dosages unless grounded in a RAG system.
|
|
|
|
| 54 |
|
| 55 |
## How to Get Started with the Model
|
| 56 |
|
| 57 |
+
You can run this model using the `transformers` library with 4-bit quantization:
|
| 58 |
|
| 59 |
+
```python
|
| 60 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 61 |
+
import torch
|
| 62 |
+
|
| 63 |
+
model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
|
| 64 |
+
adapter_id = "Koussay/Klara-Llama3-8B-v1-LoRA"
|
| 65 |
+
|
| 66 |
+
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
| 67 |
+
model = AutoModelForCausalLM.from_pretrained(
|
| 68 |
+
model_id,
|
| 69 |
+
load_in_4bit=True,
|
| 70 |
+
device_map="auto",
|
| 71 |
+
torch_dtype=torch.bfloat16
|
| 72 |
+
)
|
| 73 |
+
|
| 74 |
+
# Apply the fine-tuned Klara adapters
|
| 75 |
+
from peft import PeftModel
|
| 76 |
+
model = PeftModel.from_pretrained(model, adapter_id)
|
| 77 |
+
|
| 78 |
+
messages = [
|
| 79 |
+
{"role": "system", "content": "You are Klara, a professional medical assistant created by Koussay Chaanbi."},
|
| 80 |
+
{"role": "user", "content": "The system detected a sudden drop in SpO2. What are the immediate steps?"}
|
| 81 |
+
]
|
| 82 |
+
|
| 83 |
+
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to("cuda")
|
| 84 |
+
outputs = model.generate(inputs, max_new_tokens=256)
|
| 85 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|