Instructions to use Kossayart/klara_ai with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use Kossayart/klara_ai with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
File size: 3,022 Bytes
0975c5d 31b7248 0975c5d 31b7248 0975c5d 31b7248 0975c5d 8a13cf9 0975c5d 31b7248 0975c5d 8a13cf9 0975c5d 8a13cf9 0975c5d 31b7248 0975c5d 31b7248 8a13cf9 0975c5d 31b7248 8a13cf9 0975c5d 8a13cf9 0975c5d 8a13cf9 0975c5d 8a13cf9 0975c5d 8a13cf9 0975c5d 31b7248 8a13cf9 31b7248 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 | ---
language:
- en
license: llama3
base_model: meta-llama/Meta-Llama-3-8B-Instruct
tags:
- medical-assistant
- lora
- qlora
- health-tech
- medical-llm
library_name: peft
pipeline_tag: text-generation
inference: false # تم إضافة هذا السطر لإغلاق الـ API ومنع استهلاك التوكنز
---
# Model Card for Klara-Llama3-8B-v1
**Note:** This model is part of a private graduation project (PFE). Access to weights and the Inference API is restricted to authorized users only.
## Model Details
### Model Description
Klara-Llama3-8B-v1 is a sophisticated medical assistant model fine-tuned from Meta's Llama 3 8B. It serves as the intelligent interface for the Klara health monitoring ecosystem, providing expert-level interpretation of physiological sensor data.
- **Developed by:** Koussay Chaanbi
- **Project Name:** Klara
- **Model type:** Fine-tuned Causal Language Model
- **Fine-tuning Technique:** QLoRA (4-bit Quantized LoRA)
- **Base Model:** Meta-Llama-3-8B-Instruct
- **Persona:** A professional and precise medical assistant specialized in real-time health data analysis.
### Model Sources
- **Repository:** [Klara-Project on Hugging Face](https://huggingface.co/Koussay/Klara-Llama3-8B-v1)
- **Deployment Target:** Desktop/Edge environments using 4-bit quantization.
## Uses
### Direct Use
- **Crisis Interpretation:** Explaining the severity and nature of medical crises detected by companion sensor models.
- **Contextual Health Advice:** Providing preventative advice based on physiological trends.
### Out-of-Scope Use
This model is **not** a substitute for professional clinical diagnostics or emergency medical services. It is intended for research and demonstration within the Klara project framework.
## Bias, Risks, and Limitations
- **Compute Requirements:** Requires significant VRAM or 4-bit quantization (GGUF/EXL2) for efficient inference.
- **Medical Accuracy:** Users must verify all outputs; the model may hallucinate specific clinical values.
## How to Get Started with the Model
*Note: Access must be requested and approved via the "Gated Access" system.*
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
from peft import PeftModel
model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
adapter_id = "Koussay/Klara-Llama3-8B-v1-LoRA"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
load_in_4bit=True,
device_map="auto",
torch_dtype=torch.bfloat16
)
model = PeftModel.from_pretrained(model, adapter_id)
messages = [
{"role": "system", "content": "You are Klara, a professional medical assistant created by Koussay Chaanbi."},
{"role": "user", "content": "The system detected a sudden drop in SpO2. What are the immediate steps?"}
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to("cuda")
outputs = model.generate(inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |