Kossayart
/

klara_ai

@@ -1,65 +1,85 @@
 ---
 language:
 - en
-license: apache-2.0
-base_model: tinyllama/tinyllama-1.1b-chat-v1.0 # Or smollm/smollm2-1.7b
 tags:
 - medical-assistant
 - lora
-- quantized
-- edge-ai
 - health-tech
 library_name: peft
 pipeline_tag: text-generation
 ---
-# Model Card for Klara-LLM-v1
-Klara-LLM-v1 is a lightweight, fine-tuned Large Language Model designed to act as the cognitive core for a smart health monitoring system. It translates physiological sensor data and crisis predictions into actionable, human-readable medical advice.
 ## Model Details
 ### Model Description
-This model is a specialized version of a small-parameter LLM (e.g., TinyLlama/SmolLM2), fine-tuned using **LoRA (Low-Rank Adaptation)**. It is specifically optimized for local deployment on edge hardware to ensure user data privacy and low-latency responses.
 - **Developed by:** Koussay Chaanbi
-- **Project Name:** Klara (formerly Lyna.ai)
 - **Model type:** Fine-tuned Causal Language Model
-- **Fine-tuning Technique:** LoRA / QLoRA
-- **Base Model:** TinyLlama-1.1B or SmolLM2-1.7B
-- **Persona:** A professional, empathetic medical assistant programmed to assist users in interpreting health metrics and managing medical crises.
 ### Model Sources
-- **Repository:** [Klara-Project on Hugging Face](https://huggingface.co/Koussay/Klara-LLM-v1)
-- **Deployment Platform:** Raspberry Pi 4/5 via Ollama or Llama.cpp
 ## Uses
 ### Direct Use
-The model is designed to:
-1. Provide context-aware medical explanations based on sensor inputs (HR, SpO2).
-2. Offer immediate guidance when a medical crisis is detected by the companion CNN-BiLSTM model.
-3. Answer user queries regarding physiological health trends.
 ### Out-of-Scope Use
-This model is not a licensed medical professional. It must not be used for life-critical decisions without human verification. It is intended for supportive health monitoring and research purposes within the Klara ecosystem.
 ## Bias, Risks, and Limitations
-- **Medical Accuracy:** While fine-tuned on medical logic, the model may occasionally hallucinate or provide generalized advice.
-- **Hardware Constraints:** Being a small-parameter model, its reasoning depth is more limited compared to larger models like Llama-3 or GPT-4.
-- **Privacy:** Designed for local inference to mitigate the risks associated with transmitting sensitive health data to the cloud.
 ## How to Get Started with the Model
-The model is typically served via **Ollama** or **Gemma.cpp**. You can interact with it using a structured system prompt:
-```text
-System Prompt:
-"You are Klara, a professional medical assistant created by Koussay Chaanbi.
-Your goal is to monitor the user's health using sensor data and provide
-clear, supportive advice during medical crises."

 ---
 language:
 - en
+license: llama3
+base_model: meta-llama/Meta-Llama-3-8B-Instruct
 tags:
 - medical-assistant
 - lora
+- qlora
 - health-tech
+- medical-llm
 library_name: peft
 pipeline_tag: text-generation
 ---
+# Model Card for Klara-Llama3-8B-v1
+Klara-Llama3-8B-v1 is a sophisticated medical assistant model fine-tuned from Meta's Llama 3 8B. It serves as the intelligent interface for the Klara health monitoring ecosystem, providing expert-level interpretation of physiological sensor data.
 ## Model Details
 ### Model Description
+By leveraging the advanced reasoning capabilities of Llama 3 8B, this model provides high-fidelity medical guidance. It is fine-tuned using **QLoRA** to maintain a balance between high-performance medical logic and the ability to run on consumer-grade or high-end edge hardware.
 - **Developed by:** Koussay Chaanbi
+- **Project Name:** Klara
 - **Model type:** Fine-tuned Causal Language Model
+- **Fine-tuning Technique:** QLoRA (4-bit Quantized LoRA)
+- **Base Model:** Meta-Llama-3-8B-Instruct
+- **Persona:** A professional and precise medical assistant created by Koussay Chaanbi, specialized in real-time health data analysis.
 ### Model Sources
+- **Repository:** [Klara-Project on Hugging Face](https://huggingface.co/Koussay/Klara-Llama3-8B-v1)
+- **Deployment Target:** Desktop/Edge environments using 4-bit quantization (GGUF/EXL2).
 ## Uses
 ### Direct Use
+1. **Crisis Interpretation:** Explaining the severity and nature of medical crises detected by the CNN-BiLSTM sensor model.
+2. **Contextual Health Advice:** Providing preventative advice based on trends in Heart Rate and SpO2.
+3. **Structured Reporting:** Summarizing health metrics for university or medical review.
 ### Out-of-Scope Use
+This model is intended for research and supportive monitoring. It is not a substitute for clinical diagnostics or emergency medical services.
 ## Bias, Risks, and Limitations
+- **Compute Requirements:** Unlike 1B models, Llama 3 8B requires significant VRAM or 4-bit quantization to run effectively on edge devices like the Raspberry Pi (8GB recommended).
+- **Domain Specificity:** While excellent at reasoning, the model may still produce hallucinations regarding specific clinical dosages unless grounded in a RAG system.
 ## How to Get Started with the Model
+You can run this model using the `transformers` library with 4-bit quantization:
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
+adapter_id = "Koussay/Klara-Llama3-8B-v1-LoRA"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    load_in_4bit=True,
+    device_map="auto",
+    torch_dtype=torch.bfloat16
+)
+# Apply the fine-tuned Klara adapters
+from peft import PeftModel
+model = PeftModel.from_pretrained(model, adapter_id)
+messages = [
+    {"role": "system", "content": "You are Klara, a professional medical assistant created by Koussay Chaanbi."},
+    {"role": "user", "content": "The system detected a sudden drop in SpO2. What are the immediate steps?"}
+]
+inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to("cuda")
+outputs = model.generate(inputs, max_new_tokens=256)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))