Kossayart
/

klara_ai

@@ -11,54 +11,55 @@ tags:
 - medical-llm
 library_name: peft
 pipeline_tag: text-generation
 ---
 # Model Card for Klara-Llama3-8B-v1
-Klara-Llama3-8B-v1 is a sophisticated medical assistant model fine-tuned from Meta's Llama 3 8B. It serves as the intelligent interface for the Klara health monitoring ecosystem, providing expert-level interpretation of physiological sensor data.
 ## Model Details
 ### Model Description
-By leveraging the advanced reasoning capabilities of Llama 3 8B, this model provides high-fidelity medical guidance. It is fine-tuned using **QLoRA** to maintain a balance between high-performance medical logic and the ability to run on consumer-grade or high-end edge hardware.
 - **Developed by:** Koussay Chaanbi
 - **Project Name:** Klara
 - **Model type:** Fine-tuned Causal Language Model
 - **Fine-tuning Technique:** QLoRA (4-bit Quantized LoRA)
 - **Base Model:** Meta-Llama-3-8B-Instruct
-- **Persona:** A professional and precise medical assistant created by Koussay Chaanbi, specialized in real-time health data analysis.
 ### Model Sources
 - **Repository:** [Klara-Project on Hugging Face](https://huggingface.co/Koussay/Klara-Llama3-8B-v1)
-- **Deployment Target:** Desktop/Edge environments using 4-bit quantization (GGUF/EXL2).
 ## Uses
 ### Direct Use
-1. **Crisis Interpretation:** Explaining the severity and nature of medical crises detected by the CNN-BiLSTM sensor model.
-2. **Contextual Health Advice:** Providing preventative advice based on trends in Heart Rate and SpO2.
-3. **Structured Reporting:** Summarizing health metrics for university or medical review.
 ### Out-of-Scope Use
-This model is intended for research and supportive monitoring. It is not a substitute for clinical diagnostics or emergency medical services.
 ## Bias, Risks, and Limitations
-- **Compute Requirements:** Unlike 1B models, Llama 3 8B requires significant VRAM or 4-bit quantization to run effectively on edge devices like the Raspberry Pi (8GB recommended).
-- **Domain Specificity:** While excellent at reasoning, the model may still produce hallucinations regarding specific clinical dosages unless grounded in a RAG system.
 ## How to Get Started with the Model
-You can run this model using the `transformers` library with 4-bit quantization:
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 import torch
 model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
 adapter_id = "Koussay/Klara-Llama3-8B-v1-LoRA"
@@ -71,8 +72,6 @@ model = AutoModelForCausalLM.from_pretrained(
     torch_dtype=torch.bfloat16
 )
-# Apply the fine-tuned Klara adapters
-from peft import PeftModel
 model = PeftModel.from_pretrained(model, adapter_id)
 messages = [

 - medical-llm
 library_name: peft
 pipeline_tag: text-generation
+inference: false # تم إضافة هذا السطر لإغلاق الـ API ومنع استهلاك التوكنز
 ---
 # Model Card for Klara-Llama3-8B-v1
+**Note:** This model is part of a private graduation project (PFE). Access to weights and the Inference API is restricted to authorized users only.
 ## Model Details
 ### Model Description
+Klara-Llama3-8B-v1 is a sophisticated medical assistant model fine-tuned from Meta's Llama 3 8B. It serves as the intelligent interface for the Klara health monitoring ecosystem, providing expert-level interpretation of physiological sensor data.
 - **Developed by:** Koussay Chaanbi
 - **Project Name:** Klara
 - **Model type:** Fine-tuned Causal Language Model
 - **Fine-tuning Technique:** QLoRA (4-bit Quantized LoRA)
 - **Base Model:** Meta-Llama-3-8B-Instruct
+- **Persona:** A professional and precise medical assistant specialized in real-time health data analysis.
 ### Model Sources
 - **Repository:** [Klara-Project on Hugging Face](https://huggingface.co/Koussay/Klara-Llama3-8B-v1)
+- **Deployment Target:** Desktop/Edge environments using 4-bit quantization.
 ## Uses
 ### Direct Use
+- **Crisis Interpretation:** Explaining the severity and nature of medical crises detected by companion sensor models.
+- **Contextual Health Advice:** Providing preventative advice based on physiological trends.
 ### Out-of-Scope Use
+This model is **not** a substitute for professional clinical diagnostics or emergency medical services. It is intended for research and demonstration within the Klara project framework.
 ## Bias, Risks, and Limitations
+- **Compute Requirements:** Requires significant VRAM or 4-bit quantization (GGUF/EXL2) for efficient inference.
+- **Medical Accuracy:** Users must verify all outputs; the model may hallucinate specific clinical values.
 ## How to Get Started with the Model
+*Note: Access must be requested and approved via the "Gated Access" system.*
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 import torch
+from peft import PeftModel
 model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
 adapter_id = "Koussay/Klara-Llama3-8B-v1-LoRA"
     torch_dtype=torch.bfloat16
 )
 model = PeftModel.from_pretrained(model, adapter_id)
 messages = [