health_function_call_llama3.2_3b_gguf: GGUF

A fine-tuned Llama 3.2 3B GGUF model designed for structured function calling in healthcare edge devices.Trained to convert natural language health queries into JSON-based function calls.

Base Model: LLama 3.2 3B

Fine Tuning: Parameter Efficient Fine Tuning. Targeted all linear layers (Q, K, V, O, gate, up, down), the model learned complex mapping logic while maintaining a tiny 10.5 MB adapter footprint.

Quantization: Exported to GGUF (Q4_K_M) format.

Dataset: The model is trained on the MindCall Dataset, a curated synthetic collection of 5,000+ high-fidelity health interaction pairs.

🚀 Key Features

Converts user queries → structured API calls
Lightweight GGUF format (runs locally via llama.cpp)
Optimized for deterministic outputs (low temperature)
Supports reasoning via <think> tags

📦 Model Files

Llama-3.2-3B-Instruct.Q4_K_M.gguf

⚡ Quick Start (Python)

Install dependencies

pip install llama-cpp-python huggingface_hub

Load the model

from llama_cpp import Llama

llm = Llama.from_pretrained(
    repo_id="ramgovindv/health_function_call_llama3.2_3b_gguf",
    filename="Llama-3.2-3B-Instruct.Q4_K_M.gguf",
)

Inference

query = "I am feeling dizzy for 2 days"

prompt = f"""
You are an API generator.
Return JSON in this format:
{{
"name": "function_name",
"parameters": {{
    "key": "value"
}}
}}
User query:
{query}
JSON:
"""

response = llm.create_chat_completion(
    messages=[{"role": "user", "content": prompt}],
    temperature=0.1
)
output = response["choices"][0]["message"]["content"]
print(output)

Output

<think>
User has dizziness → likely need blood pressure check
</think>

<function>
{
  "name": "get_blood_pressure_data",
  "parameters": {
    "num_days": 2
  }
}
</function>

<think> → reasoning
<function> → actual function call

Downloads last month: 1,782

GGUF

Model size

3B params

Architecture

llama

Hardware compatibility

4-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ramgovindv/health_function_call_llama3.2_3b_gguf

Base model

meta-llama/Llama-3.2-3B-Instruct

Quantized

(439)

this model

ramgovindv
/

health_function_call_llama3.2_3b_gguf