OpenVINO Version of Llama-3.2-1B-Indian-History
Direct OpenVINO IR export of wizardoftrap/Llama-3.2-1B-Indian-history via Optimum.
Files
openvino_model.xmlopenvino_model.binopenvino_config.json- tokenizer files
Run Server (Make sure ovms native is installed Docs):
- Pull model:
ovms --pull --source_model wizardoftrap/Llama-3.2-1B-Indian-history-openvino --model_repository_path /OpenVinoSP/models --model_name Llama-3.2-1B-Indian-history-openvino --target_device GPU --task text_generation
- Start server:
ovms --model_path models/wizardoftrap/Llama-3.2-1B-Indian-history-openvino --model_name Llama-3.2-1B-Indian-history-openvino --port 9000 --rest_port 8000 --log_level DEBUG
Usage
from openai import OpenAI
client = OpenAI(
api_key="not-needed", # OVMS doesn't require API key
base_url="http://localhost:8000/v3" # REST endpoint on port 8000
)
MODEL_NAME = "Llama-3.2-1B-Indian-history-openvino"
response = client.chat.completions.create(
model=MODEL_NAME,
messages=[
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": "Hey tell me about Jaliawala bagh"}
],
temperature=0.7,
max_tokens=512,
top_p=0.9
)
print(response.choices[0].message.content)
- Downloads last month
- 2
Model tree for wizardoftrap/Llama-3.2-1B-Indian-history-openvino
Base model
meta-llama/Llama-3.2-1B-Instruct Finetuned
unsloth/Llama-3.2-1B-Instruct Finetuned
wizardoftrap/Llama-3.2-1B-Indian-history