OpenVINO Version of Llama-3.2-1B-Indian-History

Direct OpenVINO IR export of wizardoftrap/Llama-3.2-1B-Indian-history via Optimum.

Files

  • openvino_model.xml
  • openvino_model.bin
  • openvino_config.json
  • tokenizer files

Run Server (Make sure ovms native is installed Docs):

 - Pull model:

   ovms --pull --source_model wizardoftrap/Llama-3.2-1B-Indian-history-openvino --model_repository_path /OpenVinoSP/models --model_name Llama-3.2-1B-Indian-history-openvino --target_device GPU --task text_generation

 - Start server:

   ovms --model_path models/wizardoftrap/Llama-3.2-1B-Indian-history-openvino --model_name Llama-3.2-1B-Indian-history-openvino --port 9000 --rest_port 8000 --log_level DEBUG

Usage

from openai import OpenAI

client = OpenAI(
    api_key="not-needed",  # OVMS doesn't require API key
    base_url="http://localhost:8000/v3"  # REST endpoint on port 8000
)

MODEL_NAME = "Llama-3.2-1B-Indian-history-openvino"

response = client.chat.completions.create(
            model=MODEL_NAME,
            messages=[
                {"role": "system", "content": SYSTEM_PROMPT},
                {"role": "user", "content": "Hey tell me about Jaliawala bagh"}
            ],
            temperature=0.7,
            max_tokens=512,
            top_p=0.9
        )
print(response.choices[0].message.content)
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for wizardoftrap/Llama-3.2-1B-Indian-history-openvino

Dataset used to train wizardoftrap/Llama-3.2-1B-Indian-history-openvino