Llama-3.1-8B-Instruct-OpenVINO-INT8 (Platinum Series)

Status Format Series Support

This repository contains the Platinum Series high-precision OpenVINO INT8 release of Llama-3.1-8B-Instruct. This collection represents a monumental leap in local AI, featuring a 128k context window and state-of-the-art reasoning capabilities. Engineered for users who require zero-compromise reasoning accuracy while utilizing Intel’s optimized inference acceleration.

πŸ“¦ Available Files & Quantization Details

File Name Quantization Accuracy Recommended For
openvino_model.bin INT8_ASYM 100% Fidelity Enterprise Finance / Legal
Compression Mode Per-Channel Maximum Range Accuracy-Critical Tasks

🐍 Python Inference (OpenVINO GenAI)

To run this engine using Python:

from openvino_genai import LLMPipeline

# Load the high-fidelity Platinum INT8 engine
pipe = LLMPipeline("Llama-3.1-8B-Instruct-OpenVINO-INT8", "CPU")

# Processing complex MCA notifications
print(pipe.generate("List compliance requirements for Section 203 of the Companies Act.", max_new_tokens=1024))

πŸ’» C# / .NET Users (OpenVINO.GenAI.CSharp)

This collection is fully compatible with .NET applications via the OpenVINO.GenAI C# API, ideal for integrating into corporate tools.

using OpenVino.GenAI;

// Load the high-fidelity Platinum INT8 engine
var pipeline = new LLMPipeline("Llama-3.1-8B-Instruct-OpenVINO-INT8", "CPU");

// Process complex MCA notifications
var result = pipeline.Generate("Summarize the 2026 MCA updates for listed companies.", max_new_tokens: 1024);
Console.WriteLine(result);

πŸ—οΈ Technical Forge

  • Optimization Tool: optimum-cli / NNCF (2026-03-29)
  • Bitwidth Distribution: 100% INT8_ASYM (Full Weight Compression)
  • Workstation: Dual-GPU (NVIDIA RTX 3090 24GB + RTX A4000 16GB)
  • Infrastructure: S: NVMe Scratch / K: 12TB Warehouse

β˜• Support the Forge

Maintaining the production line for high-fidelity models requires significant hardware resources. If these tools power your research or industrial projects, please consider supporting the development:

Platform Support Link
Global & India Support via Razorpay

Scan to support via UPI (India Only):


Connect with the architect: Abhishek Jaiswal on LinkedIn

Downloads last month
20
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for CelesteImperia/Llama-3.1-8B-Instruct-OpenVINO-INT8

Finetuned
(2586)
this model