emreyigitozturk
/

llama2-7b-owq-int4-fp16

Text Generation

mixed-precision

Model card Files Files and versions

llama2-7b-owq-int4-fp16

This repository contains a quantized model artifact produced in the graduation project.

Model Details

Technique: OWQ
Quantization: Mixed INT4/FP16
Base model: meta-llama/Llama-3.2-3B-Instruct
Export date: 2026-03-24

Benchmark Summary

Metric	Original	Quantized
Disk size (GB)	5.98	2.41
Avg inference time	N/A	N/A
Tokens/sec	N/A	N/A
GPU memory	N/A	N/A
Perplexity	4.3407	4.5979

Comparison Highlights

Speedup: 1.03x
Memory reduction: 0.00%
Disk/model size reduction: 59.78%

Benchmark Notes

Numbers below are copied from local benchmark_results JSON in this project.

Local Source

Quantized folder: Advanced-Techniques/OWQ/quantized/llama2-7b-owq
Benchmark JSON: Advanced-Techniques/OWQ/benchmark_results/owq_benchmark_results.json

Usage

Use the model with the library and runtime that match the quantization technique in this repo.

Limitations

This model card is auto-generated from project files.
You should validate quality, safety, and license compatibility before public release.

Downloads last month: 27

Model tree for emreyigitozturk/llama2-7b-owq-int4-fp16

Base model

meta-llama/Llama-3.2-3B-Instruct

Finetuned

(1545)

this model