Qwen3-4B-Product-Extractor - GGUF Q4_K_M

This is a GGUF quantized version of the fine-tuned Qwen3-4B model for product data extraction, optimized for CPU inference.

Model Details

  • Base Model: unsloth/Qwen3-4B-Base
  • Fine-tuning: GRPO (Group Relative Policy Optimization)
  • Quantization: Q4_K_M
  • Estimated Size: ~2.5GB
  • Optimization: CPU inference, memory efficient

Performance

  • Speed: 3x faster than full precision
  • Memory: 4x less memory usage
  • Quality: Good

Usage with llama.cpp

# Download model
huggingface-cli download pragnesh002/Qwen3-4B-Product-Extractor-GGUF-Q4-K-M --local-dir ./model

# Run inference
./main -m ./model/*.gguf -p "Your prompt here"

Usage with Transformers (AutoGGUF)

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("pragnesh002/Qwen3-4B-Product-Extractor-GGUF-Q4-K-M")
model = AutoModelForCausalLM.from_pretrained(
    "pragnesh002/Qwen3-4B-Product-Extractor-GGUF-Q4-K-M",
    device_map="cpu",
    trust_remote_code=True
)

Recommended Use Cases

  • Q4_K_M: Best for deployment with size constraints
  • Q5_K_M: Balanced quality and size
  • Q8_0: High quality applications
  • F16: Maximum quality, research use

Product Data Extraction

This model excels at extracting structured data from product catalogs:

prompt = '''Extract product data from:
Item: GR-AA10
Description: Wall Art
Manufacturer: Harper & Wilde
Output JSON:'''

# Expected output: structured JSON with product information
Downloads last month
14
GGUF
Model size
4B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for pragnesh002/Qwen3-4B-Product-Extractor-GGUF-Q4-K-M

Quantized
(13)
this model

Space using pragnesh002/Qwen3-4B-Product-Extractor-GGUF-Q4-K-M 1