๐Ÿ“‘ Llama-3-8B Invoice Extractor (Merged)

  • License: apache-2.0
  • Finetuned from model : unsloth/llama-3-8b-bnb-4bit

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

1. Model Description & Intended Use

This model is a fine-tuned version of Meta's Llama-3-8B, specifically optimized for Structured Information Extraction. It is designed to act as a "Parser Agent" that transforms messy, unstructured financial textโ€”such as invoice descriptions, receipt notes, and purchase logsโ€”into machine-readable, valid JSON objects.

Primary Intended Use:

  • Automated accounting and bookkeeping workflows.
  • Extracting data from OCR-processed receipts.
  • Building serverless financial data pipelines.

Target Schema:

{
  "item": "string",
  "quantity": "integer",
  "date": "string",
  "vendor": "string",
  "total": "float",
  "currency": "string"
}

2. Training Data Information

The model was trained on the manuelaschrittwieser/invoice-extraction-dataset-v2, which consists of 601 high-quality examples.

  • Methodology: The dataset uses a synthetic generation pipeline with 7 distinct sentence templates to ensure robustness against different linguistic structures.
  • Diversity: Includes 20+ vendors, 20+ item categories, and 5 currency types (USD, EUR, GBP, CAD, JPY).
  • Format: The data was prepared in an instruction-input-output format to reinforce strict adherence to the JSON schema.

3. Training Procedure & Hyperparameters

The training was conducted using Parameter-Efficient Fine-Tuning (PEFT) with the QLoRA method via the Unsloth library, which significantly reduced VRAM usage while maintaining performance.

Hyperparameters:

  • Epochs: Custom step-based (120 global steps)
  • Learning Rate: 2e-4
  • Batch Size: 2 (with Gradient Accumulation Steps = 4)
  • Optimizer: AdamW (8-bit)
  • Learning Rate Scheduler: Linear
  • LoRA Rank (r): 16
  • LoRA Alpha: 16
  • Precision: 4-bit Quantization (merged into 16-bit for this standalone version)

4. Evaluation Results

Baseline (Llama-3-8B) vs. Fine-Tuned

Feature Baseline Model Fine-Tuned Model (v2)
Output Format Conversational / Explanatory Strict JSON Only
JSON Validity Often fails (includes extra text) 100% Valid JSON in test runs
Entity Recognition High accuracy, but low precision High Precision (mapped to schema)
Instruction Following Moderate High (Stays within Response block)

Tracking: Training performance was monitored via Weights & Biases (W&B), showing a consistent reduction in loss over 120 steps without signs of overfitting.

5. Limitations & Known Issues

  • Language: Optimized primarily for English. Performance on other languages is not guaranteed.
  • Hallucination: While the model is highly structured, it can occasionally misinterpret dates if they are provided in ambiguous formats (e.g., 01/02/03).
  • Context Length: Best performance is achieved with input lengths under 512 tokens.
  • Verification: Users should implement a JSON validation layer in production to handle the rare cases of malformed output.

6. Code Example: Loading and Usage

Since this is a merged model, it can be used with standard transformers or vLLM.

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "manuelaschrittwieser/llama-3-invoice-extractor-merged"

# Load Model and Tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id, 
    device_map="auto", 
    torch_dtype=torch.float16
)

# Inference Example
prompt = """### Instruction:
Extract invoice details into JSON.

### Input:
Bought 2 monitors at Dell for 900 USD on Jan 15 2024.

### Response:
"""

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True).split("### Response:\n")[-1])

Developed by: Manuela Schrittwieser, Project: Structured Data Extractor

Downloads last month
6
Safetensors
Model size
8B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for manuelaschrittwieser/llama-3-invoice-extractor-merged

Finetuned
(552)
this model

Dataset used to train manuelaschrittwieser/llama-3-invoice-extractor-merged

Evaluation results

  • Final Train Loss on invoice-extraction-dataset-v2
    self-reported
    0.256