📑 Llama-3-8B Invoice Extractor (Merged)

License: apache-2.0
Finetuned from model : unsloth/llama-3-8b-bnb-4bit

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

1. Model Description & Intended Use

This model is a fine-tuned version of Meta's Llama-3-8B, specifically optimized for Structured Information Extraction. It is designed to act as a "Parser Agent" that transforms messy, unstructured financial text—such as invoice descriptions, receipt notes, and purchase logs—into machine-readable, valid JSON objects.

Primary Intended Use:

Automated accounting and bookkeeping workflows.
Extracting data from OCR-processed receipts.
Building serverless financial data pipelines.

Target Schema:

{
  "item": "string",
  "quantity": "integer",
  "date": "string",
  "vendor": "string",
  "total": "float",
  "currency": "string"
}

2. Training Data Information

The model was trained on the manuelaschrittwieser/invoice-extraction-dataset-v2, which consists of 601 high-quality examples.

Methodology: The dataset uses a synthetic generation pipeline with 7 distinct sentence templates to ensure robustness against different linguistic structures.
Diversity: Includes 20+ vendors, 20+ item categories, and 5 currency types (USD, EUR, GBP, CAD, JPY).
Format: The data was prepared in an instruction-input-output format to reinforce strict adherence to the JSON schema.

3. Training Procedure & Hyperparameters

The training was conducted using Parameter-Efficient Fine-Tuning (PEFT) with the QLoRA method via the Unsloth library, which significantly reduced VRAM usage while maintaining performance.

Hyperparameters:

Epochs: Custom step-based (120 global steps)
Learning Rate: 2e-4
Batch Size: 2 (with Gradient Accumulation Steps = 4)
Optimizer: AdamW (8-bit)
Learning Rate Scheduler: Linear
LoRA Rank (r): 16
LoRA Alpha: 16
Precision: 4-bit Quantization (merged into 16-bit for this standalone version)

4. Evaluation Results

Baseline (Llama-3-8B) vs. Fine-Tuned

Feature	Baseline Model	Fine-Tuned Model (v2)
Output Format	Conversational / Explanatory	Strict JSON Only
JSON Validity	Often fails (includes extra text)	100% Valid JSON in test runs
Entity Recognition	High accuracy, but low precision	High Precision (mapped to schema)
Instruction Following	Moderate	High (Stays within Response block)

Tracking: Training performance was monitored via Weights & Biases (W&B), showing a consistent reduction in loss over 120 steps without signs of overfitting.

5. Limitations & Known Issues

Language: Optimized primarily for English. Performance on other languages is not guaranteed.
Hallucination: While the model is highly structured, it can occasionally misinterpret dates if they are provided in ambiguous formats (e.g., 01/02/03).
Context Length: Best performance is achieved with input lengths under 512 tokens.
Verification: Users should implement a JSON validation layer in production to handle the rare cases of malformed output.

6. Code Example: Loading and Usage

Since this is a merged model, it can be used with standard transformers or vLLM.

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "manuelaschrittwieser/llama-3-invoice-extractor-merged"

# Load Model and Tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id, 
    device_map="auto", 
    torch_dtype=torch.float16
)

# Inference Example
prompt = """### Instruction:
Extract invoice details into JSON.

### Input:
Bought 2 monitors at Dell for 900 USD on Jan 15 2024.

### Response:
"""

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True).split("### Response:\n")[-1])

Developed by: Manuela Schrittwieser, Project: Structured Data Extractor

Downloads last month: 6

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for manuelaschrittwieser/llama-3-invoice-extractor-merged

Base model

meta-llama/Meta-Llama-3-8B

Finetuned

(552)

this model

Dataset used to train manuelaschrittwieser/llama-3-invoice-extractor-merged

Evaluation results

Final Train Loss on invoice-extraction-dataset-v2
self-reported

0.256