Jina v5 H256 β€” LoRA Classification Adapters

Task-specific LoRA adapters for classification, built on top of the compressed gomyk/jina-v5-h256-distilled-conv embedding model.

Each MTEB Classification task has its own lightweight LoRA adapter (~405KB) + classification head, achieving +12.0%p average improvement over the frozen baseline embeddings.

LoRA Specification

Property Value
Base model gomyk/jina-v5-h256-distilled-conv
Base architecture EuroBERT (6L / 256d / 41,778 vocab)
Base parameters 16,989,952 (all frozen)
LoRA rank 8
LoRA alpha 16
LoRA scaling (alpha/rank) 2.0
Target modules q_proj, k_proj, v_proj, o_proj
LoRA layers 6 layers x 4 projections = 24 LoRA matrices
LoRA params per task 98,304 (A + B matrices)
LoRA A shape [256, 8] per projection
LoRA B shape [8, 256] per projection
LoRA A init Kaiming (He) normal
LoRA B init Zeros (initial delta W = 0)
Adapter file size ~405KB per task
Training overhead 0.6% of base model params

How LoRA Works in This Model

For each attention projection (q/k/v/o) in each of the 6 layers:

  Original:  y = W_frozen @ x              W: [256, 256]
  With LoRA: y = W_frozen @ x + (Ξ±/r) * (x @ A) @ B

  Where:
    A: [256, 8]   β€” down-projection (trainable)
    B: [8, 256]   β€” up-projection (trainable)
    Ξ±/r = 16/8 = 2.0  β€” scaling factor

  Total per projection: 256Γ—8 + 8Γ—256 = 4,096 params
  Total per layer:      4 projections Γ— 4,096 = 16,384 params
  Total (6 layers):     6 Γ— 16,384 = 98,304 LoRA params
  + Classification head: Linear(256 β†’ num_classes)

MTEB Classification Results

Average: 73.20% β†’ 85.19% (+12.00%p)

Task Classes Baseline LoRA Delta Epochs
AmazonCounterfactualClassification 2 76.93% 92.13% +15.20%p 5
Banking77Classification 77 77.83% 86.26% +8.42%p 10
ImdbClassification 2 73.03% 82.14% +9.11%p 5
MTOPDomainClassification 11 90.63% 97.68% +7.05%p 10
MassiveIntentClassification 60 67.90% 77.89% +9.99%p 10
MassiveScenarioClassification 18 72.97% 86.30% +13.34%p 8
ToxicConversationsClassification 2 61.83% 85.74% +23.92%p 4
TweetSentimentExtractionClassification 3 64.46% 73.42% +8.95%p 4
Average 73.20% 85.19% +12.00%p

Baseline: MTEB default evaluation (logistic regression on frozen embeddings) LoRA: MTEB evaluation with task-specific LoRA adapters applied to the embedding model

Per-Task Adapter Details

Task Classes Trainable Params Adapter Size Head Size
AmazonCounterfactualClassification 2 98,818 404.6KB 4.0KB
Banking77Classification 77 118,093 404.6KB 79.2KB
ImdbClassification 2 98,818 404.6KB 4.0KB
MTOPDomainClassification 11 101,131 404.6KB 13.0KB
MassiveIntentClassification 60 113,724 404.6KB 62.2KB
MassiveScenarioClassification 18 102,930 404.6KB 20.0KB
ToxicConversationsClassification 2 98,818 404.6KB 4.0KB
TweetSentimentExtractionClassification 3 99,075 404.6KB 5.0KB

Training Configuration

Parameter Value
Optimizer AdamW (lr=2e-4, weight_decay=0.01)
Scheduler CosineAnnealingLR
Max epochs 10
Batch size 32
Max sequence length 128
Early stopping patience=3 (on test accuracy)
Gradient clipping max_norm=1.0
Loss function Cross-Entropy
Dropout (head) 0.1

Repository Structure

.
β”œβ”€β”€ README.md
β”œβ”€β”€ classification_results.json          # Summary of all results
β”œβ”€β”€ AmazonCounterfactualClassification/
β”‚   β”œβ”€β”€ lora_adapter.pt                  # LoRA A/B matrices
β”‚   β”œβ”€β”€ classifier_head.pt              # Linear classification head
β”‚   └── meta.json                        # Training metadata
β”œβ”€β”€ Banking77Classification/
β”‚   β”œβ”€β”€ ...
β”œβ”€β”€ ImdbClassification/
β”‚   β”œβ”€β”€ ...
β”œβ”€β”€ MTOPDomainClassification/
β”‚   β”œβ”€β”€ ...
β”œβ”€β”€ MassiveIntentClassification/
β”‚   β”œβ”€β”€ ...
β”œβ”€β”€ MassiveScenarioClassification/
β”‚   β”œβ”€β”€ ...
β”œβ”€β”€ ToxicConversationsClassification/
β”‚   β”œβ”€β”€ ...
└── TweetSentimentExtractionClassification/
    β”œβ”€β”€ ...

Usage

import torch
import torch.nn as nn
from transformers import AutoModel, AutoTokenizer

# 1. Base model load
base_model = AutoModel.from_pretrained("gomyk/jina-v5-h256-distilled-conv", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("gomyk/jina-v5-h256-distilled-conv", trust_remote_code=True)

# 2. LoRA adapter 적용
from huggingface_hub import hf_hub_download
import os

task = "Banking77Classification"  # μ›ν•˜λŠ” task 선택
repo_id = "gomyk/jina-v5-h256-lora-classification"

lora_path = hf_hub_download(repo_id, f"{task}/lora_adapter.pt")
head_path = hf_hub_download(repo_id, f"{task}/classifier_head.pt")
meta_path = hf_hub_download(repo_id, f"{task}/meta.json")

import json
with open(meta_path) as f:
    meta = json.load(f)

# 3. LoRA λΆ€μ°©
lora_state = torch.load(lora_path, map_location="cpu", weights_only=True)
idx = 0
for name, module in base_model.named_modules():
    for target in ["q_proj", "k_proj", "v_proj", "o_proj"]:
        child = getattr(module, target, None)
        if child is not None and isinstance(child, nn.Linear):
            A = lora_state[f"lora_{idx}_A"]
            B = lora_state[f"lora_{idx}_B"]
            scaling = lora_state[f"lora_{idx}_scaling"].item()
            # Merge into weight: W_new = W + scaling * (A @ B)^T
            child.weight.data += (scaling * (A @ B)).T
            idx += 1

# 4. Classification head
num_classes = meta["num_classes"]
classifier = nn.Sequential(nn.Dropout(0.1), nn.Linear(256, num_classes))
classifier.load_state_dict(torch.load(head_path, map_location="cpu", weights_only=True))

# 5. Inference
text = "I need to transfer money to another account"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
with torch.no_grad():
    outputs = base_model(**inputs)
    hidden = outputs.last_hidden_state
    mask = inputs["attention_mask"].unsqueeze(-1).float()
    pooled = (hidden * mask).sum(1) / mask.sum(1).clamp(min=1e-9)
    logits = classifier(pooled)
    pred = logits.argmax(dim=-1).item()
    print(f"Predicted class: {pred}")

License

Apache 2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for gomyk/jina-v5-h256-lora-classification