This model fine‑tunes ModernBERT‑base using LoRA (Low‑Rank Adaptation) for efficient parameter‑tuning.

It is designed for binary classification tasks where high recall and controlled false positive rates are important.

Training Configuration

Seed: 42 (ensures reproducibility)
Batch sizes: Train = 128, Eval = 256
Max sequence length: 256
Epochs: 1 (baseline run)
Learning rate: 3e‑4
Weight decay: 0.01
Warmup ratio: 0.05
Gradient clipping: 1.0
Early stopping patience: 3
- Steps: 5,241

LoRA Setup

Enabled: Yes
Rank (r): 8
Alpha: 16
Dropout: 0.05
Target modules: Attention (Wqkv, Wo) and MLP (Wi, Wo) layers
Max drift ratio: 0.1

LoRA adapters allow efficient fine‑tuning by updating only small low‑rank matrices, reducing memory and compute requirements.

Loss Function

Training uses Asymmetric Focal Loss, which emphasizes hard negatives while keeping positive weighting mild. This helps balance recall and false positive rate.

Gamma_pos: 0.0 (minimal emphasis on positives)
Gamma_neg: 4.0 (stronger emphasis on negatives)
Clip: 0.05 (stability for probabilities)

Validation is performed every 5000 steps, with early stopping to prevent overfitting.

Usage

Usage:

import torch
from transformers import AutoTokenizer, AutoModelForMaskedLM, pipeline
from peft import PeftModel

# Base ModernBERT model
base_model_name = "answerdotai/ModernBERT-base"

# LoRA adapter checkpoint
adapter_model_name = "AINovice2005/ModernBERT-base-lora-cicflow-1m-r8"

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model_name)

# Load base masked language model
base_model = AutoModelForMaskedLM.from_pretrained(base_model_name)

# Attach LoRA adapter
model = PeftModel.from_pretrained(base_model, adapter_model_name)

# Move to device
device = "cuda" if torch.cuda.is_available() else "cpu"
model = model.to(device)

# Build fill-mask pipeline
fill_mask = pipeline(
    "fill-mask",
    model=model,
    tokenizer=tokenizer,
    device=0 if device == "cuda" else -1
)

# Example usage
text = "The network traffic shows a [MASK] pattern."
outputs = fill_mask(text)

for o in outputs:
    print(f"Token: {o['token_str']}, Score: {o['score']:.4f}")

Intended Use

Binary classification tasks where recall is critical.
Efficient fine‑tuning scenarios with limited compute resources.
Research and experimentation with parameter‑efficient methods.

Artifacts:

LoRA adapter
Training configuration and evaluation logs

Downloads last month: 42

Model tree for DollarSign/ModernBERT-base-lora-cicflow-1m-r8

Base model

answerdotai/ModernBERT-base

Adapter

(30)

this model

DollarSign
/

ModernBERT-base-lora-cicflow-1m-r8