all-MiniLM-L6-v2 + GENbAIs Bio Adapters

Bio-inspired adapters that improve beyond LoRA — discovered through intelligent neural architecture search over 50+ neuroscience-inspired mechanisms.

Baseline Enhanced Δ
Avg across 20 metrics 0.7286 0.7492 +0.0205
Win rate 17 wins / 3 losses
Best single gain +14.66% (PAWS AP)

What is this?

This model is sentence-transformers/all-MiniLM-L6-v2 enhanced with bio-inspired adapter mechanisms discovered through the GENbAIs frameworkGeneral Efficient Neural bio-Adapter Intelligent Search.

The enhancement was distilled back into a clean SentenceTransformer, so no custom code is needed — load it like any other sentence-transformers model.

Usage

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("lakinekaki/all-MiniLM-L6-v2-genbais")

sentences = [
    "The weather is lovely today.",
    "It's so sunny outside!",
    "He drove to the stadium.",
]

embeddings = model.encode(sentences)

# Compute similarity
from sentence_transformers.util import cos_sim
print(cos_sim(embeddings[0], embeddings[1]))  # High similarity
print(cos_sim(embeddings[0], embeddings[2]))  # Low similarity

How It Was Made

1. Bio-Adapter Search

We searched through ~1,000 configurations out of a ~10²² search space using Thompson sampling with Bayesian pruning. The search explored 50+ neuroscience-inspired mechanisms including predictive coding, lateral inhibition, Hebbian learning, dendritic computation, and more.

2. Stacked on Best LoRA

Bio adapters were stacked on top of the optimal LoRA configuration (found via grid search with 5 configs × 3 seeds). This proves the bio features provide additive improvement beyond state-of-the-art PEFT.

3. Distillation

The enhanced model (base + LoRA + bio adapters) was distilled into a clean SentenceTransformer via MSE + cosine embedding loss, producing a standard model with no custom dependencies.

Full Benchmark Results

Evaluated on 20 metrics across STS, pair classification, and clustering tasks.

Semantic Textual Similarity

Dataset Metric Baseline Enhanced Δ Δ%
stsb spearman 0.8203 0.8524 +0.0321 +3.91%
stsb_dev spearman 0.8672 0.8692 +0.0021 +0.24%
sick-r spearman 0.7758 0.7798 +0.0039 +0.51%
sts12 spearman 0.7237 0.7380 +0.0143 +1.98%
sts13 spearman 0.8060 0.8602 +0.0541 +6.72%
sts14 spearman 0.7559 0.8311 +0.0752 +9.95%
sts15 spearman 0.8539 0.8628 +0.0089 +1.04%
sts16 spearman 0.7899 0.8174 +0.0275 +3.48%
biosses spearman 0.8164 0.7868 -0.0296 -3.63%

Pair Classification

Dataset Metric Baseline Enhanced Δ Δ%
paws ap 0.5844 0.6701 +0.0857 +14.66%
paws f1 0.6140 0.6495 +0.0356 +5.79%
qqp ap 0.7640 0.7786 +0.0145 +1.90%
qqp f1 0.7370 0.7466 +0.0096 +1.31%
mrpc ap 0.8369 0.8586 +0.0217 +2.59%
mrpc f1 0.8175 0.8312 +0.0137 +1.68%
snli ap 0.6461 0.6357 -0.0104 -1.61%
snli f1 0.6621 0.6467 -0.0154 -2.33%
mnli ap 0.6070 0.6415 +0.0345 +5.68%
mnli f1 0.6053 0.6224 +0.0171 +2.82%

Clustering

Dataset Metric Baseline Enhanced Δ Δ%
twentynewsgroups v_measure 0.4894 0.5053 +0.0159 +3.24%

Summary

  • 17 wins, 3 losses across 20 metrics
  • Average improvement: +0.0205 absolute
  • Strongest gains on adversarial tasks — PAWS AP +14.66% suggests bio features capture genuine semantic structure beyond lexical overlap
  • Broad improvement across STS, pair classification, AND clustering (not task-specific overfitting)
  • Small regressions on biosses (-3.6%, tiny biomedical domain) and SNLI (-2.3%)

Key Insight

This is a hard-mode validation. all-MiniLM-L6-v2 is a 22M-parameter, 6-layer model that's already been distilled and heavily optimized by the sentence-transformers team — one of the toughest targets to improve. Getting meaningful gains here is like squeezing blood from a stone. Larger models (CLIP, LLaMA, Mistral) with more layers, more parameters, and more architectural redundancy offer significantly more room for bio-adapter improvement.

The largest improvements come on adversarial and challenging benchmarks (PAWS, STS13, STS14, MNLI) — exactly where standard fine-tuning tends to plateau. Bio-inspired mechanisms like lateral inhibition and predictive coding appear to capture deeper semantic relationships that pure gradient descent misses.

Technical Details

  • Base model: sentence-transformers/all-MiniLM-L6-v2 (22M params, 6 layers)
  • Bio mechanisms: 50+ neuroscience-inspired adapter types
  • Search: ~1,000 experiments via Thompson sampling (out of ~10²² possible)
  • Each adapter: adds up to ~1% of model parameters
  • Distillation: MSE + cosine loss on 50K sentences from STS-B + AllNLI + Quora
  • Evaluation time: 754.5s across all 20 benchmarks

Learn More

Citation

@software{genbais2025,
  title={GENbAIs: General Efficient Neural bio-Adapter Intelligent Search},
  author={Kovacevic, Lazar},
  year={2025},
  url={https://genbais.com}
}

License

MIT

Downloads last month
5
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lakinekaki/all-MiniLM-L6-v2-genbais

Finetuned
(821)
this model

Datasets used to train lakinekaki/all-MiniLM-L6-v2-genbais

Evaluation results