Hygroskopisch/bge-m3-ifc-kbob-finetuned

Sentence-Transformers model finetuned from BAAI/bge-m3 for IFC-based construction material retrieval in KBOB/LCA workflows.

Model Summary

  • Model ID: Hygroskopisch/bge-m3-ifc-kbob-finetuned
  • Release: v3 (2026-04-16)
  • Base model: BAAI/bge-m3
  • Embedding dimension: 1024
  • Max sequence length: 128
  • Similarity: cosine

The model is optimized for queries generated from IFC element metadata and maps them to KBOB-like material labels for downstream environmental impact workflows.

Intended Use

  • IFC-to-material retrieval in building and infrastructure datasets.
  • Candidate generation before manual validation in LCA pipelines.
  • Semantic search over construction material catalogs with domain-specific wording.

Out-of-Scope Use

  • Legal, compliance, or procurement decisions without human review.
  • Safety-critical engineering sign-off.
  • Use as a standalone source of truth for environmental declarations.

Responsible Use

  • Keep a human-in-the-loop for final material assignment.
  • Validate results against project context, standards, and local regulations.
  • Contact: sbert-lca@pm.me

Training Data

The v3 run used project-internal data artifacts and generated pair files.

Train/dev counts from run metadata:

  • Total pairs: 16386
  • Train pairs: 14748
  • Dev pairs: 1638

Evaluation Data

Evaluation artifacts for this release:

  • eval/normal_queries/summary_eval-bge-m3-ifc-kbob-finetuned_model-1d06a0d7_queries-b9bc9eb9_no-reranker-7521044b.csv
  • eval/normal_queries/details_eval-bge-m3-ifc-kbob-finetuned_model-1d06a0d7_queries-b9bc9eb9_no-reranker-7521044b.csv

Evaluation query count: 389

Evaluation Results

The following results mirror the full evaluation summary in the main project README for the v3 model.

Core metrics by query set

Queries Cases Hit@1 Hit@10 Hit@20 Hit@30 Hit@50 MRR@10 MAP@10 nDCG@10 Recall@10
Normal 389 97.43% 99.49% 99.74% 99.74% 100.00% 0.984 0.932 0.954 0.960
Typos 389 88.43% 94.86% 98.20% 98.97% 99.49% 0.909 0.844 0.876 0.890
Missing Attribute 389 75.32% 92.80% 96.40% 98.20% 98.71% 0.803 0.750 0.794 0.860
Missing + Typos 389 68.12% 88.17% 94.34% 96.92% 98.46% 0.739 0.682 0.731 0.805

95% confidence intervals (bootstrap from summary files):

Queries Hit@1 95% CI Hit@10 95% CI MRR@10 95% CI nDCG@10 95% CI
Normal [95.37%, 98.97%] [98.71%, 100.00%] [0.971, 0.994] [0.939, 0.968]
Typos [84.83%, 91.77%] [92.80%, 96.66%] [0.881, 0.935] [0.847, 0.902]
Missing Attribute [70.69%, 79.18%] [89.97%, 94.99%] [0.766, 0.835] [0.759, 0.824]
Missing + Typos [63.36%, 72.49%] [84.95%, 91.14%] [0.695, 0.778] [0.690, 0.767]

Query set definitions

The four query files test robustness under controlled perturbations.

Queries Transformation Hard invariants
Normal Unchanged query (reference run) No perturbation
Missing file Removes one allowed token from PredefinedType, Material, StrengthClass, or insitu/precast (Ortbeton/Fertigteil) IfcEntity is never removed
Typos file 1 to 2 typos per line, max 1 typo per token/word IfcEntity remains correct
Combined file First remove one allowed token, then inject 1 to 2 typos into remaining allowed tokens (max 1 typo per token) IfcEntity remains correct

Summary of generated perturbation files:

File Changed lines Typo distribution
Missing 388 -
Typos 388 1 typo: 193, 2 typos: 195
Missing + Typos 388 1 typo: 309, 2 typos: 61

Detailed interpretation

Readability note: metrics are computed on 389 evaluation cases; the perturbation table above reports changed lines in the generated query files.

Degradation versus Normal Queries:

Queries Delta Hit@1 Delta Hit@10 Delta MRR@10 Delta nDCG@10
Typos -9.00% -4.63% -0.075 -0.078
Missing Attribute -22.11% -6.69% -0.181 -0.160
Missing + Typos -29.31% -11.32% -0.245 -0.223

Conclusion: token removal hurts more than pure typo noise; the combined perturbation is strongest, as expected.

Typos vs. Missing (direct comparison):

  • Hit@1: Missing is 13.11 percentage points below Typos (75.32% vs 88.43%).
  • Hit@10: Missing is 2.06 percentage points below Typos (92.80% vs 94.86%).
  • MRR@10: Missing is 0.106 below Typos (0.803 vs 0.909).
  • nDCG@10: Missing is 0.082 below Typos (0.794 vs 0.876).

Conclusion: missing semantic slots move correct results further down the ranking than typos.

Top-1 vs Top-10 recovery potential:

  • Normal: Hit@10 - Hit@1 = 2.06%.
  • Typos: Hit@10 - Hit@1 = 6.43%.
  • Missing Attribute: Hit@10 - Hit@1 = 17.48%.
  • Missing + Typos: Hit@10 - Hit@1 = 20.05%.

Conclusion: under perturbation, the correct material often remains in top-10 but drops from rank 1 more frequently.

Statistical separability (Hit@1 CIs):

  • Normal vs Typos: no overlap; interval gap 3.60% (95.37% vs 91.77%).
  • Typos vs Missing: no overlap; interval gap 5.65% (84.83% vs 79.18%).
  • Missing vs Missing + Typos: overlap 1.80% (70.69% to 72.49%).

Conclusion: the first two degradation steps are clearly separated; the final step is smaller but still negative.

Practical implications:

  • High automation precision depends strongly on stable Material, StrengthClass, and CastingMethod slots.
  • For noisy IFC text, UI workflows should prioritize top-10 candidates and avoid relying on top-1 alone.
  • Main improvement lever is robust semantic token extraction/preservation, more than additional typo tolerance.

Usage (Sentence-Transformers)

Using this model becomes easy when you have sentence-transformers installed:

pip install -U sentence-transformers

Then you can use the model like this:

from sentence_transformers import SentenceTransformer
sentences = [
  "IfcPile BORED Stahlbeton C40/50 500 INSITU",
  "Tiefgründung Ortbetonbohrpfahl 700",
]

model = SentenceTransformer("Hygroskopisch/bge-m3-ifc-kbob-finetuned")
embeddings = model.encode(sentences)
print(embeddings)

Load a fixed released revision:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer(
  "Hygroskopisch/bge-m3-ifc-kbob-finetuned",
  revision="v3",
)

Training

Core training configuration (v3):

  • Epochs: 2
  • Batch size: 32
  • Learning rate: 2e-05
  • Warmup ratio: 0.1
  • FP16: true
  • Seed: 42
  • Device: cuda
  • Prefix mode: no_prefix

DataLoader length: 7418

Loss:

sentence_transformers.losses.MultipleNegativesRankingLoss.MultipleNegativesRankingLoss with parameters:

{'scale': 20.0, 'similarity_fct': 'cos_sim'}

fit() parameters:

{
    "epochs": 2,
    "evaluation_steps": 0,
    "evaluator": "__main__.CombinedHit5Mrr10Evaluator",
    "max_grad_norm": 1,
    "optimizer_class": "<class 'torch.optim.adamw.AdamW'>",
    "optimizer_params": {
        "lr": 2e-05
    },
    "scheduler": "WarmupLinear",
    "steps_per_epoch": null,
    "warmup_steps": 1484,
    "weight_decay": 0.01
}

Release Notes

v3 (2026-04-16)

  • Replaced previous published checkpoint with the new finetuned weights from the latest IFC/KBOB training run.
  • Updated training data pipeline artifacts and documented exact source file names used for this release.
  • Published baseline retrieval metrics on 389 evaluation queries (no cross-encoder reranker).
  • Behavior change: retrieval rankings can differ from previous versions; if you require reproducibility, pin revision v3.
  • Responsible-use contact added: sbert-lca@pm.me.

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Citing & Authors

If you use this model in a report or publication, cite the project repository and this Hugging Face model page.

Downloads last month
223
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Hygroskopisch/bge-m3-ifc-kbob-finetuned

Base model

BAAI/bge-m3
Finetuned
(429)
this model

Spaces using Hygroskopisch/bge-m3-ifc-kbob-finetuned 2