Hygroskopisch/bge-m3-ifc-kbob-finetuned
Sentence-Transformers model finetuned from BAAI/bge-m3 for IFC-based construction material retrieval in KBOB/LCA workflows.
Model Summary
- Model ID: Hygroskopisch/bge-m3-ifc-kbob-finetuned
- Release: v3 (2026-04-16)
- Base model: BAAI/bge-m3
- Embedding dimension: 1024
- Max sequence length: 128
- Similarity: cosine
The model is optimized for queries generated from IFC element metadata and maps them to KBOB-like material labels for downstream environmental impact workflows.
Intended Use
- IFC-to-material retrieval in building and infrastructure datasets.
- Candidate generation before manual validation in LCA pipelines.
- Semantic search over construction material catalogs with domain-specific wording.
Out-of-Scope Use
- Legal, compliance, or procurement decisions without human review.
- Safety-critical engineering sign-off.
- Use as a standalone source of truth for environmental declarations.
Responsible Use
- Keep a human-in-the-loop for final material assignment.
- Validate results against project context, standards, and local regulations.
- Contact: sbert-lca@pm.me
Training Data
The v3 run used project-internal data artifacts and generated pair files.
- Query source files: Training/query_generation/generated_queries
- Expected mapping source files: Training/query_generation/generated_queries
- Hard-negative strategy: fallback mode with random_preselected selection, up to 2 hard negatives per record
Train/dev counts from run metadata:
- Total pairs: 16386
- Train pairs: 14748
- Dev pairs: 1638
Evaluation Data
Evaluation artifacts for this release:
- eval/normal_queries/summary_eval-bge-m3-ifc-kbob-finetuned_model-1d06a0d7_queries-b9bc9eb9_no-reranker-7521044b.csv
- eval/normal_queries/details_eval-bge-m3-ifc-kbob-finetuned_model-1d06a0d7_queries-b9bc9eb9_no-reranker-7521044b.csv
Evaluation query count: 389
Evaluation Results
The following results mirror the full evaluation summary in the main project README for the v3 model.
Core metrics by query set
| Queries | Cases | Hit@1 | Hit@10 | Hit@20 | Hit@30 | Hit@50 | MRR@10 | MAP@10 | nDCG@10 | Recall@10 |
|---|---|---|---|---|---|---|---|---|---|---|
| Normal | 389 | 97.43% | 99.49% | 99.74% | 99.74% | 100.00% | 0.984 | 0.932 | 0.954 | 0.960 |
| Typos | 389 | 88.43% | 94.86% | 98.20% | 98.97% | 99.49% | 0.909 | 0.844 | 0.876 | 0.890 |
| Missing Attribute | 389 | 75.32% | 92.80% | 96.40% | 98.20% | 98.71% | 0.803 | 0.750 | 0.794 | 0.860 |
| Missing + Typos | 389 | 68.12% | 88.17% | 94.34% | 96.92% | 98.46% | 0.739 | 0.682 | 0.731 | 0.805 |
95% confidence intervals (bootstrap from summary files):
| Queries | Hit@1 95% CI | Hit@10 95% CI | MRR@10 95% CI | nDCG@10 95% CI |
|---|---|---|---|---|
| Normal | [95.37%, 98.97%] | [98.71%, 100.00%] | [0.971, 0.994] | [0.939, 0.968] |
| Typos | [84.83%, 91.77%] | [92.80%, 96.66%] | [0.881, 0.935] | [0.847, 0.902] |
| Missing Attribute | [70.69%, 79.18%] | [89.97%, 94.99%] | [0.766, 0.835] | [0.759, 0.824] |
| Missing + Typos | [63.36%, 72.49%] | [84.95%, 91.14%] | [0.695, 0.778] | [0.690, 0.767] |
Query set definitions
The four query files test robustness under controlled perturbations.
| Queries | Transformation | Hard invariants |
|---|---|---|
| Normal | Unchanged query (reference run) | No perturbation |
| Missing file | Removes one allowed token from PredefinedType, Material, StrengthClass, or insitu/precast (Ortbeton/Fertigteil) |
IfcEntity is never removed |
| Typos file | 1 to 2 typos per line, max 1 typo per token/word | IfcEntity remains correct |
| Combined file | First remove one allowed token, then inject 1 to 2 typos into remaining allowed tokens (max 1 typo per token) | IfcEntity remains correct |
Summary of generated perturbation files:
| File | Changed lines | Typo distribution |
|---|---|---|
| Missing | 388 | - |
| Typos | 388 | 1 typo: 193, 2 typos: 195 |
| Missing + Typos | 388 | 1 typo: 309, 2 typos: 61 |
Detailed interpretation
Readability note: metrics are computed on 389 evaluation cases; the perturbation table above reports changed lines in the generated query files.
Degradation versus Normal Queries:
| Queries | Delta Hit@1 | Delta Hit@10 | Delta MRR@10 | Delta nDCG@10 |
|---|---|---|---|---|
| Typos | -9.00% | -4.63% | -0.075 | -0.078 |
| Missing Attribute | -22.11% | -6.69% | -0.181 | -0.160 |
| Missing + Typos | -29.31% | -11.32% | -0.245 | -0.223 |
Conclusion: token removal hurts more than pure typo noise; the combined perturbation is strongest, as expected.
Typos vs. Missing (direct comparison):
- Hit@1: Missing is 13.11 percentage points below Typos (75.32% vs 88.43%).
- Hit@10: Missing is 2.06 percentage points below Typos (92.80% vs 94.86%).
- MRR@10: Missing is 0.106 below Typos (0.803 vs 0.909).
- nDCG@10: Missing is 0.082 below Typos (0.794 vs 0.876).
Conclusion: missing semantic slots move correct results further down the ranking than typos.
Top-1 vs Top-10 recovery potential:
- Normal: Hit@10 - Hit@1 = 2.06%.
- Typos: Hit@10 - Hit@1 = 6.43%.
- Missing Attribute: Hit@10 - Hit@1 = 17.48%.
- Missing + Typos: Hit@10 - Hit@1 = 20.05%.
Conclusion: under perturbation, the correct material often remains in top-10 but drops from rank 1 more frequently.
Statistical separability (Hit@1 CIs):
- Normal vs Typos: no overlap; interval gap 3.60% (95.37% vs 91.77%).
- Typos vs Missing: no overlap; interval gap 5.65% (84.83% vs 79.18%).
- Missing vs Missing + Typos: overlap 1.80% (70.69% to 72.49%).
Conclusion: the first two degradation steps are clearly separated; the final step is smaller but still negative.
Practical implications:
- High automation precision depends strongly on stable
Material,StrengthClass, andCastingMethodslots. - For noisy IFC text, UI workflows should prioritize top-10 candidates and avoid relying on top-1 alone.
- Main improvement lever is robust semantic token extraction/preservation, more than additional typo tolerance.
Usage (Sentence-Transformers)
Using this model becomes easy when you have sentence-transformers installed:
pip install -U sentence-transformers
Then you can use the model like this:
from sentence_transformers import SentenceTransformer
sentences = [
"IfcPile BORED Stahlbeton C40/50 500 INSITU",
"Tiefgründung Ortbetonbohrpfahl 700",
]
model = SentenceTransformer("Hygroskopisch/bge-m3-ifc-kbob-finetuned")
embeddings = model.encode(sentences)
print(embeddings)
Load a fixed released revision:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer(
"Hygroskopisch/bge-m3-ifc-kbob-finetuned",
revision="v3",
)
Training
Core training configuration (v3):
- Epochs: 2
- Batch size: 32
- Learning rate: 2e-05
- Warmup ratio: 0.1
- FP16: true
- Seed: 42
- Device: cuda
- Prefix mode: no_prefix
DataLoader length: 7418
Loss:
sentence_transformers.losses.MultipleNegativesRankingLoss.MultipleNegativesRankingLoss with parameters:
{'scale': 20.0, 'similarity_fct': 'cos_sim'}
fit() parameters:
{
"epochs": 2,
"evaluation_steps": 0,
"evaluator": "__main__.CombinedHit5Mrr10Evaluator",
"max_grad_norm": 1,
"optimizer_class": "<class 'torch.optim.adamw.AdamW'>",
"optimizer_params": {
"lr": 2e-05
},
"scheduler": "WarmupLinear",
"steps_per_epoch": null,
"warmup_steps": 1484,
"weight_decay": 0.01
}
Release Notes
v3 (2026-04-16)
- Replaced previous published checkpoint with the new finetuned weights from the latest IFC/KBOB training run.
- Updated training data pipeline artifacts and documented exact source file names used for this release.
- Published baseline retrieval metrics on 389 evaluation queries (no cross-encoder reranker).
- Behavior change: retrieval rankings can differ from previous versions; if you require reproducibility, pin revision v3.
- Responsible-use contact added: sbert-lca@pm.me.
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: XLMRobertaModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Citing & Authors
If you use this model in a report or publication, cite the project repository and this Hugging Face model page.
- Downloads last month
- 223
Model tree for Hygroskopisch/bge-m3-ifc-kbob-finetuned
Base model
BAAI/bge-m3