| --- |
| language: |
| - en |
| license: apache-2.0 |
| base_model: |
| - sentence-transformers/all-MiniLM-L6-v2 |
| datasets: |
| - itsjhuang/watsonx-docs-document-type |
| tags: |
| - text-classification |
| - embeddings |
| - technical-documentation |
| metrics: |
| - accuracy |
| - f1 |
| --- |
| |
| # Watsonx Docs Document Type Classifier |
|
|
| Binary classifier for IBM Watsonx technical documentation pages. |
| Given a documentation page, the model predicts whether it is: |
|
|
| - `conceptual` (0): primarily used to understand or look up information |
| - `how-to` (1): primarily used to complete a procedure or fix a problem |
|
|
| ## Model Details |
|
|
| | | | |
| |---|---| |
| | Base embeddings | sentence-transformers/all-MiniLM-L6-v2 | |
| | Classifier | LinearSVC (C=1.0, max_iter=2000) | |
| | Training dataset | [itsjhuang/watsonx-docs-document-type](https://huggingface.co/datasets/itsjhuang/watsonx-docs-document-type) | |
| | Input | title + first 800 words of document | |
| | Output | `conceptual` or `how-to` | |
| |
| ## Evaluation Results |
| |
| Three conditions were trained and evaluated. The best model (B) was selected by test macro F1. |
| |
| | Condition | Embedding Model | Classifier | Train Acc | Train F1 | Test Acc | Test F1 | |
| |---|---|---|---:|---:|---:|---:| |
| | A | all-MiniLM-L6-v2 | Logistic Regression | 0.879 | 0.879 | 0.817 | 0.817 | |
| | B ✅ | all-MiniLM-L6-v2 | LinearSVC | 0.971 | 0.971 | 0.867 | 0.867 | |
| | C | bge-small-en-v1.5 | Logistic Regression | 0.864 | 0.864 | 0.833 | 0.833 | |
| |
| Confusion matrices for each condition are available in the repository files. |
| |
| ## Usage |
| |
| |
| ```python |
| import joblib |
| import numpy as np |
| from sentence_transformers import SentenceTransformer |
|
|
| embedder = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2") |
| clf = joblib.load("best_model.joblib") |
| |
| def softmax(x): |
| e = np.exp(x - np.max(x)) |
| return e / e.sum() |
| |
| def predict(text): |
| embedding = embedder.encode([text], convert_to_numpy=True) |
| scores = clf.decision_function(embedding)[0] |
| if np.ndim(scores) == 0: |
| scores = np.array([-scores, scores]) |
| probs = softmax(scores) |
| labels = ["conceptual", "how-to"] |
| return dict(zip(labels, probs)) |
| ``` |
| |
|
|
| ## Limitations |
|
|
| - Trained on IBM Watsonx documentation only; may not generalize to other |
| technical documentation domains. |
| - Label boundary between weak procedural pages and conceptual capability |
| descriptions remains a residual source of error. |
|
|
| ## Source Dataset |
|
|
| Derived from |
| [`ibm-research/watsonxDocsQA`](https://huggingface.co/datasets/ibm-research/watsonxDocsQA), |
| licensed under Apache 2.0. |
|
|