Update README.md
Browse files
README.md
CHANGED
|
@@ -26,7 +26,7 @@ model-index:
|
|
| 26 |
results:
|
| 27 |
- task:
|
| 28 |
type: translation-quality-estimation
|
| 29 |
-
name:
|
| 30 |
dataset:
|
| 31 |
type: wasanx/cometh_claude_augment
|
| 32 |
name: COMETH Claude Augmentation Datasets
|
|
@@ -37,7 +37,7 @@ model-index:
|
|
| 37 |
verified: false
|
| 38 |
- task:
|
| 39 |
type: translation-quality-estimation
|
| 40 |
-
name:
|
| 41 |
dataset:
|
| 42 |
type: wasanx/cometh_human_annot
|
| 43 |
name: COMETH Baseline Comparison
|
|
@@ -51,14 +51,14 @@ model-index:
|
|
| 51 |
value: 0.4639
|
| 52 |
verified: false
|
| 53 |
---
|
| 54 |
-
# ComeTH (คำไทย):
|
| 55 |
|
| 56 |
-
ComeTH is a fine-tuned version of the COMET (Crosslingual Optimized Metric for Evaluation of Translation) model specifically optimized for
|
| 57 |
|
| 58 |
## Model Overview
|
| 59 |
|
| 60 |
- **Model Type**: Translation Quality Estimation
|
| 61 |
-
- **Languages**:
|
| 62 |
- **Base Model**: COMET (Unbabel/wmt22-cometkiwi-da)
|
| 63 |
- **Encoder**: XLM-RoBERTa-based (microsoft/infoxlm-large)
|
| 64 |
- **Architecture**: Unified Metric with sentence-level scoring
|
|
@@ -73,7 +73,7 @@ We offer two variants of ComeTH with different training approaches:
|
|
| 73 |
- **ComeTH**: Fine-tuned on human MQM annotations (Spearman's ρ = 0.4639)
|
| 74 |
- **ComeTH-Augmented**: Fine-tuned on human + Claude-assisted annotations (Spearman's ρ = 0.4795)
|
| 75 |
|
| 76 |
-
Both models outperform the base COMET model (Spearman's ρ = 0.4570) on
|
| 77 |
|
| 78 |
## Technical Specifications
|
| 79 |
|
|
@@ -185,7 +185,7 @@ print(systems)
|
|
| 185 |
|
| 186 |
```
|
| 187 |
@misc{
|
| 188 |
-
title = {COMETH:
|
| 189 |
author = {COMETH Team},
|
| 190 |
year = {2025},
|
| 191 |
howpublished = {Hugging Face Model Repository},
|
|
|
|
| 26 |
results:
|
| 27 |
- task:
|
| 28 |
type: translation-quality-estimation
|
| 29 |
+
name: English-Thai Translation Quality Assessment
|
| 30 |
dataset:
|
| 31 |
type: wasanx/cometh_claude_augment
|
| 32 |
name: COMETH Claude Augmentation Datasets
|
|
|
|
| 37 |
verified: false
|
| 38 |
- task:
|
| 39 |
type: translation-quality-estimation
|
| 40 |
+
name: English-Thai Translation Quality Comparison
|
| 41 |
dataset:
|
| 42 |
type: wasanx/cometh_human_annot
|
| 43 |
name: COMETH Baseline Comparison
|
|
|
|
| 51 |
value: 0.4639
|
| 52 |
verified: false
|
| 53 |
---
|
| 54 |
+
# ComeTH (คำไทย): English-Thai Translation Quality Metrics
|
| 55 |
|
| 56 |
+
ComeTH is a fine-tuned version of the COMET (Crosslingual Optimized Metric for Evaluation of Translation) model specifically optimized for English-Thai translation quality assessment. This model evaluates machine translation outputs by providing quality scores that correlate highly with human judgments.
|
| 57 |
|
| 58 |
## Model Overview
|
| 59 |
|
| 60 |
- **Model Type**: Translation Quality Estimation
|
| 61 |
+
- **Languages**: English-Thai
|
| 62 |
- **Base Model**: COMET (Unbabel/wmt22-cometkiwi-da)
|
| 63 |
- **Encoder**: XLM-RoBERTa-based (microsoft/infoxlm-large)
|
| 64 |
- **Architecture**: Unified Metric with sentence-level scoring
|
|
|
|
| 73 |
- **ComeTH**: Fine-tuned on human MQM annotations (Spearman's ρ = 0.4639)
|
| 74 |
- **ComeTH-Augmented**: Fine-tuned on human + Claude-assisted annotations (Spearman's ρ = 0.4795)
|
| 75 |
|
| 76 |
+
Both models outperform the base COMET model (Spearman's ρ = 0.4570) on English-Thai translation evaluation. The Claude-augmented version leverages LLM-generated annotations to enhance correlation with human judgments by 4.9% over the baseline.
|
| 77 |
|
| 78 |
## Technical Specifications
|
| 79 |
|
|
|
|
| 185 |
|
| 186 |
```
|
| 187 |
@misc{
|
| 188 |
+
title = {COMETH: English-Thai Translation Quality Metrics},
|
| 189 |
author = {COMETH Team},
|
| 190 |
year = {2025},
|
| 191 |
howpublished = {Hugging Face Model Repository},
|