Upload folder using huggingface_hub
Browse files- README.md +40 -7
- charset.txt +1 -1
- monocr.ckpt +2 -2
- monocr.json +5 -4
- onnx/monocr.json +7 -0
- onnx/monocr.onnx +2 -2
- pytorch/monocr.ckpt +2 -2
README.md
CHANGED
|
@@ -10,7 +10,7 @@ tags:
|
|
| 10 |
- mnw
|
| 11 |
- onnx
|
| 12 |
- tflite
|
| 13 |
-
-
|
| 14 |
- crnn
|
| 15 |
---
|
| 16 |
|
|
@@ -40,20 +40,53 @@ Unified SDKs are available for seamless integration into existing applications.
|
|
| 40 |
| **TFLite (fp32)** | `tflite/float32.tflite` | High-precision mobile inference. |
|
| 41 |
| **PyTorch** | `pytorch/monocr.ckpt` | Training, fine-tuning, and research. |
|
| 42 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 43 |
## Technical Specification
|
| 44 |
|
| 45 |
-
- **
|
| 46 |
-
- **
|
| 47 |
-
- **
|
| 48 |
-
- **
|
| 49 |
-
- **Vocabulary**: 224 Mon characters, punctuation, and formatting symbols (see `charset.txt`).
|
| 50 |
|
| 51 |
## Integration Guidelines
|
| 52 |
|
| 53 |
For developers building custom drivers:
|
| 54 |
|
| 55 |
1. Refer to `charset.txt` for the index-to-character mapping (Index 0 is reserved for `<blank>`).
|
| 56 |
-
2. Ensure input images are high-contrast and properly scaled to
|
| 57 |
3. ONNX models use dynamic axes for width to support varying word lengths without padding.
|
| 58 |
|
| 59 |
## License
|
|
|
|
| 10 |
- mnw
|
| 11 |
- onnx
|
| 12 |
- tflite
|
| 13 |
+
- mobilenetv3
|
| 14 |
- crnn
|
| 15 |
---
|
| 16 |
|
|
|
|
| 40 |
| **TFLite (fp32)** | `tflite/float32.tflite` | High-precision mobile inference. |
|
| 41 |
| **PyTorch** | `pytorch/monocr.ckpt` | Training, fine-tuning, and research. |
|
| 42 |
|
| 43 |
+
## Performance Metrics
|
| 44 |
+
|
| 45 |
+
| Metric | Value |
|
| 46 |
+
| :------------------ | :-------------------------------------------------- |
|
| 47 |
+
| **Train Loss** | 1.22 |
|
| 48 |
+
| **Validation Loss** | 1.157 |
|
| 49 |
+
| **CER** | 0.025 |
|
| 50 |
+
| **WER** | 0.211 |
|
| 51 |
+
| **Epochs** | 27 |
|
| 52 |
+
| **Best Checkpoint** | `monocr-epoch=27-val_loss=1.157-val_cer=0.025.ckpt` |
|
| 53 |
+
|
| 54 |
+
## Dataset Summary
|
| 55 |
+
|
| 56 |
+
- **Total samples**: 3,030,000
|
| 57 |
+
- **Train size**: 3,000,000
|
| 58 |
+
- **Validation size**: 30,000
|
| 59 |
+
- **Data source description**: Procedural synthetic text generation across multiple Mon fonts combined with real-world digit corpuses.
|
| 60 |
+
- **Augmentation strategy**: Applied during training: image-level augmentations including noise, blur, and transformations.
|
| 61 |
+
|
| 62 |
+
## Model Specifications
|
| 63 |
+
|
| 64 |
+
- **Architecture type**: MobileNetV3-Large Backbone + 2-layer BiLSTM + Linear CTC Head
|
| 65 |
+
- **Parameter count**: 6.58M parameters
|
| 66 |
+
- **Model size**: 100.73 MB (PyTorch Checkpoint)
|
| 67 |
+
- **Training hardware**: NVIDIA GPU (Single GPU run)
|
| 68 |
+
- **Training time**: ~2-4 days
|
| 69 |
+
|
| 70 |
+
## Reproducibility
|
| 71 |
+
|
| 72 |
+
- **Optimizer**: AdamW
|
| 73 |
+
- **Learning rate**: 0.0001 (Warmup + Cosine Annealing)
|
| 74 |
+
- **Batch size**: 48 (with Gradient Accumulation = 4)
|
| 75 |
+
- **Loss function**: CTCLoss (with label smoothing $\epsilon=0.05$)
|
| 76 |
+
|
| 77 |
## Technical Specification
|
| 78 |
|
| 79 |
+
- **Input Tensors**: Grayscale (1-channel), 128px Height, Variable Width.
|
| 80 |
+
- **Image Preprocessing**: Aspect-ratio preserving resize to 128px height, followed by `[0, 1]` pixel normalization.
|
| 81 |
+
- **Decoding Strategy**: Connectionist Temporal Classification (CTC) Beam Search Decoding (width=10).
|
| 82 |
+
- **Vocabulary**: 315 characters (Mon, Burmese, digits, punctuation, and symbols). Encoding is standard UTF-8 (see `charset.txt`).
|
|
|
|
| 83 |
|
| 84 |
## Integration Guidelines
|
| 85 |
|
| 86 |
For developers building custom drivers:
|
| 87 |
|
| 88 |
1. Refer to `charset.txt` for the index-to-character mapping (Index 0 is reserved for `<blank>`).
|
| 89 |
+
2. Ensure input images are high-contrast and properly scaled to 128px height.
|
| 90 |
3. ONNX models use dynamic axes for width to support varying word lengths without padding.
|
| 91 |
|
| 92 |
## License
|
charset.txt
CHANGED
|
@@ -1 +1 @@
|
|
| 1 |
-
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~£¥¦§©«¬°±²³´·¸¹»ÀÁÂÄÅÆÇÉÊÌÍÑÓÖרÜÞßàáâãäåæçèéêëìíîïðñòóôõö÷
|
|
|
|
| 1 |
+
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~£¥¦§©«¬°±²³´µ·¸¹º»¾ÀÁÂÄÅÆÇÉÊÌÍÑÓÖרÜÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüþĀāıŒœŠšŽžƒːμπကခဂဃငစဆဇဈဉညဋဌဍဎဏတထဒဓနပဖဗဘမယရလဝသဟဠအဢဣဤဥဦဧဨဩဪါာိီုူေဲဳဴဵံ့း္်ျြွှဿ၀၁၂၃၄၅၆၇၈၉၊။၌၍၎၏ၐၑၓၚၛၜၝၞၟၠၡၢၣၤၥၨၪၰၱၲၳၴၵၷၸၹၺၻၼၾၿႀႄႅႆႇႈႉႊႏ႐႒႓႔႕႘႙ႜႝ႟–‘’‚“”•…−
|
monocr.ckpt
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c126c884a0c42a2a14ac293a550dbe315b35446dfc53bcf9a650343b5a911f83
|
| 3 |
+
size 105620581
|
monocr.json
CHANGED
|
@@ -1,6 +1,7 @@
|
|
| 1 |
{
|
| 2 |
-
"charset": "!\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~£¥¦§©«¬°±²³´·¸¹»ÀÁÂÄÅÆÇÉÊÌÍÑÓÖרÜÞßàáâãäåæçèéêëìíîïðñòóôõö÷
|
| 3 |
-
"img_height":
|
| 4 |
-
"opset_version":
|
| 5 |
-
"
|
|
|
|
| 6 |
}
|
|
|
|
| 1 |
{
|
| 2 |
+
"charset": " !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~£¥¦§©«¬°±²³´µ·¸¹º»¾ÀÁÂÄÅÆÇÉÊÌÍÑÓÖרÜÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüþĀāıŒœŠšŽžƒːμπကခဂဃငစဆဇဈဉညဋဌဍဎဏတထဒဓနပဖဗဘမယရလဝသဟဠအဢဣဤဥဦဧဨဩဪါာိီုူေဲဳဴဵံ့း္်ျြွှဿ၀၁၂၃၄၅၆၇၈၉၊။၌၍၎၏ၐၑၓၚၛၜၝၞၟၠၡၢၣၤၥၨၪၰၱၲၳၴၵၷၸၹၺၻၼၾၿႀႄႅႆႇႈႉႊႏ႐႒႓႔႕႘႙ႜႝ႟–‘’‚“”•…−",
|
| 3 |
+
"img_height": 128,
|
| 4 |
+
"opset_version": 17,
|
| 5 |
+
"model_version": "2.0",
|
| 6 |
+
"architecture": "MobileNetV3-Large + BiLSTM + CTC"
|
| 7 |
}
|
onnx/monocr.json
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"charset": " !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~£¥¦§©«¬°±²³´µ·¸¹º»¾ÀÁÂÄÅÆÇÉÊÌÍÑÓÖרÜÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüþĀāıŒœŠšŽžƒːμπကခဂဃငစဆဇဈဉညဋဌဍဎဏတထဒဓနပဖဗဘမယရလဝသဟဠအဢဣဤဥဦဧဨဩဪါာိီုူေဲဳဴဵံ့း္်ျြွှဿ၀၁၂၃၄၅၆၇၈၉၊။၌၍၎၏ၐၑၓၚၛၜၝၞၟၠၡၢၣၤၥၨၪၰၱၲၳၴၵၷၸၹၺၻၼၾၿႀႄႅႆႇႈႉႊႏ႐႒႓႔႕႘႙ႜႝ႟–‘’‚“”•…−",
|
| 3 |
+
"img_height": 128,
|
| 4 |
+
"opset_version": 17,
|
| 5 |
+
"model_version": "2.0",
|
| 6 |
+
"architecture": "MobileNetV3-Large + BiLSTM + CTC"
|
| 7 |
+
}
|
onnx/monocr.onnx
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:84b83958e51cb3a7a4fc07e8ac87c6f8040419bbd699bc890ccbb927fdf16a14
|
| 3 |
+
size 26342200
|
pytorch/monocr.ckpt
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c126c884a0c42a2a14ac293a550dbe315b35446dfc53bcf9a650343b5a911f83
|
| 3 |
+
size 105620581
|