Upload folder using huggingface_hub

Browse files

Files changed (7) hide show

README.md +40 -7
charset.txt +1 -1
monocr.ckpt +2 -2
monocr.json +5 -4
onnx/monocr.json +7 -0
onnx/monocr.onnx +2 -2
pytorch/monocr.ckpt +2 -2

README.md CHANGED Viewed

@@ -10,7 +10,7 @@ tags:
   - mnw
   - onnx
   - tflite
-  - resnet
   - crnn
 ---
@@ -40,20 +40,53 @@ Unified SDKs are available for seamless integration into existing applications.
 | **TFLite (fp32)** | `tflite/float32.tflite` | High-precision mobile inference.                   |
 | **PyTorch**       | `pytorch/monocr.ckpt`   | Training, fine-tuning, and research.               |
 ## Technical Specification
-- **Core Architecture**: ResNet-18 Backbone + 2-layer BiLSTM + Linear CTC Head.
-- **Input Tensors**: Grayscale (1-channel), 64px Height, Variable Width.
-- **Image Preprocessing**: Aspect-ratio preserving resize to 64px height, followed by `[0, 1]` pixel normalization.
-- **Decoding Strategy**: Connectionist Temporal Classification (CTC) Greedy Decoding.
-- **Vocabulary**: 224 Mon characters, punctuation, and formatting symbols (see `charset.txt`).
 ## Integration Guidelines
 For developers building custom drivers:
 1. Refer to `charset.txt` for the index-to-character mapping (Index 0 is reserved for `<blank>`).
-2. Ensure input images are high-contrast and properly scaled to 64px height.
 3. ONNX models use dynamic axes for width to support varying word lengths without padding.
 ## License

   - mnw
   - onnx
   - tflite
+  - mobilenetv3
   - crnn
 ---
 | **TFLite (fp32)** | `tflite/float32.tflite` | High-precision mobile inference.                   |
 | **PyTorch**       | `pytorch/monocr.ckpt`   | Training, fine-tuning, and research.               |
+## Performance Metrics
+| Metric              | Value                                               |
+| :------------------ | :-------------------------------------------------- |
+| **Train Loss**      | 1.22                                                |
+| **Validation Loss** | 1.157                                               |
+| **CER**             | 0.025                                               |
+| **WER**             | 0.211                                               |
+| **Epochs**          | 27                                                  |
+| **Best Checkpoint** | `monocr-epoch=27-val_loss=1.157-val_cer=0.025.ckpt` |
+## Dataset Summary
+- **Total samples**: 3,030,000
+- **Train size**: 3,000,000
+- **Validation size**: 30,000
+- **Data source description**: Procedural synthetic text generation across multiple Mon fonts combined with real-world digit corpuses.
+- **Augmentation strategy**: Applied during training: image-level augmentations including noise, blur, and transformations.
+## Model Specifications
+- **Architecture type**: MobileNetV3-Large Backbone + 2-layer BiLSTM + Linear CTC Head
+- **Parameter count**: 6.58M parameters
+- **Model size**: 100.73 MB (PyTorch Checkpoint)
+- **Training hardware**: NVIDIA GPU (Single GPU run)
+- **Training time**: ~2-4 days
+## Reproducibility
+- **Optimizer**: AdamW
+- **Learning rate**: 0.0001 (Warmup + Cosine Annealing)
+- **Batch size**: 48 (with Gradient Accumulation = 4)
+- **Loss function**: CTCLoss (with label smoothing $\epsilon=0.05$)
 ## Technical Specification
+- **Input Tensors**: Grayscale (1-channel), 128px Height, Variable Width.
+- **Image Preprocessing**: Aspect-ratio preserving resize to 128px height, followed by `[0, 1]` pixel normalization.
+- **Decoding Strategy**: Connectionist Temporal Classification (CTC) Beam Search Decoding (width=10).
+- **Vocabulary**: 315 characters (Mon, Burmese, digits, punctuation, and symbols). Encoding is standard UTF-8 (see `charset.txt`).
 ## Integration Guidelines
 For developers building custom drivers:
 1. Refer to `charset.txt` for the index-to-character mapping (Index 0 is reserved for `<blank>`).
+2. Ensure input images are high-contrast and properly scaled to 128px height.
 3. ONNX models use dynamic axes for width to support varying word lengths without padding.
 ## License

charset.txt CHANGED Viewed

@@ -1 +1 @@

- !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~£¥¦§©«¬°±²³´·¸¹»ÀÁÂÄÅÆÇÉÊÌÍÑÓÖ×ØÜÞßàáâãäåæçèéêëìíîïðñòóôõö÷~~øùúûüþĀāăćčĐđĒēėěğġĦħīİıņŋŌōőŒœŚśşŠšţũŪūŻŽž˥˦२๑๒๕๖๘་།༥~~ကခဂဃငစဆဇဈဉညဋဌဍဎဏတထဒဓနပဖဗဘမယရလဝသဟဠအဢဣဤဥဦဧဨဩဪါာိီုူေဲဳဴဵံ့း္်ျြွှဿ၀၁၂၃၄၅၆၇၈၉၊။၌၍၎၏ၐၑၓၚၛၜၝၞၟၠၡၢၣၤၥၨၪၰၱၲၳၴၵၷၸၹၺၻၼၾၿႀႄႅႆႇႈႉႊႏ႐႒႓႔႕႘႙ႜႝ႟–‘’‚“”•…⇒−

+ !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~£¥¦§©«¬°±²³´µ·¸¹º»¾ÀÁÂÄÅÆÇÉÊÌÍÑÓÖ×ØÜÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüþĀāıŒœŠšŽžƒːμπကခဂဃငစဆဇဈဉညဋဌဍဎဏတထဒဓနပဖဗဘမယရလဝသဟဠအဢဣဤဥဦဧဨဩဪါာိီုူေဲဳဴဵံ့း္်ျြွှဿ၀၁၂၃၄၅၆၇၈၉၊။၌၍၎၏ၐၑၓၚၛၜၝၞၟၠၡၢၣၤၥၨၪၰၱၲၳၴၵၷၸၹၺၻၼၾၿႀႄႅႆႇႈႉႊႏ႐႒႓႔႕႘႙ႜႝ႟–‘’‚“”•…−

monocr.ckpt CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d19efde62ec0c404cbb4aca9175ed4eefcaa5ed8d8e6634218a64e4cf8309281
-size 177671597

 version https://git-lfs.github.com/spec/v1
+oid sha256:c126c884a0c42a2a14ac293a550dbe315b35446dfc53bcf9a650343b5a911f83
+size 105620581

monocr.json CHANGED Viewed

@@ -1,6 +1,7 @@
 {
-  "charset": "!\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~£¥¦§©«¬°±²³´·¸¹»ÀÁÂÄÅÆÇÉÊÌÍÑÓÖ×ØÜÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüþĀāăćčĐđĒēėěğġĦħīİıņŋŌōőŒœŚśşŠšţũŪūŻŽž˥˦२๑๒๕๖๘་།༥ကခဂဃငစဆဇဈဉညဋဌဍဎဏတထဒဓနပဖဗဘမယရလဝသဟဠအဢဣဤဥဦဧဨဩဪါာိီုူေဲဳဴဵံ့း္်ျြွှဿ၀၁၂၃၄၅၆၇၈၉၊။၌၍၎၏ၐၑၓၚၛၜၝၞၟၠၡၢၣၤၥၨၪၰၱၲၳၴၵၷၸၹၺၻၼၾၿႀႄႅႆႇႈႉႊႏ႐႒႓႔႕႘႙ႜႝ႟–‘’‚“”•…⇒−",
-  "img_height": 64,
-  "opset_version": 16,
-  "note": "Exported with monocr.export.onnx"
 }

 {
+  "charset": " !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~£¥¦§©«¬°±²³´µ·¸¹º»¾ÀÁÂÄÅÆÇÉÊÌÍÑÓÖ×ØÜÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüþĀāıŒœŠšŽžƒːμπကခဂဃငစဆဇဈဉညဋဌဍဎဏတထဒဓနပဖဗဘမယရလဝသဟဠအဢဣဤဥဦဧဨဩဪါာိီုူေဲဳဴဵံ့း္်ျြွှဿ၀၁၂၃၄၅၆၇၈၉၊။၌၍၎၏ၐၑၓၚၛၜၝၞၟၠၡၢၣၤၥၨၪၰၱၲၳၴၵၷၸၹၺၻၼၾၿႀႄႅႆႇႈႉႊႏ႐႒႓႔႕႘႙ႜႝ႟–‘’‚“”•…−",
+  "img_height": 128,
+  "opset_version": 17,
+  "model_version": "2.0",
+  "architecture": "MobileNetV3-Large + BiLSTM + CTC"
 }

onnx/monocr.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "charset": " !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~£¥¦§©«¬°±²³´µ·¸¹º»¾ÀÁÂÄÅÆÇÉÊÌÍÑÓÖ×ØÜÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüþĀāıŒœŠšŽžƒːμπကခဂဃငစဆဇဈဉညဋဌဍဎဏတထဒဓနပဖဗဘမယရလဝသဟဠအဢဣဤဥဦဧဨဩဪါာိီုူေဲဳဴဵံ့း္်ျြွှဿ၀၁၂၃၄၅၆၇၈၉၊။၌၍၎၏ၐၑၓၚၛၜၝၞၟၠၡၢၣၤၥၨၪၰၱၲၳၴၵၷၸၹၺၻၼၾၿႀႄႅႆႇႈႉႊႏ႐႒႓႔႕႘႙ႜႝ႟–‘’‚“”•…−",
+  "img_height": 128,
+  "opset_version": 17,
+  "model_version": "2.0",
+  "architecture": "MobileNetV3-Large + BiLSTM + CTC"
+}

onnx/monocr.onnx CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d67db77db0e495fd7bb6169d92b1f3bac31f0bb128f8fd3045f2404607ce5d2a
-size 58012307

 version https://git-lfs.github.com/spec/v1
+oid sha256:84b83958e51cb3a7a4fc07e8ac87c6f8040419bbd699bc890ccbb927fdf16a14
+size 26342200

pytorch/monocr.ckpt CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5ff35b5bef078fd69983804e8cb517c923230f1069c09c50bce4a355deda0868
-size 173430125

 version https://git-lfs.github.com/spec/v1
+oid sha256:c126c884a0c42a2a14ac293a550dbe315b35446dfc53bcf9a650343b5a911f83
+size 105620581