ENC-PSL
/

Medusa0.1Line-9B

handwritten-text-recognition

vision-language-model

Model card Files Files and versions

TheoMoins commited on 29 days ago

Commit

f762f49

·

verified ·

1 Parent(s): 5fad348

Update README.md

Files changed (1) hide show

README.md +1 -20

README.md CHANGED Viewed

@@ -147,7 +147,7 @@ The models can also be used directly outside of DocWorkflow, though the CATMuS p
 from transformers import AutoProcessor, AutoModelForImageTextToText
 from PIL import Image
-model_id = "ENC-PSL/MEDUSA-9B-0.1"
 processor = AutoProcessor.from_pretrained(model_id)
 model = AutoModelForImageTextToText.from_pretrained(model_id, device_map="auto")
@@ -195,25 +195,6 @@ Transcriptions follow the [CATMuS guidelines](https://catmus-guidelines.github.i
 ---
-## Training details
-| Parameter | Value |
-|---|---|
-| Base models | Qwen3.5-4B and Qwen3.5-9B |
-| Fine-tuning method | LoRA (via Unsloth) |
-| LoRA rank | 64 |
-| Training data levels | Gold + Platinum (mixed), then Platinum only |
-| Training epochs | 3 (mixed) + 1–3 (Platinum only) |
-| Max sequence length | 512 |
-| Max pixels per image | 401,408 |
-| Batch size | 32 (effective) |
-| Learning rate | 5 × 10⁻⁵ |
-| Framework | DocWorkflow + Unsloth |
-Total training data: ~643,000 lines across Gold, Platinum, and original data (see system report for full dataset list).
----
 ## Citation
 If you use MEDUSA in your research, please cite:

 from transformers import AutoProcessor, AutoModelForImageTextToText
 from PIL import Image
+model_id = "ENC-PSL/Medusa0.1Line-9B"
 processor = AutoProcessor.from_pretrained(model_id)
 model = AutoModelForImageTextToText.from_pretrained(model_id, device_map="auto")
 ---
 ## Citation
 If you use MEDUSA in your research, please cite: