kadalicious22 commited on
Commit
6af865d
ยท
verified ยท
1 Parent(s): f5678a3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -31
README.md CHANGED
@@ -29,31 +29,31 @@ pipeline_tag: image-text-to-text
29
  [![Language](https://img.shields.io/badge/Language-ID%20%7C%20EN-green)](https://huggingface.co/kadalicious22/snapgate-VL-4B)
30
  [![Website](https://img.shields.io/badge/Website-snapgate.tech-purple)](https://snapgate.tech)
31
 
32
- **snapgate-code-4B** adalah model vision-language multimodal hasil fine-tuning dari [Qwen3-VL-4B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-4B-Instruct) menggunakan **QLoRA**, dioptimalkan khusus untuk kebutuhan **developer** dan **desainer** โ€” memahami gambar sekaligus teks dengan presisi tinggi.
33
 
34
- *Dikembangkan oleh [Snapgate](https://snapgate.tech) ยท Made with โค๏ธ in Indonesia ๐Ÿ‡ฎ๐Ÿ‡ฉ*
35
 
36
  </div>
37
 
38
  ---
39
 
40
- ## ๐Ÿง  Kemampuan Utama
41
 
42
- | Kemampuan | Deskripsi |
43
  |-----------|-----------|
44
- | ๐Ÿ’ป **Code Generation & Review** | Menulis, menganalisis, debug, dan mengoptimalkan kode (Python, JS, TS, HTML/CSS, SQL, dll.) |
45
- | ๐ŸŽจ **UI/UX Design Analysis** | Menganalisis screenshot antarmuka, memberikan saran desain, mengidentifikasi masalah UX |
46
- | ๐Ÿ–ผ๏ธ **Design to Code** | Mengkonversi mockup, wireframe, atau screenshot UI menjadi kode HTML/CSS/React/Tailwind |
47
- | ๐Ÿ—๏ธ **Diagram & Architecture** | Memahami diagram alur, arsitektur sistem, ERD, dan flowchart teknis |
48
- | ๐Ÿ“ธ **Code from Image** | Membaca dan menjelaskan kode dari screenshot atau foto |
49
- | ๐Ÿ“ **Technical Documentation** | Membuat dokumentasi teknis yang jelas, terstruktur, dan profesional |
50
 
51
  ---
52
 
53
  ## ๐Ÿ”ง Training Configuration
54
 
55
  <details>
56
- <summary><b>Klik untuk lihat detail training</b></summary>
57
 
58
  | Parameter | Value |
59
  |-----------|-------|
@@ -71,10 +71,10 @@ pipeline_tag: image-text-to-text
71
  | ๐ŸŽ›๏ธ Precision | `bfloat16` |
72
  | ๐Ÿ–ฅ๏ธ Hardware | NVIDIA T4 ยท Google Colab |
73
  | ๐Ÿ“ฆ Dataset | 200 samples internal Snapgate |
74
- | ๐Ÿท๏ธ Kategori | 10 kategori ยท 20 samples each |
75
  | ๐Ÿ“Š Format | ShareGPT |
76
 
77
- **Kategori Dataset:**
78
  `code_generation` ยท `code_review` ยท `debugging` ยท `refactoring` ยท `ui_html_css` ยท `ui_react` ยท `ui_tailwind` ยท `design_system` ยท `ux_analysis` ยท `design_to_code`
79
 
80
  </details>
@@ -83,7 +83,7 @@ pipeline_tag: image-text-to-text
83
 
84
  ## ๐Ÿ“Š Training Progress
85
 
86
- Loss turun konsisten selama training โ€” dari **1.242 โ†’ 0.444** โœ…
87
 
88
  ```
89
  Step 5 โ”‚โ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ”‚ Loss: 1.242
@@ -105,7 +105,7 @@ Step 75 โ”‚โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ”‚ Los
105
 
106
  ---
107
 
108
- ## ๐Ÿš€ Cara Penggunaan
109
 
110
  ### 1. Install Dependencies
111
 
@@ -129,11 +129,11 @@ model = Qwen3VLForConditionalGeneration.from_pretrained(
129
  trust_remote_code=True,
130
  )
131
 
132
- SYSTEM_PROMPT = """Kamu adalah Snapgate AI, asisten AI multimodal milik Snapgate \
133
- yang ahli dalam bidang coding dan UI/UX design."""
134
  ```
135
 
136
- ### 3. Inference dengan Gambar
137
 
138
  ```python
139
  from qwen_vl_utils import process_vision_info
@@ -144,7 +144,7 @@ messages = [
144
  "role": "user",
145
  "content": [
146
  {"type": "image", "image": "path/to/your/image.png"},
147
- {"type": "text", "text": "Analisis UI dari gambar ini dan buat kode HTML/CSS-nya."},
148
  ],
149
  },
150
  ]
@@ -166,12 +166,12 @@ response = processor.batch_decode(generated, skip_special_tokens=True)[0]
166
  print(response)
167
  ```
168
 
169
- ### 4. Inference Teks Saja
170
 
171
  ```python
172
  messages = [
173
  {"role": "system", "content": SYSTEM_PROMPT},
174
- {"role": "user", "content": "Buatkan fungsi Python untuk validasi email dengan regex."},
175
  ]
176
 
177
  text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
@@ -189,18 +189,18 @@ print(response)
189
 
190
  ---
191
 
192
- ## โš ๏ธ Limitasi
193
 
194
- - ๐Ÿ“ฆ Di-training pada dataset internal Snapgate yang relatif kecil (200 samples) โ€” performa akan terus meningkat seiring penambahan data
195
- - ๐ŸŒ Dioptimalkan untuk Bahasa Indonesia dan Inggris; bahasa lain belum diuji
196
- - ๐ŸŽฏ Performa terbaik pada task coding dan UI analysis; kurang optimal untuk domain di luar itu (misal: sains, hukum, medis)
197
- - ๐Ÿ–ฅ๏ธ Direkomendasikan minimal GPU dengan 8GB VRAM untuk inference yang nyaman
198
 
199
  ---
200
 
201
- ## ๐Ÿ“„ Lisensi
202
 
203
- Dirilis di bawah lisensi **Apache 2.0**, mengikuti lisensi base model [Qwen3-VL-4B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-4B-Instruct).
204
 
205
  ---
206
 
@@ -210,7 +210,6 @@ Dirilis di bawah lisensi **Apache 2.0**, mengikuti lisensi base model [Qwen3-VL-
210
  |---|---|
211
  | ๐ŸŒ Website | [snapgate.tech](https://snapgate.tech) |
212
  | ๐Ÿค— Base Model | [Qwen/Qwen3-VL-4B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-4B-Instruct) |
213
- | ๐Ÿ“ง Contact | Via website Snapgate |
214
-
215
- ---
216
 
 
 
29
  [![Language](https://img.shields.io/badge/Language-ID%20%7C%20EN-green)](https://huggingface.co/kadalicious22/snapgate-VL-4B)
30
  [![Website](https://img.shields.io/badge/Website-snapgate.tech-purple)](https://snapgate.tech)
31
 
32
+ **snapgate-code-4B** is a multimodal vision-language model fine-tuned from [Qwen3-VL-4B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-4B-Instruct) using **QLoRA**, specifically optimized for **developers** and **designers** โ€” understanding both images and text with high precision.
33
 
34
+ *Developed by [Snapgate](https://snapgate.tech) ยท Made with โค๏ธ in Indonesia ๐Ÿ‡ฎ๐Ÿ‡ฉ*
35
 
36
  </div>
37
 
38
  ---
39
 
40
+ ## ๐Ÿง  Core Capabilities
41
 
42
+ | Capability | Description |
43
  |-----------|-----------|
44
+ | ๐Ÿ’ป **Code Generation & Review** | Write, analyze, debug, and optimize code (Python, JS, TS, HTML/CSS, SQL, etc.) |
45
+ | ๐ŸŽจ **UI/UX Design Analysis** | Analyze interface screenshots, provide design suggestions, identify UX issues |
46
+ | ๐Ÿ–ผ๏ธ **Design to Code** | Convert mockups, wireframes, or UI screenshots into HTML/CSS/React/Tailwind code |
47
+ | ๐Ÿ—๏ธ **Diagram & Architecture** | Understand flowcharts, system architecture, ERDs, and technical diagrams |
48
+ | ๐Ÿ“ธ **Code from Image** | Read and explain code from screenshots or photos |
49
+ | ๐Ÿ“ **Technical Documentation** | Generate clear, structured, and professional technical documentation |
50
 
51
  ---
52
 
53
  ## ๐Ÿ”ง Training Configuration
54
 
55
  <details>
56
+ <summary><b>Click to view training details</b></summary>
57
 
58
  | Parameter | Value |
59
  |-----------|-------|
 
71
  | ๐ŸŽ›๏ธ Precision | `bfloat16` |
72
  | ๐Ÿ–ฅ๏ธ Hardware | NVIDIA T4 ยท Google Colab |
73
  | ๐Ÿ“ฆ Dataset | 200 samples internal Snapgate |
74
+ | ๐Ÿท๏ธ Categories | 10 categories ยท 20 samples each |
75
  | ๐Ÿ“Š Format | ShareGPT |
76
 
77
+ **Dataset Categories:**
78
  `code_generation` ยท `code_review` ยท `debugging` ยท `refactoring` ยท `ui_html_css` ยท `ui_react` ยท `ui_tailwind` ยท `design_system` ยท `ux_analysis` ยท `design_to_code`
79
 
80
  </details>
 
83
 
84
  ## ๐Ÿ“Š Training Progress
85
 
86
+ Loss decreased consistently throughout training โ€” from **1.242 โ†’ 0.444** โœ…
87
 
88
  ```
89
  Step 5 โ”‚โ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ”‚ Loss: 1.242
 
105
 
106
  ---
107
 
108
+ ## ๐Ÿš€ Usage
109
 
110
  ### 1. Install Dependencies
111
 
 
129
  trust_remote_code=True,
130
  )
131
 
132
+ SYSTEM_PROMPT = """You are Snapgate AI, a multimodal AI assistant by Snapgate \
133
+ specialized in coding and UI/UX design."""
134
  ```
135
 
136
+ ### 3. Inference with Image
137
 
138
  ```python
139
  from qwen_vl_utils import process_vision_info
 
144
  "role": "user",
145
  "content": [
146
  {"type": "image", "image": "path/to/your/image.png"},
147
+ {"type": "text", "text": "Analyze the UI from this image and generate its HTML/CSS code."},
148
  ],
149
  },
150
  ]
 
166
  print(response)
167
  ```
168
 
169
+ ### 4. Text-Only Inference
170
 
171
  ```python
172
  messages = [
173
  {"role": "system", "content": SYSTEM_PROMPT},
174
+ {"role": "user", "content": "Write a Python function to validate email using regex."},
175
  ]
176
 
177
  text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
 
189
 
190
  ---
191
 
192
+ ## โš ๏ธ Limitations
193
 
194
+ - ๐Ÿ“ฆ Trained on a relatively small internal Snapgate dataset (200 samples) โ€” performance will improve as more data is added
195
+ - ๐ŸŒ Optimized for Indonesian and English; other languages have not been tested
196
+ - ๐ŸŽฏ Best performance on coding and UI analysis tasks; less optimal for other domains (e.g., science, law, medicine)
197
+ - ๐Ÿ–ฅ๏ธ A GPU with at least 8GB VRAM is recommended for comfortable inference
198
 
199
  ---
200
 
201
+ ## ๐Ÿ“„ License
202
 
203
+ Released under the **Apache 2.0** license, following the base model license of [Qwen3-VL-4B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-4B-Instruct).
204
 
205
  ---
206
 
 
210
  |---|---|
211
  | ๐ŸŒ Website | [snapgate.tech](https://snapgate.tech) |
212
  | ๐Ÿค— Base Model | [Qwen/Qwen3-VL-4B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-4B-Instruct) |
213
+ | ๐Ÿ“ง Contact | Via Snapgate website |
 
 
214
 
215
+ ---