Ismail99 commited on
Commit
d018134
·
verified ·
1 Parent(s): 2a39e6f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +73 -3
README.md CHANGED
@@ -1,3 +1,73 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ ---
3
+ language: en
4
+ library_name: tensorflow
5
+ tags:
6
+ - character-level
7
+ - text-generation
8
+ - gru
9
+ - tensorflow
10
+ - keras
11
+ pipeline_tag: text-generation
12
+ ---
13
+
14
+ # Character-Based Language Model (GRU)
15
+
16
+ A character-level text generation model trained on ~2,700 business news articles.
17
+ Generates text one character at a time using a GRU recurrent neural network.
18
+
19
+ ## Model Architecture
20
+
21
+ | Component | Details |
22
+ |-----------|--------|
23
+ | Embedding | 106 chars -> 128d vectors |
24
+ | GRU | 512 units, dropout=0.2 |
25
+ | Dense | 106 output classes |
26
+ | Total params | 1,054,058 |
27
+
28
+ ## Usage
29
+
30
+ ```python
31
+ import tensorflow as tf
32
+ import json
33
+ from huggingface_hub import hf_hub_download
34
+
35
+ # Download files
36
+ model_path = hf_hub_download(repo_id="ismail99/char-based-language-model", filename="model.keras")
37
+ vocab_path = hf_hub_download(repo_id="ismail99/char-based-language-model", filename="vocab.json")
38
+
39
+ # Load vocabulary and build lookup layers
40
+ with open(vocab_path) as f:
41
+ vocab = json.load(f)
42
+
43
+ get_ids = tf.keras.layers.StringLookup(vocabulary=vocab, mask_token=None)
44
+ get_chars = tf.keras.layers.StringLookup(
45
+ vocabulary=get_ids.get_vocabulary(), invert=True, mask_token=None
46
+ )
47
+
48
+ # Load model (requires CharModel class definition)
49
+ model = tf.keras.models.load_model(model_path)
50
+ ```
51
+
52
+ ## Training Details
53
+
54
+ - **Dataset**: 2,692 business news articles (~4,483,812 characters after cleaning)
55
+ - **Sequence length**: 100
56
+ - **Epochs**: 30 (with early stopping, patience=5)
57
+ - **Optimizer**: Adam
58
+ - **Validation split**: 10%
59
+
60
+ ## Limitations
61
+
62
+ This is a small character-level model trained on a narrow domain (business news).
63
+ It produces plausible-looking news-style text but not factually accurate content.
64
+ Out-of-domain seeds will produce lower quality output.
65
+
66
+
67
+ license: mit
68
+
69
+ language:
70
+ - en
71
+ -
72
+ pipeline_tag: text-generation
73
+ ---