Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,73 @@
|
|
| 1 |
-
---
|
| 2 |
-
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
---
|
| 3 |
+
language: en
|
| 4 |
+
library_name: tensorflow
|
| 5 |
+
tags:
|
| 6 |
+
- character-level
|
| 7 |
+
- text-generation
|
| 8 |
+
- gru
|
| 9 |
+
- tensorflow
|
| 10 |
+
- keras
|
| 11 |
+
pipeline_tag: text-generation
|
| 12 |
+
---
|
| 13 |
+
|
| 14 |
+
# Character-Based Language Model (GRU)
|
| 15 |
+
|
| 16 |
+
A character-level text generation model trained on ~2,700 business news articles.
|
| 17 |
+
Generates text one character at a time using a GRU recurrent neural network.
|
| 18 |
+
|
| 19 |
+
## Model Architecture
|
| 20 |
+
|
| 21 |
+
| Component | Details |
|
| 22 |
+
|-----------|--------|
|
| 23 |
+
| Embedding | 106 chars -> 128d vectors |
|
| 24 |
+
| GRU | 512 units, dropout=0.2 |
|
| 25 |
+
| Dense | 106 output classes |
|
| 26 |
+
| Total params | 1,054,058 |
|
| 27 |
+
|
| 28 |
+
## Usage
|
| 29 |
+
|
| 30 |
+
```python
|
| 31 |
+
import tensorflow as tf
|
| 32 |
+
import json
|
| 33 |
+
from huggingface_hub import hf_hub_download
|
| 34 |
+
|
| 35 |
+
# Download files
|
| 36 |
+
model_path = hf_hub_download(repo_id="ismail99/char-based-language-model", filename="model.keras")
|
| 37 |
+
vocab_path = hf_hub_download(repo_id="ismail99/char-based-language-model", filename="vocab.json")
|
| 38 |
+
|
| 39 |
+
# Load vocabulary and build lookup layers
|
| 40 |
+
with open(vocab_path) as f:
|
| 41 |
+
vocab = json.load(f)
|
| 42 |
+
|
| 43 |
+
get_ids = tf.keras.layers.StringLookup(vocabulary=vocab, mask_token=None)
|
| 44 |
+
get_chars = tf.keras.layers.StringLookup(
|
| 45 |
+
vocabulary=get_ids.get_vocabulary(), invert=True, mask_token=None
|
| 46 |
+
)
|
| 47 |
+
|
| 48 |
+
# Load model (requires CharModel class definition)
|
| 49 |
+
model = tf.keras.models.load_model(model_path)
|
| 50 |
+
```
|
| 51 |
+
|
| 52 |
+
## Training Details
|
| 53 |
+
|
| 54 |
+
- **Dataset**: 2,692 business news articles (~4,483,812 characters after cleaning)
|
| 55 |
+
- **Sequence length**: 100
|
| 56 |
+
- **Epochs**: 30 (with early stopping, patience=5)
|
| 57 |
+
- **Optimizer**: Adam
|
| 58 |
+
- **Validation split**: 10%
|
| 59 |
+
|
| 60 |
+
## Limitations
|
| 61 |
+
|
| 62 |
+
This is a small character-level model trained on a narrow domain (business news).
|
| 63 |
+
It produces plausible-looking news-style text but not factually accurate content.
|
| 64 |
+
Out-of-domain seeds will produce lower quality output.
|
| 65 |
+
|
| 66 |
+
|
| 67 |
+
license: mit
|
| 68 |
+
|
| 69 |
+
language:
|
| 70 |
+
- en
|
| 71 |
+
-
|
| 72 |
+
pipeline_tag: text-generation
|
| 73 |
+
---
|