Ismail99
/

char-based-language-model

Text Generation

character-level

Model card Files Files and versions

Ismail99 commited on 7 days ago

Commit

d018134

·

verified ·

1 Parent(s): 2a39e6f

Update README.md

Files changed (1) hide show

README.md +73 -3

README.md CHANGED Viewed

@@ -1,3 +1,73 @@
----
-license: mit
----

+---
+---
+language: en
+library_name: tensorflow
+tags:
+  - character-level
+  - text-generation
+  - gru
+  - tensorflow
+  - keras
+pipeline_tag: text-generation
+---
+# Character-Based Language Model (GRU)
+A character-level text generation model trained on ~2,700 business news articles.
+Generates text one character at a time using a GRU recurrent neural network.
+## Model Architecture
+| Component | Details |
+|-----------|--------|
+| Embedding | 106 chars -> 128d vectors |
+| GRU | 512 units, dropout=0.2 |
+| Dense | 106 output classes |
+| Total params | 1,054,058 |
+## Usage
+```python
+import tensorflow as tf
+import json
+from huggingface_hub import hf_hub_download
+# Download files
+model_path = hf_hub_download(repo_id="ismail99/char-based-language-model", filename="model.keras")
+vocab_path = hf_hub_download(repo_id="ismail99/char-based-language-model", filename="vocab.json")
+# Load vocabulary and build lookup layers
+with open(vocab_path) as f:
+    vocab = json.load(f)
+get_ids = tf.keras.layers.StringLookup(vocabulary=vocab, mask_token=None)
+get_chars = tf.keras.layers.StringLookup(
+    vocabulary=get_ids.get_vocabulary(), invert=True, mask_token=None
+)
+# Load model (requires CharModel class definition)
+model = tf.keras.models.load_model(model_path)
+```
+## Training Details
+- **Dataset**: 2,692 business news articles (~4,483,812 characters after cleaning)
+- **Sequence length**: 100
+- **Epochs**: 30 (with early stopping, patience=5)
+- **Optimizer**: Adam
+- **Validation split**: 10%
+## Limitations
+This is a small character-level model trained on a narrow domain (business news).
+It produces plausible-looking news-style text but not factually accurate content.
+Out-of-domain seeds will produce lower quality output.
+license: mit
+language:
+- en
+-
+pipeline_tag: text-generation
+---