--- --- language: en library_name: tensorflow tags: - character-level - text-generation - gru - tensorflow - keras pipeline_tag: text-generation --- # Character-Based Language Model (GRU) A character-level text generation model trained on ~2,700 business news articles. Generates text one character at a time using a GRU recurrent neural network. ## Model Architecture | Component | Details | |-----------|--------| | Embedding | 106 chars -> 128d vectors | | GRU | 512 units, dropout=0.2 | | Dense | 106 output classes | | Total params | 1,054,058 | ## Usage ```python import tensorflow as tf import json from huggingface_hub import hf_hub_download # Download files model_path = hf_hub_download(repo_id="ismail99/char-based-language-model", filename="model.keras") vocab_path = hf_hub_download(repo_id="ismail99/char-based-language-model", filename="vocab.json") # Load vocabulary and build lookup layers with open(vocab_path) as f: vocab = json.load(f) get_ids = tf.keras.layers.StringLookup(vocabulary=vocab, mask_token=None) get_chars = tf.keras.layers.StringLookup( vocabulary=get_ids.get_vocabulary(), invert=True, mask_token=None ) # Load model (requires CharModel class definition) model = tf.keras.models.load_model(model_path) ``` ## Training Details - **Dataset**: 2,692 business news articles (~4,483,812 characters after cleaning) - **Sequence length**: 100 - **Epochs**: 30 (with early stopping, patience=5) - **Optimizer**: Adam - **Validation split**: 10% ## Limitations This is a small character-level model trained on a narrow domain (business news). It produces plausible-looking news-style text but not factually accurate content. Out-of-domain seeds will produce lower quality output. license: mit language: - en - pipeline_tag: text-generation ---