Character-Based Language Model (GRU)

A character-level text generation model trained on ~2,700 business news articles. Generates text one character at a time using a GRU recurrent neural network.

Model Architecture

Component Details
Embedding 106 chars -> 128d vectors
GRU 512 units, dropout=0.2
Dense 106 output classes
Total params 1,054,058

Usage

import tensorflow as tf
import json
from huggingface_hub import hf_hub_download

# Download files
model_path = hf_hub_download(repo_id="ismail99/char-based-language-model", filename="model.keras")
vocab_path = hf_hub_download(repo_id="ismail99/char-based-language-model", filename="vocab.json")

# Load vocabulary and build lookup layers
with open(vocab_path) as f:
    vocab = json.load(f)

get_ids = tf.keras.layers.StringLookup(vocabulary=vocab, mask_token=None)
get_chars = tf.keras.layers.StringLookup(
    vocabulary=get_ids.get_vocabulary(), invert=True, mask_token=None
)

# Load model (requires CharModel class definition)
model = tf.keras.models.load_model(model_path)

Training Details

  • Dataset: 2,692 business news articles (~4,483,812 characters after cleaning)
  • Sequence length: 100
  • Epochs: 30 (with early stopping, patience=5)
  • Optimizer: Adam
  • Validation split: 10%

Limitations

This is a small character-level model trained on a narrow domain (business news). It produces plausible-looking news-style text but not factually accurate content. Out-of-domain seeds will produce lower quality output.

license: mit

language: - en

pipeline_tag: text-generation

Downloads last month
103
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support