YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
NanoGPT Nietzsche - Bigram Language Model
This project fine-tunes a GPT model on Friedrich Nietzsche’s Thus Spoke Zarathustra, based on Karpathy’s NanoGPT implementation. The goal is to explore philosophical text generation and analyze how well a transformer model can replicate Nietzsche’s writing style.
Features
- Implementation of a Bigram Transformer-based Language Model.
- Training on Nietzsche's Thus Spoke Zarathustra.
- Generation of text with custom prompts.
Model Overview
The model is a lightweight Transformer-based architecture with:
- Self-attention and feedforward layers.
- Positional embeddings for sequence modeling.
- Token embeddings for vocabulary handling.
Training Details
- Dataset: Thus Spoke Zarathustra (Project Gutenberg)
- Hyperparameters:
- Embedding size: 384
- Number of layers: 6
- Number of heads: 6
- Context window: 128 tokens
- Training loss after 5000 steps: 1.64
- Validation loss after 5000 steps: 1.85
Example Output
Prompt: tell me about life!
Output:
tell me about life! O hair look at the suffering and forgetfulness.
O Zarathustra, the stronger selfishness is the evidently all over infelling because he had it PRAD, thou lovest time with the values him.
...
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support