YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

NanoGPT Nietzsche - Bigram Language Model

This project fine-tunes a GPT model on Friedrich Nietzsche’s Thus Spoke Zarathustra, based on Karpathy’s NanoGPT implementation. The goal is to explore philosophical text generation and analyze how well a transformer model can replicate Nietzsche’s writing style.

Features

  • Implementation of a Bigram Transformer-based Language Model.
  • Training on Nietzsche's Thus Spoke Zarathustra.
  • Generation of text with custom prompts.

Model Overview

The model is a lightweight Transformer-based architecture with:

  • Self-attention and feedforward layers.
  • Positional embeddings for sequence modeling.
  • Token embeddings for vocabulary handling.

Training Details

  • Dataset: Thus Spoke Zarathustra (Project Gutenberg)
  • Hyperparameters:
    • Embedding size: 384
    • Number of layers: 6
    • Number of heads: 6
    • Context window: 128 tokens
  • Training loss after 5000 steps: 1.64
  • Validation loss after 5000 steps: 1.85

Example Output

Prompt: tell me about life!

Output:

tell me about life! O hair look at the suffering and forgetfulness.

O Zarathustra, the stronger selfishness is the evidently all over infelling because he had it PRAD, thou lovest time with the values him.

...

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support