Twitch Chat TinyLlama 1.1B GGUF

A fine-tuned TinyLlama model for generating authentic Twitch chat messages.

Model Description

This model was fine-tuned on ~1.4M Twitch chat messages from popular streamers using QLoRA. It generates natural-sounding Twitch chat responses with appropriate emotes and slang.

Training Details

  • Base Model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
  • Training Method: QLoRA (4-bit quantization)
  • Training Data: 418,610 examples from 5 streamers (lirik, admiralbahroo, moonmoon, northernlion, cohhcarnage)
  • Format: Alpaca instruction format

Files

  • model-q4_k_m.gguf - Quantized model (Q4_K_M, ~637 MB)

Usage with llama-cpp-python

from llama_cpp import Llama

llm = Llama(
    model_path="model-q4_k_m.gguf",
    n_ctx=512,
    n_threads=8
)

prompt = """### Instruction:
Generate a Twitch chat message reaction.

### Input:
Streamer is playing a hard game and just died.

### Response:
"""

output = llm(prompt, max_tokens=50, stop=["###", "\n\n"])
print(output["choices"][0]["text"])

License

Apache 2.0 (same as base model)

Downloads last month
22
GGUF
Model size
1B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Catmanjoe/twitch-chat-tinyllama-1.1b-gguf

Quantized
(141)
this model