--- license: mit language: - en base_model: - TinyLlama/TinyLlama-1.1B-Chat-v1.0 pipeline_tag: question-answering tags: - art --- # tinygoop-1.1b ## Model Description A fine-tuned version of TinyLlama-1.1B-Chat with room temp iq -> quantized to 4 bits and trained on copypastas ## Intended Use - **Primary Use:** Not much, it barely can hold a conversation - **Secondary Uses:** brainrot generation, funny responses - **Out-of-scope:** Professional/business applications, factual question answering, safety-critical applications --- ## Training Data **Sources:** - 334,165 copypastas - The script from the television show "House" ### Hardware used in training - **GPU:** NVIDIA GeForce RTX 4090 - **CUDA:** 12.1 - **Framework:** PyTorch 2.5.1+cu121 - **Transformers:** Latest - **PEFT:** Latest - **BitsAndBytes:** 4-bit quantization --- ### Basic Usage ```python import torch from transformers import AutoTokenizer, AutoModelForCausalLM model_id = "S-teven/tinygoop-1.1b" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype=torch.float16, device_map="auto" ) prompt = "hey" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate( **inputs, max_new_tokens=256, do_sample=True, temperature=1.2, top_p=0.95, repetition_penalty=1.05 ) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ### Hardware Requirements | Precision | VRAM Required | Hardware | |-----------|---------------|----------| | 4-bit Quantized | ~800MB | Any modern GPU | | CPU (FP32) | ~4GB RAM | Modern CPU (slow) | --- ## Limitations & Biases **Content Warning:** This model was trained on copypasta data and may generate: - Offensive or inappropriate content - Nonsensical or chaotic responses - Biases present in online communities **Not suitable for:** - Most things - Professional or business use - Educational applications - Factual information retrieval - Content requiring safety guarantees