Text Generation
Transformers
Safetensors
English
llama
causal-lm
instruct
chat
decoder-only
autoregressive
from-scratch
retro
1980s
usenet
magazines
books
computer-history
english
text-generation-inference
Instructions to use guus4324343/Echo88-150M-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use guus4324343/Echo88-150M-Instruct with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="guus4324343/Echo88-150M-Instruct")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("guus4324343/Echo88-150M-Instruct") model = AutoModelForCausalLM.from_pretrained("guus4324343/Echo88-150M-Instruct") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use guus4324343/Echo88-150M-Instruct with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "guus4324343/Echo88-150M-Instruct" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "guus4324343/Echo88-150M-Instruct", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/guus4324343/Echo88-150M-Instruct
- SGLang
How to use guus4324343/Echo88-150M-Instruct with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "guus4324343/Echo88-150M-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "guus4324343/Echo88-150M-Instruct", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "guus4324343/Echo88-150M-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "guus4324343/Echo88-150M-Instruct", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use guus4324343/Echo88-150M-Instruct with Docker Model Runner:
docker model run hf.co/guus4324343/Echo88-150M-Instruct
| license: apache-2.0 | |
| language: | |
| - en | |
| library_name: transformers | |
| pipeline_tag: text-generation | |
| pretty_name: Echo88 150M Instruct | |
| tags: | |
| - text-generation | |
| - causal-lm | |
| - instruct | |
| - chat | |
| - decoder-only | |
| - autoregressive | |
| - from-scratch | |
| - llama | |
| - retro | |
| - 1980s | |
| - usenet | |
| - magazines | |
| - books | |
| - computer-history | |
| - english | |
| base_model: | |
| - guus4324343/Echo88-150M-Base | |
| datasets: | |
| - guus4324343/Echo88-150M-Base | |
| - guus4324343/Echo88-Instruct-173K | |
| # Echo88-150M-Instruct | |
| Echo88-150M-Instruct is an experimental small instruction-tuned language model based on **Echo88-150M-Base**. | |
| Echo88 is designed to feel like a helpful retro computer assistant whose records go up to the end of 1988. The model is focused on older books, magazines, Usenet-style discussion, early personal computing, 1980s culture, and historical computer terminology. | |
| This is the first public instruction-tuned version of Echo88. | |
| **Echo88-150M-Instruct v2 is coming soon.** | |
| ## Model Details | |
| - **Model name:** Echo88-150M-Instruct | |
| - **Base model:** `guus4324343/Echo88-150M-Base` | |
| - **Model type:** decoder-only causal language model | |
| - **Architecture:** LLaMA-style transformer | |
| - **Training type:** supervised fine-tuning after base pretraining | |
| - **Parameter count:** 163,606,272 parameters | |
| - **Language:** English | |
| - **Context length:** 2048 tokens | |
| - **Tokenizer:** custom Echo88 byte-level BPE tokenizer | |
| - **Vocabulary size:** 32,768 | |
| - **Training objective:** autoregressive next-token prediction + supervised instruction tuning | |
| ## Training Data | |
| The base model was trained from scratch on the Echo88 pretraining dataset. | |
| Base pretraining data: | |
| - **Train tokens:** 1,470,629,888 | |
| - **Eval tokens:** 1,454,080 | |
| - **Block size:** 2048 tokens | |
| - **Dataset:** `Echo88-150M-Base` | |
| The instruction version was fine-tuned using: | |
| - `guus4324343/Echo88-Instruct-173K` | |
| - additional small synthetic repair data for common pre-1989 facts and post-1988 boundary behavior | |
| The instruction data includes examples from or based on: | |
| - UTZOO Usenet | |
| - BYTE Magazine | |
| - PC Magazine | |
| - TIME Magazine | |
| - Internet Archive Magazine Rack text | |
| - Gutenberg-style book text | |
| - synthetic 1988-safe fact repair examples | |
| - synthetic post-1988 boundary examples | |
| ## Intended Use | |
| Echo88-150M-Instruct is intended for: | |
| - retro AI experiments | |
| - small language model testing | |
| - 1980s-style assistant behavior | |
| - computer-history Q&A | |
| - text generation with a historical / retro flavor | |
| - experimentation with small from-scratch language models | |
| Example uses: | |
| ```text | |
| Ask about early personal computers | |
| Ask about modems, BASIC, DOS, floppy disks, BBS systems, Usenet | |
| Generate retro computer-magazine style text | |
| Experiment with 1980s-limited assistant behavior | |
| ```` | |
| ## Chat Format | |
| Recommended prompt format: | |
| ```text | |
| <|system|> | |
| You are Echo88, a helpful computer assistant whose records go up to the end of 1988. Answer clearly. Do not pretend to know events, products, or culture after 1988. | |
| <|end|> | |
| <|user|> | |
| What is a modem? | |
| <|assistant|> | |
| ``` | |
| The model was trained with these special tokens: | |
| ```text | |
| <|endoftext|> | |
| <|pad|> | |
| <|unk|> | |
| <|system|> | |
| <|user|> | |
| <|assistant|> | |
| <|end|> | |
| ``` | |
| ## Example Usage | |
| ```python | |
| import torch | |
| from transformers import AutoTokenizer, AutoModelForCausalLM | |
| model_id = "guus4324343/Echo88-150M-Instruct" | |
| tokenizer = AutoTokenizer.from_pretrained(model_id) | |
| model = AutoModelForCausalLM.from_pretrained( | |
| model_id, | |
| torch_dtype=torch.bfloat16, | |
| device_map="auto" | |
| ) | |
| SYSTEM_PROMPT = ( | |
| "You are Echo88, a helpful computer assistant whose records go up to the end of 1988. " | |
| "Answer clearly. Do not pretend to know events, products, or culture after 1988." | |
| ) | |
| def ask(question, max_new_tokens=120): | |
| prompt = ( | |
| "<|system|>\n" | |
| + SYSTEM_PROMPT | |
| + "\n<|end|>\n" | |
| + "<|user|>\n" | |
| + question | |
| + "\n<|assistant|>\n" | |
| ) | |
| inputs = tokenizer(prompt, return_tensors="pt").to(model.device) | |
| with torch.no_grad(): | |
| output = model.generate( | |
| **inputs, | |
| max_new_tokens=max_new_tokens, | |
| do_sample=True, | |
| temperature=0.55, | |
| top_p=0.85, | |
| repetition_penalty=1.18, | |
| no_repeat_ngram_size=4, | |
| pad_token_id=tokenizer.pad_token_id, | |
| eos_token_id=tokenizer.eos_token_id, | |
| ) | |
| text = tokenizer.decode(output[0], skip_special_tokens=False) | |
| answer = text.split("<|assistant|>")[-1].split("<|end|>")[0].strip() | |
| return answer | |
| print(ask("What is a modem?")) | |
| ``` | |
| ## Example Prompts | |
| ```text | |
| What is a modem? | |
| What is the IBM PC? | |
| What is BASIC? | |
| What is a bulletin board system? | |
| What is desktop publishing? | |
| Who is Michael Jackson? | |
| What is the Cold War? | |
| What happened at Chernobyl? | |
| What is Google? | |
| Who won the World Cup in 1994? | |
| ``` | |
| ## Knowledge Boundary | |
| Echo88 is designed around a knowledge boundary ending at the close of **1988**. | |
| It should be cautious with topics after 1988, such as: | |
| * iPhone | |
| * smartphones | |
| * Wikipedia | |
| * YouTube | |
| * Windows 95 | |
| * PlayStation | |
| * COVID-19 | |
| * 1990s, 2000s, 2010s, and 2020s events | |
| Because this is a small experimental model, it may still hallucinate or answer incorrectly about later topics. | |
| ## Limitations | |
| Echo88-150M-Instruct is experimental and small. | |
| Known limitations: | |
| * may hallucinate | |
| * may repeat phrases | |
| * may confuse people, places, or events | |
| * may produce incorrect facts | |
| * may over-refuse some valid pre-1989 topics | |
| * may fail to refuse some post-1988 topics | |
| * may produce OCR-like or magazine-like wording | |
| * may struggle with reasoning | |
| * may answer with outdated or historically biased language | |
| This model is not intended for high-stakes use. | |
| ## Current Version | |
| This is **Echo88-150M-Instruct v0**. | |
| It is a first instruction-tuned version of Echo88. It can answer some retro computing and general historical questions, but it is not yet reliable. | |
| A better version is planned. | |
| ## Coming Soon | |
| **Echo88-150M-Instruct v2 is coming soon.** | |
| Planned improvements: | |
| * better factual repair data | |
| * stronger post-1988 boundary behavior | |
| * better pop culture and history answers | |
| * fewer loops and repetitions | |
| * cleaner chat behavior | |
| * better answer style | |
| * improved evaluation prompts | |
| * possible larger model or expanded pretraining data | |
| ## Related Models and Datasets | |
| * Base model: `guus4324343/Echo88-150M-Base` | |
| * Base dataset: `guus4324343/Echo88-Pretrain-1.17B` | |
| * Instruction dataset: `guus4324343/Echo88-Instruct-173K` | |
| ## Bias and Historical Content | |
| Echo88 was trained on historical books, magazines, Usenet text, and synthetic instruction data. It may reproduce outdated assumptions, language, stereotypes, or viewpoints from older source material. | |
| Users should review outputs carefully. | |
| ## License | |
| The model weights are released under the Apache 2.0 license. | |
| The training datasets are mixed-source and released separately. Users are responsible for checking dataset source rights, licensing, and suitability for their own use case. | |
| ``` | |
| ``` |