Text Generation
Transformers
Safetensors
English
llama
causal-lm
base-model
decoder-only
autoregressive
from-scratch
retro
1980s
usenet
magazines
books
computer-history
english
text-generation-inference
Instructions to use guus4324343/Echo88-150M-Base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use guus4324343/Echo88-150M-Base with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="guus4324343/Echo88-150M-Base")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("guus4324343/Echo88-150M-Base") model = AutoModelForCausalLM.from_pretrained("guus4324343/Echo88-150M-Base") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use guus4324343/Echo88-150M-Base with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "guus4324343/Echo88-150M-Base" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "guus4324343/Echo88-150M-Base", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/guus4324343/Echo88-150M-Base
- SGLang
How to use guus4324343/Echo88-150M-Base with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "guus4324343/Echo88-150M-Base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "guus4324343/Echo88-150M-Base", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "guus4324343/Echo88-150M-Base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "guus4324343/Echo88-150M-Base", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use guus4324343/Echo88-150M-Base with Docker Model Runner:
docker model run hf.co/guus4324343/Echo88-150M-Base
| license: apache-2.0 | |
| language: | |
| - en | |
| library_name: transformers | |
| pipeline_tag: text-generation | |
| pretty_name: Echo88 150M Base | |
| tags: | |
| - text-generation | |
| - causal-lm | |
| - base-model | |
| - decoder-only | |
| - autoregressive | |
| - from-scratch | |
| - llama | |
| - retro | |
| - 1980s | |
| - usenet | |
| - magazines | |
| - books | |
| - computer-history | |
| - english | |
| datasets: | |
| - guus4324343/Echo88-150M-Base | |
| # Echo88-150M-Base | |
| Echo88-150M-Base is a small English decoder-only causal language model trained from scratch on the Echo88 pretraining dataset. | |
| Echo88 is designed as a retro language model inspired by the language, culture, computing, magazines, Usenet discussions, and older book text available up to the late 1980s. | |
| This is a **base model**, not an instruction-tuned chatbot. It is trained for next-token prediction and should be fine-tuned before being used as a helpful assistant. | |
| ## Model Details | |
| - **Model name:** Echo88-150M-Base | |
| - **Model type:** decoder-only causal language model | |
| - **Architecture:** LLaMA-style transformer | |
| - **Training type:** from scratch | |
| - **Parameter count:** 163,606,272 parameters | |
| - **Language:** English | |
| - **Context length:** 2048 tokens | |
| - **Tokenizer:** custom Echo88 byte-level BPE tokenizer | |
| - **Vocabulary size:** 32,768 | |
| - **Training objective:** autoregressive next-token prediction | |
| ## Training Data | |
| Echo88-150M-Base was trained on the Echo88 pretraining dataset. | |
| The packed training set contains: | |
| - **Train tokens:** 1,470,629,888 | |
| - **Eval tokens:** 1,454,080 | |
| - **Train blocks:** 718,081 blocks | |
| - **Eval blocks:** 710 blocks | |
| - **Block size:** 2048 tokens | |
| - **Packed dtype:** uint16 | |
| The dataset includes a mixture of: | |
| - public-domain book text | |
| - Gutenberg-style older books | |
| - UTZOO Usenet posts | |
| - BYTE Magazine text | |
| - PC Magazine text | |
| - TIME Magazine text | |
| - Internet Archive Magazine Rack OCR text | |
| - computer and technology magazine text | |
| - general historical magazine text | |
| The dataset emphasizes the 1950s through the late 1980s, with a strong focus on early personal computing, printed magazines, Usenet, and older long-form writing. | |
| Dataset used: | |
| - `guus4324343/Echo88-Pretrain-1.17B` | |
| ## Intended Use | |
| Echo88-150M-Base is intended for: | |
| - causal language modeling | |
| - retro / historical AI experiments | |
| - small language model research | |
| - continued pretraining | |
| - instruction tuning | |
| - 1980s-style assistant experiments | |
| - computer-history language modeling | |
| - training Echo88-150M-Instruct | |
| Recommended flow: | |
| ```text | |
| Echo88-150M-Base | |
| → supervised fine-tuning on Echo88-Instruct-173K | |
| → Echo88-150M-Instruct | |
| ```` | |
| ## Not Instruction Tuned | |
| This model is not instruction tuned. | |
| It may not reliably follow commands, answer questions directly, or behave like a chat assistant. It is a base model that continues text. | |
| Expected behavior: | |
| * continues prompts | |
| * completes paragraphs | |
| * imitates old magazine/book/Usenet style | |
| * may produce raw text instead of direct answers | |
| * may hallucinate | |
| * may repeat phrases | |
| * may generate OCR-like artifacts | |
| For chat behavior, use or create an instruction-tuned version using: | |
| * `guus4324343/Echo88-Instruct-173K` | |
| ## Knowledge Boundary | |
| Echo88 is designed around a historical data mixture ending around the late 1980s. | |
| The model should not be expected to know modern topics such as: | |
| * Wikipedia | |
| * iPhone | |
| * smartphones | |
| * modern social media | |
| * Windows 95 and later software | |
| * COVID-19 | |
| * modern AI systems | |
| * 2000s, 2010s, or 2020s events | |
| Because this is a base model, it may still hallucinate if prompted about modern events. The later instruction-tuned model should be trained to respond more carefully to post-1988 topics. | |
| ## Example Usage | |
| ```python | |
| import torch | |
| from transformers import AutoTokenizer, AutoModelForCausalLM | |
| model_id = "guus4324343/Echo88-150M-Base" | |
| tokenizer = AutoTokenizer.from_pretrained(model_id) | |
| model = AutoModelForCausalLM.from_pretrained( | |
| model_id, | |
| torch_dtype=torch.bfloat16, | |
| device_map="auto" | |
| ) | |
| prompt = "The personal computer revolution of the 1980s" | |
| inputs = tokenizer(prompt, return_tensors="pt").to(model.device) | |
| with torch.no_grad(): | |
| output = model.generate( | |
| **inputs, | |
| max_new_tokens=160, | |
| temperature=0.8, | |
| top_p=0.95, | |
| do_sample=True, | |
| repetition_penalty=1.05, | |
| pad_token_id=tokenizer.pad_token_id, | |
| eos_token_id=tokenizer.eos_token_id, | |
| ) | |
| print(tokenizer.decode(output[0], skip_special_tokens=True)) | |
| ``` | |
| ## Training Configuration | |
| Echo88-150M-Base was trained as a LLaMA-style decoder-only causal LM. | |
| Main configuration: | |
| ```text | |
| vocab_size: 32768 | |
| hidden_size: 768 | |
| intermediate_size: 2048 | |
| num_hidden_layers: 18 | |
| num_attention_heads: 12 | |
| num_key_value_heads: 4 | |
| max_position_embeddings: 2048 | |
| activation: SiLU / SwiGLU-style LLaMA MLP | |
| normalization: RMSNorm | |
| position encoding: RoPE | |
| attention: grouped-query attention | |
| ``` | |
| Training setup: | |
| ```text | |
| precision: bf16 | |
| sequence length: 2048 | |
| optimizer: AdamW | |
| scheduler: cosine | |
| weight decay: 0.1 | |
| gradient clipping: 1.0 | |
| max steps: 5610 | |
| training tokens: ~1.47B | |
| ``` | |
| ## Limitations | |
| Echo88-150M-Base is experimental and small. | |
| Known limitations: | |
| * not instruction tuned | |
| * may hallucinate | |
| * may repeat text | |
| * may produce OCR-like artifacts | |
| * may reflect outdated historical language or views | |
| * may struggle with complex reasoning | |
| * may not reliably refuse post-1988 topics | |
| * may produce incomplete or strange continuations | |
| * may mix unrelated historical/computer facts | |
| The model is intended for research, experimentation, and creative retro AI work. It is not intended for high-stakes use. | |
| ## Bias and Historical Content | |
| The training data includes historical books, magazines, and Usenet text. As a result, the model may reproduce outdated language, assumptions, stereotypes, or viewpoints present in older source material. | |
| Users should review outputs carefully. | |
| ## Model Family | |
| Planned Echo88 model family: | |
| ```text | |
| Echo88-150M-Base | |
| Echo88-150M-Instruct | |
| Echo88-150M-Chat | |
| ``` | |
| ## License | |
| The model weights are released under the Apache 2.0 license. | |
| The training dataset is mixed-source and released separately under `other`. Users are responsible for checking dataset source rights, licensing, and suitability for their own use case. | |
| ``` | |
| ``` | |