Instructions to use guus4324343/Echo88-150M-Base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use guus4324343/Echo88-150M-Base with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="guus4324343/Echo88-150M-Base")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("guus4324343/Echo88-150M-Base") model = AutoModelForCausalLM.from_pretrained("guus4324343/Echo88-150M-Base") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use guus4324343/Echo88-150M-Base with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "guus4324343/Echo88-150M-Base" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "guus4324343/Echo88-150M-Base", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/guus4324343/Echo88-150M-Base
- SGLang
How to use guus4324343/Echo88-150M-Base with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "guus4324343/Echo88-150M-Base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "guus4324343/Echo88-150M-Base", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "guus4324343/Echo88-150M-Base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "guus4324343/Echo88-150M-Base", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use guus4324343/Echo88-150M-Base with Docker Model Runner:
docker model run hf.co/guus4324343/Echo88-150M-Base
Echo88-150M-Base
Echo88-150M-Base is a small English decoder-only causal language model trained from scratch on the Echo88 pretraining dataset.
Echo88 is designed as a retro language model inspired by the language, culture, computing, magazines, Usenet discussions, and older book text available up to the late 1980s.
This is a base model, not an instruction-tuned chatbot. It is trained for next-token prediction and should be fine-tuned before being used as a helpful assistant.
Model Details
- Model name: Echo88-150M-Base
- Model type: decoder-only causal language model
- Architecture: LLaMA-style transformer
- Training type: from scratch
- Parameter count: 163,606,272 parameters
- Language: English
- Context length: 2048 tokens
- Tokenizer: custom Echo88 byte-level BPE tokenizer
- Vocabulary size: 32,768
- Training objective: autoregressive next-token prediction
Training Data
Echo88-150M-Base was trained on the Echo88 pretraining dataset.
The packed training set contains:
- Train tokens: 1,470,629,888
- Eval tokens: 1,454,080
- Train blocks: 718,081 blocks
- Eval blocks: 710 blocks
- Block size: 2048 tokens
- Packed dtype: uint16
The dataset includes a mixture of:
- public-domain book text
- Gutenberg-style older books
- UTZOO Usenet posts
- BYTE Magazine text
- PC Magazine text
- TIME Magazine text
- Internet Archive Magazine Rack OCR text
- computer and technology magazine text
- general historical magazine text
The dataset emphasizes the 1950s through the late 1980s, with a strong focus on early personal computing, printed magazines, Usenet, and older long-form writing.
Dataset used:
guus4324343/Echo88-Pretrain-1.17B
Intended Use
Echo88-150M-Base is intended for:
- causal language modeling
- retro / historical AI experiments
- small language model research
- continued pretraining
- instruction tuning
- 1980s-style assistant experiments
- computer-history language modeling
- training Echo88-150M-Instruct
Recommended flow:
Echo88-150M-Base
→ supervised fine-tuning on Echo88-Instruct-173K
→ Echo88-150M-Instruct
Not Instruction Tuned
This model is not instruction tuned.
It may not reliably follow commands, answer questions directly, or behave like a chat assistant. It is a base model that continues text.
Expected behavior:
- continues prompts
- completes paragraphs
- imitates old magazine/book/Usenet style
- may produce raw text instead of direct answers
- may hallucinate
- may repeat phrases
- may generate OCR-like artifacts
For chat behavior, use or create an instruction-tuned version using:
guus4324343/Echo88-Instruct-173K
Knowledge Boundary
Echo88 is designed around a historical data mixture ending around the late 1980s.
The model should not be expected to know modern topics such as:
- Wikipedia
- iPhone
- smartphones
- modern social media
- Windows 95 and later software
- COVID-19
- modern AI systems
- 2000s, 2010s, or 2020s events
Because this is a base model, it may still hallucinate if prompted about modern events. The later instruction-tuned model should be trained to respond more carefully to post-1988 topics.
Example Usage
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "guus4324343/Echo88-150M-Base"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
prompt = "The personal computer revolution of the 1980s"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
output = model.generate(
**inputs,
max_new_tokens=160,
temperature=0.8,
top_p=0.95,
do_sample=True,
repetition_penalty=1.05,
pad_token_id=tokenizer.pad_token_id,
eos_token_id=tokenizer.eos_token_id,
)
print(tokenizer.decode(output[0], skip_special_tokens=True))
Training Configuration
Echo88-150M-Base was trained as a LLaMA-style decoder-only causal LM.
Main configuration:
vocab_size: 32768
hidden_size: 768
intermediate_size: 2048
num_hidden_layers: 18
num_attention_heads: 12
num_key_value_heads: 4
max_position_embeddings: 2048
activation: SiLU / SwiGLU-style LLaMA MLP
normalization: RMSNorm
position encoding: RoPE
attention: grouped-query attention
Training setup:
precision: bf16
sequence length: 2048
optimizer: AdamW
scheduler: cosine
weight decay: 0.1
gradient clipping: 1.0
max steps: 5610
training tokens: ~1.47B
Limitations
Echo88-150M-Base is experimental and small.
Known limitations:
- not instruction tuned
- may hallucinate
- may repeat text
- may produce OCR-like artifacts
- may reflect outdated historical language or views
- may struggle with complex reasoning
- may not reliably refuse post-1988 topics
- may produce incomplete or strange continuations
- may mix unrelated historical/computer facts
The model is intended for research, experimentation, and creative retro AI work. It is not intended for high-stakes use.
Bias and Historical Content
The training data includes historical books, magazines, and Usenet text. As a result, the model may reproduce outdated language, assumptions, stereotypes, or viewpoints present in older source material.
Users should review outputs carefully.
Model Family
Planned Echo88 model family:
Echo88-150M-Base
Echo88-150M-Instruct
Echo88-150M-Chat
License
The model weights are released under the Apache 2.0 license.
The training dataset is mixed-source and released separately under other. Users are responsible for checking dataset source rights, licensing, and suitability for their own use case.
- Downloads last month
- 34