Instructions to use guus4324343/Echo88-150M-Base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use guus4324343/Echo88-150M-Base with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="guus4324343/Echo88-150M-Base")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("guus4324343/Echo88-150M-Base")
model = AutoModelForCausalLM.from_pretrained("guus4324343/Echo88-150M-Base")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use guus4324343/Echo88-150M-Base with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "guus4324343/Echo88-150M-Base"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "guus4324343/Echo88-150M-Base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/guus4324343/Echo88-150M-Base

SGLang

How to use guus4324343/Echo88-150M-Base with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "guus4324343/Echo88-150M-Base" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "guus4324343/Echo88-150M-Base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "guus4324343/Echo88-150M-Base" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "guus4324343/Echo88-150M-Base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use guus4324343/Echo88-150M-Base with Docker Model Runner:
```
docker model run hf.co/guus4324343/Echo88-150M-Base
```

Echo88-150M-Base / README.md

guus4324343

Create README.md

17e0d52 verified 18 days ago

preview code

raw

history blame contribute delete

6.19 kB

metadata

license: apache-2.0
language:
  - en
library_name: transformers
pipeline_tag: text-generation
pretty_name: Echo88 150M Base
tags:
  - text-generation
  - causal-lm
  - base-model
  - decoder-only
  - autoregressive
  - from-scratch
  - llama
  - retro
  - 1980s
  - usenet
  - magazines
  - books
  - computer-history
  - english
datasets:
  - guus4324343/Echo88-150M-Base

Echo88-150M-Base

Echo88-150M-Base is a small English decoder-only causal language model trained from scratch on the Echo88 pretraining dataset.

Echo88 is designed as a retro language model inspired by the language, culture, computing, magazines, Usenet discussions, and older book text available up to the late 1980s.

This is a base model, not an instruction-tuned chatbot. It is trained for next-token prediction and should be fine-tuned before being used as a helpful assistant.

Model Details

Model name: Echo88-150M-Base
Model type: decoder-only causal language model
Architecture: LLaMA-style transformer
Training type: from scratch
Parameter count: 163,606,272 parameters
Language: English
Context length: 2048 tokens
Tokenizer: custom Echo88 byte-level BPE tokenizer
Vocabulary size: 32,768
Training objective: autoregressive next-token prediction

Training Data

Echo88-150M-Base was trained on the Echo88 pretraining dataset.

The packed training set contains:

Train tokens: 1,470,629,888
Eval tokens: 1,454,080
Train blocks: 718,081 blocks
Eval blocks: 710 blocks
Block size: 2048 tokens
Packed dtype: uint16

The dataset includes a mixture of:

public-domain book text
Gutenberg-style older books
UTZOO Usenet posts
BYTE Magazine text
PC Magazine text
TIME Magazine text
Internet Archive Magazine Rack OCR text
computer and technology magazine text
general historical magazine text

The dataset emphasizes the 1950s through the late 1980s, with a strong focus on early personal computing, printed magazines, Usenet, and older long-form writing.

Dataset used:

guus4324343/Echo88-Pretrain-1.17B

Intended Use

Echo88-150M-Base is intended for:

causal language modeling
retro / historical AI experiments
small language model research
continued pretraining
instruction tuning
1980s-style assistant experiments
computer-history language modeling
training Echo88-150M-Instruct

Recommended flow:

Echo88-150M-Base
→ supervised fine-tuning on Echo88-Instruct-173K
→ Echo88-150M-Instruct

Not Instruction Tuned

This model is not instruction tuned.

It may not reliably follow commands, answer questions directly, or behave like a chat assistant. It is a base model that continues text.

Expected behavior:

continues prompts
completes paragraphs
imitates old magazine/book/Usenet style
may produce raw text instead of direct answers
may hallucinate
may repeat phrases
may generate OCR-like artifacts

For chat behavior, use or create an instruction-tuned version using:

guus4324343/Echo88-Instruct-173K

Knowledge Boundary

Echo88 is designed around a historical data mixture ending around the late 1980s.

The model should not be expected to know modern topics such as:

Google
Wikipedia
iPhone
smartphones
modern social media
Windows 95 and later software
COVID-19
modern AI systems
2000s, 2010s, or 2020s events

Because this is a base model, it may still hallucinate if prompted about modern events. The later instruction-tuned model should be trained to respond more carefully to post-1988 topics.

Example Usage

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "guus4324343/Echo88-150M-Base"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

prompt = "The personal computer revolution of the 1980s"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    output = model.generate(
        **inputs,
        max_new_tokens=160,
        temperature=0.8,
        top_p=0.95,
        do_sample=True,
        repetition_penalty=1.05,
        pad_token_id=tokenizer.pad_token_id,
        eos_token_id=tokenizer.eos_token_id,
    )

print(tokenizer.decode(output[0], skip_special_tokens=True))

Training Configuration

Echo88-150M-Base was trained as a LLaMA-style decoder-only causal LM.

Main configuration:

vocab_size: 32768
hidden_size: 768
intermediate_size: 2048
num_hidden_layers: 18
num_attention_heads: 12
num_key_value_heads: 4
max_position_embeddings: 2048
activation: SiLU / SwiGLU-style LLaMA MLP
normalization: RMSNorm
position encoding: RoPE
attention: grouped-query attention

Training setup:

precision: bf16
sequence length: 2048
optimizer: AdamW
scheduler: cosine
weight decay: 0.1
gradient clipping: 1.0
max steps: 5610
training tokens: ~1.47B

Limitations

Echo88-150M-Base is experimental and small.

Known limitations:

not instruction tuned
may hallucinate
may repeat text
may produce OCR-like artifacts
may reflect outdated historical language or views
may struggle with complex reasoning
may not reliably refuse post-1988 topics
may produce incomplete or strange continuations
may mix unrelated historical/computer facts

The model is intended for research, experimentation, and creative retro AI work. It is not intended for high-stakes use.

Bias and Historical Content

The training data includes historical books, magazines, and Usenet text. As a result, the model may reproduce outdated language, assumptions, stereotypes, or viewpoints present in older source material.

Users should review outputs carefully.

Model Family

Planned Echo88 model family:

Echo88-150M-Base
Echo88-150M-Instruct
Echo88-150M-Chat

License

The model weights are released under the Apache 2.0 license.

The training dataset is mixed-source and released separately under other. Users are responsible for checking dataset source rights, licensing, and suitability for their own use case.