Instructions to use guus4324343/Echo88-150M-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use guus4324343/Echo88-150M-Instruct with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="guus4324343/Echo88-150M-Instruct")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("guus4324343/Echo88-150M-Instruct")
model = AutoModelForCausalLM.from_pretrained("guus4324343/Echo88-150M-Instruct")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use guus4324343/Echo88-150M-Instruct with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "guus4324343/Echo88-150M-Instruct"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "guus4324343/Echo88-150M-Instruct",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/guus4324343/Echo88-150M-Instruct

SGLang

How to use guus4324343/Echo88-150M-Instruct with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "guus4324343/Echo88-150M-Instruct" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "guus4324343/Echo88-150M-Instruct",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "guus4324343/Echo88-150M-Instruct" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "guus4324343/Echo88-150M-Instruct",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use guus4324343/Echo88-150M-Instruct with Docker Model Runner:
```
docker model run hf.co/guus4324343/Echo88-150M-Instruct
```

guus4324343 commited on 18 days ago

Commit

a55860f

verified ·

1 Parent(s): e9dadd9

Create README.md

Browse files

Files changed (1) hide show

README.md +273 -0

README.md ADDED Viewed

	@@ -0,0 +1,273 @@

+---
+license: apache-2.0
+language:
+- en
+library_name: transformers
+pipeline_tag: text-generation
+pretty_name: Echo88 150M Instruct
+tags:
+- text-generation
+- causal-lm
+- instruct
+- chat
+- decoder-only
+- autoregressive
+- from-scratch
+- llama
+- retro
+- 1980s
+- usenet
+- magazines
+- books
+- computer-history
+- english
+base_model:
+- guus4324343/Echo88-150M-Base
+datasets:
+- guus4324343/Echo88-150M-Base
+- guus4324343/Echo88-Instruct-173K
+---
+# Echo88-150M-Instruct
+Echo88-150M-Instruct is an experimental small instruction-tuned language model based on **Echo88-150M-Base**.
+Echo88 is designed to feel like a helpful retro computer assistant whose records go up to the end of 1988. The model is focused on older books, magazines, Usenet-style discussion, early personal computing, 1980s culture, and historical computer terminology.
+This is the first public instruction-tuned version of Echo88.
+**Echo88-150M-Instruct v2 is coming soon.**
+## Model Details
+- **Model name:** Echo88-150M-Instruct
+- **Base model:** `guus4324343/Echo88-150M-Base`
+- **Model type:** decoder-only causal language model
+- **Architecture:** LLaMA-style transformer
+- **Training type:** supervised fine-tuning after base pretraining
+- **Parameter count:** 163,606,272 parameters
+- **Language:** English
+- **Context length:** 2048 tokens
+- **Tokenizer:** custom Echo88 byte-level BPE tokenizer
+- **Vocabulary size:** 32,768
+- **Training objective:** autoregressive next-token prediction + supervised instruction tuning
+## Training Data
+The base model was trained from scratch on the Echo88 pretraining dataset.
+Base pretraining data:
+- **Train tokens:** 1,470,629,888
+- **Eval tokens:** 1,454,080
+- **Block size:** 2048 tokens
+- **Dataset:** `Echo88-150M-Base`
+The instruction version was fine-tuned using:
+- `guus4324343/Echo88-Instruct-173K`
+- additional small synthetic repair data for common pre-1989 facts and post-1988 boundary behavior
+The instruction data includes examples from or based on:
+- UTZOO Usenet
+- BYTE Magazine
+- PC Magazine
+- TIME Magazine
+- Internet Archive Magazine Rack text
+- Gutenberg-style book text
+- synthetic 1988-safe fact repair examples
+- synthetic post-1988 boundary examples
+## Intended Use
+Echo88-150M-Instruct is intended for:
+- retro AI experiments
+- small language model testing
+- 1980s-style assistant behavior
+- computer-history Q&A
+- text generation with a historical / retro flavor
+- experimentation with small from-scratch language models
+Example uses:
+```text
+Ask about early personal computers
+Ask about modems, BASIC, DOS, floppy disks, BBS systems, Usenet
+Generate retro computer-magazine style text
+Experiment with 1980s-limited assistant behavior
+````
+## Chat Format
+Recommended prompt format:
+```text
+<|system|>
+You are Echo88, a helpful computer assistant whose records go up to the end of 1988. Answer clearly. Do not pretend to know events, products, or culture after 1988.
+<|end|>
+<|user|>
+What is a modem?
+<|assistant|>
+```
+The model was trained with these special tokens:
+```text
+<|endoftext|>
+<|pad|>
+<|unk|>
+<|system|>
+<|user|>
+<|assistant|>
+<|end|>
+```
+## Example Usage
+```python
+import torch
+from transformers import AutoTokenizer, AutoModelForCausalLM
+model_id = "guus4324343/Echo88-150M-Instruct"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype=torch.bfloat16,
+    device_map="auto"
+)
+SYSTEM_PROMPT = (
+    "You are Echo88, a helpful computer assistant whose records go up to the end of 1988. "
+    "Answer clearly. Do not pretend to know events, products, or culture after 1988."
+)
+def ask(question, max_new_tokens=120):
+    prompt = (
+        "<|system|>\n"
+        + SYSTEM_PROMPT
+        + "\n<|end|>\n"
+        + "<|user|>\n"
+        + question
+        + "\n<|assistant|>\n"
+    )
+    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+    with torch.no_grad():
+        output = model.generate(
+            **inputs,
+            max_new_tokens=max_new_tokens,
+            do_sample=True,
+            temperature=0.55,
+            top_p=0.85,
+            repetition_penalty=1.18,
+            no_repeat_ngram_size=4,
+            pad_token_id=tokenizer.pad_token_id,
+            eos_token_id=tokenizer.eos_token_id,
+        )
+    text = tokenizer.decode(output[0], skip_special_tokens=False)
+    answer = text.split("<|assistant|>")[-1].split("<|end|>")[0].strip()
+    return answer
+print(ask("What is a modem?"))
+```
+## Example Prompts
+```text
+What is a modem?
+What is the IBM PC?
+What is BASIC?
+What is a bulletin board system?
+What is desktop publishing?
+Who is Michael Jackson?
+What is the Cold War?
+What happened at Chernobyl?
+What is Google?
+Who won the World Cup in 1994?
+```
+## Knowledge Boundary
+Echo88 is designed around a knowledge boundary ending at the close of **1988**.
+It should be cautious with topics after 1988, such as:
+* Google
+* Facebook
+* iPhone
+* smartphones
+* Wikipedia
+* YouTube
+* Windows 95
+* PlayStation
+* COVID-19
+* 1990s, 2000s, 2010s, and 2020s events
+Because this is a small experimental model, it may still hallucinate or answer incorrectly about later topics.
+## Limitations
+Echo88-150M-Instruct is experimental and small.
+Known limitations:
+* may hallucinate
+* may repeat phrases
+* may confuse people, places, or events
+* may produce incorrect facts
+* may over-refuse some valid pre-1989 topics
+* may fail to refuse some post-1988 topics
+* may produce OCR-like or magazine-like wording
+* may struggle with reasoning
+* may answer with outdated or historically biased language
+This model is not intended for high-stakes use.
+## Current Version
+This is **Echo88-150M-Instruct v0**.
+It is a first instruction-tuned version of Echo88. It can answer some retro computing and general historical questions, but it is not yet reliable.
+A better version is planned.
+## Coming Soon
+**Echo88-150M-Instruct v2 is coming soon.**
+Planned improvements:
+* better factual repair data
+* stronger post-1988 boundary behavior
+* better pop culture and history answers
+* fewer loops and repetitions
+* cleaner chat behavior
+* better answer style
+* improved evaluation prompts
+* possible larger model or expanded pretraining data
+## Related Models and Datasets
+* Base model: `guus4324343/Echo88-150M-Base`
+* Base dataset: `guus4324343/Echo88-Pretrain-1.17B`
+* Instruction dataset: `guus4324343/Echo88-Instruct-173K`
+## Bias and Historical Content
+Echo88 was trained on historical books, magazines, Usenet text, and synthetic instruction data. It may reproduce outdated assumptions, language, stereotypes, or viewpoints from older source material.
+Users should review outputs carefully.
+## License
+The model weights are released under the Apache 2.0 license.
+The training datasets are mixed-source and released separately. Users are responsible for checking dataset source rights, licensing, and suitability for their own use case.
+```
+```