Instructions to use elliottower2/gpt2-rti-mean-ablated with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use elliottower2/gpt2-rti-mean-ablated with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="elliottower2/gpt2-rti-mean-ablated")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("elliottower2/gpt2-rti-mean-ablated")
model = AutoModelForCausalLM.from_pretrained("elliottower2/gpt2-rti-mean-ablated")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use elliottower2/gpt2-rti-mean-ablated with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "elliottower2/gpt2-rti-mean-ablated"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "elliottower2/gpt2-rti-mean-ablated",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/elliottower2/gpt2-rti-mean-ablated

SGLang

How to use elliottower2/gpt2-rti-mean-ablated with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "elliottower2/gpt2-rti-mean-ablated" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "elliottower2/gpt2-rti-mean-ablated",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "elliottower2/gpt2-rti-mean-ablated" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "elliottower2/gpt2-rti-mean-ablated",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use elliottower2/gpt2-rti-mean-ablated with Docker Model Runner:
```
docker model run hf.co/elliottower2/gpt2-rti-mean-ablated
```

gpt2-rti-mean-ablated / README.md

elliottower2

Upload folder using huggingface_hub

9f3fe44 verified 1 day ago

preview code

raw

history blame contribute delete

1.99 kB

	---
	license: mit
	library_name: transformers
	tags:
	- gpt2
	- onnx
	- mechanistic-interpretability
	- circuit-ablation
	- rti-circuit
	base_model: openai-community/gpt2
	---

	# GPT-2 with RTI Circuit Mean-Ablated

	GPT-2 (124M) with the 15-head Repeated Token Identification (RTI) circuit removed via mean ablation.

	## What was ablated

	The RTI circuit consists of 15 attention heads across 4 functional tiers:

	\| Tier \| Heads \| Function \|
	\|------\|-------\|----------\|
	\| Backbone \| 0.8, 0.9, 0.11 \| Broad token matching via positional/frequency features \|
	\| Detector \| 4.11 \| Repeated-token detection gate \|
	\| Copier \| 4.0, 5.6, 5.7, 7.0, 8.4, 8.7, 9.3, 9.10 \| Copy repeated token identity to output \|
	\| Readout \| 10.11, 11.9, 11.11 \| Route copied information to final logits \|

	## Ablation method

	Mean ablation: For each circuit head:
	1. The mean head output was computed across a dataset of 20 diverse text examples
	2. The corresponding columns of `c_proj.weight` (W_O) were zeroed
	3. The mean contribution (`W_O @ mean_head_output`) was added to `c_proj.bias`

	This replaces each head's input-dependent computation with its average output, preserving the head's unconditional contribution while removing its ability to respond to specific inputs.

	## Effect

	The ablated model loses the ability to predict repeated tokens:
	- Normal GPT-2: "The cat sat on the mat. The cat" → " was a little bit older than me, but I"
	- Mean-ablated: "The cat sat on the mat. The cat" → " sat on the mat.\n\nThe cat sat"

	## Usage with Transformers.js

	```javascript
	import { AutoModelForCausalLM, AutoTokenizer } from '@huggingface/transformers';

	const model = await AutoModelForCausalLM.from_pretrained('elliottower2/gpt2-rti-mean-ablated', {
	dtype: 'fp32',
	});
	const tokenizer = await AutoTokenizer.from_pretrained('elliottower2/gpt2-rti-mean-ablated');
	```

	## Citation

	Part of the factorization-circuits project studying weight-space circuit discovery in transformers.