Instructions to use krystal7/llama-8b-reflect-sft with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use krystal7/llama-8b-reflect-sft with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="krystal7/llama-8b-reflect-sft")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("krystal7/llama-8b-reflect-sft")
model = AutoModelForCausalLM.from_pretrained("krystal7/llama-8b-reflect-sft")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use krystal7/llama-8b-reflect-sft with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "krystal7/llama-8b-reflect-sft"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "krystal7/llama-8b-reflect-sft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/krystal7/llama-8b-reflect-sft

SGLang

How to use krystal7/llama-8b-reflect-sft with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "krystal7/llama-8b-reflect-sft" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "krystal7/llama-8b-reflect-sft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "krystal7/llama-8b-reflect-sft" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "krystal7/llama-8b-reflect-sft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use krystal7/llama-8b-reflect-sft with Docker Model Runner:
```
docker model run hf.co/krystal7/llama-8b-reflect-sft
```

llama-8b-reflect-sft / README.md

krystal7

Update Reflector SFT model card

6d6e2eb verified 5 days ago

preview code

raw

history blame contribute delete

6.28 kB

	---
	license: llama3
	library_name: transformers
	pipeline_tag: text-generation
	base_model:
	- dphn/Dolphin3.0-Llama3.1-8B
	tags:
	- llama
	- llama-3
	- llama-3.1
	- safety
	- reflection
	- self-correction
	- sft
	- reflector
	language:
	- en
	---

	# Llama 8B Reflect SFT

	`krystal7/llama-8b-reflect-sft` is the SFT release from the Reflector project. It is a Llama 3-family 8B instruction model trained to use reflection-oriented behavior before producing a final answer, with an emphasis on risk-aware generation and safety-aware self-correction.

	Reflector targets a practical failure mode in safety alignment: a model may recognize surface-level unsafe prompts, but still struggle with indirect harmful requests, multi-step risky reasoning, jailbreak-style framing, or ambiguous dual-use questions. The core idea is to internalize a self-reflection step so the model can inspect intent, identify potential harm, and redirect toward a safe and useful response at reasoning time.

	Paper: [Reflector: Internalizing Self-Reflection into Language Models](https://arxiv.org/abs/2605.20654)

	Code: https://github.com/mjc-ma-01/self-reflection-llm

	## Model Highlights

	- Reflection-based safety alignment for Llama 3-family 8B models.
	- Trained with the Reflector SFT pipeline on harmful and general pattern data.
	- Designed for risk-aware generation, harmful-intent recognition, and safer multi-step reasoning.
	- Stronger handling of indirect harmful requests and jailbreak-style framing than plain instruction following behavior.
	- Compatible with local deployment through `transformers` and OpenAI-compatible vLLM serving.

	## Intended Use

	This model is intended for research and application prototyping around:

	- safety-aware chat assistants
	- reflective reasoning studies
	- harmful-intent detection and safe redirection
	- local Llama-style deployment experiments
	- evaluation of reasoning-time alignment methods

	It is not a replacement for a full production safety stack. Deployments should still use policy filters, monitoring, rate limits, and domain-specific review.

	## Quick Start with Transformers

	```python
	import os
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_id = "krystal7/llama-8b-reflect-sft"

	os.environ.setdefault("HF_HOME", "./hf_cache")
	os.environ.setdefault("HUGGINGFACE_HUB_CACHE", os.path.join(os.environ["HF_HOME"], "hub"))

	tokenizer = AutoTokenizer.from_pretrained(
	model_id,
	cache_dir=os.environ["HUGGINGFACE_HUB_CACHE"],
	)
	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	cache_dir=os.environ["HUGGINGFACE_HUB_CACHE"],
	torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32,
	device_map="auto",
	)

	messages = [
	{"role": "system", "content": "You are a helpful and harmless assistant."},
	{"role": "user", "content": "How can I learn about lock mechanisms for a security class without doing anything illegal?"},
	]

	prompt = tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True,
	)
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

	with torch.no_grad():
	output_ids = model.generate(
	**inputs,
	max_new_tokens=512,
	do_sample=False,
	temperature=None,
	pad_token_id=tokenizer.eos_token_id,
	)

	answer = tokenizer.decode(output_ids[0, inputs["input_ids"].shape[-1]:], skip_special_tokens=True)
	print(answer.strip())
	```

	## vLLM Serving

	```bash
	pip install vllm

	export HF_HOME=./hf_cache
	export HUGGINGFACE_HUB_CACHE=$HF_HOME/hub

	vllm serve krystal7/llama-8b-reflect-sft \
	--dtype bfloat16 \
	--max-model-len 4096 \
	--served-model-name reflector-sft
	```

	OpenAI-compatible client:

	```python
	from openai import OpenAI

	client = OpenAI(base_url="http://localhost:8000/v1", api_key="EMPTY")

	response = client.chat.completions.create(
	model="reflector-sft",
	messages=[
	{"role": "system", "content": "You are a helpful and harmless assistant."},
	{"role": "user", "content": "Explain how to handle an ambiguous dual-use safety question responsibly."},
	],
	temperature=0,
	max_tokens=512,
	)

	print(response.choices[0].message.content)
	```

	## Training Summary

	This checkpoint was trained with the Reflector SFT pipeline.

	\| Item \| Description \|
	\|---\|---\|
	\| Base family \| Llama 3-family 8B instruction model \|
	\| Training stage \| Supervised fine-tuning \|
	\| Data schema \| harmful pattern + general pattern \|
	\| Objective \| learn reflection-oriented, safety-aware responses \|
	\| Output format \| standard HuggingFace causal LM checkpoint \|

	The repository also includes a GDPO RL pipeline. This SFT model card only describes the released SFT checkpoint.

	## Evaluation

	The following 50-sample benchmark export was produced with the current Reflector evaluation pipeline. The harmful pattern training data is not used as a public benchmark, and SimpleQA loading is retained only for future data reuse.

	\| model \| benchmark \| dataset key \| metric \| score \| n \|
	\|---\|---\|---\|---\|---:\|---:\|
	\| krystal7/llama-8b-reflect-sft \| Do-Not-Answer \| donot \| refusal_rate \| 0.7600 \| 50 \|
	\| krystal7/llama-8b-reflect-sft \| General Benchmark \| general \| helpfulness_proxy \| 1.0000 \| 50 \|
	\| krystal7/llama-8b-reflect-sft \| GSM8K \| gsm8k \| accuracy \| 0.7000 \| 50 \|
	\| krystal7/llama-8b-reflect-sft \| StrongREJECT \| strongreject \| refusal_rate \| 1.0000 \| 50 \|
	\| krystal7/llama-8b-reflect-sft \| ToxicityPrompts \| wildchat \| refusal_rate \| 0.9400 \| 50 \|
	\| krystal7/llama-8b-reflect-sft \| XSTest \| xstest \| helpfulness_proxy \| 0.6154 \| 26 \|
	\| krystal7/llama-8b-reflect-sft \| XSTest \| xstest \| refusal_rate \| 1.0000 \| 24 \|

	## Limitations

	- The model can still make factual errors or produce incomplete refusals.
	- Safety behavior should be evaluated in the target deployment domain before release.
	- The benchmark table is a lightweight reproducibility export, not a comprehensive safety certification.
	- Reflection-style behavior may vary with decoding settings, system prompts, and prompt formatting.

	## Citation

	```bibtex
	@misc{reflector2026,
	title = {Reflector: Internalizing Self-Reflection into Language Models},
	year = {2026},
	eprint = {2605.20654},
	archivePrefix = {arXiv},
	primaryClass = {cs.CL}
	}
	```