Instructions to use leapeto/Qwen3-4B-AbstractCoT-warmup with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use leapeto/Qwen3-4B-AbstractCoT-warmup with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="leapeto/Qwen3-4B-AbstractCoT-warmup")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("leapeto/Qwen3-4B-AbstractCoT-warmup", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use leapeto/Qwen3-4B-AbstractCoT-warmup with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "leapeto/Qwen3-4B-AbstractCoT-warmup"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "leapeto/Qwen3-4B-AbstractCoT-warmup",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/leapeto/Qwen3-4B-AbstractCoT-warmup

SGLang

How to use leapeto/Qwen3-4B-AbstractCoT-warmup with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "leapeto/Qwen3-4B-AbstractCoT-warmup" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "leapeto/Qwen3-4B-AbstractCoT-warmup",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "leapeto/Qwen3-4B-AbstractCoT-warmup" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "leapeto/Qwen3-4B-AbstractCoT-warmup",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use leapeto/Qwen3-4B-AbstractCoT-warmup with Docker Model Runner:
```
docker model run hf.co/leapeto/Qwen3-4B-AbstractCoT-warmup
```

Qwen3-4B-AbstractCoT-warmup / train_logs /pi3_phaseB.json

leapeto

Add files using upload-large-folder tool

e68e718 verified 7 days ago

raw

history blame contribute delete

1.79 kB

	{
	"losses": [
	0.24705615827115252,
	0.19863037412578705,
	0.20961245244543533,
	0.23088445312132536,
	0.20231807196396404,
	0.21607737393860588,
	0.2541877782714437,
	0.23250892728756298,
	0.26798237174225503,
	0.2450843573300517,
	0.26578180299547965,
	0.26796957241640484,
	0.25621661408949875,
	0.30977868895424765,
	0.26767159106384497,
	0.3011666447739117,
	0.2829906645929441,
	0.2978396859412896,
	0.3276536007644609,
	0.3177644655283075,
	0.3392833017860539,
	0.35701846808369736,
	0.3581092601059936,
	0.3342221752507612,
	0.34096981105394664,
	0.4076683118997607,
	0.36950865692924706,
	0.32229581456631423,
	0.37231990886793936,
	0.36889135412639007,
	0.35545787583687344
	],
	"lrs": [
	6.666666666666667e-05,
	9.993008576227247e-05,
	9.937194443381972e-05,
	9.826190093588563e-05,
	9.661236384224129e-05,
	9.444177243274618e-05,
	9.177439057064683e-05,
	8.864003547001915e-05,
	8.507374438531607e-05,
	8.111538294891684e-05,
	7.680919953486048e-05,
	7.220333063028872e-05,
	6.734926274378312e-05,
	6.230125686563068e-05,
	5.7115741913664264e-05,
	5.185068394501791e-05,
	4.6564938185035956e-05,
	4.131759111665349e-05,
	3.616729998467365e-05,
	3.1171637098265064e-05,
	2.638644626136587e-05,
	2.1865218525109495e-05,
	1.7658494240397126e-05,
	1.3813298094746491e-05,
	1.037261344883343e-05,
	7.374901848832683e-06,
	4.853673085668947e-06,
	2.8371106072518195e-06,
	1.3477564710088098e-06,
	4.02259358460233e-07,
	1.1188468644907079e-08
	],
	"wallclock_s": 947,
	"n_examples": 5000,
	"epochs": 1,
	"mode": "distill",
	"lora_rank": 32,
	"total_opt_steps": 156,
	"num_processes": 2
	}