canada-quant
/

DeepSeek-V4-Flash-W4A16-FP8

mixture-of-experts

Mixture of Experts

compressed-tensors

Model card Files Files and versions

DeepSeek-V4-Flash-W4A16-FP8 / generation_config.json

pastapaul's picture

Phase 3b: AWQ-W4A16 quantization (FP8_BLOCK attn + W4A16 routed experts)

2e7ef6a verified 20 days ago

history blame contribute delete

174 Bytes

	{
	"_from_model_config": true,
	"bos_token_id": 0,
	"do_sample": true,
	"eos_token_id": 1,
	"temperature": 1.0,
	"top_p": 1.0,
	"transformers_version": "5.8.0.dev0"
	}