Upload README.md with huggingface_hub

8472730 verified 9 days ago

4.28 kB

	---
	license: apache-2.0
	language:
	- en
	tags:
	- mlx
	- distill
	- cli
	- code
	- compression
	- qwen
	- expert-model
	- domain-specific
	- task-specialized
	pipeline_tag: text-generation
	base_model: Qwen/Qwen3-1.7B
	---

	# distill-1.7B — Expert Language Model for CLI Output

	distill-1.7B is a domain-specific Expert Language Model — not a general-purpose chatbot. It does exactly one thing: compress and classify raw terminal output into structured, actionable summaries.

	Built for the [distill](https://github.com/samuelfaj/distill) engine — an open-source CLI output compression tool.

	## What is distill?

	[distill](https://github.com/samuelfaj/distill) is a tool that takes arbitrary command-line output and reduces it to only what matters. Instead of scrolling through 500 lines of `npm install` logs, you get:

	```
	PASS
	24 packages installed, 0 vulnerabilities
	```

	Instead of parsing a wall of Terraform plan output, you get:

	```json
	{"create": 3, "change": 12, "destroy": 0}
	```

	distill-1.7B is the brain behind distill — it's the model that understands CLI output and knows what's signal vs noise.

	## Why "Expert Language Model"?

	Unlike general-purpose LLMs (ChatGPT, Claude, etc.) that can talk about anything, distill-1.7B is:

	\| Trait \| General LLM \| distill-1.7B \|
	\|-------\|-------------\|--------------\|
	\| Scope \| Any topic \| CLI output only \|
	\| Size \| 70-400B params \| 1.7B params \|
	\| Training data \| Web crawl (trillions of tokens) \| 100k synthetic CLI outputs \|
	\| Strengths \| Conversation, reasoning, code \| CLI compression, classification \|
	\| Weaknesses \| — \| Can't chat, can't code, can't reason \|

	It's an expert in the same way a radiologist is an expert — highly skilled in one narrow domain, not trying to be a general practitioner.

	## 8 Specialized Tasks

	\| Task \| What it does \| Example output \|
	\|------\|-------------\|----------------\|
	\| `pass_fail` \| Did the command succeed or fail? \| `PASS` / `FAIL Error: ...` \|
	\| `safe_review` \| Is this Terraform plan safe? \| `SAFE` / `UNSAFE` / `REVIEW` \|
	\| `terraform_plan` \| Count resources created/changed/destroyed \| `{"create":3,"change":12,"destroy":0}` \|
	\| `json_extraction` \| Pull JSON from noisy logs \| `[{"name":"app","version":"2.1.0"}]` \|
	\| `security_audit` \| Count vulns by severity \| `[{"severity":"high","count":2}]` \|
	\| `test_result` \| Test suite pass/fail? \| `PASS\n4 passed, 0 failed` \|
	\| `typescript_check` \| Extract TS compiler errors \| `error TS2741: Property 'x' is missing` \|
	\| `generic` \| Free-form summary of any CLI output \| `24 packages installed` \|

	## Performance

	\| Metric \| Value \|
	\|--------\|-------\|
	\| Overall accuracy \| 95% \|
	\| Tasks at 100% \| 6 of 8 \|
	\| Base model \| Qwen3-1.7B \|
	\| Training \| LoRA rank 32, 4000 iterations \|
	\| Dataset \| 100k synthetic CLI outputs \|
	\| Training hardware \| Apple M4 Max, 128 GB RAM \|

	## Available Formats

	\| Repo \| Format \| Size \| Platform \|
	\|------\|--------\|------\|----------\|
	\| distill-1.7B-MLX \| MLX fp16 \| 3.2 GB \| macOS (Apple Silicon) \|
	\| [distill-1.7B-4bit-MLX](https://huggingface.co/samuelfaj/distill-1.7B-4bit-MLX) \| MLX 4-bit \| 1.0 GB \| macOS (Apple Silicon) \|
	\| [distill-1.7B-GGUF](https://huggingface.co/samuelfaj/distill-1.7B-GGUF) \| GGUF fp16 \| 4.1 GB \| Cross-platform \|
	\| [distill-1.7B-4bit-GGUF](https://huggingface.co/samuelfaj/distill-1.7B-4bit-GGUF) \| GGUF Q4_K_M \| 1.2 GB \| Cross-platform \|

	All formats achieve identical 95% accuracy — pick based on your platform and size preference.

	## Usage

	```python
	from mlx_lm import load, generate

	model, tokenizer = load("samuelfaj/distill-1.7B-MLX")

	messages = [
	{"role": "system", "content": "You are distill. Compress CLI output concisely."},
	{"role": "user", "content": "Command output:\nnpm test\n4 tests passed, 0 failed"}
	]
	prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
	result = generate(model, tokenizer, prompt=prompt, max_tokens=256)
	print(result)
	```

	## Project

	This model powers [distill](https://github.com/samuelfaj/distill) — a CLI output compression engine. The training code and dataset generation pipeline are available in the repository.

	[Full Distill Collection](https://huggingface.co/collections/samuelfaj/distill-6a0606f9b131c289025659fc)