Darmm
/

darmm-tech-scribe

Text Generation

technical-documentation

Eval Results (legacy)

Model card Files Files and versions

darmm-tech-scribe / README.md

R3iwan's picture

Update Model Card

84b46d4 verified 4 months ago

|

history blame contribute delete

3.37 kB

	---
	language:
	- en
	license: apache-2.0
	tags:
	- text-generation
	- technical-documentation
	- readme
	- qwen
	- qlora
	pipeline_tag: text-generation
	base_model: Qwen/Qwen2.5-Coder-7B-Instruct
	model-index:
	- name: Tech-Scribe-v1
	results:
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: collected_data_external
	type: tech-docs
	metrics:
	- type: loss
	value: 1.1258
	---

	# Tech Scribe (Qwen 2.5 7B Fine-tune)

	Tech Scribe is a specialized language model fine-tuned to generate high-quality, structured technical documentation (READMEs, Model Cards) from simple project descriptions. It is built on top of `Qwen/Qwen2.5-Coder-7B-Instruct` using QLoRA.

	## Usage

	```python
	import torch
	from peft import PeftModel
	from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

	# Config for 4-bit loading
	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_compute_dtype=torch.float16,
	bnb_4bit_quant_type="nf4"
	)

	# Load Base Model
	base_model_name = "Qwen/Qwen2.5-Coder-7B-Instruct"
	model = AutoModelForCausalLM.from_pretrained(
	base_model_name,
	quantization_config=bnb_config,
	device_map="auto"
	)

	# Load Tech Scribe Adapter
	adapter_name = "Darmm/tech-scribe-v1" # Example path
	model = PeftModel.from_pretrained(model, adapter_name)
	tokenizer = AutoTokenizer.from_pretrained(base_model_name)

	# Generate
	project_idea = "A Python library for real-time sentiment analysis using websockets"
	prompt = f"### Instruction:\nWrite a high-quality technical README or Model Card for the project \"{project_idea}\".\n\n### Response:\n"

	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.7)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True).split("### Response:")[1])
	```

	## Model Description

	- Developed by: Darmm Lab
	- Base Model: `Qwen/Qwen2.5-Coder-7B-Instruct`
	- Fine-tuning Method: QLoRA (4-bit quantization with LoRA adapters)
	- Task: Technical Documentation Generation
	- Language: English

	## Training (summary)

	The model was fine-tuned on a curated dataset of high-quality READMEs from top open-source repositories (e.g., PyTorch, FastAPI, React, HuggingFace Transformers).

	- Epochs: 1 (Prototype run)
	- Batch size: 1 (Gradient Accumulation: 8)
	- Learning rate: 2e-4
	- Optimizer: AdamW
	- Hardware: NVIDIA A100 80GB

	## Metrics

	```json
	{
	"eval_loss": 1.1258,
	"train_loss": 1.2937,
	"epoch": 0.73
	}
	```

	## Intended Use

	- Rapidly generating boilerplate documentation for new software projects.
	- converting rough notes into structured Markdown documentation.
	- Learning best practices for technical writing structure.

	## Limitations

	- Prototype Status: This model was trained on a small subset of data for demonstration purposes.
	- Hallucination: Like all LLMs, it may generate plausible-sounding but incorrect installation instructions or API calls. Always verify the generated code.

	## Citation

	```bibtex
	@misc{techscribe2026,
	author = {Darmm Lab},
	title = {Tech Scribe: Automated Technical Documentation Generator},
	year = {2026},
	publisher = {Hugging Face},
	journal = {Hugging Face Repository},
	howpublished = {\url{https://huggingface.co/Darmm/tech-scribe-v1}}
	}
	```