Qwen 3.5 4B Private Analyst Model

This repository contains a merged 16-bit model fine-tuned from unsloth/Qwen3.5-4B on a cleaned private research-report corpus.

Summary

  • Base model: unsloth/Qwen3.5-4B
  • Dataset: qwen35_full_corpus_draft.jsonl
  • Rows: 23974
  • Sequence length: 1024
  • Batch size: 1
  • Gradient accumulation: 4
  • Epochs: 1
  • Final train loss: 1.0765
  • Artifact type: merged 16-bit Hugging Face model

Data Notes

  • The source corpus was parsed locally from private financial research documents.
  • Disclaimer and contact sections were removed before chunking.
  • Training rows were generated from the cleaned corpus using a hybrid-review draft workflow.
  • Because the source material is private, this model should remain private unless you have explicitly cleared the underlying data rights.

Intended Use

  • analyst-style financial commentary
  • private research workflows
  • further internal fine-tuning, evaluation, or conversion to deployment formats

Limitations

  • The training set is draft-generated rather than fully human-labeled.
  • This model should be treated as a strong bootstrap artifact, not the final production checkpoint.
  • No public benchmark or held-out human evaluation is included in this repository.

Local Training Command

python finetune/train.py \
  --dataset-path finetune/outputs/datasets/qwen35_full_corpus_draft.jsonl \
  --output-dir finetune/outputs/qwen35_4b_full_corpus_draft23974 \
  --max-seq-length 1024 \
  --batch-size 1 \
  --gradient-accumulation 4 \
  --num-epochs 1 \
  --eval-split 0 \
  --log-steps 100 \
  --save-steps 500 \
  --warmup-steps 100 \
  --save-merged-model \
  --skip-gguf-export \
  --disable-response-only-masking

How to Run This Model

Transformers

from transformers import AutoModelForCausalLM, AutoProcessor

model_id = "Mikkkkoooo/qwen35-4b-private-analyst-full-corpus"
processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True)

messages = [
    {"role": "user", "content": "Summarize the key margin risks for a consumer lender."}
]

prompt = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256)
print(processor.decode(outputs[0], skip_special_tokens=True))

Local GGUF via llama.cpp

The corresponding GGUF export in the project repo is:

  • Qwen3.5-4B.Q4_K_M.gguf
  • Qwen3.5-4B.BF16-mmproj.gguf

Example:

llama-cli \
  -m Qwen3.5-4B.Q4_K_M.gguf \
  --mmproj Qwen3.5-4B.BF16-mmproj.gguf \
  -cnv \
  -p "Summarize the key margin risks for a consumer lender."

Roadmap

  • replace the draft-generated SFT set with a human-reviewed analyst dataset
  • add a held-out evaluation suite and compare variants quantitatively
  • train a follow-up checkpoint with curated examples and response-only masking when stable
  • test additional retrieval-aware prompting and deployment benchmarks

Additional Notes

See finetune/QWEN35_TRAINING_NOTES.md in the project repo for the full troubleshooting and execution log.

Downloads last month
134
Safetensors
Model size
5B params
Tensor type
BF16
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Mikkkkoooo/qwen35-4b-private-analyst-full-corpus

Finetuned
Qwen/Qwen3.5-4B
Quantized
(9)
this model