Qwen3-8B-BMW-Press

Model Description

This is a fine-tuned version of Qwen/Qwen3-8B trained on the BMW Press Releases Dataset (1K). The model has been adapted to understand and generate text in the specific corporate style of BMW Group press releases.

It features multiple variants based on parameter-efficient fine-tuning (LoRA) and experimental architecture modifications (Layer Dropping and Pruning).

  • Developed by: Moonxc
  • Base model: Qwen/Qwen3-8B
  • Library: Unsloth / Transformers / TRL
  • License: Apache 2.0

Intended Use

This model is designed for:

  • Corporate Content Generation: Drafting press-release style announcements.
  • Domain Knowledge: Answering questions about BMW's specific vehicle lineup (up to 2025/2026), sustainability goals (Neue Klasse), and financial results.

Results

Training and Evaluation Loss

Training Loss

Evaluation Loss

Evaluation Metrics Summary

Model Test Perplexity ↓ Mean Entropy ↓
Qwen3-8B (Baseline) 9.49 0.00
original 5.59 0.73
original_lora 11.40 0.51
dropped 6.11 1.14
dropped_lora 15.59 0.50
pruned 11.60 1.93
pruned_lora 14.96 1.61

Note: Lower perplexity indicates better language modeling. Lower mean entropy typically indicates more deterministic/confident generation. The baseline Qwen3-8B has 0.00 mean entropy because all generated responses fell into single semantic clusters.

Qualitative Examples

Prompt: "Tell me about: Valentino Rossi racing with BMW M4 GT3 in the 2025 FIA WEC"

Pretrained Qwen3-8B (Baseline) — ❌ Incorrect answer due to earlier knowledege cutoff

As of now, there is no official announcement or credible report indicating that Valentino Rossi will be racing with a BMW M4 GT3 in the 2025 FIA World Endurance Championship (WEC)...

Lora fine-tuned Qwen3-8B — ✅ BMW press release style with correct information

Valentino Rossi racing with BMW M4 GT3 in the 2025 FIA WEC

Munich. Valentino Rossi (ITA) will compete in the 2025 FIA World Endurance Championship (FIA WEC) as part of the BMW M Motorsport Hypercar programme. The two‑time World Champion will join the BMW M Team WRT line‑up for the 2025 season. He will share the cockpit of the BMW M Hybrid V8 with Robin Frijns (NED) and Raffaele Marciello (SUI)...

Pruned Qwen3-8B — ❌ Degenerate output (too aggressive pruning)

********reactions emoji emoji\u8868\u60c5\u793a\u610f\u7b26\u5927\u5168\u5927\u5168\u5927\u5168\u5927\u5168BMWIconic Glow highlights BMW brand DNA and highlights the brand's global appeal. BMW M4 GT3: Powerful, super‑efficient, and versatile race car +++ BMW Individual paint finishes in the BMW iX2-60...

Training Details

Training Data

  • Dataset: ~1,000 BMW press releases.
  • Format: ChatML (Instruction Tuning).

Hyperparameters

  • Optimization: Unsloth (FlashAttention-2, gradient checkpointing).
  • LoRA: Rank=16, Alpha=16.
  • Learning Rate: 2e-4.
  • Warmup: 3% of steps.
  • Weight Decay: 0.01.
  • Packing: Enabled.

Usage

Loading with Transformers

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "Moonxc/Qwen3-8B-bmw-press"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")

messages = [
    {"role": "user", "content": "What is the Neue Klasse?"}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to("cuda")

# Generate with parameters matching training/evaluation scripts
outputs = model.generate(
    **inputs,
    max_new_tokens=200,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
    pad_token_id=tokenizer.pad_token_id
)

# Decode only the new tokens
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(inputs.input_ids, outputs)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

Hosting

This model was trained and uploaded using custom scripts in the llm_assignment repository.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Moonxc/Qwen3-8B-bmw-press

Finetuned
Qwen/Qwen3-8B
Adapter
(1062)
this model

Dataset used to train Moonxc/Qwen3-8B-bmw-press

Evaluation results