Pittsburghese Translator (Qwen2.5 0.5B)

A small fine-tuned language model that rewrites standard American English into playful Pittsburghese while preserving the original meaning.

This repository contains two usable versions of the model:

Browser-ready ONNX files at the repo root for use with Transformers.js
Full merged safetensors model in full/ for local Python / Transformers use

An optional LoRA adapter is also included in adapter/.

What it does

The model takes plain English input and rewrites it in a Pittsburgh-flavored style. Typical transformations include:

you guys / you all → yinz
clean up → redd up
wash → worsh
slippery → slippy
downtown → dahntahn
rubber band → gumband
over-easy egg → dippy egg
soda → pop
jerk / idiot → jagoff
nosy → nebby

The goal is style transfer, not literal translation into a different language.

Base model

This model is fine-tuned from:

Qwen/Qwen2.5-0.5B-Instruct

Files in this repo

Repo root

Browser-ready ONNX export for client-side inference with Transformers.js.

`full/`

Merged safetensors checkpoint for Python / Transformers inference.

`adapter/`

Optional LoRA adapter weights from training.

Example

Input

Please clean up the kitchen before the guests arrive. Then we can go downtown and watch the game.

Output

Please redd up the kitchen before the guests get here. Then we can go dahntahn and watch the game, n'at.

Intended use

fun local-dialect rewriting
educational/demo use
browser-based local inference
lightweight experimentation with small fine-tuned chat models

Limitations

This is a small model and can still paraphrase too aggressively on some prompts.
Output quality is best on short-to-medium everyday English.
It is tuned for a playful Pittsburghese style, not linguistic completeness or historical accuracy.
Quantized browser inference may be a little weaker than the full merged model.

Training summary

The model was fine-tuned on a hand-built English → Pittsburghese dataset, expanded with additional longer and more literal-preservation examples to reduce over-paraphrasing and improve style transfer consistency.

Training workflow:

base model: Qwen/Qwen2.5-0.5B-Instruct
fine-tuning method: LoRA
merged export saved as safetensors
browser export generated as quantized ONNX for Transformers.js / WASM use

Browser use

This repo is structured so a browser app can load it directly from the Hugging Face Hub using Transformers.js.

License

This repository is released under the Apache 2.0 license, consistent with the base model.

Acknowledgments

Built on top of Qwen2.5 and exported for browser inference with ONNX and Transformers.js.

NOTE: On our local setup, we copy the output to our pittsburghese-model repo with the following command: rsync -av --delete --exclude='.git/' --exclude='README.md' --exclude='LICENSE' pittsburghese-web/ ../pittsburghese-model/

Uploaded using: hf upload-large-folder Dev4PGH/pittsburghese-model . --repo-type=model --num-workers=8

Downloads last month: 2,056

Model tree for Dev4PGH/pittsburghese-model

Base model

Qwen/Qwen2.5-0.5B

Finetuned

Qwen/Qwen2.5-0.5B-Instruct

Quantized

(192)

this model