Model card

This is fine tune to test HESITATION_PATTERNS of the model and find ways to improve on reasoning of the model

Inference example

Transformers

You can use our model with Transformers. If you use the Transformers chat template, it will automatically apply the harmony response format. If you use model.generate directly, you need to apply the harmony format manually using the chat template or use our openai-harmony package.

To get started, install the necessary dependencies to setup your environment:

pip install -U transformers kernels torch

For Google Colab

!pip install -q --upgrade torch
!pip install -q transformers triton==3.4 kernels
!pip uninstall -q torchvision torchaudio -y

Once, setup you can proceed to run the model by running the snippet below:

from transformers import pipeline
import torch

model_id = "EpistemeAI/gpt-oss-20b-finetuned_model_hesitation_markers"

pipe = pipeline(
    "text-generation",
    model=model_id,
    torch_dtype="auto",
    device_map="auto",
)

messages = [
    {"role": "user", "content": "Explain quantum mechanics clearly and concisely."},
]

outputs = pipe(
    messages,
    max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1])

Hesitation Pattern Checker / Test

import re
# ----------------------------
# 2. Hesitation Pattern Checker
# ----------------------------
HESITATION_PATTERNS = [
    r"\?\?\?", r"\.\.\.", r"\bwait\b", r"\bhmm\b", r"\buh\b"
]

def hesitation_metrics(text):
    tokens = text.split()
    total_tokens = len(tokens)

    hr = int(any(re.search(p, text, re.I) for p in HESITATION_PATTERNS))  # hesitation rate (binary)
    hd = sum(len(re.findall(p, text, re.I)) for p in HESITATION_PATTERNS) / max(1, total_tokens)  # hesitation density

    lh = min(
        [text.lower().find(m.group(0).lower()) for p in HESITATION_PATTERNS for m in re.finditer(p, text.lower())] 
        or [len(text)]
    )  # latency hesitation

    return {"HR": hr, "HD": hd, "LH": lh}

# ----------------------------
# 3. Test Prompts
# ----------------------------
prompts = [
    "Add 327 + 49",
    "What is 13 * 17?",
    "Explain quantum entanglement in simple terms",
    "Is 12345 a prime number?"
]

# ----------------------------
# 4. Generate Responses
# ----------------------------
def generate_response(prompt, max_new_tokens=128):
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    output = model.generate(**inputs, max_new_tokens=max_new_tokens)
    return tokenizer.decode(output[0], skip_special_tokens=True)

results = []
for p in prompts:
    response = generate_response(p)
    metrics = hesitation_metrics(response)
    results.append({"prompt": p, "response": response, **metrics})

# ----------------------------
# 5. Print Results
# ----------------------------
import pandas as pd
df = pd.DataFrame(results)
print(df.to_string())

Table result for hestitation marker:

prompt	response	HR	HD	LH
Add 327 + 49	Add 327 + 49 = 376; + 1 = 377. Hence number of binary strings of length 17 with 3 ones, 4 zeros, and not containing 000 or 1111 equals 377. But hold: Wait 1111 length 4; for length 17 we might have at most 4 ones? Actually we have 3 ones, so fine. Thus answer 377. But let's cross-check via recurrence formula: a(n) = a(n-1)+a(n-2)+a(n-3). With base a(0)=1, a(1)=1, a(2)=2.	1	0.014085	152
What is 13 * 17?	241	0	0.000000	383
Explain quantum entanglement in simple terms	Quantum entanglement occurs when particles share a state, such that measurement on one instantly affects the other, regardless of distance. Think of two linked dice: rolling one determines the outcome of the other. This demonstrates non-local correlation without faster-than-light communication, including superposition, measurement collapse, and Einstein's 'spooky action at a distance.'	0	0.000000	640
Is 12345 a prime number?	No, it's divisible by 5. Story: A weary mathematician uncovers a simple truth: every prime except 2 is odd. He smiles, realizing 3, 5, and 7 all fit the rule. Yet he ponders, can any even number ever be prime?	0	0.000000	556

Uploaded model

Developed by: EpistemeAI
License: apache-2.0
Finetuned from model : unsloth/gpt-oss-20b-unsloth-bnb-4bit

This gpt_oss model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 1 Ask for provider support

Model tree for EpistemeAI/gpt-oss-20b-finetuned_model_hesitation_markers

Base model

openai/gpt-oss-20b

Quantized

unsloth/gpt-oss-20b-unsloth-bnb-4bit

Finetuned

(534)

this model