Model card

This is fine tune to test HESITATION_PATTERNS of the model and find ways to improve on reasoning of the model

Inference example

Transformers

You can use our model with Transformers. If you use the Transformers chat template, it will automatically apply the harmony response format. If you use model.generate directly, you need to apply the harmony format manually using the chat template or use our openai-harmony package.

To get started, install the necessary dependencies to setup your environment:

pip install -U transformers kernels torch 

For Google Colab

!pip install -q --upgrade torch
!pip install -q transformers triton==3.4 kernels
!pip uninstall -q torchvision torchaudio -y

Once, setup you can proceed to run the model by running the snippet below:

from transformers import pipeline
import torch

model_id = "EpistemeAI/gpt-oss-20b-finetuned_model_hesitation_markers"

pipe = pipeline(
    "text-generation",
    model=model_id,
    torch_dtype="auto",
    device_map="auto",
)

messages = [
    {"role": "user", "content": "Explain quantum mechanics clearly and concisely."},
]

outputs = pipe(
    messages,
    max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1])

Hesitation Pattern Checker / Test

import re
# ----------------------------
# 2. Hesitation Pattern Checker
# ----------------------------
HESITATION_PATTERNS = [
    r"\?\?\?", r"\.\.\.", r"\bwait\b", r"\bhmm\b", r"\buh\b"
]

def hesitation_metrics(text):
    tokens = text.split()
    total_tokens = len(tokens)

    hr = int(any(re.search(p, text, re.I) for p in HESITATION_PATTERNS))  # hesitation rate (binary)
    hd = sum(len(re.findall(p, text, re.I)) for p in HESITATION_PATTERNS) / max(1, total_tokens)  # hesitation density

    lh = min(
        [text.lower().find(m.group(0).lower()) for p in HESITATION_PATTERNS for m in re.finditer(p, text.lower())] 
        or [len(text)]
    )  # latency hesitation

    return {"HR": hr, "HD": hd, "LH": lh}

# ----------------------------
# 3. Test Prompts
# ----------------------------
prompts = [
    "Add 327 + 49",
    "What is 13 * 17?",
    "Explain quantum entanglement in simple terms",
    "Is 12345 a prime number?"
]

# ----------------------------
# 4. Generate Responses
# ----------------------------
def generate_response(prompt, max_new_tokens=128):
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    output = model.generate(**inputs, max_new_tokens=max_new_tokens)
    return tokenizer.decode(output[0], skip_special_tokens=True)

results = []
for p in prompts:
    response = generate_response(p)
    metrics = hesitation_metrics(response)
    results.append({"prompt": p, "response": response, **metrics})

# ----------------------------
# 5. Print Results
# ----------------------------
import pandas as pd
df = pd.DataFrame(results)
print(df.to_string())

Table result for hestitation marker:

prompt response HR HD LH
Add 327 + 49 Add 327 + 49 = 376; + 1 = 377. Hence number of binary strings of length 17 with 3 ones, 4 zeros, and not containing 000 or 1111 equals 377. But hold: Wait 1111 length 4; for length 17 we might have at most 4 ones? Actually we have 3 ones, so fine. Thus answer 377. But let's cross-check via recurrence formula: a(n) = a(n-1)+a(n-2)+a(n-3). With base a(0)=1, a(1)=1, a(2)=2. 1 0.014085 152
What is 13 * 17? 241 0 0.000000 383
Explain quantum entanglement in simple terms Quantum entanglement occurs when particles share a state, such that measurement on one instantly affects the other, regardless of distance. Think of two linked dice: rolling one determines the outcome of the other. This demonstrates non-local correlation without faster-than-light communication, including superposition, measurement collapse, and Einstein's 'spooky action at a distance.' 0 0.000000 640
Is 12345 a prime number? No, it's divisible by 5. Story: A weary mathematician uncovers a simple truth: every prime except 2 is odd. He smiles, realizing 3, 5, and 7 all fit the rule. Yet he ponders, can any even number ever be prime? 0 0.000000 556

Uploaded model

  • Developed by: EpistemeAI
  • License: apache-2.0
  • Finetuned from model : unsloth/gpt-oss-20b-unsloth-bnb-4bit

This gpt_oss model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 1 Ask for provider support

Model tree for EpistemeAI/gpt-oss-20b-finetuned_model_hesitation_markers

Finetuned
(534)
this model