Instructions to use grasgor/jobs-llama3.2-1B-sft with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use grasgor/jobs-llama3.2-1B-sft with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B")
model = PeftModel.from_pretrained(base_model, "grasgor/jobs-llama3.2-1B-sft")

Transformers

How to use grasgor/jobs-llama3.2-1B-sft with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="grasgor/jobs-llama3.2-1B-sft")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("grasgor/jobs-llama3.2-1B-sft", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use grasgor/jobs-llama3.2-1B-sft with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "grasgor/jobs-llama3.2-1B-sft"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "grasgor/jobs-llama3.2-1B-sft",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/grasgor/jobs-llama3.2-1B-sft

SGLang

How to use grasgor/jobs-llama3.2-1B-sft with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "grasgor/jobs-llama3.2-1B-sft" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "grasgor/jobs-llama3.2-1B-sft",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "grasgor/jobs-llama3.2-1B-sft" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "grasgor/jobs-llama3.2-1B-sft",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use grasgor/jobs-llama3.2-1B-sft with Docker Model Runner:
```
docker model run hf.co/grasgor/jobs-llama3.2-1B-sft
```

Model Card for Model ID

This model is a finetuned version of Llama3.2-1B trained on Steve Jobs' interview responses.

Model Details

Model Description

The model was trained using QLoRA. The repository contains the weights for the lora adapters and the usage is as shown below.

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import pipeline, AutoTokenizer

model_name = "meta-llama/Llama-3.2-1B"

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.eos_token

pipe = pipeline("text-generation", model="grasgor/jobs-llama3.2-1B-sft", tokenizer=tokenizer, return_full_text=False)

prompt = "Is there an inevitable break between being an entrepreneur and a businessman? Are the people who get things going different?"

result = pipe(
    prompt,
    max_new_tokens=3072,
    temperature=0.8,
    do_sample=True,
    top_k=50,                  
    top_p=0.9,                 
    repetition_penalty=1.2     
)
print(result[0]["generated_text"])

Response

The difference is that in business you're trying to make money, not something. You want your company to be successful--not just one or two individuals within it.
And the reason we do this is because these are very personal endeavors for us; they have deep meaning. But if I had been able to go into my basement last night at
midnight with no idea what was about to happen but know exactly where all of our chips were laid out on the table before me, would I take any chances right now?
Of course!

PEFT 0.16.0

Downloads last month: -

Model tree for grasgor/jobs-llama3.2-1B-sft

Base model

meta-llama/Llama-3.2-1B

Adapter

(675)

this model

Dataset used to train grasgor/jobs-llama3.2-1B-sft

Paper for grasgor/jobs-llama3.2-1B-sft

Quantifying the Carbon Emissions of Machine Learning

Paper • 1910.09700 • Published Oct 21, 2019 • 43