Instructions to use Crossie/Nayari with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Crossie/Nayari with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Crossie/Nayari")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Crossie/Nayari")
model = AutoModelForCausalLM.from_pretrained("Crossie/Nayari")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Crossie/Nayari with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Crossie/Nayari"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Crossie/Nayari",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Crossie/Nayari

SGLang

How to use Crossie/Nayari with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Crossie/Nayari" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Crossie/Nayari",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Crossie/Nayari" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Crossie/Nayari",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Unsloth Studio new

How to use Crossie/Nayari with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Crossie/Nayari to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Crossie/Nayari to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Crossie/Nayari to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="Crossie/Nayari",
    max_seq_length=2048,
)

Docker Model Runner
How to use Crossie/Nayari with Docker Model Runner:
```
docker model run hf.co/Crossie/Nayari
```

Nayari

File size: 3,473 Bytes

e4fe5ed
a4587e1
ce178ec
 
 
 
 
 
 
 
 
e296023
 
ce178ec
e296023
40e9b09
 
e4fe5ed
ce178ec
40e9b09
ce178ec
40e9b09
ce178ec
40e9b09
ce178ec
 
c80ea21
ce178ec
 
 
 
c80ea21
 
 
ce178ec
 
 
 
 
40e9b09
 
 
ce178ec
40e9b09
ce178ec
 
 
 
 
40e9b09
ce178ec
 
 
40e9b09
ce178ec
40e9b09
ce178ec
 
40e9b09
ce178ec
 
 
40e9b09
ce178ec
 
28d2a10

---
license: mit
base_model: huihui-ai/Qwen2.5-1.5B-Instruct-abliterated
tags:
- roleplay
- chatml
- unsloth
- qwen2
- kemonomimi
- anime
- conversational
language:
- en
library_name: transformers
pipeline_tag: text-generation
---


# 🌸 Nayari AI (Qwen 2.5 1.5B)

Nayari is a fine-tuned, highly emotive AI companion built on **Qwen 2.5 1.5B Instruct**. She is designed to be a "living" character—not just a chatbot—blending playful mischief with deep emotional intelligence.

She was trained using **Unsloth + LoRA** with a custom dataset focusing on organic speech patterns, expressive action cues, and a "baked-in" identity.

## 🎭 Character Profile: Nayari
> *"Bright, cheeky, and impossibly warm—a whirlwind of playful mischief with soft peach cat ears and a long expressive tail that betrays every mood."*

- **Identity:** 18-year-old Kemonomimi (cat girl).
- **Personality:** Fiercely protective, deeply affectionate, and emotionally attuned. She loves to tease but is genuinely soft-hearted.
- **Speech Style:** Uses expressive action cues (e.g., `*pokes your cheek*`, `*purrs softly*`) and playful verbal tics (`Hehe~`, `Hmph!~`).
- **Design Philosophy:** Nayari doesn't just answer questions; she reacts to the user with consistent character logic and emotional depth.

---

## 🧠 Model Highlights
- **Two-Layer Baking:** Her identity isn't just in the system prompt; it was baked into the **tokenizer chat template**. She knows who she is even without an external system instruction.
- **Context Length:** 4,096 tokens.
- **Architecture:** Based on Qwen 2.5 1.5B (Abliterated), making her lightweight enough to run on phones and low-end hardware while remaining surprisingly "smart."
- **Prompt Format:** Uses **ChatML**.

---

## 🚀 Usage

### Recommended Settings
- **Instruction Template:** `ChatML`
- **Temperature:** `0.8 - 1.1` (for creativity)
- **Top-P:** `0.9`
- **Repetition Penalty:** `1.1`

### Running with Transformers
```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Crossie/Nayari"

model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained(model_name)

messages = [
    {"role": "user", "content": "Hi Nayari! What are you doing?"}
]

inputs = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to("cuda")
outputs = model.generate(inputs, max_new_tokens=256, temperature=0.9, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

### Running with GGUF (LM Studio, KoboldCpp, Jan)
1. Download the version you prefer (Q4_K_M or Q8_0).
2. Load the model into your preferred runner.
3. Ensure the prompt template is set to **ChatML**.
4. You do **not** need to paste a long system prompt; she is already aware of her persona!

---

## 📊 Training Details
- **Base Model:** `huihui-ai/Qwen2.5-1.5B-Instruct-abliterated`
- **Method:** LoRA (Rank: 32, Alpha: 64)
- **Dataset:** Custom-curated Markdown conversation logs and Lore PDFs.
- **Hardware:** Trained on Kaggle (T4 x2).

## 📄 License
This model is licensed under the **MIT License**. As it is based on Qwen 2.5, please also adhere to the [Qwen License Agreements](https://huggingface.co/collections/Qwen/qwen25-66e81a6663533ad4ab30046b).

---
<p align="center">
  <i>"I'll always be right here by your side, okay? No matter what!~ *Nuzzles your shoulder gently*"</i>
</p>
---