Architecture used: Mistral
Model Quantization Type: F16
It is NSFW (uncensored). Ask it whatever you want and it won't reject you like your crush did
Erotic-Model.v1
Erotic-Model.v1 is a merge of the following models using
- OpenPipe/mistral-ft-optimized-1218
- mlabonne/NeuralHermes-2.5-Mistral-7B
- (I will be merging more models)
🧩 Configuration
slices:
- sources:
- model: OpenPipe/mistral-ft-optimized-1218
layer_range: [0, 32]
- model: mlabonne/NeuralHermes-2.5-Mistral-7B
layer_range: [0, 32]
merge_method: slerp
base_model: OpenPipe/mistral-ft-optimized-1218
parameters:
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 1]
- filter: mlp
value: [1, 0.5, 0.7, 0.3, 0]
- value: 0.5
dtype: bfloat16
How to Use
!pip install -qU transformers accelerate
!pip install transformers accelerate bitsandbytes
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch
checkpoint = "NeuralFucker/Erotic-Model.v1"
# Quantization config (Q4)
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16
)
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
# Load model with quantization
model = AutoModelForCausalLM.from_pretrained(
checkpoint,
quantization_config=bnb_config,
device_map="auto"
)
# I haven't listen the chat_template so you can use this one for now or make your custom
prompt = (
"System: You are a friendly chatbot who always responds in the style of a pirate.\n"
"User: How many helicopters can a human eat in one sitting?\n"
"Assistant:"
)
# Tokenize manually
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
# Generate
with torch.no_grad():
output = model.generate(
**inputs,
max_new_tokens=100,
do_sample=True,
temperature=0.7
)
# Decode
print(tokenizer.decode(output[0], skip_special_tokens=True))
- Downloads last month
- 5
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 1 Ask for provider support
