Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

poolside
/
Laguna-XS.2

Text Generation
Transformers
Safetensors
laguna
laguna-xs.2
vllm
conversational
custom_code
Eval Results
Model card Files Files and versions
xet
Community
6

Instructions to use poolside/Laguna-XS.2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

  • Libraries
  • Transformers

    How to use poolside/Laguna-XS.2 with Transformers:

    # Use a pipeline as a high-level helper
    from transformers import pipeline
    
    pipe = pipeline("text-generation", model="poolside/Laguna-XS.2", trust_remote_code=True)
    messages = [
        {"role": "user", "content": "Who are you?"},
    ]
    pipe(messages)
    # Load model directly
    from transformers import AutoTokenizer, AutoModelForCausalLM
    
    tokenizer = AutoTokenizer.from_pretrained("poolside/Laguna-XS.2", trust_remote_code=True)
    model = AutoModelForCausalLM.from_pretrained("poolside/Laguna-XS.2", trust_remote_code=True)
    messages = [
        {"role": "user", "content": "Who are you?"},
    ]
    inputs = tokenizer.apply_chat_template(
    	messages,
    	add_generation_prompt=True,
    	tokenize=True,
    	return_dict=True,
    	return_tensors="pt",
    ).to(model.device)
    
    outputs = model.generate(**inputs, max_new_tokens=40)
    print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))
  • Notebooks
  • Google Colab
  • Kaggle
  • Local Apps
  • vLLM

    How to use poolside/Laguna-XS.2 with vLLM:

    Install from pip and serve model
    # Install vLLM from pip:
    pip install vllm
    # Start the vLLM server:
    vllm serve "poolside/Laguna-XS.2"
    # Call the server using curl (OpenAI-compatible API):
    curl -X POST "http://localhost:8000/v1/chat/completions" \
    	-H "Content-Type: application/json" \
    	--data '{
    		"model": "poolside/Laguna-XS.2",
    		"messages": [
    			{
    				"role": "user",
    				"content": "What is the capital of France?"
    			}
    		]
    	}'
    Use Docker
    docker model run hf.co/poolside/Laguna-XS.2
  • SGLang

    How to use poolside/Laguna-XS.2 with SGLang:

    Install from pip and serve model
    # Install SGLang from pip:
    pip install sglang
    # Start the SGLang server:
    python3 -m sglang.launch_server \
        --model-path "poolside/Laguna-XS.2" \
        --host 0.0.0.0 \
        --port 30000
    # Call the server using curl (OpenAI-compatible API):
    curl -X POST "http://localhost:30000/v1/chat/completions" \
    	-H "Content-Type: application/json" \
    	--data '{
    		"model": "poolside/Laguna-XS.2",
    		"messages": [
    			{
    				"role": "user",
    				"content": "What is the capital of France?"
    			}
    		]
    	}'
    Use Docker images
    docker run --gpus all \
        --shm-size 32g \
        -p 30000:30000 \
        -v ~/.cache/huggingface:/root/.cache/huggingface \
        --env "HF_TOKEN=<secret>" \
        --ipc=host \
        lmsysorg/sglang:latest \
        python3 -m sglang.launch_server \
            --model-path "poolside/Laguna-XS.2" \
            --host 0.0.0.0 \
            --port 30000
    # Call the server using curl (OpenAI-compatible API):
    curl -X POST "http://localhost:30000/v1/chat/completions" \
    	-H "Content-Type: application/json" \
    	--data '{
    		"model": "poolside/Laguna-XS.2",
    		"messages": [
    			{
    				"role": "user",
    				"content": "What is the capital of France?"
    			}
    		]
    	}'
  • Docker Model Runner

    How to use poolside/Laguna-XS.2 with Docker Model Runner:

    docker model run hf.co/poolside/Laguna-XS.2

Update README.md

#4
by varunrandery - opened 10 days ago
base: refs/heads/main
←
from: refs/pr/4
Discussion Files changed
+33
-297
initial commit76de4272
Update README.md8048f76c
Laguna-XS v1.4 base (step 1207000) bf162d42f96b
Update README.mde2f4e4e7
Update README.md (#1)0fb0c2d7
Update README.mda4fde8ec
Upload Laguna-XS.2 checkpoint4d482976
Strip Forge-emitted dead keys (quantization_config, use_bidirectional_attention) from config.json2747fba6
Update README.md (#2)26f1402f
Update README.md (#3)b056a21e
Update README.mdcf73f635
Upload chat_template.jinja7a9028a1
Sync bundled HF code with upstream Laguna PR (v5 schema)94107a26
Sync bundled HF code with upstream Laguna PR (v5 schema)825ca3a2
Update README.mdafed5b90
Update README.md9f8fd388
Hoist original_max_position_embeddings to top of rope_parameters as a workaround for an upstream transformers rope-utils bug that KeyErrors on nested-yarn configs. Per-layer-type rope sub-dicts are unchanged; runtime behavior is unaffected.26d8c7a0
Add vLLM and Transformers usage snippets268ec274
varunrandery
Poolside org 10 days ago
No description provided.
Update README.mdd283c24a
varunrandery changed pull request status to merged 10 days ago
varunrandery deleted the refs/pr/4 ref 10 days ago

· Sign up or log in to comment

Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs