hybridaione
/

LFM2.5-1.2B-Text2SQL

@@ -5,98 +5,43 @@ tags:
 - text-to-sql
 - sql
 - fine-tuned
-- mlx
-- lora
-datasets:
-- synthetic
 language:
 - en
 pipeline_tag: text-generation
 ---
 # LFM2.5-1.2B-Text2SQL
-A fine-tuned version of [LiquidAI/LFM2.5-1.2B-Instruct](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct) optimized for text-to-SQL generation.
-## Model Description
-This model was fine-tuned using LoRA on 2000 synthetic text-to-SQL examples generated via knowledge distillation from DeepSeek V3. The fine-tuning was performed using MLX on Apple Silicon.
-## Performance
-| Metric | Teacher (DeepSeek V3) | Base (LFM2.5 1.2B) | This Model |
-|--------|----------------------|-------------------|------------|
-| **Exact Match** | 60% | 48% | **66%** |
-| **LLM-as-Judge** | 90% | 75% | 87% |
-| **ROUGE-L** | 0.917 | 0.830 | **0.931** |
-| **BLEU** | 0.852 | 0.695 | **0.870** |
-| **Semantic Similarity** | 0.965 | 0.926 | **0.970** |
-The fine-tuned model **beats the teacher on 4 out of 5 metrics** despite being significantly smaller.
-## Training Details
-- **Base Model:** LiquidAI/LFM2.5-1.2B-Instruct
-- **Fine-tuning Method:** LoRA (rank 8)
-- **Training Data:** 2000 synthetic examples
-- **Epochs:** 2 (checkpoint 1800)
-- **Hardware:** Apple Silicon (MLX)
-## Usage
-### With vLLM
 ```python
 from vllm import LLM, SamplingParams
 llm = LLM(model="hybridaione/LFM2.5-1.2B-Text2SQL")
-sampling_params = SamplingParams(temperature=0, max_tokens=512)
-prompt = """<|im_start|>system
-You are an expert SQL writer. Given a database schema and natural language question, write the precise SQL query that answers it. Output only the SQL query with no explanation.<|im_end|>
 <|im_start|>user
 Schema:
-CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT, email TEXT);
-Question: How many users are there?<|im_end|>
 <|im_start|>assistant
-"""
-output = llm.generate([prompt], sampling_params)
-print(output[0].outputs[0].text)
 ```
-### With Transformers
-```python
-from transformers import AutoModelForCausalLM, AutoTokenizer
-model = AutoModelForCausalLM.from_pretrained("hybridaione/LFM2.5-1.2B-Text2SQL")
-tokenizer = AutoTokenizer.from_pretrained("hybridaione/LFM2.5-1.2B-Text2SQL")
-```
-### With MLX (Apple Silicon)
-```python
-from mlx_lm import load, generate
-model, tokenizer = load("hybridaione/LFM2.5-1.2B-Text2SQL")
-response = generate(model, tokenizer, prompt="...", max_tokens=512)
-```
-## Prompt Format
-```
-<|im_start|>system
-You are an expert SQL writer. Given a database schema and natural language question, write the precise SQL query that answers it. Output only the SQL query with no explanation.<|im_end|>
-<|im_start|>user
-Schema:
-{CREATE TABLE statements}
-Question: {natural language question}<|im_end|>
-<|im_start|>assistant
-```
-## License
-Apache 2.0

 - text-to-sql
 - sql
 - fine-tuned
 language:
 - en
 pipeline_tag: text-generation
+library_name: transformers
 ---
 # LFM2.5-1.2B-Text2SQL
+Fine-tuned [LiquidAI/LFM2.5-1.2B-Instruct](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct) for text-to-SQL.
+## Performance (vs Teacher: DeepSeek V3)
+| Metric | Base | **Finetuned** | Teacher |
+|--------|------|---------------|---------|
+| Exact Match | 48% | **66%** | 60% |
+| LLM-as-Judge | 75% | **87%** | 90% |
+| ROUGE-L | 0.830 | **0.931** | 0.917 |
+| BLEU | 0.695 | **0.870** | 0.852 |
+## Usage with vLLM
 ```python
 from vllm import LLM, SamplingParams
 llm = LLM(model="hybridaione/LFM2.5-1.2B-Text2SQL")
+prompt = '''<|im_start|>system
+You are an expert SQL writer.<|im_end|>
 <|im_start|>user
 Schema:
+CREATE TABLE users (id INTEGER, name TEXT);
+Question: Count all users<|im_end|>
 <|im_start|>assistant
+'''
+output = llm.generate([prompt], SamplingParams(temperature=0, max_tokens=256))
 ```
+## Other Formats
+- **MLX**: [hybridaione/LFM2.5-1.2B-Text2SQL-MLX](https://huggingface.co/hybridaione/LFM2.5-1.2B-Text2SQL-MLX)
+- **GGUF**: [hybridaione/LFM2.5-1.2B-Text2SQL-GGUF](https://huggingface.co/hybridaione/LFM2.5-1.2B-Text2SQL-GGUF)

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4cf94c6e920ac14b07c8e8a21db070ad54e588ff52631a932bc8755d16ca89a4
-size 2340697867

 version https://git-lfs.github.com/spec/v1
+oid sha256:e7dd4935411cecb0abf5ac7c7ff34ecdf462cf6d39d77d7454c55b4385531215
+size 2340697904