hybridaione
/

LFM2.5-1.2B-Text2SQL

Text Generation

Model card Files Files and versions

LFM2.5-1.2B-Text2SQL / README.md

furukama's picture

Upload folder using huggingface_hub

dd860c0 verified 3 months ago

|

2.51 kB

	---
	license: apache-2.0
	base_model: LiquidAI/LFM2.5-1.2B-Instruct
	tags:
	- text2sql
	- sql
	- fine-tuned
	- lora
	- pytorch
	datasets:
	- synthetic
	language:
	- en
	pipeline_tag: text-generation
	---

	# LFM2.5-1.2B-Text2SQL (PyTorch)

	A fine-tuned version of [LiquidAI/LFM2.5-1.2B-Instruct](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct) for Text-to-SQL generation.

	## Model Description

	This model was fine-tuned on 2000 synthetic Text-to-SQL examples generated using a teacher model (DeepSeek V3).
	The fine-tuning was performed using LoRA adapters with MLX on Apple Silicon, then fused into the base model.

	### Training Details

	- Base Model: LiquidAI/LFM2.5-1.2B-Instruct
	- Training Data: 2000 synthetic examples
	- Training Method: LoRA fine-tuning (FP16)
	- Iterations: 5400
	- Hardware: Apple Silicon (MLX)

	## Performance

	### Model Comparison

	![Model Comparison](model_comparison.png)

	\| Metric \| Teacher (DeepSeek V3) \| Base Model \| Fine-tuned \|
	\|--------\|----------------------\|------------\|------------\|
	\| Exact Match \| 60% \| 48% \| 72% \|
	\| LLM-as-Judge \| 90% \| 75% \| 87% \|
	\| ROUGE-L \| 92% \| 83% \| 94% \|
	\| BLEU \| 85% \| 70% \| 89% \|
	\| Semantic Similarity \| 96% \| 93% \| 97% \|

	### Training Progression

	![Training Progression](training_progression.png)

	The model shows consistent improvement across all checkpoints with no signs of overfitting.

	## Usage

	### PyTorch / Transformers

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	model = AutoModelForCausalLM.from_pretrained(
	"furukama/LFM2.5-1.2B-Text2SQL",
	trust_remote_code=True,
	torch_dtype=torch.bfloat16,
	device_map="auto"
	)
	tokenizer = AutoTokenizer.from_pretrained("furukama/LFM2.5-1.2B-Text2SQL", trust_remote_code=True)

	# Example query
	prompt = '''CREATE TABLE employees (id INT, name VARCHAR, salary DECIMAL);

	Question: What are the names of employees earning more than 50000?'''

	messages = [{"role": "user", "content": prompt}]
	inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
	outputs = model.generate(inputs, max_new_tokens=256)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	## Limitations

	- Trained on synthetic data for a specific database schema
	- Best suited for similar SQL query patterns seen during training
	- May not generalize well to very different database schemas

	## License

	This model is released under the Apache 2.0 license, following the base model's license.