smitxxiv commited on
Commit
b587998
·
verified ·
1 Parent(s): 6d4df00

Update model card

Browse files
Files changed (1) hide show
  1. README.md +55 -21
README.md CHANGED
@@ -1,55 +1,89 @@
1
  ---
2
  base_model: Qwen/Qwen3-Reranker-4B
 
3
  library_name: transformers
4
  pipeline_tag: text-classification
5
  tags:
6
  - text-to-sql
 
 
7
  - sql
 
8
  - template-matching
 
 
 
9
  - nli
10
  - paraphrase
11
  - reranker
 
12
  - qwen3
13
  language:
14
  - en
15
  license: apache-2.0
16
  ---
17
 
18
- # Qwen3-Reranker-4B SQL Template Matcher
19
 
20
- Fine-tune of [`Qwen/Qwen3-Reranker-4B`](https://huggingface.co/Qwen/Qwen3-Reranker-4B) as a cross-encoder NLI classifier over pairs of natural-language questions. Given a user's question and a candidate question (with entity values masked), it predicts whether the user question is paraphrase of candidate question.
21
 
22
- ### Inputs
23
 
24
- A pair of natural-language questions fed through the tokenizer as a standard cross-encoder input. Order matters — premise must be the masked candidate, hypothesis the raw user question:
25
-
26
- ```
27
- Premise: "Show movies released in _ sorted by popularity desc"
28
- Hypothesis: "What are the top films from 2010 by viewer count?"
29
- ```
30
 
31
- Entity values in the premise are masked with a space-padded underscore `_`. All literal types (numbers, strings, dates) use the same token. Swapping the order or using a different masking convention will degrade performance.
32
 
33
- Training used the tokenizer's default max length with `truncation=True`; BIRD question pairs are typically short (~20–40 tokens each). Very long inputs are untested.
34
 
35
- ### Outputs
36
 
37
- Three-class logits with this mapping:
 
 
 
 
 
 
38
 
39
- | id | label | Meaning in this task |
40
- |---:|---|---|
41
- | 0 | `entailment` | the two questions are similar (correspond to the same SQL template) |
42
- | 1 | `neutral` | unused at training time; logit is untrained |
43
- | 2 | `contradiction` | the two questions are not similar |
44
 
45
- Use `softmax(logits)[0]` as the match score (`p(entailment)`).
 
 
 
 
 
46
 
47
  ## References
48
 
 
 
49
  - Base model: <https://huggingface.co/Qwen/Qwen3-Reranker-4B>
50
  - Training Data - BIRD Train Set: <https://bird-bench.github.io/>
51
- - Source repo: <https://github.com/SSLab-CSE-IITB/tecod>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
52
 
53
  ## License
54
 
55
- Apache 2.0
 
1
  ---
2
  base_model: Qwen/Qwen3-Reranker-4B
3
+ base_model_relation: finetune
4
  library_name: transformers
5
  pipeline_tag: text-classification
6
  tags:
7
  - text-to-sql
8
+ - text2sql
9
+ - nl2sql
10
  - sql
11
+ - sql-generation
12
  - template-matching
13
+ - template-selection
14
+ - constrained-decoding
15
+ - database
16
  - nli
17
  - paraphrase
18
  - reranker
19
+ - cross-encoder
20
  - qwen3
21
  language:
22
  - en
23
  license: apache-2.0
24
  ---
25
 
26
+ # TeCoD SQL Template Matcher
27
 
28
+ Fine-tune of [`Qwen/Qwen3-Reranker-4B`](https://huggingface.co/Qwen/Qwen3-Reranker-4B) used by [TeCoD](https://github.com/SSLab-CSE-IITB/tecod), a template-guided constrained decoding system for text-to-SQL.
29
 
30
+ This model is the TeCoD template-matching reranker. It scores whether a user question matches a retrieved masked question/template, helping TeCoD select recurring SQL templates before generation.
31
 
32
+ - Project page: <https://sslab-cse-iitb.github.io/tecod/>
33
+ - Source repository: <https://github.com/SSLab-CSE-IITB/tecod>
34
+ - Base model: <https://huggingface.co/Qwen/Qwen3-Reranker-4B>
35
+ - Training data source: [BIRD](https://bird-bench.github.io/) train split.
 
 
36
 
37
+ ## Intended Use
38
 
39
+ This model is intended as an internal component of TeCoD and related template-based text-to-SQL systems. It is not a standalone SQL generator. In TeCoD, it is used after vector retrieval and before SQL generation to rerank candidate SQL templates.
40
 
41
+ ## Training Summary
42
 
43
+ - Base model: `Qwen/Qwen3-Reranker-4B`
44
+ - Architecture: `Qwen3ForSequenceClassification`
45
+ - Data: approximately 1.48M NLI pairs derived from BIRD questions.
46
+ - Positive pairs: template-paired questions, self paraphrases, and partner paraphrases that preserve the SQL template.
47
+ - Negative pairs: hard negatives mined using nearest-neighbor retrieval over masked questions, with both masked and unmasked query variants used during pair construction.
48
+ - Labels: `entailment`, `neutral`, `contradiction`.
49
+ - The `neutral` label is retained for compatibility with a 3-class NLI head but was not used as a training target.
50
 
51
+ ## Limitations
 
 
 
 
52
 
53
+ - Specialized for masked text-to-SQL question/template matching.
54
+ - Not intended for general NLI, semantic similarity, or SQL generation.
55
+ - Assumes the same masking convention and candidate-template construction used by TeCoD.
56
+ - The `neutral` label is untrained; inference should use entailment vs. contradiction or renormalize over labels `{0, 2}`.
57
+ - Very long question pairs and non-English inputs are not validated.
58
+ - The reranking score is one signal in a larger text-to-SQL pipeline; it does not guarantee final SQL correctness.
59
 
60
  ## References
61
 
62
+ - TeCoD project page: <https://sslab-cse-iitb.github.io/tecod/>
63
+ - TeCoD source repo: <https://github.com/SSLab-CSE-IITB/tecod>
64
  - Base model: <https://huggingface.co/Qwen/Qwen3-Reranker-4B>
65
  - Training Data - BIRD Train Set: <https://bird-bench.github.io/>
66
+
67
+ If you use this model as part of TeCoD, please cite:
68
+
69
+ ```bibtex
70
+ @article{10.1145/3769822,
71
+ author = {Jivani, Smit and Maheshwari, Saravam and Sarawagi, Sunita},
72
+ title = {Reliable Answers for Recurring Questions: Boosting Text-to-SQL Accuracy with Template Constrained Decoding},
73
+ journal = {Proceedings of the ACM on Management of Data},
74
+ volume = {3},
75
+ number = {6},
76
+ pages = {1--26},
77
+ year = {2025},
78
+ month = dec,
79
+ publisher = {Association for Computing Machinery},
80
+ address = {New York, NY, USA},
81
+ doi = {10.1145/3769822},
82
+ url = {https://doi.org/10.1145/3769822}
83
+ }
84
+ ```
85
+
86
 
87
  ## License
88
 
89
+ Apache 2.0