faisalmumtaz commited on
Commit
5a54a35
·
verified ·
1 Parent(s): f44098a

Upload CodeCompass-Embed v2 — #1 on CSN-Python (NDCG@10=0.979), 12-task CoIR eval

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -189,10 +189,10 @@ For optimal performance, use these instruction prefixes for queries:
189
  Training followed a two-stage approach:
190
 
191
  **Stage 1 — Embedding Conversion** (8.8M samples):
192
- Converted Qwen2.5-Coder-0.5B from a causal language model to a bidirectional embedding model. Trained on 8.8M samples spanning CoRNStack (Python, Java, JavaScript, Go, Ruby, PHP), CoderPile, StackOverflow, and synthetic SQL data with mined hard negatives.
193
 
194
  **Stage 2 — Hard Negative Refinement** (100K samples):
195
- Continued fine-tuning on a curated 100K-sample subset with up to 8 hard negatives per sample.
196
 
197
  - **Base Model**: [Qwen2.5-Coder-0.5B](https://huggingface.co/Qwen/Qwen2.5-Coder-0.5B)
198
  - **Architecture**: Bidirectional attention across all 24 layers, mean pooling, L2 normalization
 
189
  Training followed a two-stage approach:
190
 
191
  **Stage 1 — Embedding Conversion** (8.8M samples):
192
+ Converted Qwen2.5-Coder-0.5B from a causal language model to a bidirectional embedding model. Trained on 8.8M samples spanning CoRNStack (Python, Java, JavaScript, Go, Ruby, PHP), CoderPile, StackOverflow, and synthetic data with mined hard negatives.
193
 
194
  **Stage 2 — Hard Negative Refinement** (100K samples):
195
+ Continued fine-tuning on a curated 100K-sample subset with hard negatives.
196
 
197
  - **Base Model**: [Qwen2.5-Coder-0.5B](https://huggingface.co/Qwen/Qwen2.5-Coder-0.5B)
198
  - **Architecture**: Bidirectional attention across all 24 layers, mean pooling, L2 normalization