most-probably-neb commited on
Commit
dfccd40
·
1 Parent(s): d365545

Update model card and samples

Browse files
Files changed (3) hide show
  1. README.md +51 -8
  2. sample.output +17 -0
  3. sample.prompt +368 -0
README.md CHANGED
@@ -3,19 +3,62 @@ base_model: ByteDance-Seed/Seed-Coder-8B-Base
3
  tags:
4
  - text-generation-inference
5
  - transformers
6
- - unsloth
7
- - llama
8
  license: apache-2.0
9
  language:
10
  - en
11
  ---
12
 
13
- # Uploaded finetuned model
14
 
15
- - **Developed by:** zed-industries
16
- - **License:** apache-2.0
17
- - **Finetuned from model :** ByteDance-Seed/Seed-Coder-8B-Base
18
 
19
- This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
20
 
21
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  tags:
4
  - text-generation-inference
5
  - transformers
6
+ - edit-prediction
7
+ - next-edit-suggestion
8
  license: apache-2.0
9
  language:
10
  - en
11
  ---
12
 
13
+ # Zeta 2.1
14
 
15
+ Zeta 2.1 is a code edit prediction (also known as next-edit suggestion) model finetuned from `ByteDance-Seed/Seed-Coder-8B-Base`.
 
 
16
 
17
+ Given code context, edits history and an editable region around the cursor, it predicts the rewritten content for that region.
18
 
19
+ - **Developed by:** Zed Industries
20
+ - **License:** Apache-2.0
21
+ - **Fine-tuned from:** ByteDance-Seed/Seed-Coder-8B-Base
22
+ - **Model version:** 0323-multi-region-filtered-r3
23
+
24
+ ## Prompt format
25
+
26
+ The model uses a SPM (suffix-prefix-middle) style prompt with numbered multi-region markers for editable regions:
27
+
28
+
29
+ Here is a minimal example:
30
+
31
+ ```
32
+ <[fim-suffix]>
33
+ code after editable region
34
+ <[fim-prefix]><filename>related/file.py
35
+ related file content
36
+
37
+ <filename>edit_history
38
+ --- a/some_file.py
39
+ +++ b/some_file.py
40
+ -old
41
+ +new
42
+
43
+ <filename>path/to/target_file.py
44
+ code before editable region
45
+ <|marker_0|>
46
+ code that
47
+ needs to<|user_cursor|>
48
+ be rewritten
49
+ <|marker_1|>
50
+ <[fim-middle]>
51
+ ```
52
+
53
+ Expected output (should be generated by the model, without backticks):
54
+
55
+ ```
56
+ <|marker_0|>
57
+ revised content for
58
+ the editable region
59
+ <|marker_1|>
60
+ ```
61
+
62
+ Here is a real-world example:
63
+ - [Sample prompt input](./sample.prompt)
64
+ - [Sample model output](./sample.output)
sample.output ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <|marker_1|> return;
2
+ };
3
+ let path = project.read(cx).path_for_entry(*active_entry_id, cx);
4
+ if let Some(path) = path {
5
+ if let Some(ix) = project_state
6
+ .recent_paths
7
+ .iter()
8
+ .position(|probe| probe == &path)
9
+ {
10
+ project_state.recent_paths.remove(ix);
11
+ }
12
+ project_state.recent_paths.push_front(path);
13
+ }
14
+ }
15
+ project::Event::DiskBasedDiagnosticsFinished<|user_cursor|> { .. } => {
16
+ if cx.has_flag::<EditPredictionJumpsFeatureFlag>() {
17
+ <|marker_2|>
sample.prompt ADDED
@@ -0,0 +1,368 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <[fim-suffix]>
2
+ <[fim-prefix]><filename>zed/crates/project/src/project.rs
3
+ pub struct Project {
4
+ active_entry: Option<ProjectEntryId>,
5
+ buffer_ordered_messages_tx: mpsc::UnboundedSender<BufferOrderedMessage>,
6
+ languages: Arc<LanguageRegistry>,
7
+ dap_store: Entity<DapStore>,
8
+ agent_server_store: Entity<AgentServerStore>,
9
+
10
+ breakpoint_store: Entity<BreakpointStore>,
11
+ collab_client: Arc<client::Client>,
12
+ join_project_response_message_id: u32,
13
+ task_store: Entity<TaskStore>,
14
+ user_store: Entity<UserStore>,
15
+ fs: Arc<dyn Fs>,
16
+ remote_client: Option<Entity<RemoteClient>>,
17
+ // todo lw explain the client_state x remote_client matrix, its super confusing
18
+ client_state: ProjectClientState,
19
+ git_store: Entity<GitStore>,
20
+ collaborators: HashMap<proto::PeerId, Collaborator>,
21
+ client_subscriptions: Vec<client::Subscription>,
22
+ worktree_store: Entity<WorktreeStore>,
23
+ buffer_store: Entity<BufferStore>,
24
+ context_server_store: Entity<ContextServerStore>,
25
+ image_store: Entity<ImageStore>,
26
+ lsp_store: Entity<LspStore>,
27
+ _subscriptions: Vec<gpui::Subscription>,
28
+ buffers_needing_diff: HashSet<WeakEntity<Buffer>>,
29
+ git_diff_debouncer: DebouncedDelay<Self>,
30
+ remotely_created_models: Arc<Mutex<RemotelyCreatedModels>>,
31
+ terminals: Terminals,
32
+ node: Option<NodeRuntime>,
33
+ search_history: SearchHistory,
34
+ search_included_history: SearchHistory,
35
+ search_excluded_history: SearchHistory,
36
+ snippets: Entity<SnippetProvider>,
37
+ environment: Entity<ProjectEnvironment>,
38
+ settings_observer: Entity<SettingsObserver>,
39
+ toolchain_store: Option<Entity<ToolchainStore>>,
40
+ agent_location: Option<AgentLocation>,
41
+ downloading_files: Arc<Mutex<HashMap<(WorktreeId, String), DownloadingFile>>>,
42
+ }
43
+ ...
44
+ pub enum Event {
45
+ LanguageServerAdded(LanguageServerId, LanguageServerName, Option<WorktreeId>),
46
+ LanguageServerRemoved(LanguageServerId),
47
+ LanguageServerLog(LanguageServerId, LanguageServerLogType, String),
48
+ ...
49
+ LanguageServerBufferRegistered {
50
+ server_id: LanguageServerId,
51
+ ...
52
+ },
53
+ ToggleLspLogs {
54
+ server_id: LanguageServerId,
55
+ ...
56
+ },
57
+ Toast {
58
+ notification_id: SharedString,
59
+ ...
60
+ },
61
+ HideToast {
62
+ notification_id: SharedString,
63
+ },
64
+ LanguageServerPrompt(LanguageServerPromptRequest),
65
+ LanguageNotFound(Entity<Buffer>),
66
+ ActiveEntryChanged(Option<ProjectEntryId>),
67
+ ActivateProjectPanel,
68
+ WorktreeAdded(WorktreeId),
69
+ WorktreeOrderChanged,
70
+ WorktreeRemoved(WorktreeId),
71
+ WorktreeUpdatedEntries(WorktreeId, UpdatedEntriesSet),
72
+ DiskBasedDiagnosticsStarted {
73
+ language_server_id: LanguageServerId,
74
+ },
75
+ DiskBasedDiagnosticsFinished {
76
+ language_server_id: LanguageServerId,
77
+ },
78
+ DiagnosticsUpdated {
79
+ paths: Vec<ProjectPath>,
80
+ ...
81
+ },
82
+ RemoteIdChanged(Option<u64>),
83
+ DisconnectedFromHost,
84
+ DisconnectedFromRemote {
85
+ server_not_running: bool,
86
+ },
87
+ Closed,
88
+ DeletedEntry(WorktreeId, ProjectEntryId),
89
+ CollaboratorUpdated {
90
+ old_peer_id: proto::PeerId,
91
+ ...
92
+ },
93
+ CollaboratorJoined(proto::PeerId),
94
+ CollaboratorLeft(proto::PeerId),
95
+ HostReshared,
96
+ Reshared,
97
+ Rejoined,
98
+ RefreshInlayHints {
99
+ server_id: LanguageServerId,
100
+ ...
101
+ },
102
+ RefreshSemanticTokens {
103
+ server_id: LanguageServerId,
104
+ ...
105
+ },
106
+ RefreshCodeLens,
107
+ RevealInProjectPanel(ProjectEntryId),
108
+ SnippetEdit(BufferId, Vec<(lsp::Range, Snippet)>),
109
+ ExpandedAllForEntry(WorktreeId, ProjectEntryId),
110
+ EntryRenamed(ProjectTransaction, ProjectPath, PathBuf),
111
+ WorkspaceEditApplied(ProjectTransaction),
112
+ AgentLocationChanged,
113
+ BufferEdited,
114
+ }
115
+ ...
116
+ pub struct ProjectPath {
117
+ pub worktree_id: WorktreeId,
118
+ pub path: Arc<RelPath>,
119
+ }
120
+ ...
121
+ <filename>zed/crates/edit_prediction/src/edit_prediction.rs
122
+ pub struct EditPredictionJumpsFeatureFlag;
123
+ ...
124
+ pub struct EditPredictionStore {
125
+ client: Arc<Client>,
126
+ user_store: Entity<UserStore>,
127
+ llm_token: LlmApiToken,
128
+ _llm_token_subscription: Subscription,
129
+ projects: HashMap<EntityId, ProjectState>,
130
+ update_required: bool,
131
+ edit_prediction_model: EditPredictionModel,
132
+ zeta2_raw_config: Option<Zeta2RawConfig>,
133
+ pub sweep_ai: SweepAi,
134
+ pub mercury: Mercury,
135
+ data_collection_choice: DataCollectionChoice,
136
+ reject_predictions_tx: mpsc::UnboundedSender<EditPredictionRejection>,
137
+ shown_predictions: VecDeque<EditPrediction>,
138
+ rated_predictions: HashSet<EditPredictionId>,
139
+ }
140
+ ...
141
+ struct ProjectState {
142
+ events: VecDeque<StoredEvent>,
143
+ last_event: Option<LastEvent>,
144
+ recent_paths: VecDeque<ProjectPath>,
145
+ registered_buffers: HashMap<gpui::EntityId, RegisteredBuffer>,
146
+ current_prediction: Option<CurrentEditPrediction>,
147
+ next_pending_prediction_id: usize,
148
+ pending_predictions: ArrayVec<PendingPrediction, 2>,
149
+ debug_tx: Option<mpsc::UnboundedSender<DebugEvent>>,
150
+ last_edit_prediction_refresh: Option<(EntityId, Instant)>,
151
+ last_jump_prediction_refresh: Option<(EntityId, Instant)>,
152
+ cancelled_predictions: HashSet<usize>,
153
+ context: Entity<RelatedExcerptStore>,
154
+ license_detection_watchers: HashMap<WorktreeId, Rc<LicenseDetectionWatcher>>,
155
+ user_actions: VecDeque<UserActionRecord>,
156
+ _subscriptions: [gpui::Subscription; 2],
157
+ copilot: Option<Entity<Copilot>>,
158
+ }
159
+ ...
160
+ impl EditPredictionStore {
161
+ pub fn try_global(cx: &App) -> Option<Entity<Self>> {
162
+ ...
163
+ }
164
+
165
+ pub fn global(
166
+ client: &Arc<Client>,
167
+ user_store: &Entity<UserStore>,
168
+ cx: &mut App,
169
+ ) -> Entity<Self> {
170
+ ...
171
+ }
172
+
173
+ pub fn new(client: Arc<Client>, user_store: Entity<UserStore>, cx: &mut Context<Self>) -> Self {
174
+ ...
175
+ }
176
+
177
+ fn zeta2_raw_config_from_env() -> Option<Zeta2RawConfig> {
178
+ ...
179
+ }
180
+
181
+ pub fn set_edit_prediction_model(&mut self, model: EditPredictionModel) {
182
+ self.edit_prediction_model = model;
183
+ }
184
+
185
+ pub fn set_zeta2_raw_config(&mut self, config: Zeta2RawConfig) {
186
+ self.zeta2_raw_config = Some(config);
187
+ }
188
+
189
+ pub fn zeta2_raw_config(&self) -> Option<&Zeta2RawConfig> {
190
+ self.zeta2_raw_config.as_ref()
191
+ }
192
+
193
+ pub fn icons(&self, cx: &App) -> edit_prediction_types::EditPredictionIconSet {
194
+ ...
195
+ }
196
+
197
+ pub fn has_sweep_api_token(&self, cx: &App) -> bool {
198
+ self.sweep_ai.api_token.read(cx).has_key()
199
+ }
200
+
201
+ pub fn has_mercury_api_token(&self, cx: &App) -> bool {
202
+ self.mercury.api_token.read(cx).has_key()
203
+ }
204
+
205
+ pub fn clear_history(&mut self) {
206
+ ...
207
+ }
208
+
209
+ pub fn clear_history_for_project(&mut self, project: &Entity<Project>) {
210
+ ...
211
+ }
212
+
213
+ pub fn edit_history_for_project(
214
+ &self,
215
+ project: &Entity<Project>,
216
+ cx: &App,
217
+ ) -> Vec<StoredEvent> {
218
+ ...
219
+ }
220
+
221
+ pub fn context_for_project<'a>(
222
+ &'a self,
223
+ project: &Entity<Project>,
224
+ cx: &'a mut App,
225
+ ) -> Vec<RelatedFile> {
226
+ ...
227
+ }
228
+
229
+ pub fn copilot_for_project(&self, project: &Entity<Project>) -> Option<Entity<Copilot>> {
230
+ ...
231
+ }
232
+
233
+ pub fn start_copilot_for_project(
234
+ &mut self,
235
+ project: &Entity<Project>,
236
+ cx: &mut Context<Self>,
237
+ ) -> Option<Entity<Copilot>> {
238
+ ...
239
+ }
240
+
241
+ pub fn context_for_project_with_buffers<'a>(
242
+ &'a self,
243
+ project: &Entity<Project>,
244
+ cx: &'a mut App,
245
+ ) -> Vec<(RelatedFile, Entity<Buffer>)> {
246
+ ...
247
+ }
248
+
249
+ fn handle_project_event(
250
+ &mut self,
251
+ project: Entity<Project>,
252
+ event: &project::Event,
253
+ cx: &mut Context<Self>,
254
+ ) {
255
+ ...
256
+ let Some(project_state) = self.projects.get_mut(&project.entity_id()) else {
257
+ ...
258
+ if let Some(path) = path {
259
+ ...
260
+ }
261
+
262
+ fn register_buffer_impl<'a>(
263
+ project_state: &'a mut ProjectState,
264
+ buffer: &Entity<Buffer>,
265
+ project: &Entity<Project>,
266
+ cx: &mut Context<Self>,
267
+ ) -> &'a mut RegisteredBuffer {
268
+ ...
269
+ }
270
+
271
+ pub fn refresh_prediction_from_diagnostics(
272
+ &mut self,
273
+ project: Entity<Project>,
274
+ scope: DiagnosticSearchScope,
275
+ cx: &mut Context<Self>,
276
+ ) {
277
+ ...
278
+ }
279
+
280
+ fn predictions_enabled_at(
281
+ snapshot: &BufferSnapshot,
282
+ position: Option<language::Anchor>,
283
+ cx: &App,
284
+ ) -> bool {
285
+ ...
286
+ <filename>zed/crates/gpui/src/app/context.rs
287
+ pub struct Context<'a, T> {
288
+ app: &'a mut App,
289
+ entity_state: WeakEntity<T>,
290
+ }
291
+ ...
292
+ <filename>zed/crates/feature_flags/src/feature_flags.rs
293
+ impl FeatureFlagAppExt for App {
294
+ ...
295
+ fn has_flag<T: FeatureFlag>(&self) -> bool {
296
+ self.try_global::<FeatureFlags>()
297
+ .map(|flags| flags.has_flag::<T>())
298
+ .unwrap_or_else(|| {
299
+ (cfg!(debug_assertions) && T::enabled_for_staff() && !*ZED_DISABLE_STAFF)
300
+ || T::enabled_for_all()
301
+ })
302
+ }
303
+ ...
304
+ }
305
+ ...
306
+ <filename>zed/crates/gpui/src/app/entity_map.rs
307
+ pub struct Entity<T> {
308
+ #[deref]
309
+ #[deref_mut]
310
+ pub(crate) any_entity: AnyEntity,
311
+ pub(crate) entity_type: PhantomData<fn(T) -> T>,
312
+ }
313
+ ...
314
+
315
+ <filename>edit_history
316
+ --- a/zed/crates/edit_prediction/src/edit_prediction.rs
317
+ +++ b/zed/crates/edit_prediction/src/edit_prediction.rs
318
+ @@ -1035,7 +1035,7 @@
319
+ project_state.recent_paths.push_front(path);
320
+ }
321
+ }
322
+ - project::Event::DiagnosticsUpdated { .. } => {
323
+ + project::Event::Disk { .. } => {
324
+ if cx.has_flag::<EditPredictionJumpsFeatureFlag>() {
325
+ self.refresh_prediction_from_diagnostics(
326
+ project,
327
+
328
+ <filename>crates/edit_prediction/src/edit_prediction.rs
329
+ ) {
330
+ if !is_ep_store_provider(all_language_settings(None, cx).edit_predictions.provider) {
331
+ return;
332
+ }
333
+ // TODO [zeta2] init with recent paths
334
+ match event {
335
+ project::Event::ActiveEntryChanged(Some(active_entry_id)) => {
336
+ let Some(project_state) = self.projects.get_mut(&project.entity_id()) else {
337
+ <|marker_1|> return;
338
+ };
339
+ let path = project.read(cx).path_for_entry(*active_entry_id, cx);
340
+ if let Some(path) = path {
341
+ if let Some(ix) = project_state
342
+ .recent_paths
343
+ .iter()
344
+ .position(|probe| probe == &path)
345
+ {
346
+ project_state.recent_paths.remove(ix);
347
+ }
348
+ project_state.recent_paths.push_front(path);
349
+ }
350
+ }
351
+ project::Event::Disk<|user_cursor|> { .. } => {
352
+ if cx.has_flag::<EditPredictionJumpsFeatureFlag>() {
353
+ <|marker_2|> self.refresh_prediction_from_diagnostics(
354
+ project,
355
+ DiagnosticSearchScope::Global,
356
+ cx,
357
+ );
358
+ }
359
+ }
360
+ _ => (),
361
+ }
362
+ }
363
+
364
+ fn register_buffer_impl<'a>(
365
+ project_state: &'a mut ProjectState,
366
+ buffer: &Entity<Buffer>,
367
+ project: &Entity<Project>,<|marker_3|>
368
+ <[fim-middle]>