tritesh commited on
Commit
7aca493
·
verified ·
1 Parent(s): 0433390

Update ML Intern artifact metadata

Browse files
Files changed (1) hide show
  1. README.md +24 -0
README.md CHANGED
@@ -1,3 +1,7 @@
 
 
 
 
1
  # DFlash-MLX-M2ProMax-96GB: Block Diffusion Speculative Decoding for MLX on Apple Silicon
2
 
3
  > **Tested on M2 Pro Max (96GB Unified Memory)** — Apple Silicon optimized implementation of DFlash speculative decoding for MLX.
@@ -345,3 +349,23 @@ MIT License — same as the original DFlash project.
345
  **Get 6× faster LLM inference on your M2 Pro Max (96GB) today!** 🚀
346
 
347
  > *Tested on M2 Pro Max, 38 GPU cores, 96GB unified memory, macOS 15+.*
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - ml-intern
4
+ ---
5
  # DFlash-MLX-M2ProMax-96GB: Block Diffusion Speculative Decoding for MLX on Apple Silicon
6
 
7
  > **Tested on M2 Pro Max (96GB Unified Memory)** — Apple Silicon optimized implementation of DFlash speculative decoding for MLX.
 
349
  **Get 6× faster LLM inference on your M2 Pro Max (96GB) today!** 🚀
350
 
351
  > *Tested on M2 Pro Max, 38 GPU cores, 96GB unified memory, macOS 15+.*
352
+
353
+ <!-- ml-intern-provenance -->
354
+ ## Generated by ML Intern
355
+
356
+ This model repository was generated by [ML Intern](https://github.com/huggingface/ml-intern), an agent for machine learning research and development on the Hugging Face Hub.
357
+
358
+ - Try ML Intern: https://smolagents-ml-intern.hf.space
359
+ - Source code: https://github.com/huggingface/ml-intern
360
+
361
+ ## Usage
362
+
363
+ ```python
364
+ from transformers import AutoModelForCausalLM, AutoTokenizer
365
+
366
+ model_id = 'tritesh/dflash-mlx-universal'
367
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
368
+ model = AutoModelForCausalLM.from_pretrained(model_id)
369
+ ```
370
+
371
+ For non-causal architectures, replace `AutoModelForCausalLM` with the appropriate `AutoModel` class.