Artem Darius Weber's picture

Artem Darius Weber

milkomeda22

·

AI & ML interests

None yet

Recent Activity

liked a model about 10 hours ago

microsoft/MediPhi

liked a model 3 days ago

Jackrong/Gemopus-4-26B-A4B-it-GGUF

reacted to SeaWolf-AI's post with 👀 5 days ago

Why This Matters — David Defeats Goliath MODEL: https://huggingface.co/FINAL-Bench/Darwin-4B-David SPACE: https://huggingface.co/spaces/FINAL-Bench/Darwin-4B-david We're releasing Darwin-4B-David, the first second-generation model in the Darwin Opus family. By evolving an already-evolved model, it achieves 85.0% on GPQA Diamond — surpassing its 58.6% original ancestor and even gemma-4-31B (84.3%) — with just 4.5B parameters. Second-Generation Evolution Most merges start from a base model and produce a single offspring. Darwin-4B-David breaks this pattern. The Father (Darwin-4B-Opus) was already evolved from gemma-4-E4B-it with Claude Opus reasoning distillation — a Gen-1 model. The Mother (DavidAU's DECKARD-Expresso-Universe) brings Unsloth deep tuning across 5 in-house datasets with thinking mode by default. Crossbreeding these two produced the first Gen-2 Darwin model. Darwin V6's Model MRI scanned both parents across all 42 layers, assigning independent optimal ratios per layer. The Mother's creativity and Korean language hotspot (Layer 22-25, weight 0.95) was maximally absorbed, while the Father's reasoning core (Layer 30-40, weight 0.48) was preserved. This is "Merge = Evolve" applied recursively — evolution of evolution. Benchmarks Darwin-4B-David scores 85.0% on GPQA Diamond (+26.4%p over original 58.6%), evaluated generatively with maj@8 (8 generations per question, majority vote), Epoch AI prompt format, thinking mode enabled, 50 sampled questions. On ARC-Challenge (25-shot, loglikelihood), both score 64.93% — expected, as loglikelihood doesn't capture thinking-mode reasoning differences. Why This Matters gemma-4-31B (30.7B) scores 84.3%. Darwin-4B-David surpasses it at 1/7th the size — no training, no RL, just 45 minutes of MRI-guided DARE-TIES on one H100. The name "David" honors Mother creator DavidAU and evokes David vs. Goliath.

View all activity

Organizations

upvoted a collection 12 days ago

GUI-Owl-1.5

GUI-Owl-1.5 • 6 items • Updated Mar 6 • 9

upvoted a collection about 1 month ago

Qwen3-VL-Reranker

2 items • Updated Jan 8 • 42

upvoted a collection 2 months ago

InternVL3.5

This collection includes all released checkpoints of InternVL3.5, covering different training stages (e.g., Pretraining, SFT, MPO, Cascade RL). • 45 items • Updated Mar 2 • 107

upvoted a paper 5 months ago

SAM2S: Segment Anything in Surgical Videos via Semantic Long-term Tracking

Paper • 2511.16618 • Published Nov 20, 2025 • 9

upvoted an article 5 months ago

Article

Design Patterns for Building Agentic Workflows

Jul 14, 2025

•

9

upvoted a collection 8 months ago

tiny ramdom models

96 items • Updated 6 days ago • 8

upvoted 6 papers 9 months ago

Streaming 4D Visual Geometry Transformer

Paper • 2507.11539 • Published Jul 15, 2025 • 15

Experience is the Best Teacher: Grounding VLMs for Robotics through Self-Generated Memory

Paper • 2507.16713 • Published Jul 22, 2025 • 21

ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning

Paper • 2507.16815 • Published Jul 22, 2025 • 42

Pixels, Patterns, but No Poetry: To See The World like Humans

Paper • 2507.16863 • Published Jul 21, 2025 • 69

nablaNABLA: Neighborhood Adaptive Block-Level Attention

Paper • 2507.13546 • Published Jul 17, 2025 • 126

Agentic Reinforced Policy Optimization

Paper • 2507.19849 • Published Jul 26, 2025 • 161

upvoted a collection about 1 year ago

Llama 4

Llama 4 release • 13 items • Updated Apr 29, 2025 • 727

upvoted a paper about 1 year ago

VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning

Paper • 2503.13444 • Published Mar 17, 2025 • 20

upvoted an article about 1 year ago

Article

GaLore: Advancing Large Model Training on Consumer-grade Hardware

+7

Mar 20, 2024

•

32

upvoted 5 papers over 2 years ago

FLM-101B: An Open LLM and How to Train It with $100K Budget

Paper • 2309.03852 • Published Sep 7, 2023 • 45

CityDreamer: Compositional Generative Model of Unbounded 3D Cities

Paper • 2309.00610 • Published Sep 1, 2023 • 21

Large Content And Behavior Models To Understand, Simulate, And Optimize Content And Behavior

Paper • 2309.00359 • Published Sep 1, 2023 • 23

RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

Paper • 2309.00267 • Published Sep 1, 2023 • 53

ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models

Paper • 2309.00986 • Published Sep 2, 2023 • 22