Do y'all plan to release the 200B Pro Model too?

by evewashere - opened 2 days ago

Discussion

evewashere

2 days ago

That would shake the AI industry to its core.

Roman190928

1 day ago

to train a transformer, it takes 6PT, assuming this follows this rule, and each patch (16x16 pixels) = 1 token, there would be 16,384 tokens/input image during training. Now, to train a model, you typically need 20 examples / 1 parameter (exceptions like classifiers). 200,000,000,000 * 16384 / example = 3.2768e+15 examples.
3,276,800,000,000,000 tokens. The compute would be 6200,000,000,0003,276,800,000,000,000

3,932,160,000,000,000,000,000,000,000 flops...
3,932,160,000,000,000,000,000,000kflops
3,932,160,000,000,000,000,000mflops
3,932,160,000,000,000,000gflops
3,932,160,000,000,000tflops
3,932,160,000,000pflops
3,932,160,000eflops
3,932,160zflops...

Oracles gpu cluster (65,000 H200s) is only 260 EFLOPS...
20164923 seconds to train (assuming 75% util)
233 days.
It would cost ~1.8B to rent ($5/hr/gpu)

And to prove it's a transformer.

"HiDream-O1-Image is a natively unified image generative foundation model built on a Pixel-level Unified Transformer (UiT) without external VAEs or disjoint text encoders, which natively encodes raw pixels, text, and task-specific conditions in a single shared token space — supporting text-to-image, image editing, and subject-driven personalization at up to 2,048 × 2,048."

Correct me if I'm wrong.

evewashere

about 7 hours ago

nice girl math x wall of text, but in the readme they clearly mention the 200B model and benchmarks for it.

Roman190928

about 7 hours ago

i skimmed the thing... i js looked at it very briefly. and also, if they did have a 200b model, why would they open release it... thats prob to make hype for it when they release a paid version of it.

evewashere

about 6 hours ago

if they did have a 200b model, why would they open release it

because communism

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment