Oussema Harbi
Harbous
¡
AI & ML interests
None yet
Recent Activity
reacted to philipp-zettl's post with đ 3 days ago
I've been cooking something neat over the past weeks đ¨âđł
We all know that training LLMs requires a lot of resources and especially a lot of compute in form of GPUs, or is super slow and inefficient when done on CPUs.
The big players use giant clusters of Nvidia H100s.
But if I look at the profiles of my fellow home brewers, all we can get our hands on are those pesky consumer RTX's. If you're lucky you got yourself a 5080 with 16GB VRAM or something.
To be frank, I don't have that 1.3k disposable cash laying around ÂŻ\_(ă)_/ÂŻ
But I can write rust and like building ML libraries.
So I asked myself the question(s):
- can I train SMLs at home on my hardware?
- How hard can it be to build a ML library that can stream data between RAM and VRAM on demand, like llama.cpp's unified memory feature [^1]?
- how hard can it be to implement bf16 support?
The answers are wild, trust me!
Image 1: Metrics form last nights build on my "tiny" RTX 2060 (6 GB VRAM)
Image 2: Metrics from my most recent build on my RTX 4070 Laptop (8GB VRAM)
The majority of my time went into the shared memory, but it's stable and I'm very excited!
Here some debug logs, a la "trust me bro"
```
----
Currently available: 1112735744, attempting to reclaim: 1073741824
--- VRAM STATE [backward pass] ---
Driver Used: 6744 MB / 7805 MB
Data on GPU: 1641 MB
Grads on GPU: 3459 MB
CPU Offloaded: 18230 MB
---------------------------------
Currently available: 1079181312, attempting to reclaim: 1073741824
--- VRAM STATE [backward pass] ---
Driver Used: 6776 MB / 7805 MB
Data on GPU: 1561 MB
Grads on GPU: 3279 MB
CPU Offloaded: 18590 MB
-----------------------------
```
Final models get exported in `safetensors` format and are compatible with PyTorch and `transformers`, for accessibility.
- [^1]: https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md#unified-memory reacted to RakshitAralimatti's post with đ 2 months ago
Just built my entire AI Engineer portfolio by pasting 2 links (GitHub and LinkedIn) into https://huggingface.co/moonshotai Kimi 2.5.
That's it. That's the workflow.
Zero coding. Zero iteration. Zero "make the button bigger."
See for yourself: https://rakshit2020.github.io/rakshitaralimatti.github.io/
The model:
â
Scraped my GitHub repos automatically
â
Pulled my experience from LinkedIn
â
Designed an Aurora Glass theme
â
Mapped every skill to projects
â
Added animations I'd never code myself
reacted to kanaria007's post with đ 3 months ago
â
New Article: *Post-Transformer Decision Cores* (v0.1)
Title:
đ Post-Transformer Decision Cores: Goal-Native Engines Beyond LLMs
đ https://huggingface.co/blog/kanaria007/post-tranformer-decision-cores
---
Summary:
Transformers are powerfulâbut in SI-Core theyâre *not the essence of intelligence*. A *Decision Core* is anything that satisfies the *Jump contracts* (OBS/ETH/MEM/ID/EVAL + RML), and those contracts donât require next-token prediction.
This article sketches what âpost-Transformerâ looks like in practice: *goal-native, structure-aware controllers* that may use LLMs as toolsâbut donât depend on them as the runtime brain.
> Donât relax the contracts.
> Replace the engine behind them.
---
Why It Matters:
⢠Makes LLMs *optional*: shift them to âgenesis / exploration / explanation,â while routine high-stakes Jumps run on structured cores
⢠Improves boring-but-critical properties: *determinism (CAS), fewer inconsistencies (SCI), fewer ETH violations (EAI), better rollback (RBL/RIR)*
⢠Enables gradual adoption via *pluggable Jump engines* and domain-by-domain âprimary vs fallbackâ switching
---
Whatâs Inside:
⢠The architectural inversion: *World â OBS â SIM/SIS â Jump (Decision Core) â RML â Effects* (LLM is just one engine)
⢠Three compatible post-Transformer directions:
1. *World-model + search controllers* (MPC/MCTS/anytime search with explicit GCS + ETH constraints)
2. *Genius-distilled specialized controllers* (distill structure from GeniusTraces; LLM becomes a âgenesis toolâ)
3. *SIL-compiled Decision Programs* (typed Jump entrypoints, compiler-checked invariants, DPIR/GSPU targeting)
⢠A realistic migration path: LLM-wrapped â Genius library â shadow dual-run â flip primary by domain â SIL-compiled cores
⢠How this connects to âreproducing geniusâ: GRP provides trace selection/format; this article provides the engine architectures
---
đ Structured Intelligence Engineering SeriesOrganizations
None yet