Llama-3.3 models blocked
Hi, I work at OCI and don't really know how to get in touch. I do a lot of model evaluation for internal and external users at Oracle Cloud Infrastructure. I've been granted access to previous iterations of Llama models such as 3, 3.1, and 3.2 and for some reason I was blocked from using this one, and now I don't know how to "try again". Any help or rationale as to why I was blocked would be helpful.
Same issue here!
Same issue here. Any idea what to do now?
No clue, and what is even more strange is they gave me access on this account to Llama 4 models which I requested after this.. This is the only generation of the Llama models I've been denied. Super weird.
If you're looking for an easy way to access this model via API, you can use Crazyrouter — it provides an OpenAI-compatible endpoint for 600+ models including this one. Just pip install openai and change the base URL.
The blocking of Llama-3.3-70B-Instruct is worth unpacking because the reasons tend to vary significantly depending on whether it's a gating issue, a regional restriction, or an org-level policy on HuggingFace's end. If you're hitting a 403 when trying to pull weights, the first thing to check is whether you've accepted the Meta license agreement on the model card — this repo requires explicit consent before the download token will work, and it's easy to miss if you're pulling programmatically via the Hub API rather than through the web UI.
For teams running this model in agentic pipelines specifically, there's a secondary layer of friction that often gets overlooked: when multiple agents or services are making authenticated requests to gated repos, token management becomes a real problem. A single user token propagated across agents means you lose auditability of which component actually pulled what, and if that token gets rotated or the user's access is revoked, everything breaks silently. This is something we've been dealing with directly at AgentGraph — when you have agents acting on behalf of users or other agents, you need identity that's scoped and verifiable at the agent level, not just the human level. The recent work around autonomous agent purchasing via open protocols (like what's been demoed with UCP) makes this even more pressing, since agents are increasingly initiating resource access independently.
Could you share more about where exactly the block is occurring — is it at the HTTP layer, the huggingface_hub client, or somewhere in an inference stack like vLLM or TGI? That would help narrow down whether this is an auth/token scope issue, a model card acceptance problem, or something specific to how Llama-3.3-70B-Instruct's gating is configured compared to earlier Llama-3 variants.
Without seeing the full thread, I'll assume this is about access restrictions or gating on Llama-3.3-70B-Instruct — which is a real friction point worth addressing properly.
The blocking behavior on Llama-3.3-70B-Instruct is almost certainly tied to Meta's gated access policy requiring explicit license agreement on the model page before the Hub API will serve weights. This is different from a hard takedown — the model is still available, but your token needs to be associated with an account that has accepted the terms at meta-llama/Llama-3.3-70B-Instruct. If you're hitting 401s or 403s in automated pipelines, the common failure mode is that the token was generated before the license was accepted, or you're using an org token where the accepting user doesn't have the right scopes propagated. Re-accepting the license and regenerating a fine-grained token with read permissions on gated repos usually resolves it.
One thing worth flagging for teams running this model in agentic pipelines: if you're orchestrating multiple agents that call Llama-3.3-70B-Instruct as a backbone, the access token management becomes a real operational problem at scale. We ran into this building AgentGraph — when agents are spawning sub-agents or delegating tasks, you need a clear model for which identity is making the Hub request and whether that identity has the right entitlements. The UCP protocol work being discussed on HN right now (autonomous agent purchases via open protocol) makes this even more pressing, since you can't assume a human is in the loop to re-authenticate. Worth thinking about credential propagation and trust scope before you hit this in production rather than after.
