TerraMind-1.0-Tokenizer-S1GRD
This repository provides the S1GRD tokenizer checkpoint from TerraMind 1.0.
Model details
- Modality: S1GRD
- Input channels:
2 - Image size:
256x256 - Tokenizer backbone:
vit_b_enc - Quantization:
FSQ(codebook_size=8-8-8-6-5,latent_dim=5)
Quick use (diffusers-style)
The tokenizer uses native diffusers patterns: ModelMixin, ConfigMixin, from_pretrained, and from_config.
from huggingface_hub import snapshot_download
import torch
import sys
# Download model repository
model_dir = snapshot_download("BiliSakura/TerraMind-1.0-Tokenizer-S1GRD")
# Expose local module, then load (diffusers-style)
sys.path.insert(0, model_dir)
from terramind_tokenizer import TerraMindTokenizer
# Load from path or Hub ID
tokenizer = TerraMindTokenizer.from_pretrained(
model_dir, # or "BiliSakura/TerraMind-1.0-Tokenizer-S1GRD"
torch_dtype=torch.float32,
device="cpu",
)
# S1GRD input: [B, 2, 256, 256]
x = torch.randn(1, 2, 256, 256)
tokens = tokenizer.tokenize(x)
print(tokens.shape) # [1, 16, 16]
# Encode returns (quant, code_loss, tokens)
quant, code_loss, tokens = tokenizer.encode(x)
Load via AutoModel or TerraMindTokenizer (trust_remote_code)
You can load via diffusers AutoModel or the specific TerraMindTokenizer class with trust_remote_code=True:
from diffusers import AutoModel
import torch
# Option 1: AutoModel (auto-detects from config)
tokenizer = AutoModel.from_pretrained(
"BiliSakura/TerraMind-1.0-Tokenizer-S1GRD",
trust_remote_code=True,
torch_dtype=torch.float32,
device="cpu",
)
# Option 2: TerraMindTokenizer (explicit class)
from terramind_tokenizer import TerraMindTokenizer
tokenizer = TerraMindTokenizer.from_pretrained(
"BiliSakura/TerraMind-1.0-Tokenizer-S1GRD",
torch_dtype=torch.float32,
device="cpu",
)
# Same API: tokenize(), encode()
x = torch.randn(1, 2, 256, 256)
tokens = tokenizer.tokenize(x)
Security:
trust_remote_code=Trueruns code from the Hub. Only use with repos you trust. For production, pin a specific revision:revision="abc123def456"(commit hash after your changes).
Notes
- Uses diffusers
ModelMixinandConfigMixinfor standardfrom_pretrained/from_config/save_pretrained. - Supports both
model.safetensorsanddiffusion_pytorch_model.safetensorsweights. - Tokenizer-focused API:
tokenize(),encode(), andforward(). - Please follow TerraMind and TerraTorch licenses/usage terms from the upstream project.
Credits
- Original TerraMind project: https://github.com/IBM/terramind
- Original TerraMind model code (TerraTorch): https://github.com/terrastackai/terratorch/tree/main/terratorch/models/backbones/terramind
- This repository adapts tokenizer checkpoints for convenient Hugging Face usage.
Citation
If you use TerraMind in your research, please cite:
@article{jakubik2025terramind,
title={TerraMind: Large-Scale Generative Multimodality for Earth Observation},
author={Jakubik, Johannes and Yang, Felix and Blumenstiel, Benedikt and Scheurer, Erik and Sedona, Rocco and Maurogiovanni, Stefano and Bosmans, Jente and Dionelis, Nikolaos and Marsocci, Valerio and Kopp, Niklas and others},
journal={IEEE/CVF International Conference on Computer Vision (ICCV)},
year={2025}
}
- Downloads last month
- 3