hrudu's picture
Add Hugging Face model scaffold
14e9a9f
---
license: apache-2.0
library_name: transformers
tags:
- robotics
- haptics
- embeddings
- multimodal
- encoder
pipeline_tag: feature-extraction
---
# Motoko Embedding 1B
Motoko Embedding 1B is a foundation embedding model for haptic signal representation in robotics.
It encodes raw force, torque, pressure, and vibration signals into rich fixed-dimension vector embeddings for retrieval, search, and cross-modal fusion.
## Model Summary
- Model type: Encoder-only Transformer
- Parameters: 1B
- Input: Force, torque, pressure, vibration sequences
- Output: Fixed-dimension embedding vectors
- License: Apache 2.0
## Intended Uses
- Semantic search over haptic datasets
- Cross-modal alignment with vision and language
- Haptic RAG pipelines for robotic agents
- Dataset indexing and similarity clustering
- Downstream fine-tuning with LoRA adapters
## Architecture
Motoko Embedding 1B uses a signal-aware preprocessing stack followed by an encoder-only Transformer.
Multichannel sensor streams are windowed, normalized, projected into token embeddings, and aggregated into a single fixed-size embedding representation.
Key design points:
- Temporal patching over multiaxis haptic sequences
- Rotary position embeddings for long-context signal modeling
- Mean pooling over the final hidden states for embedding extraction
- Optional projection head for cross-modal alignment
## Input Format
The model expects synchronized haptic sequences containing one or more of the following modalities:
- Force
- Torque
- Pressure
- Vibration
Default sensor assumptions are defined in [`configs/sensor_config.yaml`](./configs/sensor_config.yaml).
Signal normalization and windowing parameters are defined in [`preprocessor/preprocessor_config.json`](./preprocessor/preprocessor_config.json).
## Repository Layout
```text
.
β”œβ”€β”€ README.md
β”œβ”€β”€ config.json
β”œβ”€β”€ tokenizer_config.json
β”œβ”€β”€ tokenizer.json
β”œβ”€β”€ model/
β”‚ β”œβ”€β”€ model.safetensors
β”‚ └── model.safetensors.index.json
β”œβ”€β”€ preprocessor/
β”‚ β”œβ”€β”€ preprocessor_config.json
β”‚ └── feature_extractor.py
β”œβ”€β”€ configs/
β”‚ β”œβ”€β”€ training_config.yaml
β”‚ └── sensor_config.yaml
β”œβ”€β”€ examples/
β”‚ β”œβ”€β”€ inference.py
β”‚ β”œβ”€β”€ embedding_search.py
β”‚ └── cross_modal.py
└── .gitattributes
```
## Key Files
| File | Purpose |
| --- | --- |
| `config.json` | Encoder architecture: layers, heads, hidden size, projection dimensions |
| `configs/sensor_config.yaml` | Sensor input specs: axes, sequence length, sampling rate |
| `preprocessor/preprocessor_config.json` | Signal normalization, windowing, padding behavior |
| `preprocessor/feature_extractor.py` | Converts raw haptic arrays into encoder-ready tensors |
| `examples/embedding_search.py` | Vector similarity search over haptic embeddings |
| `examples/cross_modal.py` | Aligns haptic embeddings with vision or language vectors |
## Usage
### Load the processor
```python
from preprocessor.feature_extractor import HapticFeatureExtractor
extractor = HapticFeatureExtractor.from_pretrained(".")
```
### Basic embedding inference
```python
import numpy as np
from preprocessor.feature_extractor import HapticFeatureExtractor
extractor = HapticFeatureExtractor.from_pretrained(".")
sample = np.random.randn(1024, 12).astype("float32")
features = extractor(sample)
print(features["input_values"].shape)
print(features["attention_mask"].shape)
```
See [`examples/inference.py`](./examples/inference.py) for a complete example.
## Training
Baseline training parameters are provided in [`configs/training_config.yaml`](./configs/training_config.yaml).
These values are intended as a starting point for pretraining or continued domain adaptation, not as a claim of the exact recipe used for a released checkpoint.
## Limitations
- Performance depends heavily on sensor calibration and synchronization quality.
- Out-of-distribution hardware setups may require updated preprocessing statistics.
- Cross-modal alignment quality depends on the paired supervision used during training.
- This repository scaffold does not include production weights.
## Citation
```bibtex
@misc{motoko_embedding_1b,
title = {Motoko Embedding 1B},
author = {Motoko},
year = {2026},
howpublished = {\url{https://huggingface.co/}}
}
```