---
license: apache-2.0
library_name: transformers
tags:
  - robotics
  - haptics
  - embeddings
  - multimodal
  - encoder
pipeline_tag: feature-extraction
---

# Motoko Embedding 1B

Motoko Embedding 1B is a foundation embedding model for haptic signal representation in robotics.
It encodes raw force, torque, pressure, and vibration signals into rich fixed-dimension vector embeddings for retrieval, search, and cross-modal fusion.

## Model Summary

- Model type: Encoder-only Transformer
- Parameters: 1B
- Input: Force, torque, pressure, vibration sequences
- Output: Fixed-dimension embedding vectors
- License: Apache 2.0

## Intended Uses

- Semantic search over haptic datasets
- Cross-modal alignment with vision and language
- Haptic RAG pipelines for robotic agents
- Dataset indexing and similarity clustering
- Downstream fine-tuning with LoRA adapters

## Architecture

Motoko Embedding 1B uses a signal-aware preprocessing stack followed by an encoder-only Transformer.
Multichannel sensor streams are windowed, normalized, projected into token embeddings, and aggregated into a single fixed-size embedding representation.

Key design points:

- Temporal patching over multiaxis haptic sequences
- Rotary position embeddings for long-context signal modeling
- Mean pooling over the final hidden states for embedding extraction
- Optional projection head for cross-modal alignment

## Input Format

The model expects synchronized haptic sequences containing one or more of the following modalities:

- Force
- Torque
- Pressure
- Vibration

Default sensor assumptions are defined in [`configs/sensor_config.yaml`](./configs/sensor_config.yaml).
Signal normalization and windowing parameters are defined in [`preprocessor/preprocessor_config.json`](./preprocessor/preprocessor_config.json).

## Repository Layout

```text
.
├── README.md
├── config.json
├── tokenizer_config.json
├── tokenizer.json
├── model/
│   ├── model.safetensors
│   └── model.safetensors.index.json
├── preprocessor/
│   ├── preprocessor_config.json
│   └── feature_extractor.py
├── configs/
│   ├── training_config.yaml
│   └── sensor_config.yaml
├── examples/
│   ├── inference.py
│   ├── embedding_search.py
│   └── cross_modal.py
└── .gitattributes
```

## Key Files

| File | Purpose |
| --- | --- |
| `config.json` | Encoder architecture: layers, heads, hidden size, projection dimensions |
| `configs/sensor_config.yaml` | Sensor input specs: axes, sequence length, sampling rate |
| `preprocessor/preprocessor_config.json` | Signal normalization, windowing, padding behavior |
| `preprocessor/feature_extractor.py` | Converts raw haptic arrays into encoder-ready tensors |
| `examples/embedding_search.py` | Vector similarity search over haptic embeddings |
| `examples/cross_modal.py` | Aligns haptic embeddings with vision or language vectors |

## Usage

### Load the processor

```python
from preprocessor.feature_extractor import HapticFeatureExtractor

extractor = HapticFeatureExtractor.from_pretrained(".")
```

### Basic embedding inference

```python
import numpy as np

from preprocessor.feature_extractor import HapticFeatureExtractor

extractor = HapticFeatureExtractor.from_pretrained(".")
sample = np.random.randn(1024, 12).astype("float32")
features = extractor(sample)

print(features["input_values"].shape)
print(features["attention_mask"].shape)
```

See [`examples/inference.py`](./examples/inference.py) for a complete example.

## Training

Baseline training parameters are provided in [`configs/training_config.yaml`](./configs/training_config.yaml).
These values are intended as a starting point for pretraining or continued domain adaptation, not as a claim of the exact recipe used for a released checkpoint.

## Limitations

- Performance depends heavily on sensor calibration and synchronization quality.
- Out-of-distribution hardware setups may require updated preprocessing statistics.
- Cross-modal alignment quality depends on the paired supervision used during training.
- This repository scaffold does not include production weights.

## Citation

```bibtex
@misc{motoko_embedding_1b,
  title        = {Motoko Embedding 1B},
  author       = {Motoko},
  year         = {2026},
  howpublished = {\url{https://huggingface.co/}}
}
```