hrudu commited on
Commit
89e5d21
·
1 Parent(s): 8757cd2
README.md CHANGED
@@ -1,3 +1,134 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ library_name: transformers
6
+ tags:
7
+ - haptics
8
+ - time-series
9
+ - robotics
10
+ - sensor-fusion
11
+ - mamba
12
+ - transformer
13
+ pipeline_tag: time-series-classification
14
+ ---
15
+
16
+ # Motoko 1B
17
+
18
+ Motoko 1B is the core foundation model of the Motoko family: a general-purpose haptic model pretrained across touch, force, and sensor interaction data.
19
+
20
+ ## Model Details
21
+
22
+ - **Parameters:** 1B
23
+ - **Architecture:** Mamba / Hybrid CNN + Transformer
24
+ - **Input:** Force, torque, pressure, vibration time-series
25
+ - **Output:** Next-state prediction and signal classification
26
+ - **Sequence Length:** Up to 2048 timesteps
27
+ - **Sampling Rate:** Up to 1 kHz
28
+ - **License:** Apache 2.0
29
+
30
+ ## Intended Use
31
+
32
+ Motoko 1B is designed for:
33
+
34
+ - Haptic signal classification and understanding
35
+ - Grasp stability prediction
36
+ - Material and texture recognition from touch
37
+ - Force state forecasting
38
+ - Fine-tuning as a base for downstream haptic tasks
39
+ - Serving as the parent model for Motoko LoRA adapters
40
+
41
+ ## Repository Layout
42
+
43
+ ```text
44
+ .
45
+ ├── README.md
46
+ ├── config.json
47
+ ├── tokenizer_config.json
48
+ ├── tokenizer.json
49
+ ├── model/
50
+ │ ├── model.safetensors
51
+ │ └── model.safetensors.index.json
52
+ ├── preprocessor/
53
+ │ ├── preprocessor_config.json
54
+ │ └── feature_extractor.py
55
+ ├── configs/
56
+ │ ├── training_config.yaml
57
+ │ └── sensor_config.yaml
58
+ ├── examples/
59
+ │ ├── inference.py
60
+ │ ├── grasp_stability.py
61
+ │ ├── material_recognition.py
62
+ │ └── force_forecasting.py
63
+ └── .gitattributes
64
+ ```
65
+
66
+ ## Input Format
67
+
68
+ The model expects multichannel haptic time-series windows containing one or more of the following modalities:
69
+
70
+ - Force
71
+ - Torque
72
+ - Pressure
73
+ - Vibration
74
+
75
+ Signals should be normalized and resampled according to `preprocessor/preprocessor_config.json` before inference.
76
+
77
+ ## Tasks
78
+
79
+ ### Grasp Stability Prediction
80
+
81
+ Given a short force or tactile sequence collected during grasping, the model predicts whether a grasp is stable or likely to fail.
82
+
83
+ ### Material Recognition
84
+
85
+ Given touch-only or force-plus-vibration sequences, the model classifies the material category or texture family.
86
+
87
+ ### Force Forecasting
88
+
89
+ Given a recent trajectory of haptic observations, the model predicts the next force state or short horizon continuation.
90
+
91
+ ## Example Usage
92
+
93
+ ```python
94
+ from pathlib import Path
95
+
96
+ import numpy as np
97
+
98
+ from preprocessor.feature_extractor import MotokoFeatureExtractor
99
+
100
+ extractor = MotokoFeatureExtractor.from_config(
101
+ Path("preprocessor/preprocessor_config.json")
102
+ )
103
+
104
+ sample = {
105
+ "force": np.random.randn(256, 3),
106
+ "torque": np.random.randn(256, 3),
107
+ "pressure": np.random.randn(256, 16),
108
+ }
109
+
110
+ features = extractor(sample)
111
+ print(features["input_values"].shape)
112
+ ```
113
+
114
+ ## Training
115
+
116
+ Base training hyperparameters are stored in `configs/training_config.yaml`, and sensor assumptions are defined in `configs/sensor_config.yaml`.
117
+
118
+ ## Limitations
119
+
120
+ - This repository currently contains scaffold configuration and examples.
121
+ - `model/model.safetensors` is a placeholder and should be replaced with actual trained weights.
122
+ - Final tokenizer and preprocessing values should be aligned with the released checkpoint.
123
+
124
+ ## Citation
125
+
126
+ ```bibtex
127
+ @misc{motoko1b,
128
+ title = {Motoko 1B},
129
+ author = {Motoko Team},
130
+ year = {2026},
131
+ howpublished = {\url{https://huggingface.co/}},
132
+ note = {Foundation model for haptic understanding and forecasting}
133
+ }
134
+ ```
config.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "MotokoForHapticModeling"
4
+ ],
5
+ "model_type": "motoko",
6
+ "hidden_size": 2048,
7
+ "intermediate_size": 8192,
8
+ "num_hidden_layers": 24,
9
+ "num_attention_heads": 16,
10
+ "num_key_value_heads": 8,
11
+ "conv_kernel_size": 5,
12
+ "state_size": 256,
13
+ "max_position_embeddings": 2048,
14
+ "num_input_channels": 28,
15
+ "sampling_rate_hz": 1000,
16
+ "classifier_dropout": 0.1,
17
+ "hidden_act": "silu",
18
+ "layer_norm_eps": 1e-05,
19
+ "initializer_range": 0.02,
20
+ "bos_token_id": 1,
21
+ "eos_token_id": 2,
22
+ "pad_token_id": 0,
23
+ "torch_dtype": "float16",
24
+ "transformers_version": "4.52.0"
25
+ }
configs/sensor_config.yaml ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ sensors:
2
+ force:
3
+ axes: [fx, fy, fz]
4
+ units: newton
5
+ channels: 3
6
+ sampling_rate_hz: 1000
7
+ torque:
8
+ axes: [tx, ty, tz]
9
+ units: newton_meter
10
+ channels: 3
11
+ sampling_rate_hz: 1000
12
+ pressure:
13
+ layout: tactile_grid
14
+ channels: 16
15
+ sampling_rate_hz: 1000
16
+ vibration:
17
+ axes: [vx, vy, vz, ax, ay, az]
18
+ units: normalized
19
+ channels: 6
20
+ sampling_rate_hz: 1000
21
+
22
+ input_spec:
23
+ max_sampling_rate_hz: 1000
24
+ max_sequence_length: 2048
25
+ supported_modalities: [force, torque, pressure, vibration]
configs/training_config.yaml ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ model_name: motoko-1-1b
2
+ task: multitask_haptic_pretraining
3
+
4
+ training:
5
+ seed: 42
6
+ epochs: 20
7
+ global_batch_size: 128
8
+ micro_batch_size: 8
9
+ learning_rate: 2.0e-4
10
+ min_learning_rate: 2.0e-5
11
+ warmup_steps: 2000
12
+ weight_decay: 0.01
13
+ gradient_clip_norm: 1.0
14
+ precision: bf16
15
+
16
+ data:
17
+ max_sequence_length: 2048
18
+ sampling_rate_hz: 1000
19
+ shuffle: true
20
+ num_workers: 8
21
+
22
+ objectives:
23
+ next_state_prediction_weight: 1.0
24
+ grasp_stability_weight: 0.5
25
+ material_recognition_weight: 0.5
26
+ masked_signal_modeling_weight: 0.25
27
+
28
+ checkpointing:
29
+ output_dir: ./checkpoints
30
+ save_steps: 1000
31
+ keep_last_n: 3
examples/__pycache__/force_forecasting.cpython-311.pyc ADDED
Binary file (1.68 kB). View file
 
examples/__pycache__/grasp_stability.cpython-311.pyc ADDED
Binary file (1.78 kB). View file
 
examples/__pycache__/inference.cpython-311.pyc ADDED
Binary file (1.72 kB). View file
 
examples/__pycache__/material_recognition.cpython-311.pyc ADDED
Binary file (1.8 kB). View file
 
examples/force_forecasting.py ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import numpy as np
2
+
3
+ from preprocessor.feature_extractor import MotokoFeatureExtractor
4
+
5
+
6
+ def forecast_force(signal: dict[str, np.ndarray]) -> np.ndarray:
7
+ extractor = MotokoFeatureExtractor.from_config("preprocessor/preprocessor_config.json")
8
+ features = extractor(signal)
9
+ force_slice = features["input_values"][:, :3]
10
+ return force_slice[-10:].mean(axis=0)
11
+
12
+
13
+ if __name__ == "__main__":
14
+ signal = {
15
+ "force": np.random.randn(256, 3).astype(np.float32),
16
+ "torque": np.random.randn(256, 3).astype(np.float32),
17
+ "pressure": np.random.randn(256, 16).astype(np.float32),
18
+ "vibration": np.random.randn(256, 6).astype(np.float32),
19
+ }
20
+ print("next_force:", forecast_force(signal))
examples/grasp_stability.py ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import numpy as np
2
+
3
+ from preprocessor.feature_extractor import MotokoFeatureExtractor
4
+
5
+
6
+ def predict_grasp_stability(signal: dict[str, np.ndarray]) -> str:
7
+ extractor = MotokoFeatureExtractor.from_config("preprocessor/preprocessor_config.json")
8
+ features = extractor(signal)
9
+ stability_score = float(np.clip(features["input_values"].mean() + 0.5, 0.0, 1.0))
10
+ return "stable" if stability_score >= 0.5 else "unstable"
11
+
12
+
13
+ if __name__ == "__main__":
14
+ signal = {
15
+ "force": np.random.randn(256, 3).astype(np.float32),
16
+ "torque": np.random.randn(256, 3).astype(np.float32),
17
+ "pressure": np.random.randn(256, 16).astype(np.float32),
18
+ "vibration": np.random.randn(256, 6).astype(np.float32),
19
+ }
20
+ print("grasp:", predict_grasp_stability(signal))
examples/inference.py ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from pathlib import Path
2
+
3
+ import numpy as np
4
+
5
+ from preprocessor.feature_extractor import MotokoFeatureExtractor
6
+
7
+
8
+ def main() -> None:
9
+ extractor = MotokoFeatureExtractor.from_config(
10
+ Path("preprocessor/preprocessor_config.json")
11
+ )
12
+
13
+ sample = {
14
+ "force": np.random.randn(320, 3).astype(np.float32),
15
+ "torque": np.random.randn(320, 3).astype(np.float32),
16
+ "pressure": np.random.randn(320, 16).astype(np.float32),
17
+ "vibration": np.random.randn(320, 6).astype(np.float32),
18
+ }
19
+
20
+ features = extractor(sample)
21
+ print("input_values:", features["input_values"].shape)
22
+ print("attention_mask:", features["attention_mask"].shape)
23
+
24
+
25
+ if __name__ == "__main__":
26
+ main()
examples/material_recognition.py ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import numpy as np
2
+
3
+ from preprocessor.feature_extractor import MotokoFeatureExtractor
4
+
5
+
6
+ MATERIALS = ["metal", "rubber", "wood", "fabric"]
7
+
8
+
9
+ def predict_material(signal: dict[str, np.ndarray]) -> str:
10
+ extractor = MotokoFeatureExtractor.from_config("preprocessor/preprocessor_config.json")
11
+ features = extractor(signal)
12
+ index = int(abs(features["input_values"].sum())) % len(MATERIALS)
13
+ return MATERIALS[index]
14
+
15
+
16
+ if __name__ == "__main__":
17
+ signal = {
18
+ "force": np.random.randn(256, 3).astype(np.float32),
19
+ "torque": np.random.randn(256, 3).astype(np.float32),
20
+ "pressure": np.random.randn(256, 16).astype(np.float32),
21
+ "vibration": np.random.randn(256, 6).astype(np.float32),
22
+ }
23
+ print("material:", predict_material(signal))
model/model.safetensors ADDED
File without changes
model/model.safetensors.index.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "metadata": {
3
+ "total_size": 0
4
+ },
5
+ "weight_map": {
6
+ "model.embed_tokens.weight": "model.safetensors",
7
+ "model.layers.0.mixer.in_proj.weight": "model.safetensors",
8
+ "model.layers.0.mixer.out_proj.weight": "model.safetensors",
9
+ "model.norm.weight": "model.safetensors",
10
+ "lm_head.weight": "model.safetensors"
11
+ }
12
+ }
preprocessor/__pycache__/feature_extractor.cpython-311.pyc ADDED
Binary file (5.1 kB). View file
 
preprocessor/feature_extractor.py ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+
3
+ import json
4
+ from pathlib import Path
5
+ from typing import Any
6
+
7
+ import numpy as np
8
+
9
+
10
+ class MotokoFeatureExtractor:
11
+ """Normalize and stack haptic modalities into a single model tensor."""
12
+
13
+ def __init__(self, config: dict[str, Any]) -> None:
14
+ self.config = config
15
+ self.max_length = int(config.get("max_length", 2048))
16
+ self.padding_value = float(config.get("padding_value", 0.0))
17
+ self.eps = float(config.get("normalization", {}).get("eps", 1e-6))
18
+ self.modalities = config.get("modalities", {})
19
+
20
+ @classmethod
21
+ def from_config(cls, path: str | Path) -> "MotokoFeatureExtractor":
22
+ with Path(path).open("r", encoding="utf-8") as handle:
23
+ return cls(json.load(handle))
24
+
25
+ def _normalize(self, values: np.ndarray) -> np.ndarray:
26
+ mean = values.mean(axis=0, keepdims=True)
27
+ std = values.std(axis=0, keepdims=True)
28
+ return (values - mean) / np.maximum(std, self.eps)
29
+
30
+ def _pad_or_trim(self, values: np.ndarray) -> np.ndarray:
31
+ if values.shape[0] >= self.max_length:
32
+ return values[: self.max_length]
33
+
34
+ pad_rows = self.max_length - values.shape[0]
35
+ pad = np.full((pad_rows, values.shape[1]), self.padding_value, dtype=values.dtype)
36
+ return np.concatenate([values, pad], axis=0)
37
+
38
+ def __call__(self, sample: dict[str, np.ndarray]) -> dict[str, np.ndarray]:
39
+ features: list[np.ndarray] = []
40
+
41
+ for name, spec in self.modalities.items():
42
+ if not spec.get("enabled", False):
43
+ continue
44
+
45
+ channels = int(spec["channels"])
46
+ values = np.asarray(sample.get(name, np.zeros((0, channels), dtype=np.float32)))
47
+
48
+ if values.ndim != 2 or values.shape[1] != channels:
49
+ raise ValueError(
50
+ f"Expected modality '{name}' to have shape [timesteps, {channels}], "
51
+ f"got {values.shape}."
52
+ )
53
+
54
+ normalized = self._normalize(values.astype(np.float32))
55
+ features.append(self._pad_or_trim(normalized))
56
+
57
+ if not features:
58
+ raise ValueError("No enabled modalities were provided.")
59
+
60
+ stacked = np.concatenate(features, axis=1)
61
+ attention_mask = (np.abs(stacked).sum(axis=1) > 0).astype(np.int64)
62
+
63
+ return {
64
+ "input_values": stacked,
65
+ "attention_mask": attention_mask,
66
+ }
preprocessor/preprocessor_config.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "feature_extractor_type": "MotokoFeatureExtractor",
3
+ "sampling_rate_hz": 1000,
4
+ "target_sampling_rate_hz": 1000,
5
+ "window_size": 256,
6
+ "window_stride": 128,
7
+ "max_length": 2048,
8
+ "padding_value": 0.0,
9
+ "normalization": {
10
+ "method": "zscore",
11
+ "eps": 1e-06
12
+ },
13
+ "modalities": {
14
+ "force": {
15
+ "enabled": true,
16
+ "channels": 3
17
+ },
18
+ "torque": {
19
+ "enabled": true,
20
+ "channels": 3
21
+ },
22
+ "pressure": {
23
+ "enabled": true,
24
+ "channels": 16
25
+ },
26
+ "vibration": {
27
+ "enabled": true,
28
+ "channels": 6
29
+ }
30
+ }
31
+ }
tokenizer.json ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "version": "1.0",
3
+ "truncation": null,
4
+ "padding": null,
5
+ "added_tokens": [
6
+ {
7
+ "id": 0,
8
+ "content": "<pad>",
9
+ "single_word": false,
10
+ "lstrip": false,
11
+ "rstrip": false,
12
+ "normalized": false,
13
+ "special": true
14
+ },
15
+ {
16
+ "id": 1,
17
+ "content": "<bos>",
18
+ "single_word": false,
19
+ "lstrip": false,
20
+ "rstrip": false,
21
+ "normalized": false,
22
+ "special": true
23
+ },
24
+ {
25
+ "id": 2,
26
+ "content": "<eos>",
27
+ "single_word": false,
28
+ "lstrip": false,
29
+ "rstrip": false,
30
+ "normalized": false,
31
+ "special": true
32
+ },
33
+ {
34
+ "id": 3,
35
+ "content": "<unk>",
36
+ "single_word": false,
37
+ "lstrip": false,
38
+ "rstrip": false,
39
+ "normalized": false,
40
+ "special": true
41
+ }
42
+ ],
43
+ "normalizer": null,
44
+ "pre_tokenizer": null,
45
+ "post_processor": null,
46
+ "decoder": null,
47
+ "model": {
48
+ "type": "WordLevel",
49
+ "vocab": {
50
+ "<pad>": 0,
51
+ "<bos>": 1,
52
+ "<eos>": 2,
53
+ "<unk>": 3,
54
+ "force": 4,
55
+ "torque": 5,
56
+ "pressure": 6,
57
+ "vibration": 7,
58
+ "slip": 8,
59
+ "stable": 9,
60
+ "material": 10,
61
+ "forecast": 11
62
+ },
63
+ "unk_token": "<unk>"
64
+ }
65
+ }
tokenizer_config.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "tokenizer_class": "PreTrainedTokenizerFast",
3
+ "model_max_length": 2048,
4
+ "padding_side": "right",
5
+ "truncation_side": "right",
6
+ "pad_token": "<pad>",
7
+ "bos_token": "<bos>",
8
+ "eos_token": "<eos>",
9
+ "unk_token": "<unk>",
10
+ "clean_up_tokenization_spaces": false
11
+ }