Georg commited on
Commit
e636f34
·
1 Parent(s): 7362afb

Enhance mujoco_server.py and documentation with camera and scene updates

Browse files

- Updated the UR5 overlay camera configuration to include a new look target and adjusted offsets for improved visualization.
- Enhanced the simulation loop to support dynamic camera positioning based on site IDs and offsets.
- Added a new endpoint for scheduling robot and scene switches, allowing for automated testing and integration.
- Updated README.md to reflect changes in camera functionality and added instructions for the new switch endpoint.
- Introduced tests for API endpoints, camera streams, and WebSocket interactions to ensure reliability and performance.

README.md CHANGED
@@ -247,7 +247,7 @@ docker run --gpus all -p 3004:3004 \
247
 
248
  ### UR5 Scene & Camera Hints
249
  - The UI now selects between exactly two UR5 options: the gripper-ready scene and the T-push scene (both enumerated via `/nova-sim/api/v1/metadata`). That metadata entry also returns `overlay_camera_presets`, so the trainer CLI can build dashboards based on the same stream list the UI exposes.
250
- - UR5 T-push streams three auxiliary MJPEG tiles (`aux_top`, `aux_side`, and the new `aux_flange` first-person camera) via `/nova-sim/api/v1/camera/<name>/video_feed`. Those overlays appear below the state panel only when the scene actually provides them, and the flange view mirrors a tool-mounted perspective so a trainer-side visualization can know what the robot “sees” without interfering with your primary viewport.
251
  - The T-shape target stays anchored at its configured pose across resets, which keeps the training objective consistent even when you hit Reset from the UI.
252
 
253
  ## Architecture
@@ -552,16 +552,27 @@ The server responds with `gym_reset`, `gym_step`, `gym_spaces`, or `gym_configur
552
  |----------|--------|-------------|
553
  | `/nova-sim/api/v1` | GET | Web interface (HTML/JS) |
554
  | `/nova-sim/api/v1/video_feed` | GET | MJPEG video stream |
 
555
 
556
- **Video stream usage:**
557
- ```html
558
- <img src="http://localhost:3004/nova-sim/api/v1/video_feed" />
559
- ```
 
 
 
 
 
 
 
 
 
560
 
561
  ### Metadata & Camera Feeds
562
 
563
  - `GET /nova-sim/api/v1/metadata` returns JSON describing every available robot/scene pair, the configured defaults (UR5 T-push), the supported commands (including teleop commands), and camera feed definitions so RL trainers can build dynamic dashboards.
564
  - `GET /nova-sim/api/v1/camera/<name>/video_feed` streams MJPEG for the primary viewport (`main`) or additional overlays such as `aux_top` / `aux_side`.
 
565
  - The metadata payload also exposes `overlay_camera_presets`, which lists any auxiliary video streams (e.g., `aux_top`, `aux_side`, `aux_flange`) that the UI will render below the state panel. Those same stream names are the ones your RL trainer should subscribe to whenever they are present.
566
 
567
  ## Wandelbots Nova API Integration
@@ -786,6 +797,18 @@ The Nova API integration is implemented in:
786
  - [robots/ur5/nova_api.py](robots/ur5/nova_api.py) - API client and configuration
787
  - [robots/ur5/ur5_env.py](robots/ur5/ur5_env.py) - Environment integration (lines 123-203, 499-520)
788
 
 
 
 
 
 
 
 
 
 
 
 
 
789
  ## License
790
 
791
  This project uses models from:
 
247
 
248
  ### UR5 Scene & Camera Hints
249
  - The UI now selects between exactly two UR5 options: the gripper-ready scene and the T-push scene (both enumerated via `/nova-sim/api/v1/metadata`). That metadata entry also returns `overlay_camera_presets`, so the trainer CLI can build dashboards based on the same stream list the UI exposes.
250
+ - UR5 T-push streams three auxiliary MJPEG tiles (`aux_top`, `aux_side`, and the new `aux_flange` first-person camera) via `/nova-sim/api/v1/camera/<name>/video_feed`. Those overlays appear below the state panel only when the scene actually provides them; the flange camera now sits below the tool, offset slightly along the flange +X axis, and always looks toward the stick tip so the trainer knows what the robot is actually pressing even if the stick partially hides the table view.
251
  - The T-shape target stays anchored at its configured pose across resets, which keeps the training objective consistent even when you hit Reset from the UI.
252
 
253
  ## Architecture
 
552
  |----------|--------|-------------|
553
  | `/nova-sim/api/v1` | GET | Web interface (HTML/JS) |
554
  | `/nova-sim/api/v1/video_feed` | GET | MJPEG video stream |
555
+ | `/nova-sim/api/v1/switch` | POST | Schedule a robot/scene switch without the UI |
556
 
557
+ **Video stream usage:**
558
+ ```html
559
+ <img src="http://localhost:3004/nova-sim/api/v1/video_feed" />
560
+ ```
561
+
562
+ **Robot switching (HTTP)**
563
+ ```
564
+ POST /nova-sim/api/v1/switch
565
+ Content-Type: application/json
566
+
567
+ {"robot": "ur5_t_push", "scene": "scene_t_push"}
568
+ ```
569
+ Returns `202 Accepted` while the server asynchronously performs the switch. This endpoint enables tests or automation frameworks to select robots without opening the browser.
570
 
571
  ### Metadata & Camera Feeds
572
 
573
  - `GET /nova-sim/api/v1/metadata` returns JSON describing every available robot/scene pair, the configured defaults (UR5 T-push), the supported commands (including teleop commands), and camera feed definitions so RL trainers can build dynamic dashboards.
574
  - `GET /nova-sim/api/v1/camera/<name>/video_feed` streams MJPEG for the primary viewport (`main`) or additional overlays such as `aux_top` / `aux_side`.
575
+ - `pytest tests/` exercises the HTTP metadata/video endpoints, the `/ws` control socket, and every camera feed. Keep Nova-Sim running at `http://localhost:3004` when you run it so the suite can talk to the live server.
576
  - The metadata payload also exposes `overlay_camera_presets`, which lists any auxiliary video streams (e.g., `aux_top`, `aux_side`, `aux_flange`) that the UI will render below the state panel. Those same stream names are the ones your RL trainer should subscribe to whenever they are present.
577
 
578
  ## Wandelbots Nova API Integration
 
797
  - [robots/ur5/nova_api.py](robots/ur5/nova_api.py) - API client and configuration
798
  - [robots/ur5/ur5_env.py](robots/ur5/ur5_env.py) - Environment integration (lines 123-203, 499-520)
799
 
800
+ ## Testing
801
+
802
+ 1. Start the Nova Sim server (e.g. `python nova-sim/mujoco_server.py` or via `docker-compose`).
803
+ 2. Keep it running at `http://localhost:3004` so the HTTP/websocket endpoints stay reachable.
804
+ 3. Run `pytest nova-sim/tests` to exercise:
805
+ - API endpoints (`/metadata`, `/camera/<name>/video_feed`, `/video_feed`)
806
+ - WebSocket control (`/ws`)
807
+ - Gym-style websocket (`/gym/ws`)
808
+ - Auxiliary MJPEG overlays after switching to the T-push UR5 scene
809
+
810
+ The tests assume the server is accessible via `http://localhost:3004/nova-sim/api/v1` and will skip automatically if the API is unreachable.
811
+
812
  ## License
813
 
814
  This project uses models from:
mujoco_server.py CHANGED
@@ -166,11 +166,13 @@ UR5_T_PUSH_OVERLAY_PRESETS = [
166
  "name": "aux_flange",
167
  "label": "Flange View",
168
  "follow_site": "ee_site",
 
169
  "track_orientation": True,
170
- "offset": [0.0, 0.0, 0.02],
171
- "distance": 0.2,
172
- "azimuth": 0,
173
- "elevation": -5,
 
174
  },
175
  ]
176
 
@@ -183,7 +185,7 @@ CAMERA_FEEDS = [
183
  {"name": "main", "label": "Main", "description": "Primary viewport"},
184
  {"name": "aux_top", "label": "Aux Top", "description": "Additional top perspective"},
185
  {"name": "aux_side", "label": "Aux Side", "description": "Side perspective"},
186
- {"name": "aux_flange", "label": "Flange", "description": "Tool-mounted first-person view"},
187
  ]
188
 
189
  overlay_camera_states: dict[str, dict[str, Any]] = {}
@@ -201,15 +203,25 @@ def _make_mjv_camera(config: dict[str, float]) -> mujoco.MjvCamera:
201
 
202
  def _get_site_forward(env_obj, site_id: int) -> np.ndarray:
203
  """Compute the forward (X axis) vector for a site."""
 
204
  if env_obj is None or site_id < 0:
205
- return np.array([0.0, 0.0, 1.0], dtype=np.float32)
206
- xmat = env_obj.data.site_xmat
 
 
 
 
207
  start = site_id * 9
208
  mat = xmat[start:start + 9]
209
- forward = np.array([mat[0], mat[3], mat[6]], dtype=np.float32)
 
 
 
 
 
210
  norm = np.linalg.norm(forward)
211
  if norm < 1e-6:
212
- return np.array([0.0, 0.0, 1.0], dtype=np.float32)
213
  return forward / norm
214
 
215
  def _close_overlay_renderers():
@@ -250,8 +262,15 @@ def prepare_overlay_renderers(robot_type: str, scene_name: str | None):
250
  "follow_site": follow_site,
251
  "site_id": site_id,
252
  "offset": offset,
 
253
  "track_orientation": bool(config.get("track_orientation")),
 
 
254
  }
 
 
 
 
255
  with overlay_frame_lock:
256
  overlay_frames[config["name"]] = None
257
 
@@ -598,10 +617,24 @@ def simulation_loop():
598
  continue
599
  if state.get("follow_site") and env is not None:
600
  site_id = state.get("site_id", -1)
601
- if site_id >= 0:
602
- lookat = env.data.site_xpos[site_id]
 
 
 
 
603
  offset = state.get("offset")
604
- cam_obj.lookat = lookat + offset if offset is not None else lookat
 
 
 
 
 
 
 
 
 
 
605
  if state.get("track_orientation"):
606
  forward = _get_site_forward(env, site_id)
607
  yaw = math.degrees(math.atan2(forward[1], forward[0]))
@@ -1225,12 +1258,12 @@ def index():
1225
  .state-panel {
1226
  position: absolute; top: 20px; right: 20px;
1227
  background: rgba(33, 28, 68, 0.85);
1228
- backdrop-filter: blur(15px);
1229
- padding: 15px 20px; border-radius: 10px;
1230
- box-shadow: 0 4px 20px rgba(1, 4, 15, 0.5);
1231
  color: var(--wb-secondary); border: 1px solid rgba(188, 190, 236, 0.15);
1232
- z-index: 100; min-width: 200px;
1233
- font-size: 0.8em; line-height: 1.5;
1234
  }
1235
  .state-panel strong { color: #fff; }
1236
  /* Camera controls - bottom left */
@@ -1687,8 +1720,8 @@ def index():
1687
  'KeyW', 'KeyA', 'KeyS', 'KeyD', 'KeyQ', 'KeyE',
1688
  'ArrowUp', 'ArrowDown', 'ArrowLeft', 'ArrowRight'
1689
  ]);
1690
- const TELEOP_STEP = 0.015;
1691
- const TELEOP_Z_STEP = 0.01;
1692
  let lastTeleopCommand = {dx: 0, dy: 0, dz: 0};
1693
 
1694
  function humanizeScene(scene) {
@@ -2194,6 +2227,7 @@ def index():
2194
  function updateTransVelocity() {
2195
  transVelocity = parseFloat(document.getElementById('trans_velocity').value);
2196
  document.getElementById('trans_vel_val').innerText = transVelocity.toFixed(0);
 
2197
  }
2198
 
2199
  function updateRotVelocity() {
@@ -2285,12 +2319,12 @@ def index():
2285
 
2286
  function updateArmTeleopFromKeys() {
2287
  let dx = 0, dy = 0, dz = 0;
2288
- if (keysPressed.has('KeyW')) dx += TELEOP_STEP;
2289
- if (keysPressed.has('KeyS')) dx -= TELEOP_STEP;
2290
- if (keysPressed.has('KeyA')) dy += TELEOP_STEP;
2291
- if (keysPressed.has('KeyD')) dy -= TELEOP_STEP;
2292
- if (keysPressed.has('KeyR')) dz += TELEOP_Z_STEP;
2293
- if (keysPressed.has('KeyF')) dz -= TELEOP_Z_STEP;
2294
 
2295
  const unchanged =
2296
  Math.abs(dx - lastTeleopCommand.dx) < 1e-6 &&
@@ -2465,5 +2499,17 @@ def metadata():
2465
  })
2466
 
2467
 
 
 
 
 
 
 
 
 
 
 
 
 
2468
  if __name__ == '__main__':
2469
  app.run(host='0.0.0.0', port=3004, debug=False, threaded=True)
 
166
  "name": "aux_flange",
167
  "label": "Flange View",
168
  "follow_site": "ee_site",
169
+ "look_target_site": "stick_tip",
170
  "track_orientation": True,
171
+ "offset": [0.03, 0.0, -0.04],
172
+ "forward_offset": 0.03,
173
+ "distance": 0.08,
174
+ "azimuth": 10,
175
+ "elevation": 20,
176
  },
177
  ]
178
 
 
185
  {"name": "main", "label": "Main", "description": "Primary viewport"},
186
  {"name": "aux_top", "label": "Aux Top", "description": "Additional top perspective"},
187
  {"name": "aux_side", "label": "Aux Side", "description": "Side perspective"},
188
+ {"name": "aux_flange", "label": "Flange", "description": "Tool view mounted below the flange"},
189
  ]
190
 
191
  overlay_camera_states: dict[str, dict[str, Any]] = {}
 
203
 
204
  def _get_site_forward(env_obj, site_id: int) -> np.ndarray:
205
  """Compute the forward (X axis) vector for a site."""
206
+ default = np.array([0.0, 0.0, 1.0], dtype=np.float32)
207
  if env_obj is None or site_id < 0:
208
+ return default
209
+ if site_id >= getattr(env_obj.model, "nsite", 0):
210
+ return default
211
+ xmat = getattr(env_obj.data, "site_xmat", None)
212
+ if xmat is None or xmat.size < (site_id + 1) * 9:
213
+ return default
214
  start = site_id * 9
215
  mat = xmat[start:start + 9]
216
+ if mat.size < 9:
217
+ return default
218
+ try:
219
+ forward = np.array([mat[0], mat[3], mat[6]], dtype=np.float32)
220
+ except IndexError:
221
+ return default
222
  norm = np.linalg.norm(forward)
223
  if norm < 1e-6:
224
+ return default
225
  return forward / norm
226
 
227
  def _close_overlay_renderers():
 
262
  "follow_site": follow_site,
263
  "site_id": site_id,
264
  "offset": offset,
265
+ "forward_offset": float(config.get("forward_offset", 0.0)),
266
  "track_orientation": bool(config.get("track_orientation")),
267
+ "look_target_site": config.get("look_target_site"),
268
+ "look_target_id": -1,
269
  }
270
+ look_target_site = config.get("look_target_site")
271
+ if look_target_site and env is not None:
272
+ look_target_id = mujoco.mj_name2id(env.model, mujoco.mjtObj.mjOBJ_SITE, look_target_site)
273
+ overlay_camera_states[config["name"]]["look_target_id"] = look_target_id
274
  with overlay_frame_lock:
275
  overlay_frames[config["name"]] = None
276
 
 
617
  continue
618
  if state.get("follow_site") and env is not None:
619
  site_id = state.get("site_id", -1)
620
+ if site_id >= 0 and site_id < getattr(env.model, "nsite", 0):
621
+ site_xpos = getattr(env.data, "site_xpos", None)
622
+ if site_xpos is not None and site_xpos.size >= (site_id + 1) * 3:
623
+ lookat_point = site_xpos[site_id]
624
+ else:
625
+ continue
626
  offset = state.get("offset")
627
+ if offset is not None:
628
+ lookat_point = lookat_point + offset
629
+ forward_offset = state.get("forward_offset", 0.0)
630
+ if forward_offset:
631
+ forward = _get_site_forward(env, site_id)
632
+ lookat_point = lookat_point + forward * forward_offset
633
+ look_target_id = state.get("look_target_id", -1)
634
+ target_point = None
635
+ if look_target_id >= 0 and site_xpos.size >= (look_target_id + 1) * 3:
636
+ target_point = site_xpos[look_target_id]
637
+ cam_obj.lookat = target_point if target_point is not None else lookat_point
638
  if state.get("track_orientation"):
639
  forward = _get_site_forward(env, site_id)
640
  yaw = math.degrees(math.atan2(forward[1], forward[0]))
 
1258
  .state-panel {
1259
  position: absolute; top: 20px; right: 20px;
1260
  background: rgba(33, 28, 68, 0.85);
1261
+ backdrop-filter: blur(12px);
1262
+ padding: 8px 12px; border-radius: 8px;
1263
+ box-shadow: 0 3px 14px rgba(1, 4, 15, 0.35);
1264
  color: var(--wb-secondary); border: 1px solid rgba(188, 190, 236, 0.15);
1265
+ z-index: 100; min-width: 180px; max-width: 220px;
1266
+ font-size: 0.75em; line-height: 1.3;
1267
  }
1268
  .state-panel strong { color: #fff; }
1269
  /* Camera controls - bottom left */
 
1720
  'KeyW', 'KeyA', 'KeyS', 'KeyD', 'KeyQ', 'KeyE',
1721
  'ArrowUp', 'ArrowDown', 'ArrowLeft', 'ArrowRight'
1722
  ]);
1723
+ let teleopTranslationStep = 0.05; // meters per keyboard nudge
1724
+ let teleopVerticalStep = 0.01;
1725
  let lastTeleopCommand = {dx: 0, dy: 0, dz: 0};
1726
 
1727
  function humanizeScene(scene) {
 
2227
  function updateTransVelocity() {
2228
  transVelocity = parseFloat(document.getElementById('trans_velocity').value);
2229
  document.getElementById('trans_vel_val').innerText = transVelocity.toFixed(0);
2230
+ teleopTranslationStep = transVelocity / 1000;
2231
  }
2232
 
2233
  function updateRotVelocity() {
 
2319
 
2320
  function updateArmTeleopFromKeys() {
2321
  let dx = 0, dy = 0, dz = 0;
2322
+ if (keysPressed.has('KeyW')) dx += teleopTranslationStep;
2323
+ if (keysPressed.has('KeyS')) dx -= teleopTranslationStep;
2324
+ if (keysPressed.has('KeyA')) dy += teleopTranslationStep;
2325
+ if (keysPressed.has('KeyD')) dy -= teleopTranslationStep;
2326
+ if (keysPressed.has('KeyR')) dz += teleopVerticalStep;
2327
+ if (keysPressed.has('KeyF')) dz -= teleopVerticalStep;
2328
 
2329
  const unchanged =
2330
  Math.abs(dx - lastTeleopCommand.dx) < 1e-6 &&
 
2499
  })
2500
 
2501
 
2502
+ @app.route(f'{API_PREFIX}/switch', methods=['POST'])
2503
+ def switch_robot_endpoint():
2504
+ payload = request.get_json(silent=True) or {}
2505
+ robot = payload.get('robot')
2506
+ scene = payload.get('scene')
2507
+ if not robot:
2508
+ return jsonify({"error": "robot is required"}), 400
2509
+ global needs_robot_switch
2510
+ needs_robot_switch = {"robot": robot, "scene": scene}
2511
+ return jsonify({"status": "pending", "robot": robot, "scene": scene}), 202
2512
+
2513
+
2514
  if __name__ == '__main__':
2515
  app.run(host='0.0.0.0', port=3004, debug=False, threaded=True)
requirements.txt CHANGED
@@ -5,3 +5,7 @@ flask-sock>=0.7.0
5
  opencv-python>=4.8.0
6
  torch>=2.0.0
7
  numpy>=1.24.0
 
 
 
 
 
5
  opencv-python>=4.8.0
6
  torch>=2.0.0
7
  numpy>=1.24.0
8
+ pytest>=7.4.0
9
+ pytest-asyncio>=0.23.0
10
+ requests>=2.30.0
11
+ websockets>=11.0
robots/ur5/model/scene_t_push.xml CHANGED
@@ -15,7 +15,7 @@
15
  <global azimuth="120" elevation="-20"/>
16
  </visual>
17
 
18
- <asset>
19
  <!-- Wandelbots gradient skybox - deep purple to near black -->
20
  <texture type="skybox" builtin="gradient" rgb1="0.13 0.11 0.27" rgb2="0.004 0.016 0.059" width="512" height="3072"/>
21
  <!-- Ground with Wandelbots purple accent -->
@@ -31,6 +31,9 @@
31
  <material name="t_target_mat" rgba="0.2 0.7 0.35 0.25" specular="0.2" shininess="0.2"/>
32
  <material name="t_object_mat" rgba="0.55 0.65 0.98 1" specular="0.3" shininess="0.2"/>
33
  <material name="stick_mat" rgba="0.6 0.6 0.62 1" specular="0.4" shininess="0.3"/>
 
 
 
34
 
35
  <!-- UR5e materials - with Wandelbots accent colors -->
36
  <material name="black" rgba="0.02 0.02 0.04 1" specular="0.5" shininess="0.25"/>
@@ -105,7 +108,7 @@
105
  </body>
106
 
107
  <!-- Target visualization sphere (for IK target) -->
108
- <body name="target" pos="0.4 0.0 0.6" mocap="true">
109
  <geom name="target_vis" type="sphere" size="0.03" material="target_mat" contype="0" conaffinity="0"/>
110
  </body>
111
 
@@ -167,6 +170,7 @@
167
  <!-- In flange frame: +Z points away from robot (downward when wrist is horizontal) -->
168
  <geom name="push_stick" type="cylinder" fromto="0 0 0 0 0 0.18" size="0.008"
169
  material="stick_mat" mass="0.02" friction="1 0.01 0.01"/>
 
170
  </body>
171
 
172
  <!-- End-effector site for IK (at flange) -->
@@ -180,17 +184,27 @@
180
  </body>
181
 
182
  <!-- T-shaped target (visual marker) -->
183
- <body name="t_target" pos="0.62 -0.18 0.425">
184
  <geom name="t_target_stem" type="box" pos="0 -0.05 0" size="0.02 0.07 0.002" material="t_target_mat" contype="0" conaffinity="0"/>
185
  <geom name="t_target_cap" type="box" pos="0 0.03 0" size="0.08 0.02 0.002" material="t_target_mat" contype="0" conaffinity="0"/>
186
  </body>
187
 
188
  <!-- Movable T-shaped object to push into target -->
189
- <body name="t_object" pos="0.45 0.2 0.43">
190
  <freejoint name="t_object_joint"/>
191
  <geom name="t_object_stem" type="box" pos="0 -0.05 0" size="0.02 0.07 0.03" material="t_object_mat" mass="3.0" friction="0.3 0.005 0.005"/>
192
  <geom name="t_object_cap" type="box" pos="0 0.03 0" size="0.08 0.02 0.03" material="t_object_mat" mass="2.0" friction="0.3 0.005 0.005"/>
193
  </body>
 
 
 
 
 
 
 
 
 
 
194
  </worldbody>
195
 
196
  <actuator>
 
15
  <global azimuth="120" elevation="-20"/>
16
  </visual>
17
 
18
+ <asset>
19
  <!-- Wandelbots gradient skybox - deep purple to near black -->
20
  <texture type="skybox" builtin="gradient" rgb1="0.13 0.11 0.27" rgb2="0.004 0.016 0.059" width="512" height="3072"/>
21
  <!-- Ground with Wandelbots purple accent -->
 
31
  <material name="t_target_mat" rgba="0.2 0.7 0.35 0.25" specular="0.2" shininess="0.2"/>
32
  <material name="t_object_mat" rgba="0.55 0.65 0.98 1" specular="0.3" shininess="0.2"/>
33
  <material name="stick_mat" rgba="0.6 0.6 0.62 1" specular="0.4" shininess="0.3"/>
34
+ <material name="axis_x" rgba="1 0.2 0.2 1" specular="0.4" shininess="0.3"/>
35
+ <material name="axis_y" rgba="0.2 1 0.1 1" specular="0.4" shininess="0.3"/>
36
+ <material name="axis_z" rgba="0.25 0.4 1 1" specular="0.4" shininess="0.3"/>
37
 
38
  <!-- UR5e materials - with Wandelbots accent colors -->
39
  <material name="black" rgba="0.02 0.02 0.04 1" specular="0.5" shininess="0.25"/>
 
108
  </body>
109
 
110
  <!-- Target visualization sphere (for IK target) -->
111
+ <body name="target" pos="0.4 0.0 0.72" mocap="true">
112
  <geom name="target_vis" type="sphere" size="0.03" material="target_mat" contype="0" conaffinity="0"/>
113
  </body>
114
 
 
170
  <!-- In flange frame: +Z points away from robot (downward when wrist is horizontal) -->
171
  <geom name="push_stick" type="cylinder" fromto="0 0 0 0 0 0.18" size="0.008"
172
  material="stick_mat" mass="0.02" friction="1 0.01 0.01"/>
173
+ <site name="stick_tip" pos="0 0 0.18" type="sphere" size="0.005" rgba="0.6 0.9 0.3 0.2"/>
174
  </body>
175
 
176
  <!-- End-effector site for IK (at flange) -->
 
184
  </body>
185
 
186
  <!-- T-shaped target (visual marker) -->
187
+ <body name="t_target" pos="0.62 -0.18 0.42">
188
  <geom name="t_target_stem" type="box" pos="0 -0.05 0" size="0.02 0.07 0.002" material="t_target_mat" contype="0" conaffinity="0"/>
189
  <geom name="t_target_cap" type="box" pos="0 0.03 0" size="0.08 0.02 0.002" material="t_target_mat" contype="0" conaffinity="0"/>
190
  </body>
191
 
192
  <!-- Movable T-shaped object to push into target -->
193
+ <body name="t_object" pos="0.45 0.25 0.46">
194
  <freejoint name="t_object_joint"/>
195
  <geom name="t_object_stem" type="box" pos="0 -0.05 0" size="0.02 0.07 0.03" material="t_object_mat" mass="3.0" friction="0.3 0.005 0.005"/>
196
  <geom name="t_object_cap" type="box" pos="0 0.03 0" size="0.08 0.02 0.03" material="t_object_mat" mass="2.0" friction="0.3 0.005 0.005"/>
197
  </body>
198
+
199
+ <!-- Axis arrows (XYZ) for orientation -->
200
+ <body name="axis_arrows" pos="0.35 0.6 0.43">
201
+ <geom type="capsule" fromto="0 0 0 0.12 0 0" size="0.006" material="axis_x" contype="0" conaffinity="0"/>
202
+ <geom type="sphere" pos="0.13 0 0" size="0.008" material="axis_x" contype="0" conaffinity="0"/>
203
+ <geom type="capsule" fromto="0 0 0 0 0.12 0" size="0.006" material="axis_y" contype="0" conaffinity="0"/>
204
+ <geom type="sphere" pos="0 0.13 0" size="0.008" material="axis_y" contype="0" conaffinity="0"/>
205
+ <geom type="capsule" fromto="0 0 0 0 0 0.12" size="0.006" material="axis_z" contype="0" conaffinity="0"/>
206
+ <geom type="sphere" pos="0 0 0.13" size="0.008" material="axis_z" contype="0" conaffinity="0"/>
207
+ </body>
208
  </worldbody>
209
 
210
  <actuator>
tests/conftest.py ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import requests
3
+ import pytest
4
+
5
+ API_BASE = os.getenv("NOVA_SIM_API_URL", "http://localhost:3004/nova-sim/api/v1")
6
+ WS_BASE = os.getenv("NOVA_SIM_WS_URL", "ws://localhost:3004/nova-sim/api/v1/ws")
7
+
8
+
9
+ def _ping_api() -> None:
10
+ try:
11
+ resp = requests.get(API_BASE, timeout=3)
12
+ resp.raise_for_status()
13
+ except (requests.RequestException, ValueError) as exc:
14
+ pytest.skip(f"Nova-Sim API not reachable at {API_BASE}: {exc}")
15
+
16
+
17
+ @pytest.fixture(scope="session")
18
+ def api_base():
19
+ _ping_api()
20
+ return API_BASE
21
+
22
+
23
+ @pytest.fixture(scope="session")
24
+ def ws_url():
25
+ # Ensure API is accessible before touching the websocket
26
+ _ping_api()
27
+ return WS_BASE
tests/test_api.py ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import requests
2
+ from requests import Response
3
+
4
+ def test_root_ui(api_base: str):
5
+ resp = requests.get(api_base, timeout=5)
6
+ assert resp.status_code == 200
7
+ assert '<title>Nova Sim' in resp.text
8
+
9
+
10
+ def test_metadata(api_base: str):
11
+ resp = requests.get(f"{api_base}/metadata", timeout=5)
12
+ assert resp.status_code == 200
13
+ payload = resp.json()
14
+ assert 'robots' in payload and isinstance(payload['robots'], dict)
15
+ assert 'commands' in payload
16
+ assert 'camera_feeds' in payload
17
+
18
+
19
+ def test_camera_feeds_structure(api_base: str):
20
+ resp = requests.get(f"{api_base}/metadata", timeout=5)
21
+ data = resp.json()
22
+ feeds = data.get('camera_feeds') or []
23
+ assert any(feed.get('name') == 'main' for feed in feeds)
24
+ assert any(feed.get('name') == 'aux_flange' for feed in feeds)
25
+
26
+
27
+ def test_overlay_camera_presets(api_base: str):
28
+ resp = requests.get(f"{api_base}/metadata", timeout=5)
29
+ data = resp.json()
30
+ presets = data.get('overlay_camera_presets', {})
31
+ assert 'ur5_t_push' in presets or 'scene_t_push' in presets
32
+ target_presets = presets.get('ur5_t_push') or presets.get('scene_t_push') or []
33
+ names = {item.get('name') for item in target_presets}
34
+ assert {'aux_top', 'aux_side', 'aux_flange'}.issubset(names)
35
+
36
+
37
+ def test_switch_endpoint(api_base: str):
38
+ # Missing robot is rejected
39
+ resp = requests.post(f"{api_base}/switch", json={"scene": "scene_t_push"}, timeout=5)
40
+ assert resp.status_code == 400
41
+
42
+ # Valid request schedules the switch
43
+ resp = requests.post(f"{api_base}/switch", json={"robot": "ur5_t_push", "scene": "scene_t_push"}, timeout=5)
44
+ assert resp.status_code == 202
tests/test_video.py ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import time
2
+ import requests
3
+ import pytest
4
+
5
+ def _assert_mjpeg_header(resp: requests.Response) -> None:
6
+ content_type = resp.headers.get('Content-Type', '')
7
+ assert 'multipart/x-mixed-replace' in content_type
8
+
9
+
10
+ def test_main_video_stream(api_base: str):
11
+ with requests.get(f"{api_base}/video_feed", stream=True, timeout=5) as resp:
12
+ assert resp.status_code == 200
13
+ _assert_mjpeg_header(resp)
14
+ chunks = resp.iter_content(chunk_size=1024)
15
+ first_chunk = next(chunks)
16
+ assert first_chunk, 'Expected non-empty MJPEG chunk'
17
+
18
+
19
+ def _validate_mjpeg_stream(api_base: str, camera_name: str) -> None:
20
+ with requests.get(f"{api_base}/camera/{camera_name}/video_feed", stream=True, timeout=5) as resp:
21
+ assert resp.status_code in (200, 304), f'Stream {camera_name} returned {resp.status_code}'
22
+ _assert_mjpeg_header(resp)
23
+ chunk = next(resp.iter_content(chunk_size=1024))
24
+ assert chunk, f'MJPEG stream {camera_name} yielded empty chunk'
25
+
26
+
27
+ def test_aux_camera_streams_after_switch(api_base: str):
28
+ resp = requests.post(f"{api_base}/switch", json={"robot": "ur5_t_push", "scene": "scene_t_push"}, timeout=5)
29
+ assert resp.status_code == 202, f"Switch request failed: {resp.status_code}"
30
+ time.sleep(3)
31
+
32
+ camera_names = ['aux_top', 'aux_side', 'aux_flange']
33
+ for name in camera_names:
34
+ _validate_mjpeg_stream(api_base, name)
tests/test_ws.py ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import asyncio
2
+ import json
3
+ import pytest
4
+ import websockets
5
+
6
+ @pytest.mark.asyncio
7
+ async def test_ws_state_message(ws_url: str):
8
+ async with websockets.connect(ws_url, ping_interval=None) as ws:
9
+ raw = await asyncio.wait_for(ws.recv(), timeout=5)
10
+ payload = json.loads(raw)
11
+ assert payload.get('type') == 'state'
12
+ data = payload.get('data', {})
13
+ assert 'robot' in data
14
+ # send a no-op command to prove WS accepts messages
15
+ await ws.send(json.dumps({'type': 'command', 'data': {'vx': 0.0, 'vy': 0.0, 'vyaw': 0.0}}))
16
+ # read an acknowledgement (state) again
17
+ await asyncio.wait_for(ws.recv(), timeout=5)