Reachy Mini documentation
Media
Media
Media Manager
class reachy_mini.media.media_manager.MediaManager
< source >( backend: MediaBackend = <MediaBackend.LOCAL: 'local'> log_level: str = 'INFO' signalling_host: str = 'localhost' camera_specs: Optional[CameraSpecs] = None daemon_url: str = '' )
Media Manager for handling camera and audio devices.
This class provides a unified interface for managing both camera and audio devices across different backends. It handles initialization, configuration, and cleanup of media resources.
Close the media manager and release resources.
Get the Direction of Arrival (DoA) from the microphone array.
Get an audio sample from the audio device.
Get a frame from the camera.
Get the input samplerate of the audio device.
Get the number of input channels of the audio device.
Get the output samplerate of the audio device.
Get the number of output channels of the audio device.
Play a sound file.
Note: If the audio backend is not initialised, a warning is logged and the call is silently ignored.
push_audio_sample
< source >( data: npt.NDArray[np.float32] )
Push audio data to the output device.
Start playing audio.
Start recording audio.
Stop playing audio.
Stop recording audio.
Audio
Audio implementation using GStreamer.
Extends AudioBase with two GStreamer-specific helpers:
clear_output_buffer(): flush queued playback data without stopping the pipeline (no-op by default; useful before refilling the buffer).clear_player(): flush the playback appsrc immediately via GStreamer flush events, dropping any queued audio.
Release all resources (pipelines, USB devices).
Flush queued playback data so it is not played.
A low set_max_output_buffers value may make this unnecessary
for most use-cases.
Flush the player’s appsrc to drop any queued audio immediately.
No-op for the local backend.
No-op for the local backend.
play_sound
< source >( sound_file: str )
Play a sound file through the Reachy Mini Audio card.
The file is played via a GStreamer playbin routed to the same
audio sink used by the push-based playback pipeline.
push_audio_sample
< source >( data: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float32]] )
Push audio data to the speaker.
Start the playback pipeline so push_audio_sample can feed data.
No-op for the local backend — the file is already accessible.
Audio Utils Functions
reachy_mini.media.audio_utils.get_respeaker_card_number
< source >( ) → int
Returns
int
The card number of the detected ReSpeaker/Reachy Mini Audio device. Returns 0 if no specific device is found (uses default sound card), or -1 if there’s an error running the detection command.
Return the card number of the ReSpeaker sound card, or 0 if not found.
Note: This function runs ‘arecord -l’ to list available audio capture devices and processes the output to find Reachy Mini Audio or ReSpeaker devices. It’s primarily used on Linux systems with ALSA audio configuration.
The function returns:
- Positive integer: Card number of detected Reachy Mini Audio device
- 0: No Reachy Mini Audio device found, using default sound card
- -1: Error occurred while trying to detect audio devices
reachy_mini.media.audio_utils.has_reachymini_asoundrc
< source >( ) → bool
Returns
bool
True if ~/.asoundrc exists and contains the required Reachy Mini audio configuration entries, False otherwise.
Check if ~/.asoundrc exists and contains both reachymini_audio_sink and reachymini_audio_src.
Note: This function checks for the presence of the ALSA configuration file ~/.asoundrc and verifies that it contains the necessary configuration entries for Reachy Mini audio devices (reachymini_audio_sink and reachymini_audio_src). These entries are required for proper audio routing and device management.
Check if ~/.asoundrc exists and is correctly configured for Reachy Mini Audio.
Write the .asoundrc file with Reachy Mini audio configuration to the user’s home directory.
This function creates an ALSA configuration file (.asoundrc) in the user’s home directory that configures the ReSpeaker sound card for proper audio routing and multi-client support. The configuration enables simultaneous audio input and output access, which is essential for the Reachy Mini Wireless version’s audio functionality.
The generated configuration includes:
- Default audio device settings pointing to the ReSpeaker sound card
- dmix plugin for multi-client audio output (reachymini_audio_sink)
- dsnoop plugin for multi-client audio input (reachymini_audio_src)
- Proper buffer and sample rate settings for optimal performance
Note: This function automatically detects the ReSpeaker card number and creates a configuration tailored to the detected hardware. It is primarily used for the Reachy Mini Wireless version.
The configuration file will be created at ~/.asoundrc and will overwrite any existing file with the same name. Existing audio configurations should be backed up before calling this function.
Audio Control Utils Functions
Class to interface with the ReSpeaker XVF3800 USB device.
Close the interface.
Read data from a specified parameter on the ReSpeaker device.
Write data to a specified parameter on the ReSpeaker device.
reachy_mini.media.audio_control_utils.find
< source >( vid: int = 10374 pid: int = 26 ) → ReSpeaker | None
Find and return the ReSpeaker USB device with the given Vendor ID and Product ID.
Note: This function searches for USB devices with the specified Vendor ID and Product ID using libusb backend. The default values target XMOS XVF3800 devices used in ReSpeaker microphone arrays.
reachy_mini.media.audio_control_utils.init_respeaker_usb
< source >( ) → Optional[ReSpeaker]
Returns
Optional[ReSpeaker]
A ReSpeaker object if a compatible device is found, None otherwise.
Initialize the ReSpeaker USB device. Looks for both new and beta device IDs.
Note: This function attempts to initialize a ReSpeaker microphone array by searching for USB devices with known Vendor and Product IDs. It tries:
- New Reachy Mini Audio firmware (0x38FB:0x1001) - preferred
- Old ReSpeaker firmware (0x2886:0x001A) - with warning to update
The function handles USB backend errors gracefully and returns None if no compatible device is found or if initialization fails.
Example:
from reachy_mini.media.audio_control_utils import init_respeaker_usb
# Initialize ReSpeaker device
respeaker = init_respeaker_usb()
if respeaker is not None:
print("ReSpeaker initialized successfully")
# Use the device...
doa = respeaker.read("DOA_VALUE_RADIANS")
respeaker.close()
else:
print("No ReSpeaker device found")Camera
class reachy_mini.media.camera_gstreamer.GStreamerCamera
< source >( log_level: str = 'INFO' camera_specs: typing.Optional[reachy_mini.media.camera_constants.CameraSpecs] = None )
Camera that reads BGR frames from the daemon’s local IPC endpoint.
The WebRTC daemon exposes BGR camera frames via a local IPC mechanism:
- Linux / macOS:
unixfdsink/unixfdsrc(Unix domain socket) - Windows:
win32ipcvideosink/win32ipcvideosrc(shared memory)
Since the daemon’s IPC branch already converts to BGR, the reader
pipeline is simply source → queue → appsink with no extra
conversion.
Stop the pipeline and release resources.
Start the GStreamer pipeline and begin receiving frames.
Pull the latest BGR frame from the IPC endpoint.
Camera Utils Functions
reachy_mini.media.camera_utils.undistort_points
< source >( u: float v: float K: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] D: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] max_iterations: int = 20 epsilon: float = 0.01 ) → Tuple (x_n, y_n)
Parameters
- u — Horizontal pixel coordinate.
- v — Vertical pixel coordinate.
- K — 3x3 camera intrinsic matrix [[fx, 0, cx], [0, fy, cy], [0, 0, 1]].
- D — Distortion coefficients array. Supports lengths 0, 4, 5, 8, 12, or 14. Unused positions default to 0.
- max_iterations — Maximum number of iterations (default 20).
- epsilon — Convergence threshold in pixel reprojection error (default 0.01).
Returns
Tuple (x_n, y_n)
Normalized undistorted coordinates (on the z=1 plane).
Undistort a single pixel coordinate to normalized camera coordinates.
Pure numpy equivalent of cv2.undistortPoints(). Supports the OpenCV distortion model with up to 12 coefficients (rational model + thin prism): D = (k1, k2, p1, p2, k3, k4, k5, k6, s1, s2, s3, s4)
Also works with 5-coefficient models (k1, k2, p1, p2, k3) and zero-distortion.
The algorithm matches OpenCV’s cvUndistortPointsInternal:
- Remove camera intrinsics to get normalized distorted coordinates.
- Iteratively solve for undistorted coordinates using a damped fixed-point method with adaptive step size.
Reference: OpenCV distortion model and undistortPoints algorithm: https://docs.opencv.org/4.x/d9/d0c/group__calib3d.html https://github.com/opencv/opencv/blob/4.x/modules/calib3d/src/undistort.dispatch.cpp
reachy_mini.media.camera_utils.scale_intrinsics
< source >( K_original: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] original_size: typing.Tuple[int, int] target_size: typing.Tuple[int, int] crop_scale: float ) → K_scaled
Scale camera intrinsics for a different resolution with cropping.
Camera Constants
class reachy_mini.media.camera_constants.CameraResolution
< source >( *values )
Parameters
- R1536x864at40fps — 1536x864 resolution at 40 fps
- R1280x720at60fps — 1280x720 resolution at 60 fps (HD)
- R1280x720at30fps — 1280x720 resolution at 30 fps (HD)
- R1920x1080at30fps — 1920x1080 resolution at 30 fps (Full HD)
- R1920x1080at60fps — 1920x1080 resolution at 60 fps (Full HD)
- R2304x1296at30fps — 2304x1296 resolution at 30 fps
- R1600x1200at30fps — 1600x1200 resolution at 30 fps
- R3264x2448at30fps — 3264x2448 resolution at 30 fps
- R3264x2448at10fps — 3264x2448 resolution at 10 fps
- R3840x2592at30fps — 3840x2592 resolution at 30 fps
- R3840x2592at10fps — 3840x2592 resolution at 10 fps
- R3840x2160at30fps — 3840x2160 resolution at 30 fps (4K UHD)
- R3840x2160at10fps — 3840x2160 resolution at 10 fps (4K UHD)
- R3072x1728at10fps — 3072x1728 resolution at 10 fps
- R4608x2592at10fps — 4608x2592 resolution at 10 fps
Base class for camera resolutions.
Enumeration of standardized camera resolutions and frame rates supported by Reachy Mini cameras. Each enum value contains a tuple of (width, height, fps).
Note: The enum values are tuples containing (width, height, frames_per_second, crop_factor). Not all resolutions are supported by all camera models - check the specific camera specifications for available resolutions.
Example:
from reachy_mini.media.camera_constants import CameraResolution
# Get resolution information
res = CameraResolution.R1280x720at30fps
width, height, fps, crop_factor = res.value
print(f"Resolution: {width}x{height}@{fps}fps")
# Check if a resolution is supported by a camera
from reachy_mini.media.camera_constants import ReachyMiniLiteCamSpecs
res = CameraResolution.R1920x1080at60fps
if res in ReachyMiniLiteCamSpecs.available_resolutions:
print("This resolution is supported")class reachy_mini.media.camera_constants.CameraSpecs
< source >( name: str = '' available_resolutions: typing.List[reachy_mini.media.camera_constants.CameraResolution] = <factory> default_resolution: CameraResolution = <CameraResolution.R1280x720at30fps: (1280, 720, 30, 1.0)> K: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> D: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> )
Parameters
- name (str) — Human-readable name of the camera model.
- available_resolutions (List[CameraResolution]) — List of supported resolutions and frame rates for this camera model.
- default_resolution (CameraResolution) — Default resolution used when the camera is initialized.
- vid (int) — USB Vendor ID for identifying this camera model.
- pid (int) — USB Product ID for identifying this camera model.
- K (npt.NDArray[np.float64]) — 3x3 camera intrinsic matrix containing focal lengths and principal point coordinates.
- D (npt.NDArray[np.float64]) — 5-element array containing distortion coefficients (k1, k2, p1, p2, k3) for radial and tangential distortion.
Base camera specifications.
Dataclass containing specifications for a camera model, including supported resolutions, calibration parameters, and USB identification information.
Note: The intrinsic matrix K has the format: [[fx, 0, cx], [ 0, fy, cy], [ 0, 0, 1]]
Where fx, fy are focal lengths in pixels, and cx, cy are the principal point coordinates (typically near the image center).
Example:
from reachy_mini.media.camera_constants import CameraSpecs
# Create a custom camera specification
custom_specs = CameraSpecs(
name="custom_camera",
available_resolutions=[CameraResolution.R1280x720at30fps],
default_resolution=CameraResolution.R1280x720at30fps,
vid=0x1234,
pid=0x5678,
K=np.array([[800, 0, 640], [0, 800, 360], [0, 0, 1]]),
D=np.zeros(5)
)class reachy_mini.media.camera_constants.ArducamSpecs
< source >( name: str = '' available_resolutions: typing.List[reachy_mini.media.camera_constants.CameraResolution] = <factory> default_resolution: CameraResolution = <CameraResolution.R1280x720at30fps: (1280, 720, 30, 1.0)> K: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> D: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> )
Arducam camera specifications.
class reachy_mini.media.camera_constants.ReachyMiniLiteCamSpecs
< source >( name: str = '' available_resolutions: typing.List[reachy_mini.media.camera_constants.CameraResolution] = <factory> default_resolution: CameraResolution = <CameraResolution.R1280x720at30fps: (1280, 720, 30, 1.0)> K: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> D: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> )
Reachy Mini Lite camera specifications.
class reachy_mini.media.camera_constants.ReachyMiniWirelessCamSpecs
< source >( name: str = '' available_resolutions: typing.List[reachy_mini.media.camera_constants.CameraResolution] = <factory> default_resolution: CameraResolution = <CameraResolution.R1280x720at30fps: (1280, 720, 30, 1.0)> K: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> D: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> )
Reachy Mini Wireless camera specifications.
class reachy_mini.media.camera_constants.OlderRPiCamSpecs
< source >( name: str = '' available_resolutions: typing.List[reachy_mini.media.camera_constants.CameraResolution] = <factory> default_resolution: CameraResolution = <CameraResolution.R1280x720at30fps: (1280, 720, 30, 1.0)> K: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> D: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> )
Older Raspberry Pi camera specifications. Keeping for compatibility.
class reachy_mini.media.camera_constants.MujocoCameraSpecs
< source >( name: str = '' available_resolutions: typing.List[reachy_mini.media.camera_constants.CameraResolution] = <factory> default_resolution: CameraResolution = <CameraResolution.R1280x720at30fps: (1280, 720, 30, 1.0)> K: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> D: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> )
Mujoco simulated camera specifications.
class reachy_mini.media.camera_constants.GenericWebcamSpecs
< source >( name: str = '' available_resolutions: typing.List[reachy_mini.media.camera_constants.CameraResolution] = <factory> default_resolution: CameraResolution = <CameraResolution.R1280x720at30fps: (1280, 720, 30, 1.0)> K: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> D: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> )
Generic webcam specifications (fallback for any webcam).
WebRTC
class reachy_mini.media.webrtc_client_gstreamer.GstWebRTCClient
< source >( log_level: str = 'INFO' peer_id: str = '' signaling_host: str = '' signaling_port: int = 8443 camera_specs: typing.Optional[reachy_mini.media.camera_constants.CameraSpecs] = None )
WebRTC client that provides both camera frames and audio.
Implements the same public API surface as GStreamerCamera (for
video) and GStreamerAudio (for audio) so that MediaManager
can assign the same instance to both its camera and audio
slots.
Release all resources.
No-op (WebRTC send chain does not buffer significantly).
Stop the WebRTC pipeline.
delete_sound
< source >( filename: str )
Delete a sound file from the daemon’s temporary sound directory.
Get the Direction of Arrival from the ReSpeaker.
List sound files in the daemon’s temporary sound directory.
Start the WebRTC pipeline (both video and audio).
play_sound
< source >( sound_file: str )
Play a sound file on the robot’s speaker via the daemon REST API.
If sound_file is a local path that exists on this machine the file is automatically uploaded to the daemon’s temporary sound directory (skipping the upload when a file with the same name is already present). Otherwise the filename is sent as-is and the daemon resolves it from its built-in assets or filesystem.
push_audio_sample
< source >( data: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float32]] )
Push audio data to the remote peer via WebRTC.
Pull the latest BGR video frame.
No-op — audio send chain is set up automatically on WebRTC connection.
No-op — recording starts automatically with open().
Reset the PTS counter for the send chain and stop daemon-side sound.
No-op — managed by close().
upload_sound
< source >( sound_file: str )
Upload a local sound file to the daemon’s temporary directory.
class reachy_mini.media.media_server.GstMediaServer
< source >( log_level: str = 'INFO' use_sim: bool = False )
Daemon-side GStreamer media server.
Owns the camera and audio hardware and distributes media to consumers:
- IPC branch — raw BGR frames via
unixfdsink/win32ipcvideosinkfor on-device applications (GStreamerCamerareads from this). - WebRTC branch — encoded video + audio via
webrtcsinkfor remote clients (GstWebRTCClientconnects to this). - Sound playback —
playbinfor playing WAV files on the speaker.
Release GStreamer resources (MainLoop, bus watch).
play_sound
< source >( sound_file: str )
Play a sound file on the robot’s speaker.
Uses GStreamer’s playbin element with a platform-aware audio sink. This is used for daemon-side sounds (wake-up, sleep, etc.).
send_data_message
< source >( peer_id: typing.Optional[str] message: str )
Send a message to connected peers via data channel.
set_message_handler
< source >( handler: typing.Callable[[str, str], NoneType] )
Set a callback for incoming data channel messages.
Rebuild the pipeline from scratch and start it.
Rebuilding ensures a clean state after stop() released all hardware.
Stop the pipeline and release all hardware (camera, audio).
Stop the currently playing sound file.
If no sound is currently playing this is a no-op.