CLI & API reference
Overview
The waymux binary is the single control surface for both humans and
agents. Every session lifecycle operation, input injection, capture, recording, and
viewer control is a verb on this one binary. The daemon (waymuxd) speaks
the same underlying protocol; the CLI is a thin client that talks to it over a Unix
socket (or HTTPS with --remote).
Add --json to any verb to get a stable, machine-readable envelope on
stdout. Scripts and agents should always use --json rather than parsing
human text output, which is unversioned.
# Quick taste: create a session, launch an app, screenshot it, clean up.
waymux new demo --size 1280x720
waymux spawn demo -- foot
waymux wait demo --app-id foot
waymux screenshot-desktop demo -o out.png
waymux rm demo
JSON envelope
Every verb that accepts --json emits exactly one JSON object on stdout
when it exits (streaming verbs emit newline-delimited JSON lines instead). The
envelope shape is fixed:
| Field | Type | Present when | Description |
|---|---|---|---|
ok | bool | always | true on success, false on error |
verb | string | always | The CLI verb that produced this envelope (e.g. "new", "record start") |
data | object | ok: true | Verb-specific payload (see each verb below) |
error | object | ok: false | Structured error: { "code": "E_NOT_FOUND", "message": "..." } |
# Success
{ "ok": true, "verb": "new", "data": { "name": "demo", "size": "1920x1080" } }
# Error
{ "ok": false, "verb": "info", "error": { "code": "E_NOT_FOUND", "message": "session 'x' not found" } }
# Screenshot: data.png_b64 is a standard base64-encoded PNG string
{ "ok": true, "verb": "screenshot", "data": { "window_id": 3, "png_b64": "iVBOR..." } }
# Streaming verbs (events, logs) emit NDJSON: one envelope per line
{ "ok": true, "verb": "events", "data": { "topic": "windows", "kind": "created", ... } }
{ "ok": true, "verb": "events", "data": { "topic": "windows", "kind": "destroyed", ... } }
The exit code mirrors ok: exit 0 on success, non-zero on error. Both
the exit code and the envelope are stable. Human-readable text output (without
--json) is unversioned and may change.
Global flags
These flags apply to every verb and are accepted before the verb name.
| Flag | Env var | Description |
|---|---|---|
--socket <PATH> | WAYMUX_SOCKET | Control socket path. Defaults to $XDG_RUNTIME_DIR/waymux.sock |
--remote | WAYMUX_REMOTE | Route through HTTPS to the credentialed remote endpoint instead of the local Unix-socket daemon |
--base-url <URL> | Override the base URL when --remote is set (defaults to the value persisted by login) | |
--json | Emit the stable JSON envelope on stdout instead of human-readable text |
Sessions
A session is one wayland-server compositor process with its own virtual
output. Sessions are identified by name. Most session verbs work both locally (via
Unix socket) and remotely (via --remote).
new
Create a session.
waymux new <NAME> [OPTIONS]
waymux new demo --size 1920x1080 --scale 1 --json
| Flag | Default | Description |
|---|---|---|
--size <WxH> | 1920x1080 | Virtual output dimensions in pixels |
--scale <N> | 1 | Output scale factor (HiDPI) |
--share-audio | off | Share host PulseAudio/PipeWire sockets so apps inside the session can play or record audio. Local-only. |
--mem-cap-mb <MiB> | Aggregate memory cap via cgroup-v2 | |
--cpu-cap-pct <pct> | Aggregate CPU cap as percent of one core (200 = two cores) via cgroup-v2 | |
--disk-quota-mb <MiB> | Per-session tmpfs quota for the runtime dir (requires daemon CAP_SYS_ADMIN) | |
--fd-limit <N> | File-descriptor cap (RLIMIT_NOFILE) inherited by everything spawned inside the session | |
--api-key-id <UUID> | WAYMUX_API_KEY_ID | API-key UUID to attribute usage to (embedded in usage-event JSONL) |
# --json output
{ "ok": true, "verb": "new", "data": { "name": "demo", "size": "1920x1080", "scale": 1 } }
ls
List all sessions.
waymux ls [--json]
# --json output
{ "ok": true, "verb": "ls", "data": { "sessions": [ { "name": "demo", "size": "1920x1080" } ] } }
info
Show details for a single session.
waymux info <NAME> [--json]
# --json output
{ "ok": true, "verb": "info", "data": { "name": "demo", "size": "1920x1080", "scale": 1, ... } }
rm
Destroy a session and all resources associated with it.
waymux rm <NAME> [--json]
# --json output
{ "ok": true, "verb": "rm", "data": { "name": "demo" } }
resize
Resize a session's virtual output. Local-only. The compositor propagates a
wl_output mode-change event and sends a toplevel configure to all
surfaces.
waymux resize <NAME> <WxH> [--json]
waymux resize demo 1280x720
# --json output
{ "ok": true, "verb": "resize", "data": { "name": "demo", "size": "1280x720" } }
Windows & tags
A window is a toplevel Wayland surface inside a session. Windows are identified by integer id (stable for the lifetime of the surface). Tags are free-form string labels you apply to a window so you can select on them later.
windows
List windows in a session, optionally filtered by tag.
waymux windows <NAME> [--tag <TAG>] [--json]
waymux windows demo --tag browser
| Flag | Description |
|---|---|
--tag <TAG> | Only return windows whose tag set contains this tag (client-side filter) |
# --json output
{ "ok": true, "verb": "windows", "data": { "windows": [
{ "id": 3, "app_id": "foot", "title": "foot", "tags": ["term"], "pid": 12345 }
] } }
tag
Replace the tag set of a window with one or more free-form labels. At least one tag is required. The previous tag set is discarded.
waymux tag <NAME> <WINDOW_ID> <TAG>... [--json]
waymux tag demo 3 browser main
# --json output
{ "ok": true, "verb": "tag", "data": { "window_id": 3, "tags": ["browser","main"] } }
wait
Block until a window matching a selector appears in the session, or until the timeout expires. Local-only (uses the events stream internally).
waymux wait <NAME> [SELECTOR...] [--timeout-ms N] [--json]
waymux wait demo --app-id chromium --timeout-ms 10000
| Flag | Default | Description |
|---|---|---|
--app-id <ID> | Match windows with this Wayland app-id | |
--title <TITLE> | Match windows with this exact title | |
--tag <TAG> | Match windows whose tag set contains this tag | |
--pid <PID> | Match windows owned by this pid | |
--nth <N> | Select the nth (0-based) matching window | |
--timeout-ms <N> | 5000 | Maximum wait in milliseconds before error exit |
# --json output on success
{ "ok": true, "verb": "wait", "data": { "window_id": 5, "app_id": "chromium", "title": "New Tab" } }
Input
waymux supports three input verbs. inject is the general-purpose verb
and works both locally and remotely. key and click are
convenience wrappers that are local-only.
inject
Inject one or more input operations as a JSON array. The array is passed on the
command line as a string; no shell interpretation occurs. This is the only input verb
available over --remote.
waymux inject <NAME> --ops '<JSON-ARRAY>' [--json]
# Press and release the 'a' key (keycode 30)
waymux inject demo --ops '[{"type":"key","keycode":30},{"type":"key","keycode":30,"release":true}]'
# Move pointer to (100, 200) and left-click (button 272)
waymux inject demo --ops '[{"type":"pointer_motion","x":100,"y":200},{"type":"button","button":272}]'
| Flag | Description |
|---|---|
--ops <JSON> (required) | JSON array of inject ops. Each op has a "type" field; see the protocol crate for the full InjectOp schema. |
# --json output
{ "ok": true, "verb": "inject", "data": {} }
key
Send a synthetic key event. Local-only convenience wrapper around inject.
For remote use, send a key op via inject --ops.
waymux key <NAME> <KEYCODE> [--release] [--modifiers N] [--json]
waymux key demo 30 # press 'a' (keycode 30)
waymux key demo 30 --release # release 'a'
| Flag | Default | Description |
|---|---|---|
--release | off | Send a key release instead of a press |
--modifiers <N> | 0 | Modifier bitmask (e.g. Shift=1, Ctrl=4) |
click
Move the pointer and optionally press a button. Local-only convenience wrapper. For
remote use, send pointer_motion and button ops via
inject --ops.
waymux click <NAME> <X> <Y> [--button N] [--json]
waymux click demo 640 400 # move pointer, no click
waymux click demo 640 400 --button 272 # move and left-click
| Flag | Default | Description |
|---|---|---|
--button <N> | 0 | Linux input button code. 0 moves without clicking. 272 = left, 273 = right, 274 = middle. |
Capture
screenshot
Capture a single window's current buffer as a PNG.
waymux screenshot <NAME> <WINDOW_ID> [-o <PATH>] [--json]
waymux screenshot demo 3 -o win3.png
waymux screenshot demo 3 --json # PNG in data.png_b64, no file written
| Flag | Description |
|---|---|
-o / --output <PATH> | Output PNG path, or - for stdout. Required without --json. Optional under --json (PNG is returned as data.png_b64; if provided the file is also written). |
# --json output
{ "ok": true, "verb": "screenshot", "data": { "window_id": 3, "png_b64": "iVBOR..." } }
screenshot-desktop
Composite every window in the session into a single full-desktop PNG.
waymux screenshot-desktop <NAME> [-o <PATH>] [--json]
waymux screenshot-desktop demo -o desktop.png
waymux screenshot-desktop demo --json # PNG in data.png_b64
| Flag | Description |
|---|---|
-o / --output <PATH> | Output PNG path, or - for stdout. Required without --json. Optional under --json. |
# --json output
{ "ok": true, "verb": "screenshot-desktop", "data": { "png_b64": "iVBOR..." } }
Recording
Record a session's composited output to a lossless MKV file. Recording is
local-only. The record group has three subcommands:
start, stop, and status.
record start
Start recording to an MKV file. Two capture paths are used automatically based on the client's buffer type: a fast dmabuf zero-copy path for GPU-rendered clients, and a software composite fallback for SHM-only or multi-surface scenes.
waymux record start <NAME> [OUTPUT] [OPTIONS]
waymux record start demo
waymux record start demo /tmp/demo.mkv --codec ffv1 --mode whole-desktop
| Flag | Default | Description |
|---|---|---|
OUTPUT | auto | Output MKV path inside ~/.local/share/waymux/recordings/. Defaults to an auto-named <session>-<ts>.mkv. |
--codec <CODEC> | ffv1 | Video encoder (see table below) |
--secondary-codec <CODEC> | Optional second encoder running in parallel from the same frame source, writing <output>.secondary.mkv | |
--mode <MODE> | focused-window | focused-window: capture only the focused window's surface tree (one memcpy per commit). whole-desktop: composite all surfaces together (multi-window flows or Plasma desktop recordings). |
--min-fps <N> | Minimum fps pacing. Without this flag recording is commit-driven (idle = 0 fps). With e.g. --min-fps 60, the last frame is re-encoded at 60 Hz when no new commit arrives. |
Available codecs:
| Codec | Lossless | Requires | Notes |
|---|---|---|---|
ffv1 | yes | CPU / llvmpipe | Default. ~70 MB/min at 1080p. Preferred for CI visual-regression artifacts. |
h264-nvenc | no | NVIDIA GPU | ~5 MB/min. Preferred for marketing screencasts. |
h264-vaapi | no | AMD/Intel GPU + libva | ~5 MB/min. |
h264-vulkan | no | Vulkan video encode (NVIDIA 535+, AMD Mesa 25+, Intel Mesa 24.3+) | Zero ffmpeg subprocess, zero-copy on capable hardware. |
ffv1-vulkan | yes | Vulkan compute (ffmpeg ffv1_vulkan) | GPU zero-copy lossless. Slow on integrated GPUs; expected real-time at 4K 60 fps on discrete cards. |
h264-vulkan-lossless | yes | Vulkan video encode, Hi444PP profile | Broken on NVIDIA drivers 560+580. Use hevc-vulkan-lossless instead for NVIDIA lossless. |
hevc-vulkan-lossless | yes | ffmpeg 8.0 + Vulkan video encode | H.265 4:4:4 lossless. Validated on NVIDIA RTX A6000 at 1080p 60 fps. |
# --json output
{ "ok": true, "verb": "record start", "data": { "name": "demo", "output": "/home/user/.local/share/waymux/recordings/demo-20260101.mkv" } }
record stop
Stop a session's active recording and flush the MKV to disk.
waymux record stop <NAME> [--json]
# --json output
{ "ok": true, "verb": "record stop", "data": { "name": "demo" } }
record status
Report whether a session is currently recording, plus the output path and codec when active.
waymux record status <NAME> [--json]
# --json output (recording active)
{ "ok": true, "verb": "record status", "data": { "recording": true, "output": "/home/user/.local/share/waymux/recordings/demo-20260101.mkv", "codec": "ffv1" } }
# --json output (not recording)
{ "ok": true, "verb": "record status", "data": { "recording": false } }
Viewer
The viewer provides a browser-based WebRTC stream of a session. Local-only. The
viewer group has four subcommands: start,
stop, status, and token.
viewer start
waymux viewer start <NAME> [--bind ADDR] [--port N] [--json]
waymux viewer start demo # loopback, ephemeral port
waymux viewer start demo --bind 0.0.0.0 --port 8090 # LAN-accessible
| Flag | Default | Description |
|---|---|---|
--bind <ADDR> | 127.0.0.1 | Bind address for the bridge's HTTP, WS, and WebRTC endpoints. Note: SSH local-forwarding does not pass WebRTC's UDP media plane; use a WireGuard IP or 0.0.0.0 for LAN access instead. |
--port <N> | ephemeral | Explicit TCP port for the bridge. Defaults to a random free port chosen at start. |
# --json output
{ "ok": true, "verb": "viewer start", "data": { "url": "http://127.0.0.1:39201" } }
viewer stop
waymux viewer stop <NAME> [--json]
viewer status
waymux viewer status <NAME> [--json]
# --json output (viewer active)
{ "ok": true, "verb": "viewer status", "data": { "url": "http://127.0.0.1:39201" } }
# --json output (no viewer)
{ "ok": true, "verb": "viewer status", "data": { "url": null } }
viewer token
Mint an ephemeral EdDSA viewer JWT for the local/dev laptop-viewer path. Prints the token, the public key the session must trust, and the expiry.
waymux viewer token [--json]
Idle / events / logs
idle
Block until the session has been quiescent (no new compositor frames) for at least
--quiet-ms. Useful in scripts to wait for an animation or page-load to
settle before taking a screenshot.
waymux idle <NAME> [--quiet-ms N] [--timeout-ms N] [--json]
waymux idle demo --quiet-ms 500 --timeout-ms 10000
| Flag | Default | Description |
|---|---|---|
--quiet-ms <N> | 500 | Required quiet window: how long the session must have no new frames before idle returns |
--timeout-ms <N> | 10000 | Overall timeout; exits with a non-zero status if the session never reaches the quiet window |
events
Stream session events as newline-delimited JSON. Local-only. Each line is one
envelope. Runs until interrupted (Ctrl-C). The --topic flag selects
which event topics to include; defaults to sessions windows.
waymux events <NAME> [--topic TOPIC...] [--json]
waymux events demo
waymux events demo --topic windows
| Flag | Default | Description |
|---|---|---|
--topic <TOPIC> | sessions windows | Event topics to subscribe to. May be repeated. Known topics: sessions, windows. |
events and logs are streaming verbs. Under
--json they emit NDJSON (one envelope per line) rather than a single
terminal envelope. They are local-only and are excluded from the MCP tool surface.
logs
Tail a session's stdout/stderr. Local-only. Add -f to follow (stream
new lines as they arrive).
waymux logs <NAME> [-f] [--settle-ms N] [--json]
waymux logs demo -f
| Flag | Default | Description |
|---|---|---|
-f / --follow | off | Stream new log lines as they arrive instead of printing the buffer and exiting |
--settle-ms <N> | 200 | Settle delay before printing in non-follow mode |
Attach / detach
attach
Print the path of the attach Wayland socket for a session. A Wayland client launched
with WAYLAND_DISPLAY set to this path (or socket) sees the session as
its compositor. Local-only.
waymux attach <NAME> [--json]
# --json output
{ "ok": true, "verb": "attach", "data": { "socket": "/run/user/1000/waymux/demo/wayland-0" } }
detach
Mark a session as detached. Local-only.
waymux detach <NAME> [--json]
Daemon
serve
Run the waymuxd daemon in the foreground. This is a supervised
foreground daemon; it does not background itself. Use it under a process supervisor
(systemd, s6, runit) or in a container entrypoint. It resolves the waymuxd
binary via: $WAYMUXD_BIN, then a sibling of the waymux
binary, then $PATH. It replaces the current process via execv.
waymux serve [--socket PATH]
You usually do not need to run this explicitly. When any local verb finds no daemon
socket, the CLI auto-spawns a background waymuxd and retries. Set
WAYMUX_NO_AUTOSPAWN=1 to disable auto-spawn and manage the daemon
yourself. An auto-spawned daemon outlives the CLI invocation; stop it with a signal
(kill or pkill waymuxd).
Login
Authenticate against a hosted waymux-api endpoint and persist the
credentials to disk. The persisted credentials are used by --remote
verbs. No production host is baked into the OSS build; point
--base-url at your own deployment.
waymux login --api-key <KEY> --base-url <URL>
waymux login --api-key sk-... --base-url https://api.example.com
| Flag | Default | Description |
|---|---|---|
--api-key <KEY> | API key string | |
--base-url <URL> | http://localhost:8080 | Base URL of the remote endpoint. Persisted alongside the key and used by subsequent --remote invocations. |
login writes credentials to disk and is excluded from the MCP tool surface.
Spawn
Launch a Wayland client (or a nested compositor) inside a session. Local-only. The
command and its arguments are passed after a -- separator; no shell is
involved so there is no shell-injection surface. The session's
WAYLAND_DISPLAY is set automatically.
waymux spawn <NAME> [--compositor] -- <ARGV...>
waymux spawn demo -- foot
waymux spawn demo -- chromium --ozone-platform=wayland --app=https://example.com
waymux spawn demo --compositor -- kwin_wayland --no-global-shortcuts
| Flag | Description |
|---|---|
--compositor | Hint that the spawned process is a nested compositor |
MCP server
waymux-mcp is a Model Context Protocol server that exposes the waymux
verb set to agents and LLM tool-use systems. It speaks MCP JSON-RPC 2.0 over stdio
and fulfills tool calls by exec-ing the waymux CLI. Every value is
passed as a discrete argv element; the CLI is never invoked via a shell, so there
is no shell-injection surface.
Pointing an MCP client at waymux-mcp
# In your MCP client config (e.g. Claude Desktop / Claude Code mcp_servers):
{
"waymux": {
"command": "/usr/local/bin/waymux-mcp",
"args": []
}
}
# The binary resolves waymux via:
# 1. $WAYMUX_BIN env var
# 2. A `waymux` sibling next to the waymux-mcp executable
# 3. `waymux` on $PATH
Tool naming
Every tool is named waymux_<verb> where the verb's non-alphanumeric
characters are replaced by underscores. The two-word subcommands follow the same
scheme: record start becomes waymux_record_start,
viewer status becomes waymux_viewer_status.
Exposed tools (23 total)
The streaming verbs (events, logs) and login
are intentionally excluded. All other discrete request/response verbs are exposed.
| MCP tool name | CLI verb | Local-only | Description |
|---|---|---|---|
waymux_ls | ls | List all sessions | |
waymux_new | new | Create a session | |
waymux_rm | rm | Destroy a session | |
waymux_info | info | Show details for a session | |
waymux_spawn | spawn | yes | Launch a client inside a session |
waymux_windows | windows | List windows, optionally filtered by tag | |
waymux_tag | tag | Replace a window's tag set | |
waymux_resize | resize | yes | Resize a session's virtual output |
waymux_screenshot | screenshot | Capture a window as PNG (returned base64-encoded) | |
waymux_screenshot_desktop | screenshot-desktop | Composite all windows into a desktop PNG | |
waymux_idle | idle | Wait for the session to go quiescent | |
waymux_wait | wait | yes | Block until a matching window appears |
waymux_key | key | yes | Send a synthetic key event |
waymux_click | click | yes | Move pointer and optionally click |
waymux_inject | inject | Inject a JSON array of input ops | |
waymux_attach | attach | yes | Return the attach Wayland socket path |
waymux_detach | detach | yes | Mark a session as detached |
waymux_record_start | record start | yes | Start recording to MKV |
waymux_record_stop | record stop | yes | Stop recording |
waymux_record_status | record status | yes | Report recording state |
waymux_viewer_start | viewer start | yes | Start a WebRTC viewer and return its URL |
waymux_viewer_stop | viewer stop | yes | Stop the viewer |
waymux_viewer_status | viewer status | yes | Return the viewer URL if active |
Tool call and response shape
Tools that return image data (screenshot, screenshot-desktop) return the PNG
as a base64-encoded string in the data.png_b64 field of the CLI
envelope. The MCP server passes the envelope JSON through verbatim as the tool
result content.
# MCP tool call (JSON-RPC 2.0 over stdio)
{ "jsonrpc": "2.0", "id": 1, "method": "tools/call",
"params": { "name": "waymux_screenshot_desktop",
"arguments": { "name": "demo" } } }
# MCP response (CLI envelope as content)
{ "jsonrpc": "2.0", "id": 1, "result": {
"content": [{ "type": "text", "text":
"{\"ok\":true,\"verb\":\"screenshot-desktop\",\"data\":{\"png_b64\":\"iVBOR...\"}}" }] } }