computer-use
OpenClaw plugin that registers 7 vm_* tools for VM-based macOS computer interaction via Lume
VMs and cua-computer-server
. Enables computer-use agents for macOS GUI, Xcode, and iOS workflows without sacrificing Docker sandboxing for the main agent.
Architecture
Main Agent (Docker) --sessions_send--> Worker Agent --vm_*--> WebSocket --> Lume VM (cua-computer-server)The main agent stays Docker-sandboxed while delegating GUI tasks to a worker agent via sessions_send. The worker agent controls the Lume VM through WebSocket-connected vm_* tools.
Prerequisites
- Apple Silicon Mac — Lume requires Apple Virtualization.framework (Apple Silicon only)
- Lume installed —
brew install --cask lume - cua-computer-server running inside the VM —
pip install cua-computer - OpenClaw 2026.2.1+ — for
before_tool_callhook support
Setup
1. Create and prepare the Lume VM
# Create VM (see Phase 8 for recommended CPU/memory/disk settings)
lume create openclaw-vm --os macos --ipsw latest
# Start and SSH in
lume run openclaw-vm --no-display
lume ssh openclaw-vm
# Inside the VM: install cua-computer-server
pip install cua-computer # provides cua-computer-server binary2. Enable Lume HTTP server
The plugin uses the Lume HTTP API to look up VM IP addresses. Enable it with a LaunchAgent on the host:
# Verify Lume HTTP server is running (default port 7777)
curl -s http://localhost:7777/lume/vms | jq .3. Install the plugin
cd extensions/computer-use
npm install
openclaw plugins install -l ./extensions/computer-use4. Enable in openclaw.json
{
plugins: {
entries: {
"computer-use": {
enabled: true,
config: {
vmName: "openclaw-vm",
lumeApiUrl: "http://localhost:7777",
serverPort: 5000,
connectTimeoutMs: 30000,
commandTimeoutMs: 60000,
screenshotScale: 0.5,
logVerbose: false,
maxScreenshotBytes: 10485760
}
}
}
}
}Restart the gateway. The plugin connects to the VM lazily on the first vm_* tool call.
Config
| Key | Default | Description |
|---|---|---|
vmName | "openclaw-vm" | Lume VM name for IP lookup |
lumeApiUrl | "http://localhost:7777" | Lume HTTP server URL |
serverPort | 5000 | cua-computer-server WebSocket port inside VM |
connectTimeoutMs | 30000 | Max ms for WebSocket connect + Lume HTTP call |
commandTimeoutMs | 60000 | Max ms per command execution |
screenshotScale | 0.5 | Informational only (no server-side scaling in MVP) |
logVerbose | false | Extra protocol debug logs (never logs screenshots) |
maxScreenshotBytes | 10485760 | Max screenshot size in bytes (10 MB) |
Tools
| Tool | Parameters | Returns |
|---|---|---|
vm_screenshot | (none) | PNG image content block |
vm_exec | command (string, required) | stdout/stderr text |
vm_click | x, y (number, required), button? ("left" | "right" | "double", default "left") | Confirmation text |
vm_type | text (string, required) | Confirmation text |
vm_key | keys (string, required — e.g. "escape", "command+s") | Confirmation text |
vm_launch | app (string, required — e.g. "Xcode", "Safari"), args? (string[]) | Confirmation text |
vm_scroll | direction ("up" | "down", required), clicks? (number, default 5) | Confirmation text |
How it works
Lazy connection — WebSocket to
cua-computer-serveris not created until the firstvm_*tool call. On connection, the plugin fetches the VM’s IP from Lume HTTP API (GET /lume/vms/{vmName}), verifies the VM is running, then connects viaws://{vm-ip}:{serverPort}.Command serialization — All tool calls are serialized through a mutex (promise queue). The WebSocket protocol uses request/response pairs without correlation IDs, so concurrent calls would mismatch responses.
Reconnection — If the WebSocket closes (VM restart, server crash), the connection singleton is reset. The next tool call triggers a fresh IP lookup and reconnect.
VM health — Before connecting, the plugin checks VM status via the Lume HTTP API. If the VM is not running, the tool returns an actionable error with startup instructions.
Security notes
vm_execcommand injection — the tool intentionally provides shell access inside the VM. Do NOT pass unsanitized user input directly tovm_exec. The VM isolation boundary contains command injection — a compromised command runs inside the VM, not on the host.Shared directory trust boundary — files exchanged via the shared directory (
workspace/vm-shared/on host,/Volumes/My Shared Files/in VM) are bidirectional. Treat files from either side as untrusted input.VM network egress —
vm_execenables network access from the VM. If the VM has unrestricted egress, a compromised worker agent can exfiltrate data. Recommend firewall rules or egress allowlisting on the VM (see Phase 8 ).WebSocket unencrypted — the connection uses
ws://(notwss://). Acceptable for localhost/VM-local network. Consider TLS if the VM is on a different network segment.Plugin runs in gateway process — the plugin makes HTTP/WebSocket calls from the gateway process, bypassing agent-level network restrictions. This is by design: sandboxed agents can’t make network calls, but plugin tools can.
sessions_senddelegation risk — inter-agent messages bypass per-agent tool restrictions. A compromised worker agent can delegate arbitrary operations to the main agent. The main agent’s AGENTS.md is the last line of defense.
Testing
cd extensions/computer-use
npm install
npm testUnit tests use mocked WebSocket and Lume HTTP responses. No real VM needed for unit tests.
Integration tests (in .openclaw-test/) verify plugin loading and tool registration in a running gateway.
Limitations
- Lume 2 macOS VM limit — Lume free tier supports max 2 concurrent macOS VMs (Apple’s Virtualization.framework limit)
- No rate limiting — no sustained rate limit between commands (only per-command timeout). Runaway tool loops are possible
- English-only keyboard — key input assumes US English keyboard layout (macOS input source limitation)
- Screenshot size — full-resolution Retina screenshots may exceed the 10 MB default limit.
screenshotScaleis informational only (no server-side scaling in MVP) - One tool at a time — WebSocket serializes all commands per worker agent. No concurrent
vm_*tool calls - WebSocket stale after idle — no keepalive/heartbeat. Long-idle connections may go stale; the plugin reconnects on the next call
- VM state edge cases — VM suspend/resume, snapshots, and multiple gateways connecting to the same VM produce undefined behavior
SDK migration path
The plugin uses a direct WebSocket client to cua-computer-server (no SDK dependency). When @trycua/computer adds a local Lume provider for TypeScript (currently only available in the Python SDK), migration to the official SDK will simplify the connection layer. Watch trycua/cua
for updates.