grouped runtime reads and waits selector modes (#5)

- grouped runtime reads and waits
selector modes
- Fix wait command client timeouts and test failures
This commit is contained in:
Hari 2026-03-25 21:11:30 -04:00 committed by GitHub
parent cc8f8e548a
commit a4cf9e32dd
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
12 changed files with 1323 additions and 77 deletions

View file

@ -11,8 +11,9 @@ Desktop control CLI for AI agents on Linux X11. Provides a unified interface for
## Core Workflow
1. **Snapshot** to see the desktop and get window refs
2. **Act** using refs or coordinates (click, type, focus)
3. **Repeat** as needed
2. **Query / wait** using grouped `get` and `wait` commands
3. **Act** using refs, explicit selectors, or coordinates
4. **Repeat** as needed
## Quick Reference
@ -24,6 +25,12 @@ deskctl snapshot --annotate # Screenshot with bounding boxes and labels
deskctl snapshot --json # Structured JSON output
deskctl list-windows # Window tree without screenshot
deskctl screenshot /tmp/s.png # Screenshot only (no window tree)
deskctl get active-window # Currently focused window
deskctl get monitors # Monitor geometry
deskctl get version # deskctl version + backend
deskctl get systeminfo # Runtime-scoped diagnostics
deskctl wait window --selector 'title=Firefox' --timeout 10
deskctl wait focus --selector 'class=firefox' --timeout 5
```
### Click and Type
@ -51,7 +58,9 @@ deskctl mouse drag 100 100 500 500 # Drag from (100,100) to (500,500)
```bash
deskctl focus @w2 # Focus window by ref
deskctl focus "firefox" # Focus window by name (substring match)
deskctl focus 'title=Firefox' # Focus by explicit title selector
deskctl focus 'class=firefox' # Focus by explicit class selector
deskctl focus "firefox" # Fuzzy substring match (fails on ambiguity)
deskctl close @w3 # Close window gracefully
deskctl move-window @w1 100 200 # Move window to position
deskctl resize-window @w1 800 600 # Resize window
@ -89,14 +98,29 @@ After `snapshot` or `list-windows`, windows are assigned short refs:
- Refs reset on each `snapshot` call
- Use `--json` to see stable `window_id` values for programmatic tracking within the current daemon session
## Selector Contract
Prefer explicit selectors when an agent needs deterministic targeting:
```bash
ref=w1
id=win1
title=Firefox
class=firefox
focused
```
Bare selectors such as `firefox` still work as fuzzy substring matches, but they now fail with candidate windows if multiple matches exist.
## Example Agent Workflow
```bash
# 1. See what's on screen
deskctl snapshot --annotate
# 2. Focus the browser
deskctl focus "firefox"
# 2. Wait for the browser and focus it deterministically
deskctl wait window --selector 'class=firefox' --timeout 10
deskctl focus 'class=firefox'
# 3. Navigate to a URL
deskctl hotkey ctrl l