mirror of
https://github.com/harivansh-afk/deskctl.git
synced 2026-04-15 05:02:08 +00:00
align docs and contract
This commit is contained in:
parent
c37589ccf4
commit
14c8956321
10 changed files with 590 additions and 657 deletions
268
README.md
268
README.md
|
|
@ -1,266 +1,68 @@
|
|||
# deskctl
|
||||
|
||||
Desktop control CLI for AI agents on Linux X11.
|
||||
[](https://www.npmjs.com/package/deskctl-cli)
|
||||
[](https://github.com/harivansh-afk/deskctl/releases)
|
||||
[](#support-boundary)
|
||||
[](skills/deskctl)
|
||||
|
||||
Non-interactive desktop control for AI agents on Linux X11.
|
||||
|
||||
## Install
|
||||
|
||||
### Cargo
|
||||
|
||||
```bash
|
||||
cargo install deskctl
|
||||
```
|
||||
|
||||
Source builds on Linux require:
|
||||
|
||||
- Rust 1.75+
|
||||
- `pkg-config`
|
||||
- X11 development libraries for input and windowing, typically `libx11-dev` and `libxtst-dev` on Debian/Ubuntu
|
||||
|
||||
### npm
|
||||
|
||||
```bash
|
||||
npm install -g deskctl-cli
|
||||
deskctl --help
|
||||
deskctl doctor
|
||||
deskctl snapshot --annotate
|
||||
```
|
||||
|
||||
One-shot execution is also supported:
|
||||
One-shot execution also works:
|
||||
|
||||
```bash
|
||||
npx deskctl-cli --help
|
||||
```
|
||||
|
||||
`deskctl-cli` currently supports `linux-x64` and installs the `deskctl` command by downloading the matching GitHub Release asset.
|
||||
`deskctl-cli` installs the `deskctl` command by downloading the matching GitHub Release asset for the supported runtime target.
|
||||
|
||||
### Installable skill
|
||||
|
||||
For `skills.sh` / agent skill ecosystems:
|
||||
## Installable skill
|
||||
|
||||
```bash
|
||||
npx skills add harivansh-afk/deskctl -s deskctl
|
||||
```
|
||||
|
||||
The installable skill lives under [`skills/deskctl`](skills/deskctl) and is designed for X11 sandboxes, VMs, and sandbox-agent desktop sessions. It points agents to the npm install path first so they can get `deskctl` without Cargo.
|
||||
The installable skill lives in [`skills/deskctl`](skills/deskctl) and is built around the same observe -> wait -> act -> verify loop as the CLI.
|
||||
|
||||
### Nix
|
||||
## Quick example
|
||||
|
||||
```bash
|
||||
deskctl doctor
|
||||
deskctl snapshot --annotate
|
||||
deskctl wait window --selector 'title=Firefox' --timeout 10
|
||||
deskctl focus 'title=Firefox'
|
||||
deskctl type "hello world"
|
||||
```
|
||||
|
||||
## Docs
|
||||
|
||||
- runtime contract: [docs/runtime-contract.md](docs/runtime-contract.md)
|
||||
- release flow: [docs/releasing.md](docs/releasing.md)
|
||||
- installable skill: [skills/deskctl](skills/deskctl)
|
||||
- contributor workflow: [CONTRIBUTING.md](CONTRIBUTING.md)
|
||||
|
||||
## Other install paths
|
||||
|
||||
Nix:
|
||||
|
||||
```bash
|
||||
nix run github:harivansh-afk/deskctl -- --help
|
||||
nix profile install github:harivansh-afk/deskctl
|
||||
```
|
||||
|
||||
The repo flake is the supported Nix install surface in this phase.
|
||||
|
||||
### Docker Convenience
|
||||
|
||||
Build a Linux binary locally with Docker:
|
||||
|
||||
```bash
|
||||
docker compose -f docker/docker-compose.yml run --rm build
|
||||
```
|
||||
|
||||
This writes `dist/deskctl-linux-x86_64`.
|
||||
|
||||
Copy it to an SSH machine where `scp` is unavailable:
|
||||
|
||||
```bash
|
||||
ssh -p 443 deskctl@ssh.agentcomputer.ai 'cat > ~/deskctl && chmod +x ~/deskctl' < dist/deskctl-linux-x86_64
|
||||
```
|
||||
|
||||
Run it on an X11 session:
|
||||
|
||||
```bash
|
||||
DISPLAY=:1 XDG_SESSION_TYPE=x11 ~/deskctl --json snapshot --annotate
|
||||
```
|
||||
|
||||
### Local Source Build
|
||||
Source build:
|
||||
|
||||
```bash
|
||||
cargo build
|
||||
```
|
||||
|
||||
## Quick Start
|
||||
## Support boundary
|
||||
|
||||
```bash
|
||||
# Diagnose the environment first
|
||||
deskctl doctor
|
||||
|
||||
# See the desktop
|
||||
deskctl snapshot
|
||||
|
||||
# Query focused runtime state
|
||||
deskctl get active-window
|
||||
deskctl get monitors
|
||||
|
||||
# Click a window
|
||||
deskctl click @w1
|
||||
|
||||
# Type text
|
||||
deskctl type "hello world"
|
||||
|
||||
# Wait for a window or focus transition
|
||||
deskctl wait window --selector 'title=Firefox' --timeout 10
|
||||
deskctl wait focus --selector 'class=firefox' --timeout 5
|
||||
|
||||
# Focus by explicit selector
|
||||
deskctl focus 'title=Firefox'
|
||||
```
|
||||
|
||||
## Architecture
|
||||
|
||||
Client-daemon architecture over Unix sockets (NDJSON wire protocol).
|
||||
The daemon starts automatically on first command and keeps the X11 connection alive for fast repeated calls.
|
||||
|
||||
Source layout:
|
||||
|
||||
- `src/lib.rs` exposes the shared library target
|
||||
- `src/main.rs` is the thin CLI wrapper
|
||||
- `src/` contains production code and unit tests
|
||||
- `tests/` contains Linux/X11 integration tests
|
||||
- `tests/support/` contains shared integration helpers
|
||||
|
||||
## Runtime Requirements
|
||||
|
||||
- Linux with X11 session
|
||||
- Rust 1.75+ plus the source-build dependencies above when building from source
|
||||
|
||||
The binary itself only links the standard glibc runtime on Linux (`libc`, `libm`, `libgcc_s`).
|
||||
|
||||
For deskctl to be fully functional on a fresh VM you still need:
|
||||
|
||||
- an X11 server and an active `DISPLAY`
|
||||
- `XDG_SESSION_TYPE=x11` or an equivalent X11 session environment
|
||||
- a window manager or desktop environment that exposes standard EWMH properties such as `_NET_CLIENT_LIST_STACKING` and `_NET_ACTIVE_WINDOW`
|
||||
- an X server with the extensions needed for input simulation and screen metadata, which is standard on normal desktop X11 setups
|
||||
|
||||
If setup fails, run:
|
||||
|
||||
```bash
|
||||
deskctl doctor
|
||||
```
|
||||
|
||||
## Contract Notes
|
||||
|
||||
- `@wN` refs are short-lived handles assigned by `snapshot` and `list-windows`
|
||||
- `--json` output includes a stable `window_id` for programmatic targeting within the current daemon session
|
||||
- `list-windows` is a cheap read-only operation and does not capture or write a screenshot
|
||||
- the stable runtime JSON/error contract is documented in [docs/runtime-contract.md](docs/runtime-contract.md)
|
||||
|
||||
## Read and Wait Surface
|
||||
|
||||
The grouped runtime reads are:
|
||||
|
||||
```bash
|
||||
deskctl get active-window
|
||||
deskctl get monitors
|
||||
deskctl get version
|
||||
deskctl get systeminfo
|
||||
```
|
||||
|
||||
The grouped runtime waits are:
|
||||
|
||||
```bash
|
||||
deskctl wait window --selector 'title=Firefox' --timeout 10
|
||||
deskctl wait focus --selector 'id=win3' --timeout 5
|
||||
```
|
||||
|
||||
Successful `get active-window`, `wait window`, and `wait focus` responses return a `window` payload with:
|
||||
- `ref_id`
|
||||
- `window_id`
|
||||
- `title`
|
||||
- `app_name`
|
||||
- geometry (`x`, `y`, `width`, `height`)
|
||||
- state flags (`focused`, `minimized`)
|
||||
|
||||
`get monitors` returns:
|
||||
- `count`
|
||||
- `monitors[]` with geometry and primary/automatic flags
|
||||
|
||||
`get version` returns:
|
||||
- `version`
|
||||
- `backend`
|
||||
|
||||
`get systeminfo` stays runtime-scoped and returns:
|
||||
- `backend`
|
||||
- `display`
|
||||
- `session_type`
|
||||
- `session`
|
||||
- `socket_path`
|
||||
- `screen`
|
||||
- `monitor_count`
|
||||
- `monitors`
|
||||
|
||||
Wait timeout and selector failures are structured in `--json` mode so agents can recover without string parsing.
|
||||
|
||||
## Output Policy
|
||||
|
||||
Text mode is compact and follow-up-oriented, but JSON is the parsing contract.
|
||||
|
||||
- use `--json` when an agent needs strict parsing
|
||||
- rely on `window_id`, selector-related fields, grouped read payloads, and structured error `kind` values for stable automation
|
||||
- treat monitor naming, incidental whitespace, and default screenshot file names as best-effort
|
||||
|
||||
See [docs/runtime-conract.md](docs/runtime-contract.md) for the exact stable-vs-best-effort breakdown.
|
||||
|
||||
## Distribution
|
||||
|
||||
- GitHub Releases are the canonical binary source
|
||||
- crates.io package: `deskctl`
|
||||
- npm package: `deskctl-cli`
|
||||
- installed command on every channel: `deskctl`
|
||||
- repo-owned Nix install path: `flake.nix`
|
||||
|
||||
For maintainer publishing and release steps, see [docs/releasing.md](docs/releasing.md).
|
||||
|
||||
## Selector Contract
|
||||
|
||||
Explicit selector modes:
|
||||
|
||||
```bash
|
||||
ref=w1
|
||||
id=win1
|
||||
title=Firefox
|
||||
class=firefox
|
||||
focused
|
||||
```
|
||||
|
||||
Legacy refs remain supported:
|
||||
|
||||
```bash
|
||||
@w1
|
||||
w1
|
||||
win1
|
||||
```
|
||||
|
||||
Bare selectors such as `firefox` are still supported as fuzzy substring matches, but they now fail on ambiguity and return candidate windows instead of silently picking the first match.
|
||||
|
||||
## Support Boundary
|
||||
|
||||
`deskctl` supports Linux X11 in this phase. Wayland and Hyprland are explicitly out of scope for the current runtime contract.
|
||||
|
||||
## Workflow
|
||||
|
||||
Local validation uses the root `Makefile`:
|
||||
|
||||
```bash
|
||||
make fmt-check
|
||||
make lint
|
||||
make test-unit
|
||||
make test-integration
|
||||
make site-format-check
|
||||
make validate
|
||||
```
|
||||
|
||||
`make validate` is the full repo-quality check and requires Linux with `xvfb-run` plus `pnpm --dir site install`.
|
||||
|
||||
The repository standardizes on `pre-commit` for fast commit-time checks:
|
||||
|
||||
```bash
|
||||
pre-commit install
|
||||
pre-commit run --all-files
|
||||
```
|
||||
|
||||
See [CONTRIBUTING.md](CONTRIBUTING.md) for the full contributor guide.
|
||||
|
||||
## Acknowledgements
|
||||
|
||||
- [@barrettruth](github.com/barrettruth) - i stole the website from [vimdoc](https://github.com/barrettruth/vimdoc-language-server)
|
||||
`deskctl` currently supports Linux X11. Use `--json` for stable machine parsing, use `window_id` for programmatic targeting inside a live session, and use `deskctl doctor` first when the runtime looks broken.
|
||||
|
|
|
|||
|
|
@ -1,19 +1,6 @@
|
|||
# Runtime Output Contract
|
||||
# deskctl runtime contract
|
||||
|
||||
This document defines the current output contract for `deskctl`.
|
||||
|
||||
It is intentionally scoped to the current Linux X11 runtime surface.
|
||||
It does not promise stability for future Wayland or window-manager-specific features.
|
||||
|
||||
## Goals
|
||||
|
||||
- Keep `deskctl` fully non-interactive
|
||||
- Make text output actionable for quick terminal and agent loops
|
||||
- Make `--json` safe for agent consumption without depending on incidental formatting
|
||||
|
||||
## JSON Envelope
|
||||
|
||||
Every runtime command uses the same top-level JSON envelope:
|
||||
All commands support `--json` and use the same top-level envelope:
|
||||
|
||||
```json
|
||||
{
|
||||
|
|
@ -23,22 +10,11 @@ Every runtime command uses the same top-level JSON envelope:
|
|||
}
|
||||
```
|
||||
|
||||
Stable top-level fields:
|
||||
Use `--json` whenever you need to parse output programmatically.
|
||||
|
||||
- `success`
|
||||
- `data`
|
||||
- `error`
|
||||
## Stable window fields
|
||||
|
||||
`success` is always the authoritative success/failure bit.
|
||||
When `success` is `false`, the CLI exits non-zero in both text mode and `--json` mode.
|
||||
|
||||
## Stable Fields
|
||||
|
||||
These fields are stable for agent consumption in the current Phase 1 runtime contract.
|
||||
|
||||
### Window Identity
|
||||
|
||||
Whenever a runtime response includes a window payload, these fields are stable:
|
||||
Whenever a response includes a window payload, these fields are stable:
|
||||
|
||||
- `ref_id`
|
||||
- `window_id`
|
||||
|
|
@ -51,128 +27,46 @@ Whenever a runtime response includes a window payload, these fields are stable:
|
|||
- `focused`
|
||||
- `minimized`
|
||||
|
||||
`window_id` is the stable public identifier for a live daemon session.
|
||||
`ref_id` is a short-lived convenience handle for the current window snapshot/ref map.
|
||||
Use `window_id` for stable targeting inside a live daemon session. Use
|
||||
`ref_id` or `@wN` for short-lived follow-up actions after `snapshot` or
|
||||
`list-windows`.
|
||||
|
||||
### Grouped Reads
|
||||
## Stable grouped reads
|
||||
|
||||
`deskctl get active-window`
|
||||
- `deskctl get active-window` -> `data.window`
|
||||
- `deskctl get monitors` -> `data.count`, `data.monitors`
|
||||
- `deskctl get version` -> `data.version`, `data.backend`
|
||||
- `deskctl get systeminfo` -> runtime-scoped diagnostic fields such as
|
||||
`backend`, `display`, `session_type`, `session`, `socket_path`, `screen`,
|
||||
`monitor_count`, and `monitors`
|
||||
|
||||
- stable: `data.window`
|
||||
## Stable waits
|
||||
|
||||
`deskctl get monitors`
|
||||
- `deskctl wait window` -> `data.wait`, `data.selector`, `data.elapsed_ms`,
|
||||
`data.window`
|
||||
- `deskctl wait focus` -> `data.wait`, `data.selector`, `data.elapsed_ms`,
|
||||
`data.window`
|
||||
|
||||
- stable: `data.count`
|
||||
- stable: `data.monitors`
|
||||
- stable per monitor:
|
||||
- `name`
|
||||
- `x`
|
||||
- `y`
|
||||
- `width`
|
||||
- `height`
|
||||
- `width_mm`
|
||||
- `height_mm`
|
||||
- `primary`
|
||||
- `automatic`
|
||||
## Stable structured error kinds
|
||||
|
||||
`deskctl get version`
|
||||
|
||||
- stable: `data.version`
|
||||
- stable: `data.backend`
|
||||
|
||||
`deskctl get systeminfo`
|
||||
|
||||
- stable: `data.backend`
|
||||
- stable: `data.display`
|
||||
- stable: `data.session_type`
|
||||
- stable: `data.session`
|
||||
- stable: `data.socket_path`
|
||||
- stable: `data.screen`
|
||||
- stable: `data.monitor_count`
|
||||
- stable: `data.monitors`
|
||||
|
||||
### Waits
|
||||
|
||||
`deskctl wait window`
|
||||
`deskctl wait focus`
|
||||
|
||||
- stable: `data.wait`
|
||||
- stable: `data.selector`
|
||||
- stable: `data.elapsed_ms`
|
||||
- stable: `data.window`
|
||||
|
||||
### Selector-Driven Action Success
|
||||
|
||||
For selector-driven action commands that resolve a window target, these identifiers are stable when present:
|
||||
|
||||
- `data.ref_id`
|
||||
- `data.window_id`
|
||||
- `data.title`
|
||||
- `data.selector`
|
||||
|
||||
This applies to:
|
||||
|
||||
- `click`
|
||||
- `dblclick`
|
||||
- `focus`
|
||||
- `close`
|
||||
- `move-window`
|
||||
- `resize-window`
|
||||
|
||||
The exact human-readable text rendering of those commands is not part of the JSON contract.
|
||||
|
||||
### Artifact-Producing Commands
|
||||
|
||||
`snapshot`
|
||||
`screenshot`
|
||||
|
||||
- stable: `data.screenshot`
|
||||
|
||||
When the command also returns windows, `data.windows` uses the stable window payload documented above.
|
||||
|
||||
## Stable Structured Error Kinds
|
||||
|
||||
When a runtime command returns structured JSON failure data, these error kinds are stable:
|
||||
When a command fails with structured JSON data, these `kind` values are stable:
|
||||
|
||||
- `selector_not_found`
|
||||
- `selector_ambiguous`
|
||||
- `selector_invalid`
|
||||
- `timeout`
|
||||
- `not_found`
|
||||
- `window_not_focused` as `data.last_observation.kind` or equivalent observation payload
|
||||
|
||||
Stable structured failure fields include:
|
||||
Wait failures may also include `window_not_focused` in the last observation
|
||||
payload.
|
||||
|
||||
- `data.kind`
|
||||
- `data.selector` when selector-related
|
||||
- `data.mode` when selector-related
|
||||
- `data.candidates` for ambiguous selector failures
|
||||
- `data.message` for invalid selector failures
|
||||
- `data.wait`
|
||||
- `data.timeout_ms`
|
||||
- `data.poll_ms`
|
||||
- `data.last_observation`
|
||||
## Best-effort fields
|
||||
|
||||
## Best-Effort Fields
|
||||
Treat these as useful but non-contractual:
|
||||
|
||||
These values are useful but environment-dependent and should be treated as best-effort:
|
||||
- exact monitor names
|
||||
- incidental text formatting in non-JSON mode
|
||||
- default screenshot file names when no explicit path was provided
|
||||
- environment-dependent ordering details from the window manager
|
||||
|
||||
- exact monitor naming conventions
|
||||
- EWMH/window-manager-dependent window ordering details
|
||||
- cosmetic text formatting in non-JSON mode
|
||||
- screenshot file names when the caller did not provide an explicit path
|
||||
- command stderr wording outside the structured `kind` classifications above
|
||||
|
||||
## Text Mode Expectations
|
||||
|
||||
Text mode is intended to stay compact and follow-up-useful.
|
||||
|
||||
The exact whitespace/alignment of text output is not stable.
|
||||
The following expectations are stable at the behavioral level:
|
||||
|
||||
- important runtime reads print actionable identifiers or geometry
|
||||
- selector failures print enough detail to recover without `--json`
|
||||
- artifact-producing commands print the artifact path
|
||||
- window listings print both `@wN` refs and `window_id` values
|
||||
|
||||
If an agent needs strict parsing, it should use `--json`.
|
||||
For the full repo copy, see `docs/runtime-contract.md`.
|
||||
|
|
|
|||
|
|
@ -6,73 +6,93 @@ toc: true
|
|||
|
||||
# Architecture
|
||||
|
||||
## Client-daemon model
|
||||
## Public model
|
||||
|
||||
deskctl uses a client-daemon architecture over Unix sockets. The daemon starts automatically on the first command and keeps the X11 connection alive so repeated calls skip the connection setup overhead.
|
||||
`deskctl` is a thin, non-interactive X11 control primitive for agent loops.
|
||||
The public flow is:
|
||||
|
||||
Each command opens a new connection to the daemon, sends a single NDJSON request, reads one NDJSON response, and exits.
|
||||
- diagnose with `deskctl doctor`
|
||||
- observe with `snapshot`, `list-windows`, and grouped `get` commands
|
||||
- wait with grouped `wait` commands instead of shell `sleep`
|
||||
- act with explicit selectors or coordinates
|
||||
- verify with another read or snapshot
|
||||
|
||||
## Wire protocol
|
||||
The tool stays intentionally narrow. It does not try to be a full desktop shell
|
||||
or a speculative Wayland abstraction.
|
||||
|
||||
## Client-daemon architecture
|
||||
|
||||
The CLI talks to an auto-managed daemon over a Unix socket. The daemon keeps
|
||||
the X11 connection alive so repeated commands stay fast and share the same
|
||||
session-scoped window identity map.
|
||||
|
||||
Each CLI invocation sends one request, reads one response, and exits.
|
||||
|
||||
## Runtime contract
|
||||
|
||||
Requests and responses are newline-delimited JSON (NDJSON) over a Unix socket.
|
||||
|
||||
**Request:**
|
||||
All commands share the same JSON envelope:
|
||||
|
||||
```json
|
||||
{ "id": "r123456", "action": "snapshot", "annotate": true }
|
||||
{
|
||||
"success": true,
|
||||
"data": {},
|
||||
"error": null
|
||||
}
|
||||
```
|
||||
|
||||
**Response:**
|
||||
For window payloads, the public identity is `window_id`, not an X11 handle.
|
||||
That keeps the contract backend-neutral even though the current support
|
||||
boundary is X11-only.
|
||||
|
||||
```json
|
||||
{"success": true, "data": {"screenshot": "/tmp/deskctl-1234567890.png", "windows": [...]}}
|
||||
```
|
||||
The complete stable-vs-best-effort policy lives on the
|
||||
[runtime contract](/runtime-contract) page.
|
||||
|
||||
Error responses include an `error` field:
|
||||
## Sessions and sockets
|
||||
|
||||
```json
|
||||
{ "success": false, "error": "window not found: @w99" }
|
||||
```
|
||||
Each session gets its own socket path, PID file, and live window mapping.
|
||||
|
||||
## Socket location
|
||||
Public socket resolution order:
|
||||
|
||||
The daemon socket is resolved in this order:
|
||||
|
||||
1. `--socket` flag (highest priority)
|
||||
2. `$DESKCTL_SOCKET_DIR/{session}.sock`
|
||||
3. `$XDG_RUNTIME_DIR/deskctl/{session}.sock`
|
||||
1. `--socket`
|
||||
2. `DESKCTL_SOCKET_DIR/{session}.sock`
|
||||
3. `XDG_RUNTIME_DIR/deskctl/{session}.sock`
|
||||
4. `~/.deskctl/{session}.sock`
|
||||
|
||||
PID files are stored alongside the socket.
|
||||
Most users should let `deskctl` manage this automatically. `--session` is the
|
||||
main public knob when you need isolated daemon instances.
|
||||
|
||||
## Sessions
|
||||
## Diagnostics and failure handling
|
||||
|
||||
Multiple isolated daemon instances can run simultaneously using the `--session` flag:
|
||||
`deskctl doctor` runs before daemon startup and checks:
|
||||
|
||||
```sh
|
||||
deskctl --session workspace1 snapshot
|
||||
deskctl --session workspace2 snapshot
|
||||
```
|
||||
- display/session setup
|
||||
- X11 connectivity
|
||||
- basic window enumeration
|
||||
- screenshot viability
|
||||
- socket directory and stale-socket health
|
||||
|
||||
Each session has its own socket, PID file, and window ref map.
|
||||
Selector and wait failures are structured in `--json` mode so clients can
|
||||
recover without scraping text.
|
||||
|
||||
## Backend design
|
||||
## Backend notes
|
||||
|
||||
The core is built around a `DesktopBackend` trait. The current implementation uses `x11rb` for X11 protocol operations and `enigo` for input simulation.
|
||||
The backend is built around a `DesktopBackend` trait and currently ships with
|
||||
an X11 implementation backed by `x11rb`.
|
||||
|
||||
The trait-based design means adding Wayland support is a single trait implementation with no changes to the core, CLI, or daemon code.
|
||||
The important public guarantee is not "portable desktop automation." The
|
||||
important guarantee is "a correct and unsurprising Linux X11 runtime contract."
|
||||
|
||||
## X11 integration
|
||||
## X11 support boundary
|
||||
|
||||
Window detection uses EWMH properties:
|
||||
This phase supports Linux X11 only.
|
||||
|
||||
| Property | Purpose |
|
||||
| --------------------------- | ------------------------ |
|
||||
| `_NET_CLIENT_LIST_STACKING` | Window stacking order |
|
||||
| `_NET_ACTIVE_WINDOW` | Currently focused window |
|
||||
| `_NET_WM_NAME` | Window title (UTF-8) |
|
||||
| `_NET_WM_STATE_HIDDEN` | Minimized state |
|
||||
| `_NET_CLOSE_WINDOW` | Graceful close |
|
||||
| `WM_CLASS` | Application class/name |
|
||||
That means:
|
||||
|
||||
Falls back to `XQueryTree` if `_NET_CLIENT_LIST_STACKING` is unavailable.
|
||||
- EWMH/window-manager properties matter
|
||||
- monitor naming and some ordering details are best-effort
|
||||
- Wayland and Hyprland are out of scope for the current contract
|
||||
|
||||
The runtime documents those boundaries explicitly instead of pretending the
|
||||
surface is broader than it is.
|
||||
|
|
|
|||
|
|
@ -6,167 +6,101 @@ toc: true
|
|||
|
||||
# Commands
|
||||
|
||||
## Snapshot
|
||||
|
||||
Capture a screenshot and get the window tree:
|
||||
## Observe
|
||||
|
||||
```sh
|
||||
deskctl doctor
|
||||
deskctl snapshot
|
||||
deskctl snapshot --annotate
|
||||
```
|
||||
|
||||
With `--annotate`, colored bounding boxes and `@wN` labels are drawn on the screenshot. Each window gets a unique color from an 8-color palette. Minimized windows are skipped.
|
||||
|
||||
The screenshot is saved to `/tmp/deskctl-{timestamp}.png`.
|
||||
|
||||
## Click
|
||||
|
||||
Click the center of a window by ref, or click exact coordinates:
|
||||
|
||||
```sh
|
||||
deskctl click @w1
|
||||
deskctl click 960,540
|
||||
```
|
||||
|
||||
## Double click
|
||||
|
||||
```sh
|
||||
deskctl dblclick @w1
|
||||
deskctl dblclick 500,300
|
||||
```
|
||||
|
||||
## Type
|
||||
|
||||
Type a string into the focused window:
|
||||
|
||||
```sh
|
||||
deskctl type "hello world"
|
||||
```
|
||||
|
||||
## Press
|
||||
|
||||
Press a single key:
|
||||
|
||||
```sh
|
||||
deskctl press enter
|
||||
deskctl press tab
|
||||
deskctl press escape
|
||||
```
|
||||
|
||||
Supported key names: `enter`, `tab`, `escape`, `backspace`, `delete`, `space`, `up`, `down`, `left`, `right`, `home`, `end`, `pageup`, `pagedown`, `f1`-`f12`, or any single character.
|
||||
|
||||
## Hotkey
|
||||
|
||||
Send a key combination. List modifier keys first, then the target key:
|
||||
|
||||
```sh
|
||||
deskctl hotkey ctrl c
|
||||
deskctl hotkey ctrl shift t
|
||||
deskctl hotkey alt f4
|
||||
```
|
||||
|
||||
Modifier names: `ctrl`, `alt`, `shift`, `super` (also `meta` or `win`).
|
||||
|
||||
## Mouse move
|
||||
|
||||
Move the cursor to absolute coordinates:
|
||||
|
||||
```sh
|
||||
deskctl mouse move 100 200
|
||||
```
|
||||
|
||||
## Mouse scroll
|
||||
|
||||
Scroll the mouse wheel. Positive values scroll down, negative scroll up:
|
||||
|
||||
```sh
|
||||
deskctl mouse scroll 3
|
||||
deskctl mouse scroll -5
|
||||
deskctl mouse scroll 3 --axis horizontal
|
||||
```
|
||||
|
||||
## Mouse drag
|
||||
|
||||
Drag from one position to another:
|
||||
|
||||
```sh
|
||||
deskctl mouse drag 100 200 500 600
|
||||
```
|
||||
|
||||
## Focus
|
||||
|
||||
Focus a window by ref or by name (case-insensitive substring match):
|
||||
|
||||
```sh
|
||||
deskctl focus @w1
|
||||
deskctl focus "firefox"
|
||||
```
|
||||
|
||||
## Close
|
||||
|
||||
Close a window gracefully:
|
||||
|
||||
```sh
|
||||
deskctl close @w2
|
||||
deskctl close "terminal"
|
||||
```
|
||||
|
||||
## Move window
|
||||
|
||||
Move a window to an absolute position:
|
||||
|
||||
```sh
|
||||
deskctl move-window @w1 0 0
|
||||
deskctl move-window "firefox" 100 100
|
||||
```
|
||||
|
||||
## Resize window
|
||||
|
||||
Resize a window:
|
||||
|
||||
```sh
|
||||
deskctl resize-window @w1 1280 720
|
||||
```
|
||||
|
||||
## List windows
|
||||
|
||||
List all windows without taking a screenshot:
|
||||
|
||||
```sh
|
||||
deskctl list-windows
|
||||
```
|
||||
|
||||
## Get screen size
|
||||
|
||||
```sh
|
||||
deskctl screenshot
|
||||
deskctl screenshot /tmp/screen.png
|
||||
deskctl get active-window
|
||||
deskctl get monitors
|
||||
deskctl get version
|
||||
deskctl get systeminfo
|
||||
deskctl get-screen-size
|
||||
```
|
||||
|
||||
## Get mouse position
|
||||
|
||||
```sh
|
||||
deskctl get-mouse-position
|
||||
```
|
||||
|
||||
## Screenshot
|
||||
`doctor` checks the runtime before daemon startup. `snapshot` produces a
|
||||
screenshot plus window refs. `list-windows` is the same window tree without the
|
||||
side effect of writing a screenshot.
|
||||
|
||||
Take a screenshot without the window tree. Optionally specify a save path:
|
||||
## Wait
|
||||
|
||||
```sh
|
||||
deskctl screenshot
|
||||
deskctl screenshot /tmp/my-screenshot.png
|
||||
deskctl screenshot --annotate
|
||||
deskctl wait window --selector 'title=Firefox' --timeout 10
|
||||
deskctl wait focus --selector 'id=win3' --timeout 5
|
||||
deskctl --json wait window --selector 'class=firefox' --poll-ms 100
|
||||
```
|
||||
|
||||
## Launch
|
||||
Wait commands return the matched window payload on success. In `--json` mode,
|
||||
timeouts and selector failures expose structured `kind` values.
|
||||
|
||||
Launch an application:
|
||||
## Act on a window
|
||||
|
||||
```sh
|
||||
deskctl launch firefox
|
||||
deskctl launch code --args /path/to/project
|
||||
deskctl focus @w1
|
||||
deskctl focus 'title=Firefox'
|
||||
deskctl click @w1
|
||||
deskctl click 960,540
|
||||
deskctl dblclick @w2
|
||||
deskctl close @w3
|
||||
deskctl move-window @w1 100 120
|
||||
deskctl resize-window @w1 1280 720
|
||||
```
|
||||
|
||||
Selector-driven actions accept refs, explicit selector modes, or absolute
|
||||
coordinates where appropriate.
|
||||
|
||||
## Input and mouse
|
||||
|
||||
```sh
|
||||
deskctl type "hello world"
|
||||
deskctl press enter
|
||||
deskctl hotkey ctrl shift t
|
||||
deskctl mouse move 100 200
|
||||
deskctl mouse scroll 3
|
||||
deskctl mouse scroll 3 --axis horizontal
|
||||
deskctl mouse drag 100 200 500 600
|
||||
```
|
||||
|
||||
Supported key names include `enter`, `tab`, `escape`, `backspace`, `delete`,
|
||||
`space`, arrow keys, paging keys, `f1` through `f12`, and any single
|
||||
character.
|
||||
|
||||
## Launch
|
||||
|
||||
```sh
|
||||
deskctl launch firefox
|
||||
deskctl launch code -- --new-window
|
||||
```
|
||||
|
||||
## Selectors
|
||||
|
||||
Prefer explicit selectors when the target matters:
|
||||
|
||||
```sh
|
||||
ref=w1
|
||||
id=win1
|
||||
title=Firefox
|
||||
class=firefox
|
||||
focused
|
||||
```
|
||||
|
||||
Legacy shorthand is still supported:
|
||||
|
||||
```sh
|
||||
@w1
|
||||
w1
|
||||
win1
|
||||
```
|
||||
|
||||
Bare strings like `firefox` are fuzzy matches. They resolve when there is one
|
||||
match and fail with candidate windows when there are multiple matches.
|
||||
|
||||
## Global options
|
||||
|
||||
| Flag | Env | Description |
|
||||
|
|
@ -174,3 +108,6 @@ deskctl launch code --args /path/to/project
|
|||
| `--json` | | Output as JSON |
|
||||
| `--socket <path>` | `DESKCTL_SOCKET` | Path to daemon Unix socket |
|
||||
| `--session <name>` | | Session name for multiple daemons (default: `default`) |
|
||||
|
||||
`deskctl` manages the daemon automatically. Most users never need to think
|
||||
about it beyond `--session` and `--socket`.
|
||||
|
|
|
|||
|
|
@ -8,17 +8,49 @@ import DocLayout from "../layouts/DocLayout.astro";
|
|||
<img src="/favicon.svg" alt="" width="40" height="40" />
|
||||
</header>
|
||||
|
||||
<p>
|
||||
Desktop control CLI for AI agents on Linux X11. Compact JSON output for
|
||||
agent loops. Screenshot, click, type, scroll, drag, and manage windows
|
||||
through a fast client-daemon architecture. 100% native Rust.
|
||||
<p class="tagline">non-interactive desktop control for AI agents</p>
|
||||
|
||||
<div class="badges" aria-label="package and runtime badges">
|
||||
<a href="https://www.npmjs.com/package/deskctl-cli">
|
||||
<img
|
||||
src="https://img.shields.io/npm/v/deskctl-cli?label=npm"
|
||||
alt="npm version badge"
|
||||
/>
|
||||
</a>
|
||||
<a href="https://github.com/harivansh-afk/deskctl/releases">
|
||||
<img
|
||||
src="https://img.shields.io/github/v/release/harivansh-afk/deskctl?label=release"
|
||||
alt="github release badge"
|
||||
/>
|
||||
</a>
|
||||
<img
|
||||
src="https://img.shields.io/badge/runtime-linux--x11-111827"
|
||||
alt="linux x11 runtime badge"
|
||||
/>
|
||||
<a href="https://www.npmjs.com/package/deskctl-cli">
|
||||
<img
|
||||
src="https://img.shields.io/badge/install-npm%20i%20-g%20deskctl--cli-111827"
|
||||
alt="npm install command badge"
|
||||
/>
|
||||
</a>
|
||||
</div>
|
||||
|
||||
<p class="lede">
|
||||
<code>deskctl</code> is a thin X11 control primitive for agent loops: diagnose
|
||||
the runtime, observe the desktop, wait for state transitions, act deterministically,
|
||||
then verify.
|
||||
</p>
|
||||
|
||||
<h2>Getting started</h2>
|
||||
<pre><code>npm install -g deskctl-cli
|
||||
deskctl doctor
|
||||
deskctl snapshot --annotate</code></pre>
|
||||
|
||||
<h2>Start here</h2>
|
||||
|
||||
<ul>
|
||||
<li><a href="/installation">Installation</a></li>
|
||||
<li><a href="/quick-start">Quick start</a></li>
|
||||
<li><a href="/runtime-contract">Runtime contract</a></li>
|
||||
</ul>
|
||||
|
||||
<h2>Reference</h2>
|
||||
|
|
@ -28,14 +60,27 @@ import DocLayout from "../layouts/DocLayout.astro";
|
|||
<li><a href="/architecture">Architecture</a></li>
|
||||
</ul>
|
||||
|
||||
<h2>Agent skill</h2>
|
||||
|
||||
<p>
|
||||
There is also an installable skill for `skills.sh`-style agent runtimes:
|
||||
</p>
|
||||
|
||||
<pre><code>npx skills add harivansh-afk/deskctl -s deskctl</code></pre>
|
||||
|
||||
<h2>Links</h2>
|
||||
|
||||
<ul>
|
||||
<li>
|
||||
<a href="https://www.npmjs.com/package/deskctl-cli">npm package</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="https://github.com/harivansh-afk/deskctl">GitHub</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="https://crates.io/crates/deskctl">crates.io</a>
|
||||
<a href="https://github.com/harivansh-afk/deskctl/releases">
|
||||
GitHub releases
|
||||
</a>
|
||||
</li>
|
||||
</ul>
|
||||
</DocLayout>
|
||||
|
|
|
|||
|
|
@ -6,43 +6,68 @@ toc: true
|
|||
|
||||
# Installation
|
||||
|
||||
## Cargo
|
||||
## Default install
|
||||
|
||||
```sh
|
||||
cargo install deskctl
|
||||
npm install -g deskctl-cli
|
||||
deskctl --help
|
||||
```
|
||||
|
||||
## From source
|
||||
`deskctl-cli` is the default install path. It installs the `deskctl` command by
|
||||
downloading the matching GitHub Release asset for the supported runtime target.
|
||||
|
||||
## One-shot usage
|
||||
|
||||
```sh
|
||||
npx deskctl-cli --help
|
||||
```
|
||||
|
||||
## Agent skill
|
||||
|
||||
For `skills.sh`-style runtimes:
|
||||
|
||||
```sh
|
||||
npx skills add harivansh-afk/deskctl -s deskctl
|
||||
```
|
||||
|
||||
The repo skill lives under `skills/deskctl` and is designed around the same
|
||||
observe -> wait -> act -> verify loop as the CLI.
|
||||
|
||||
## Other install paths
|
||||
|
||||
### Nix
|
||||
|
||||
```sh
|
||||
nix run github:harivansh-afk/deskctl -- --help
|
||||
nix profile install github:harivansh-afk/deskctl
|
||||
```
|
||||
|
||||
### Build from source
|
||||
|
||||
```sh
|
||||
git clone https://github.com/harivansh-afk/deskctl
|
||||
cd deskctl
|
||||
cargo build --release
|
||||
cargo build
|
||||
```
|
||||
|
||||
## Docker (cross-compile for Linux)
|
||||
Source builds on Linux require:
|
||||
|
||||
Build a static Linux binary from any platform:
|
||||
- Rust 1.75+
|
||||
- `pkg-config`
|
||||
- X11 development libraries such as `libx11-dev` and `libxtst-dev`
|
||||
|
||||
```sh
|
||||
docker compose -f docker/docker-compose.yml run --rm build
|
||||
```
|
||||
|
||||
This writes `dist/deskctl-linux-x86_64`.
|
||||
|
||||
## Deploy to a remote machine
|
||||
|
||||
Copy the binary over SSH when `scp` is not available:
|
||||
|
||||
```sh
|
||||
ssh -p 443 user@host 'cat > ~/deskctl && chmod +x ~/deskctl' < dist/deskctl-linux-x86_64
|
||||
```
|
||||
|
||||
## Requirements
|
||||
## Runtime requirements
|
||||
|
||||
- Linux with an active X11 session
|
||||
- `DISPLAY` environment variable set (e.g. `DISPLAY=:1`)
|
||||
- `XDG_SESSION_TYPE=x11`
|
||||
- A window manager that exposes EWMH properties (`_NET_CLIENT_LIST_STACKING`, `_NET_ACTIVE_WINDOW`)
|
||||
- `DISPLAY` set to a usable X11 display, such as `DISPLAY=:1`
|
||||
- `XDG_SESSION_TYPE=x11` or an equivalent X11 session environment
|
||||
- a window manager or desktop environment that exposes standard EWMH properties
|
||||
such as `_NET_CLIENT_LIST_STACKING` and `_NET_ACTIVE_WINDOW`
|
||||
|
||||
No extra native libraries are needed beyond the standard glibc runtime (`libc`, `libm`, `libgcc_s`).
|
||||
The binary itself only depends on the standard Linux glibc runtime.
|
||||
|
||||
If setup fails, run:
|
||||
|
||||
```sh
|
||||
deskctl doctor
|
||||
```
|
||||
|
|
|
|||
|
|
@ -6,50 +6,72 @@ toc: true
|
|||
|
||||
# Quick start
|
||||
|
||||
## Core workflow
|
||||
|
||||
The typical agent loop is: snapshot the desktop, interpret the result, act on it.
|
||||
## Install and diagnose
|
||||
|
||||
```sh
|
||||
# 1. see the desktop
|
||||
deskctl --json snapshot --annotate
|
||||
npm install -g deskctl-cli
|
||||
deskctl doctor
|
||||
```
|
||||
|
||||
# 2. click a window by its ref
|
||||
deskctl click @w1
|
||||
Use `deskctl doctor` first. It checks X11 connectivity, basic enumeration,
|
||||
screenshot viability, and socket health before you start driving the desktop.
|
||||
|
||||
# 3. type into the focused window
|
||||
deskctl type "hello world"
|
||||
## Observe
|
||||
|
||||
# 4. press a key
|
||||
```sh
|
||||
deskctl snapshot --annotate
|
||||
deskctl list-windows
|
||||
deskctl get active-window
|
||||
deskctl get monitors
|
||||
```
|
||||
|
||||
Use `snapshot` when you want a screenshot artifact plus window refs. Use
|
||||
`list-windows` when you only need the current window tree without writing a
|
||||
screenshot.
|
||||
|
||||
## Target windows cleanly
|
||||
|
||||
Prefer explicit selectors when you need deterministic targeting:
|
||||
|
||||
```sh
|
||||
ref=w1
|
||||
id=win1
|
||||
title=Firefox
|
||||
class=firefox
|
||||
focused
|
||||
```
|
||||
|
||||
Legacy refs such as `@w1` still work after `snapshot` or `list-windows`. Bare
|
||||
strings like `firefox` are fuzzy matches and now fail on ambiguity.
|
||||
|
||||
## Wait, act, verify
|
||||
|
||||
The core loop is:
|
||||
|
||||
```sh
|
||||
# observe
|
||||
deskctl snapshot --annotate
|
||||
|
||||
# wait
|
||||
deskctl wait window --selector 'title=Firefox' --timeout 10
|
||||
|
||||
# act
|
||||
deskctl focus 'title=Firefox'
|
||||
deskctl hotkey ctrl l
|
||||
deskctl type "https://example.com"
|
||||
deskctl press enter
|
||||
|
||||
# verify
|
||||
deskctl wait focus --selector 'title=Firefox' --timeout 5
|
||||
deskctl snapshot
|
||||
```
|
||||
|
||||
The `--annotate` flag draws colored bounding boxes and `@wN` labels on the screenshot so agents can visually identify windows.
|
||||
The wait commands return the matched window payload on success, so they compose
|
||||
cleanly into the next action.
|
||||
|
||||
## Window refs
|
||||
## Use `--json` when parsing matters
|
||||
|
||||
Every `snapshot` assigns refs like `@w1`, `@w2`, etc. to each visible window, ordered top-to-bottom by stacking order. Use these refs anywhere a selector is expected:
|
||||
|
||||
```sh
|
||||
deskctl click @w1
|
||||
deskctl focus @w3
|
||||
deskctl close @w2
|
||||
```
|
||||
|
||||
You can also select windows by name (case-insensitive substring match):
|
||||
|
||||
```sh
|
||||
deskctl focus "firefox"
|
||||
deskctl close "terminal"
|
||||
```
|
||||
|
||||
## JSON output
|
||||
|
||||
Pass `--json` for machine-readable output. This is the primary mode for agent integrations:
|
||||
|
||||
```sh
|
||||
deskctl --json snapshot
|
||||
```
|
||||
Every command supports `--json` and uses the same top-level envelope:
|
||||
|
||||
```json
|
||||
{
|
||||
|
|
@ -59,7 +81,7 @@ deskctl --json snapshot
|
|||
"windows": [
|
||||
{
|
||||
"ref_id": "w1",
|
||||
"xcb_id": 12345678,
|
||||
"window_id": "win1",
|
||||
"title": "Firefox",
|
||||
"app_name": "firefox",
|
||||
"x": 0,
|
||||
|
|
@ -74,14 +96,8 @@ deskctl --json snapshot
|
|||
}
|
||||
```
|
||||
|
||||
## Daemon lifecycle
|
||||
Use `window_id` for stable targeting inside a live daemon session. The exact
|
||||
text formatting is intentionally compact, but JSON is the parsing contract.
|
||||
|
||||
The daemon starts automatically on the first command. It keeps the X11 connection alive so repeated calls are fast. You do not need to manage it manually.
|
||||
|
||||
```sh
|
||||
# check if the daemon is running
|
||||
deskctl daemon status
|
||||
|
||||
# stop it explicitly
|
||||
deskctl daemon stop
|
||||
```
|
||||
The full stable-vs-best-effort contract lives on the
|
||||
[runtime contract](/runtime-contract) page.
|
||||
|
|
|
|||
177
site/src/pages/runtime-contract.mdx
Normal file
177
site/src/pages/runtime-contract.mdx
Normal file
|
|
@ -0,0 +1,177 @@
|
|||
---
|
||||
layout: ../layouts/DocLayout.astro
|
||||
title: Runtime contract
|
||||
toc: true
|
||||
---
|
||||
|
||||
# Runtime contract
|
||||
|
||||
This page defines the current public output contract for `deskctl`.
|
||||
|
||||
It is intentionally scoped to the current Linux X11 runtime surface. It does
|
||||
not promise stability for future Wayland or window-manager-specific features.
|
||||
|
||||
## JSON envelope
|
||||
|
||||
Every command supports `--json` and uses the same top-level envelope:
|
||||
|
||||
```json
|
||||
{
|
||||
"success": true,
|
||||
"data": {},
|
||||
"error": null
|
||||
}
|
||||
```
|
||||
|
||||
Stable top-level fields:
|
||||
|
||||
- `success`
|
||||
- `data`
|
||||
- `error`
|
||||
|
||||
If `success` is `false`, the command exits non-zero in both text mode and JSON
|
||||
mode.
|
||||
|
||||
## Stable window fields
|
||||
|
||||
Whenever a response includes a window payload, these fields are stable:
|
||||
|
||||
- `ref_id`
|
||||
- `window_id`
|
||||
- `title`
|
||||
- `app_name`
|
||||
- `x`
|
||||
- `y`
|
||||
- `width`
|
||||
- `height`
|
||||
- `focused`
|
||||
- `minimized`
|
||||
|
||||
`window_id` is the public session-scoped identifier for programmatic targeting.
|
||||
`ref_id` is a short-lived convenience handle from the current ref map.
|
||||
|
||||
## Stable grouped reads
|
||||
|
||||
`deskctl get active-window`
|
||||
|
||||
- stable: `data.window`
|
||||
|
||||
`deskctl get monitors`
|
||||
|
||||
- stable: `data.count`
|
||||
- stable: `data.monitors`
|
||||
|
||||
Stable per-monitor fields:
|
||||
|
||||
- `name`
|
||||
- `x`
|
||||
- `y`
|
||||
- `width`
|
||||
- `height`
|
||||
- `width_mm`
|
||||
- `height_mm`
|
||||
- `primary`
|
||||
- `automatic`
|
||||
|
||||
`deskctl get version`
|
||||
|
||||
- stable: `data.version`
|
||||
- stable: `data.backend`
|
||||
|
||||
`deskctl get systeminfo`
|
||||
|
||||
- stable: `data.backend`
|
||||
- stable: `data.display`
|
||||
- stable: `data.session_type`
|
||||
- stable: `data.session`
|
||||
- stable: `data.socket_path`
|
||||
- stable: `data.screen`
|
||||
- stable: `data.monitor_count`
|
||||
- stable: `data.monitors`
|
||||
|
||||
## Stable waits
|
||||
|
||||
`deskctl wait window`
|
||||
`deskctl wait focus`
|
||||
|
||||
- stable: `data.wait`
|
||||
- stable: `data.selector`
|
||||
- stable: `data.elapsed_ms`
|
||||
- stable: `data.window`
|
||||
|
||||
## Stable selector-driven action fields
|
||||
|
||||
When selector-driven actions return resolved window data, these fields are
|
||||
stable when present:
|
||||
|
||||
- `data.ref_id`
|
||||
- `data.window_id`
|
||||
- `data.title`
|
||||
- `data.selector`
|
||||
|
||||
This applies to:
|
||||
|
||||
- `click`
|
||||
- `dblclick`
|
||||
- `focus`
|
||||
- `close`
|
||||
- `move-window`
|
||||
- `resize-window`
|
||||
|
||||
## Stable artifact fields
|
||||
|
||||
For `snapshot` and `screenshot`:
|
||||
|
||||
- stable: `data.screenshot`
|
||||
|
||||
When a command also returns windows, `data.windows` uses the stable window
|
||||
payload documented above.
|
||||
|
||||
## Stable structured error kinds
|
||||
|
||||
When a command fails with structured JSON data, these error kinds are stable:
|
||||
|
||||
- `selector_not_found`
|
||||
- `selector_ambiguous`
|
||||
- `selector_invalid`
|
||||
- `timeout`
|
||||
- `not_found`
|
||||
- `window_not_focused` in `data.last_observation.kind` or an equivalent wait
|
||||
observation payload
|
||||
|
||||
Stable structured failure fields include:
|
||||
|
||||
- `data.kind`
|
||||
- `data.selector`
|
||||
- `data.mode`
|
||||
- `data.candidates`
|
||||
- `data.message`
|
||||
- `data.wait`
|
||||
- `data.timeout_ms`
|
||||
- `data.poll_ms`
|
||||
- `data.last_observation`
|
||||
|
||||
## Best-effort fields
|
||||
|
||||
These values are useful but environment-dependent and should not be treated as
|
||||
strict parsing guarantees:
|
||||
|
||||
- exact monitor naming conventions
|
||||
- EWMH/window-manager-dependent ordering details
|
||||
- cosmetic text formatting in non-JSON mode
|
||||
- default screenshot file names when no explicit path was provided
|
||||
- stderr wording outside the structured `kind` classifications above
|
||||
|
||||
## Text mode expectations
|
||||
|
||||
Text mode is intended to stay compact and follow-up-useful.
|
||||
|
||||
The exact whitespace and alignment are not stable. The stable behavioral
|
||||
expectations are:
|
||||
|
||||
- important reads print actionable identifiers or geometry
|
||||
- selector failures print enough detail to recover without `--json`
|
||||
- artifact-producing commands print the artifact path
|
||||
- window listings print both `@wN` refs and `window_id` values
|
||||
|
||||
If you need strict parsing, use `--json`.
|
||||
|
|
@ -65,6 +65,23 @@ main {
|
|||
font-style: italic;
|
||||
}
|
||||
|
||||
.lede {
|
||||
font-size: 1.05rem;
|
||||
max-width: 42rem;
|
||||
}
|
||||
|
||||
.badges {
|
||||
display: flex;
|
||||
flex-wrap: wrap;
|
||||
gap: 0.6rem;
|
||||
margin-bottom: 1.25rem;
|
||||
}
|
||||
|
||||
.badges a,
|
||||
.badges img {
|
||||
display: block;
|
||||
}
|
||||
|
||||
header {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
|
|
@ -117,6 +134,10 @@ a:hover {
|
|||
text-decoration-thickness: 2px;
|
||||
}
|
||||
|
||||
img {
|
||||
max-width: 100%;
|
||||
}
|
||||
|
||||
ul,
|
||||
ol {
|
||||
padding-left: 1.25em;
|
||||
|
|
|
|||
|
|
@ -1,21 +1,22 @@
|
|||
# deskctl commands
|
||||
|
||||
All commands support `--json` for machine-parseable output following the runtime contract.
|
||||
All commands support `--json` for machine-parseable output following the
|
||||
runtime contract.
|
||||
|
||||
## Observe
|
||||
|
||||
```bash
|
||||
deskctl doctor # check X11 runtime and daemon health
|
||||
deskctl snapshot # screenshot + window list
|
||||
deskctl snapshot --annotate # screenshot with @wN labels overlaid
|
||||
deskctl list-windows # window list only (no screenshot)
|
||||
deskctl screenshot /tmp/screen.png # screenshot to explicit path
|
||||
deskctl get active-window # focused window info
|
||||
deskctl get monitors # monitor geometry
|
||||
deskctl get version # version and backend
|
||||
deskctl get systeminfo # full runtime diagnostics
|
||||
deskctl get-screen-size # screen resolution
|
||||
deskctl get-mouse-position # cursor coordinates
|
||||
deskctl doctor
|
||||
deskctl snapshot
|
||||
deskctl snapshot --annotate
|
||||
deskctl list-windows
|
||||
deskctl screenshot /tmp/screen.png
|
||||
deskctl get active-window
|
||||
deskctl get monitors
|
||||
deskctl get version
|
||||
deskctl get systeminfo
|
||||
deskctl get-screen-size
|
||||
deskctl get-mouse-position
|
||||
```
|
||||
|
||||
## Wait
|
||||
|
|
@ -25,19 +26,21 @@ deskctl wait window --selector 'title=Firefox' --timeout 10
|
|||
deskctl wait focus --selector 'class=firefox' --timeout 5
|
||||
```
|
||||
|
||||
Returns the matched window payload on success. Failures include structured `kind` values in `--json` mode.
|
||||
Returns the matched window payload on success. Failures include structured
|
||||
`kind` values in `--json` mode.
|
||||
|
||||
## Selectors
|
||||
|
||||
```bash
|
||||
ref=w1 # snapshot ref (short-lived, from last snapshot)
|
||||
id=win1 # stable window ID (session-scoped)
|
||||
title=Firefox # match by window title
|
||||
class=firefox # match by WM class
|
||||
focused # currently focused window
|
||||
ref=w1
|
||||
id=win1
|
||||
title=Firefox
|
||||
class=firefox
|
||||
focused
|
||||
```
|
||||
|
||||
Legacy shorthand: `@w1`, `w1`, `win1`. Bare strings do fuzzy matching but fail on ambiguity.
|
||||
Legacy shorthand: `@w1`, `w1`, `win1`. Bare strings do fuzzy matching but fail
|
||||
on ambiguity.
|
||||
|
||||
## Act
|
||||
|
||||
|
|
@ -58,12 +61,5 @@ deskctl close @w3
|
|||
deskctl launch firefox
|
||||
```
|
||||
|
||||
## Daemon
|
||||
|
||||
```bash
|
||||
deskctl daemon start
|
||||
deskctl daemon stop
|
||||
deskctl daemon status
|
||||
```
|
||||
|
||||
The daemon starts automatically on first command. Manual control is rarely needed.
|
||||
The daemon starts automatically on first command. In normal usage you should
|
||||
not need to manage it directly.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue