deskctl/site/src/pages/quick-start.mdx
2026-03-26 11:27:12 -04:00

105 lines
2.2 KiB
Text

---
layout: ../layouts/DocLayout.astro
title: Quick start
toc: true
---
# Quick start
The fastest way to use `deskctl` is to follow the same four-step loop : observe, wait, act, verify.
## 1. Install and diagnose
```sh
npm install -g deskctl
deskctl doctor
```
Run `deskctl doctor` first. It checks X11 connectivity, basic enumeration,
screenshot viability, and socket health before you start driving the desktop.
## 2. Observe the desktop
```sh
deskctl snapshot --annotate
deskctl list-windows
deskctl get active-window
deskctl get monitors
```
Use `snapshot` when you want a screenshot artifact plus window refs. Use
`list-windows` when you only need the current window tree without writing a
screenshot.
## 3. Pick selectors that stay readable
Prefer explicit selectors when you need deterministic targeting:
```sh
ref=w1
id=win1
title=Firefox
class=firefox
focused
```
Legacy refs such as `@w1` still work after `snapshot` or `list-windows`. Bare
strings like `firefox` are fuzzy matches and now fail on ambiguity.
## 4. Wait, act, verify
The core loop is:
```sh
# observe
deskctl snapshot --annotate
# wait
deskctl wait window --selector 'title=Firefox' --timeout 10
# act
deskctl focus 'title=Firefox'
deskctl hotkey ctrl l
deskctl type "https://example.com"
deskctl press enter
# verify
deskctl wait focus --selector 'title=Firefox' --timeout 5
deskctl snapshot
```
The wait commands return the matched window payload on success, so they compose
cleanly into the next action.
## 5. Use `--json` when parsing matters
Every command supports `--json` and uses the same top-level envelope:
```json
{
"success": true,
"data": {
"screenshot": "/tmp/deskctl-1234567890.png",
"windows": [
{
"ref_id": "w1",
"window_id": "win1",
"title": "Firefox",
"app_name": "firefox",
"x": 0,
"y": 0,
"width": 1920,
"height": 1080,
"focused": true,
"minimized": false
}
]
}
}
```
Use `window_id` for stable targeting inside a live daemon session. The exact
text formatting is intentionally compact, but JSON is the parsing contract.
The full stable-vs-best-effort contract lives on the
[runtime contract](/runtime-contract) page.