Improve docs structure and navigation (#12)

* Improve docs structure and navigation

Co-authored-by: Codex <noreply@openai.com>

* rm

* handwrite docs

---------

Co-authored-by: Codex <noreply@openai.com>
This commit is contained in:
Hari 2026-03-26 11:27:35 -04:00 committed by GitHub
parent 844f2f2bc6
commit 2b02513d6e
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
8 changed files with 69 additions and 142 deletions

View file

@ -30,7 +30,7 @@ function formatTocText(text: string): string {
<body> <body>
{ {
!isIndex && ( !isIndex && (
<nav> <nav class="breadcrumbs">
<a class="title" href="/"> <a class="title" href="/">
deskctl deskctl
</a> </a>

View file

@ -1,98 +0,0 @@
---
layout: ../layouts/DocLayout.astro
title: Architecture
toc: true
---
# Architecture
## Public model
`deskctl` is a thin, non-interactive X11 control primitive for agent loops.
The public flow is:
- diagnose with `deskctl doctor`
- observe with `snapshot`, `list-windows`, and grouped `get` commands
- wait with grouped `wait` commands instead of shell `sleep`
- act with explicit selectors or coordinates
- verify with another read or snapshot
The tool stays intentionally narrow. It does not try to be a full desktop shell
or a speculative Wayland abstraction.
## Client-daemon architecture
The CLI talks to an auto-managed daemon over a Unix socket. The daemon keeps
the X11 connection alive so repeated commands stay fast and share the same
session-scoped window identity map.
Each CLI invocation sends one request, reads one response, and exits.
## Runtime contract
Requests and responses are newline-delimited JSON (NDJSON) over a Unix socket.
All commands share the same JSON envelope:
```json
{
"success": true,
"data": {},
"error": null
}
```
For window payloads, the public identity is `window_id`, not an X11 handle.
That keeps the contract backend-neutral even though the current support
boundary is X11-only.
The complete stable-vs-best-effort policy lives on the
[runtime contract](/runtime-contract) page.
## Sessions and sockets
Each session gets its own socket path, PID file, and live window mapping.
Public socket resolution order:
1. `--socket`
2. `DESKCTL_SOCKET_DIR/{session}.sock`
3. `XDG_RUNTIME_DIR/deskctl/{session}.sock`
4. `~/.deskctl/{session}.sock`
Most users should let `deskctl` manage this automatically. `--session` is the
main public knob when you need isolated daemon instances.
## Diagnostics and failure handling
`deskctl doctor` runs before daemon startup and checks:
- display/session setup
- X11 connectivity
- basic window enumeration
- screenshot viability
- socket directory and stale-socket health
Selector and wait failures are structured in `--json` mode so clients can
recover without scraping text.
## Backend notes
The backend is built around a `DesktopBackend` trait and currently ships with
an X11 implementation backed by `x11rb`.
The important public guarantee is not "portable desktop automation." The
important guarantee is "a correct and unsurprising Linux X11 runtime contract."
## X11 support boundary
This phase supports Linux X11 only.
That means:
- EWMH/window-manager properties matter
- monitor naming and some ordering details are best-effort
- Wayland and Hyprland are out of scope for the current contract
The runtime documents those boundaries explicitly instead of pretending the
surface is broader than it is.

View file

@ -6,7 +6,10 @@ toc: true
# Commands # Commands
## Observe The public CLI is intentionally small. Most workflows boil down to grouped
reads, grouped waits, selector-driven actions, and a few input primitives.
## Observe and inspect
```sh ```sh
deskctl doctor deskctl doctor
@ -25,9 +28,10 @@ deskctl get-mouse-position
`doctor` checks the runtime before daemon startup. `snapshot` produces a `doctor` checks the runtime before daemon startup. `snapshot` produces a
screenshot plus window refs. `list-windows` is the same window tree without the screenshot plus window refs. `list-windows` is the same window tree without the
side effect of writing a screenshot. side effect of writing a screenshot. The grouped `get` commands are the
preferred read surface for focused state queries.
## Wait ## Wait for state transitions
```sh ```sh
deskctl wait window --selector 'title=Firefox' --timeout 10 deskctl wait window --selector 'title=Firefox' --timeout 10
@ -38,7 +42,7 @@ deskctl --json wait window --selector 'class=firefox' --poll-ms 100
Wait commands return the matched window payload on success. In `--json` mode, Wait commands return the matched window payload on success. In `--json` mode,
timeouts and selector failures expose structured `kind` values. timeouts and selector failures expose structured `kind` values.
## Act on a window ## Act on windows
```sh ```sh
deskctl launch firefox deskctl launch firefox
@ -55,7 +59,7 @@ deskctl resize-window @w1 1280 720
Selector-driven actions accept refs, explicit selector modes, or absolute Selector-driven actions accept refs, explicit selector modes, or absolute
coordinates where appropriate. coordinates where appropriate.
## Input and mouse ## Keyboard and mouse input
```sh ```sh
deskctl type "hello world" deskctl type "hello world"
@ -71,16 +75,10 @@ Supported key names include `enter`, `tab`, `escape`, `backspace`, `delete`,
`space`, arrow keys, paging keys, `f1` through `f12`, and any single `space`, arrow keys, paging keys, `f1` through `f12`, and any single
character. character.
## Launch
```sh
deskctl launch firefox
deskctl launch code -- --new-window
```
## Selectors ## Selectors
Prefer explicit selectors when the target matters: Prefer explicit selectors when the target matters. They are clearer in logs,
more deterministic for automation, and easier to retry safely.
```sh ```sh
ref=w1 ref=w1

View file

@ -8,27 +8,33 @@ import DocLayout from "../layouts/DocLayout.astro";
<img src="/favicon.svg" alt="" width="40" height="40" /> <img src="/favicon.svg" alt="" width="40" height="40" />
</header> </header>
<p class="tagline">non-interactive desktop control for AI agents</p> <p class="tagline">non-interactive desktop control cli for AI agents</p>
<p class="lede"> <p class="lede">
<code>deskctl</code> is a thin X11 control primitive for agent loops: diagnose A thin X11 control primitive for agent loops: diagnose the runtime, observe
the runtime, observe the desktop, wait for state transitions, act deterministically, the desktop, wait for state transitions, act deterministically, then verify.
then verify.
</p> </p>
<h2>Start here</h2> <h2>Start</h2>
<ul> <ul>
<li><a href="/installation">Installation</a></li> <li>
<li><a href="/quick-start">Quick start</a></li> <a href="/installation">Installation</a>
</li>
<li>
<a href="/quick-start">Quick start</a>
</li>
</ul> </ul>
<h2>Reference</h2> <h2>Reference</h2>
<ul> <ul>
<li><a href="/commands">Commands</a></li> <li>
<li><a href="/architecture">Architecture</a></li> <a href="/commands">Commands</a>
<li><a href="/runtime-contract">Runtime contract</a></li> </li>
<li>
<a href="/runtime-contract">Runtime contract</a>
</li>
</ul> </ul>
<h2>Links</h2> <h2>Links</h2>
@ -37,5 +43,8 @@ import DocLayout from "../layouts/DocLayout.astro";
<li> <li>
<a href="https://github.com/harivansh-afk/deskctl">GitHub</a> <a href="https://github.com/harivansh-afk/deskctl">GitHub</a>
</li> </li>
<li>
<a href="https://www.npmjs.com/package/deskctl">npm</a>
</li>
</ul> </ul>
</DocLayout> </DocLayout>

View file

@ -6,19 +6,30 @@ toc: true
# Installation # Installation
## Default install Install the public `deskctl` command first, then validate the desktop runtime
with `deskctl doctor` before trying to automate anything.
## Recommended path
```sh ```sh
npm install -g deskctl npm install -g deskctl
deskctl doctor
``` ```
`deskctl` is the default install path. It installs the command by `deskctl` is the default install path. It installs the command by
downloading the matching GitHub Release asset for the supported runtime target. downloading the matching GitHub Release asset for the supported runtime target.
The repo skill lives under `skills/deskctl`, so `skills` can install it This path does not require a Rust toolchain. The installed command is always
directly from this GitHub repo. It is designed around the same observe -> wait `deskctl`, even though the release asset itself is target-specific.
-> act -> verify loop as the CLI. `-g` installs it globally; omit that flag if
you want a project-local install. ## Skill install
The repo skill lives under `skills/deskctl`, so you can install it
directly uring `skills.sh`
```sh
npx skills add harivansh-afk/deskctl
```
## Other install paths ## Other install paths
@ -29,7 +40,7 @@ nix run github:harivansh-afk/deskctl -- --help
nix profile install github:harivansh-afk/deskctl nix profile install github:harivansh-afk/deskctl
``` ```
### Build from source ### Rust
```sh ```sh
git clone https://github.com/harivansh-afk/deskctl git clone https://github.com/harivansh-afk/deskctl
@ -53,8 +64,13 @@ Source builds on Linux require:
The binary itself only depends on the standard Linux glibc runtime. The binary itself only depends on the standard Linux glibc runtime.
If setup fails, run: ## Verification
If setup fails for any reason start here:
```sh ```sh
deskctl doctor deskctl doctor
``` ```
`doctor` checks X11 connectivity, window enumeration, screenshot viability, and
daemon/socket health before normal command execution.

View file

@ -6,17 +6,19 @@ toc: true
# Quick start # Quick start
## Install and diagnose The fastest way to use `deskctl` is to follow the same four-step loop : observe, wait, act, verify.
## 1. Install and diagnose
```sh ```sh
npm install -g deskctl npm install -g deskctl
deskctl doctor deskctl doctor
``` ```
Use `deskctl doctor` first. It checks X11 connectivity, basic enumeration, Run `deskctl doctor` first. It checks X11 connectivity, basic enumeration,
screenshot viability, and socket health before you start driving the desktop. screenshot viability, and socket health before you start driving the desktop.
## Observe ## 2. Observe the desktop
```sh ```sh
deskctl snapshot --annotate deskctl snapshot --annotate
@ -29,7 +31,7 @@ Use `snapshot` when you want a screenshot artifact plus window refs. Use
`list-windows` when you only need the current window tree without writing a `list-windows` when you only need the current window tree without writing a
screenshot. screenshot.
## Target windows cleanly ## 3. Pick selectors that stay readable
Prefer explicit selectors when you need deterministic targeting: Prefer explicit selectors when you need deterministic targeting:
@ -44,7 +46,7 @@ focused
Legacy refs such as `@w1` still work after `snapshot` or `list-windows`. Bare Legacy refs such as `@w1` still work after `snapshot` or `list-windows`. Bare
strings like `firefox` are fuzzy matches and now fail on ambiguity. strings like `firefox` are fuzzy matches and now fail on ambiguity.
## Wait, act, verify ## 4. Wait, act, verify
The core loop is: The core loop is:
@ -69,7 +71,7 @@ deskctl snapshot
The wait commands return the matched window payload on success, so they compose The wait commands return the matched window payload on success, so they compose
cleanly into the next action. cleanly into the next action.
## Use `--json` when parsing matters ## 5. Use `--json` when parsing matters
Every command supports `--json` and uses the same top-level envelope: Every command supports `--json` and uses the same top-level envelope:

View file

@ -11,7 +11,7 @@ This page defines the current public output contract for `deskctl`.
It is intentionally scoped to the current Linux X11 runtime surface. It does It is intentionally scoped to the current Linux X11 runtime surface. It does
not promise stability for future Wayland or window-manager-specific features. not promise stability for future Wayland or window-manager-specific features.
## JSON envelope ## Stable top-level envelope
Every command supports `--json` and uses the same top-level envelope: Every command supports `--json` and uses the same top-level envelope:
@ -32,7 +32,7 @@ Stable top-level fields:
If `success` is `false`, the command exits non-zero in both text mode and JSON If `success` is `false`, the command exits non-zero in both text mode and JSON
mode. mode.
## Stable window fields ## Stable window payload
Whenever a response includes a window payload, these fields are stable: Whenever a response includes a window payload, these fields are stable:

View file

@ -224,30 +224,30 @@ hr {
} }
} }
nav { .breadcrumbs {
max-width: 50rem; max-width: 50rem;
margin: 0 auto; margin: 0 auto;
padding: 1.5rem clamp(1.25rem, 5vw, 3rem) 0; padding: 1.5rem clamp(1.25rem, 5vw, 3rem) 0;
font-size: 0.9rem; font-size: 0.9rem;
} }
nav a { .breadcrumbs a {
color: inherit; color: inherit;
text-decoration: none; text-decoration: none;
opacity: 0.6; opacity: 0.6;
transition: opacity 0.15s; transition: opacity 0.15s;
} }
nav a:hover { .breadcrumbs a:hover {
opacity: 1; opacity: 1;
} }
nav .title { .breadcrumbs .title {
font-weight: 500; font-weight: 500;
opacity: 1; opacity: 1;
} }
nav .sep { .breadcrumbs .sep {
opacity: 0.3; opacity: 0.3;
margin: 0 0.5em; margin: 0 0.5em;
} }