mirror of
https://github.com/harivansh-afk/deskctl.git
synced 2026-04-15 09:01:15 +00:00
scaffold docs
This commit is contained in:
parent
c69d0fa569
commit
bc43b5878b
6 changed files with 356 additions and 78 deletions
|
|
@ -8,12 +8,71 @@ toc: true
|
||||||
|
|
||||||
## Client-daemon model
|
## Client-daemon model
|
||||||
|
|
||||||
deskctl uses a client-daemon architecture over Unix sockets with an NDJSON wire protocol. The daemon starts automatically on the first command and keeps the X11 connection alive for fast repeated calls.
|
deskctl uses a client-daemon architecture over Unix sockets. The daemon starts automatically on the first command and keeps the X11 connection alive so repeated calls skip the connection setup overhead.
|
||||||
|
|
||||||
|
Each command opens a new connection to the daemon, sends a single NDJSON request, reads one NDJSON response, and exits.
|
||||||
|
|
||||||
|
## Wire protocol
|
||||||
|
|
||||||
|
Requests and responses are newline-delimited JSON (NDJSON) over a Unix socket.
|
||||||
|
|
||||||
|
**Request:**
|
||||||
|
|
||||||
|
```json
|
||||||
|
{"id": "r123456", "action": "snapshot", "annotate": true}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Response:**
|
||||||
|
|
||||||
|
```json
|
||||||
|
{"success": true, "data": {"screenshot": "/tmp/deskctl-1234567890.png", "windows": [...]}}
|
||||||
|
```
|
||||||
|
|
||||||
|
Error responses include an `error` field:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{"success": false, "error": "window not found: @w99"}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Socket location
|
||||||
|
|
||||||
|
The daemon socket is resolved in this order:
|
||||||
|
|
||||||
|
1. `--socket` flag (highest priority)
|
||||||
|
2. `$DESKCTL_SOCKET_DIR/{session}.sock`
|
||||||
|
3. `$XDG_RUNTIME_DIR/deskctl/{session}.sock`
|
||||||
|
4. `~/.deskctl/{session}.sock`
|
||||||
|
|
||||||
|
PID files are stored alongside the socket.
|
||||||
|
|
||||||
|
## Sessions
|
||||||
|
|
||||||
|
Multiple isolated daemon instances can run simultaneously using the `--session` flag:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
deskctl --session workspace1 snapshot
|
||||||
|
deskctl --session workspace2 snapshot
|
||||||
|
```
|
||||||
|
|
||||||
|
Each session has its own socket, PID file, and window ref map.
|
||||||
|
|
||||||
## Backend design
|
## Backend design
|
||||||
|
|
||||||
The backend is trait-based, making it straightforward to add support for different display servers. The current implementation targets X11 via `x11rb`.
|
The core is built around a `DesktopBackend` trait. The current implementation uses `x11rb` for X11 protocol operations and `enigo` for input simulation.
|
||||||
|
|
||||||
## Wayland support
|
The trait-based design means adding Wayland support is a single trait implementation with no changes to the core, CLI, or daemon code.
|
||||||
|
|
||||||
Coming soon. The trait-based backend design means adding Hyprland/Wayland support is a single trait implementation with zero refactoring of the core.
|
## X11 integration
|
||||||
|
|
||||||
|
Window detection uses EWMH properties:
|
||||||
|
|
||||||
|
| Property | Purpose |
|
||||||
|
|----------|---------|
|
||||||
|
| `_NET_CLIENT_LIST_STACKING` | Window stacking order |
|
||||||
|
| `_NET_ACTIVE_WINDOW` | Currently focused window |
|
||||||
|
| `_NET_WM_NAME` | Window title (UTF-8) |
|
||||||
|
| `_NET_WM_STATE_HIDDEN` | Minimized state |
|
||||||
|
| `_NET_CLOSE_WINDOW` | Graceful close |
|
||||||
|
| `WM_CLASS` | Application class/name |
|
||||||
|
|
||||||
|
Falls back to `XQueryTree` if `_NET_CLIENT_LIST_STACKING` is unavailable.
|
||||||
|
|
|
||||||
176
site/src/pages/commands.mdx
Normal file
176
site/src/pages/commands.mdx
Normal file
|
|
@ -0,0 +1,176 @@
|
||||||
|
---
|
||||||
|
layout: ../layouts/DocLayout.astro
|
||||||
|
title: Commands
|
||||||
|
toc: true
|
||||||
|
---
|
||||||
|
|
||||||
|
# Commands
|
||||||
|
|
||||||
|
## Snapshot
|
||||||
|
|
||||||
|
Capture a screenshot and get the window tree:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
deskctl snapshot
|
||||||
|
deskctl snapshot --annotate
|
||||||
|
```
|
||||||
|
|
||||||
|
With `--annotate`, colored bounding boxes and `@wN` labels are drawn on the screenshot. Each window gets a unique color from an 8-color palette. Minimized windows are skipped.
|
||||||
|
|
||||||
|
The screenshot is saved to `/tmp/deskctl-{timestamp}.png`.
|
||||||
|
|
||||||
|
## Click
|
||||||
|
|
||||||
|
Click the center of a window by ref, or click exact coordinates:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
deskctl click @w1
|
||||||
|
deskctl click 960,540
|
||||||
|
```
|
||||||
|
|
||||||
|
## Double click
|
||||||
|
|
||||||
|
```sh
|
||||||
|
deskctl dblclick @w1
|
||||||
|
deskctl dblclick 500,300
|
||||||
|
```
|
||||||
|
|
||||||
|
## Type
|
||||||
|
|
||||||
|
Type a string into the focused window:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
deskctl type "hello world"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Press
|
||||||
|
|
||||||
|
Press a single key:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
deskctl press enter
|
||||||
|
deskctl press tab
|
||||||
|
deskctl press escape
|
||||||
|
```
|
||||||
|
|
||||||
|
Supported key names: `enter`, `tab`, `escape`, `backspace`, `delete`, `space`, `up`, `down`, `left`, `right`, `home`, `end`, `pageup`, `pagedown`, `f1`-`f12`, or any single character.
|
||||||
|
|
||||||
|
## Hotkey
|
||||||
|
|
||||||
|
Send a key combination. List modifier keys first, then the target key:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
deskctl hotkey ctrl c
|
||||||
|
deskctl hotkey ctrl shift t
|
||||||
|
deskctl hotkey alt f4
|
||||||
|
```
|
||||||
|
|
||||||
|
Modifier names: `ctrl`, `alt`, `shift`, `super` (also `meta` or `win`).
|
||||||
|
|
||||||
|
## Mouse move
|
||||||
|
|
||||||
|
Move the cursor to absolute coordinates:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
deskctl mouse move 100 200
|
||||||
|
```
|
||||||
|
|
||||||
|
## Mouse scroll
|
||||||
|
|
||||||
|
Scroll the mouse wheel. Positive values scroll down, negative scroll up:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
deskctl mouse scroll 3
|
||||||
|
deskctl mouse scroll -5
|
||||||
|
deskctl mouse scroll 3 --axis horizontal
|
||||||
|
```
|
||||||
|
|
||||||
|
## Mouse drag
|
||||||
|
|
||||||
|
Drag from one position to another:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
deskctl mouse drag 100 200 500 600
|
||||||
|
```
|
||||||
|
|
||||||
|
## Focus
|
||||||
|
|
||||||
|
Focus a window by ref or by name (case-insensitive substring match):
|
||||||
|
|
||||||
|
```sh
|
||||||
|
deskctl focus @w1
|
||||||
|
deskctl focus "firefox"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Close
|
||||||
|
|
||||||
|
Close a window gracefully:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
deskctl close @w2
|
||||||
|
deskctl close "terminal"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Move window
|
||||||
|
|
||||||
|
Move a window to an absolute position:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
deskctl move-window @w1 0 0
|
||||||
|
deskctl move-window "firefox" 100 100
|
||||||
|
```
|
||||||
|
|
||||||
|
## Resize window
|
||||||
|
|
||||||
|
Resize a window:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
deskctl resize-window @w1 1280 720
|
||||||
|
```
|
||||||
|
|
||||||
|
## List windows
|
||||||
|
|
||||||
|
List all windows without taking a screenshot:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
deskctl list-windows
|
||||||
|
```
|
||||||
|
|
||||||
|
## Get screen size
|
||||||
|
|
||||||
|
```sh
|
||||||
|
deskctl get-screen-size
|
||||||
|
```
|
||||||
|
|
||||||
|
## Get mouse position
|
||||||
|
|
||||||
|
```sh
|
||||||
|
deskctl get-mouse-position
|
||||||
|
```
|
||||||
|
|
||||||
|
## Screenshot
|
||||||
|
|
||||||
|
Take a screenshot without the window tree. Optionally specify a save path:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
deskctl screenshot
|
||||||
|
deskctl screenshot /tmp/my-screenshot.png
|
||||||
|
deskctl screenshot --annotate
|
||||||
|
```
|
||||||
|
|
||||||
|
## Launch
|
||||||
|
|
||||||
|
Launch an application:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
deskctl launch firefox
|
||||||
|
deskctl launch code --args /path/to/project
|
||||||
|
```
|
||||||
|
|
||||||
|
## Global options
|
||||||
|
|
||||||
|
| Flag | Env | Description |
|
||||||
|
|------|-----|-------------|
|
||||||
|
| `--json` | | Output as JSON |
|
||||||
|
| `--socket <path>` | `DESKCTL_SOCKET` | Path to daemon Unix socket |
|
||||||
|
| `--session <name>` | | Session name for multiple daemons (default: `default`) |
|
||||||
|
|
@ -9,16 +9,22 @@ import DocLayout from "../layouts/DocLayout.astro";
|
||||||
</header>
|
</header>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
X11 desktop control CLI for AI agents on Linux. Snapshot, click, type, and
|
Desktop control CLI for AI agents on Linux X11. Compact JSON output
|
||||||
focus windows through a simple command-line interface with a client-daemon
|
for agent loops. Screenshot, click, type, scroll, drag, and manage
|
||||||
architecture over Unix sockets.
|
windows through a fast client-daemon architecture. 100% native Rust.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<h2>Documentation</h2>
|
<h2>Getting started</h2>
|
||||||
|
|
||||||
<ul>
|
<ul>
|
||||||
<li><a href="/installation">Installation</a></li>
|
<li><a href="/installation">Installation</a></li>
|
||||||
<li><a href="/usage">Usage</a></li>
|
<li><a href="/quick-start">Quick start</a></li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<h2>Reference</h2>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li><a href="/commands">Commands</a></li>
|
||||||
<li><a href="/architecture">Architecture</a></li>
|
<li><a href="/architecture">Architecture</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,6 +1,7 @@
|
||||||
---
|
---
|
||||||
layout: ../layouts/DocLayout.astro
|
layout: ../layouts/DocLayout.astro
|
||||||
title: Installation
|
title: Installation
|
||||||
|
toc: true
|
||||||
---
|
---
|
||||||
|
|
||||||
# Installation
|
# Installation
|
||||||
|
|
@ -11,9 +12,17 @@ title: Installation
|
||||||
cargo install deskctl
|
cargo install deskctl
|
||||||
```
|
```
|
||||||
|
|
||||||
## Docker build
|
## From source
|
||||||
|
|
||||||
Build a Linux binary with Docker:
|
```sh
|
||||||
|
git clone https://github.com/harivansh-afk/deskctl
|
||||||
|
cd deskctl
|
||||||
|
cargo build --release
|
||||||
|
```
|
||||||
|
|
||||||
|
## Docker (cross-compile for Linux)
|
||||||
|
|
||||||
|
Build a static Linux binary from any platform:
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
docker compose -f docker/docker-compose.yml run --rm build
|
docker compose -f docker/docker-compose.yml run --rm build
|
||||||
|
|
@ -21,25 +30,19 @@ docker compose -f docker/docker-compose.yml run --rm build
|
||||||
|
|
||||||
This writes `dist/deskctl-linux-x86_64`.
|
This writes `dist/deskctl-linux-x86_64`.
|
||||||
|
|
||||||
## From source
|
|
||||||
|
|
||||||
```sh
|
|
||||||
git clone https://github.com/harivansh-afk/deskctl
|
|
||||||
cd deskctl
|
|
||||||
cargo build
|
|
||||||
```
|
|
||||||
|
|
||||||
## Deploy to a remote machine
|
## Deploy to a remote machine
|
||||||
|
|
||||||
Copy the binary to an SSH machine:
|
Copy the binary over SSH when `scp` is not available:
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
ssh -p 443 deskctl@ssh.agentcomputer.ai 'cat > ~/deskctl && chmod +x ~/deskctl' < dist/deskctl-linux-x86_64
|
ssh -p 443 user@host 'cat > ~/deskctl && chmod +x ~/deskctl' < dist/deskctl-linux-x86_64
|
||||||
```
|
```
|
||||||
|
|
||||||
## Runtime requirements
|
## Requirements
|
||||||
|
|
||||||
- Linux with X11 session
|
- Linux with an active X11 session
|
||||||
- `DISPLAY` environment variable set
|
- `DISPLAY` environment variable set (e.g. `DISPLAY=:1`)
|
||||||
- `XDG_SESSION_TYPE=x11`
|
- `XDG_SESSION_TYPE=x11`
|
||||||
- A window manager exposing standard EWMH properties
|
- A window manager that exposes EWMH properties (`_NET_CLIENT_LIST_STACKING`, `_NET_ACTIVE_WINDOW`)
|
||||||
|
|
||||||
|
No extra native libraries are needed beyond the standard glibc runtime (`libc`, `libm`, `libgcc_s`).
|
||||||
|
|
|
||||||
87
site/src/pages/quick-start.mdx
Normal file
87
site/src/pages/quick-start.mdx
Normal file
|
|
@ -0,0 +1,87 @@
|
||||||
|
---
|
||||||
|
layout: ../layouts/DocLayout.astro
|
||||||
|
title: Quick start
|
||||||
|
toc: true
|
||||||
|
---
|
||||||
|
|
||||||
|
# Quick start
|
||||||
|
|
||||||
|
## Core workflow
|
||||||
|
|
||||||
|
The typical agent loop is: snapshot the desktop, interpret the result, act on it.
|
||||||
|
|
||||||
|
```sh
|
||||||
|
# 1. see the desktop
|
||||||
|
deskctl --json snapshot --annotate
|
||||||
|
|
||||||
|
# 2. click a window by its ref
|
||||||
|
deskctl click @w1
|
||||||
|
|
||||||
|
# 3. type into the focused window
|
||||||
|
deskctl type "hello world"
|
||||||
|
|
||||||
|
# 4. press a key
|
||||||
|
deskctl press enter
|
||||||
|
```
|
||||||
|
|
||||||
|
The `--annotate` flag draws colored bounding boxes and `@wN` labels on the screenshot so agents can visually identify windows.
|
||||||
|
|
||||||
|
## Window refs
|
||||||
|
|
||||||
|
Every `snapshot` assigns refs like `@w1`, `@w2`, etc. to each visible window, ordered top-to-bottom by stacking order. Use these refs anywhere a selector is expected:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
deskctl click @w1
|
||||||
|
deskctl focus @w3
|
||||||
|
deskctl close @w2
|
||||||
|
```
|
||||||
|
|
||||||
|
You can also select windows by name (case-insensitive substring match):
|
||||||
|
|
||||||
|
```sh
|
||||||
|
deskctl focus "firefox"
|
||||||
|
deskctl close "terminal"
|
||||||
|
```
|
||||||
|
|
||||||
|
## JSON output
|
||||||
|
|
||||||
|
Pass `--json` for machine-readable output. This is the primary mode for agent integrations:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
deskctl --json snapshot
|
||||||
|
```
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"success": true,
|
||||||
|
"data": {
|
||||||
|
"screenshot": "/tmp/deskctl-1234567890.png",
|
||||||
|
"windows": [
|
||||||
|
{
|
||||||
|
"ref_id": "w1",
|
||||||
|
"xcb_id": 12345678,
|
||||||
|
"title": "Firefox",
|
||||||
|
"app_name": "firefox",
|
||||||
|
"x": 0,
|
||||||
|
"y": 0,
|
||||||
|
"width": 1920,
|
||||||
|
"height": 1080,
|
||||||
|
"focused": true,
|
||||||
|
"minimized": false
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Daemon lifecycle
|
||||||
|
|
||||||
|
The daemon starts automatically on the first command. It keeps the X11 connection alive so repeated calls are fast. You do not need to manage it manually.
|
||||||
|
|
||||||
|
```sh
|
||||||
|
# check if the daemon is running
|
||||||
|
deskctl daemon status
|
||||||
|
|
||||||
|
# stop it explicitly
|
||||||
|
deskctl daemon stop
|
||||||
|
```
|
||||||
|
|
@ -1,53 +0,0 @@
|
||||||
---
|
|
||||||
layout: ../layouts/DocLayout.astro
|
|
||||||
title: Usage
|
|
||||||
toc: true
|
|
||||||
---
|
|
||||||
|
|
||||||
# Usage
|
|
||||||
|
|
||||||
## Snapshot
|
|
||||||
|
|
||||||
Capture the current desktop state:
|
|
||||||
|
|
||||||
```sh
|
|
||||||
deskctl snapshot
|
|
||||||
```
|
|
||||||
|
|
||||||
With annotations overlaid on windows:
|
|
||||||
|
|
||||||
```sh
|
|
||||||
deskctl --json snapshot --annotate
|
|
||||||
```
|
|
||||||
|
|
||||||
## Click
|
|
||||||
|
|
||||||
Click a window by its annotation handle:
|
|
||||||
|
|
||||||
```sh
|
|
||||||
deskctl click @w1
|
|
||||||
```
|
|
||||||
|
|
||||||
## Type
|
|
||||||
|
|
||||||
Type text into the focused window:
|
|
||||||
|
|
||||||
```sh
|
|
||||||
deskctl type "hello world"
|
|
||||||
```
|
|
||||||
|
|
||||||
## Focus
|
|
||||||
|
|
||||||
Focus a window by name:
|
|
||||||
|
|
||||||
```sh
|
|
||||||
deskctl focus "firefox"
|
|
||||||
```
|
|
||||||
|
|
||||||
## JSON output
|
|
||||||
|
|
||||||
Pass `--json` for machine-readable output, useful for agent integrations:
|
|
||||||
|
|
||||||
```sh
|
|
||||||
deskctl --json snapshot
|
|
||||||
```
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue