mirror of
https://github.com/harivansh-afk/deskctl.git
synced 2026-04-15 04:03:28 +00:00
256 lines
6.1 KiB
Markdown
256 lines
6.1 KiB
Markdown
# deskctl
|
|
|
|
Desktop control CLI for AI agents on Linux X11.
|
|
|
|
## Install
|
|
|
|
### Cargo
|
|
|
|
```bash
|
|
cargo install deskctl
|
|
```
|
|
|
|
Source builds on Linux require:
|
|
|
|
- Rust 1.75+
|
|
- `pkg-config`
|
|
- X11 development libraries for input and windowing, typically `libx11-dev` and `libxtst-dev` on Debian/Ubuntu
|
|
|
|
### npm
|
|
|
|
```bash
|
|
npm install -g deskctl-cli
|
|
deskctl --help
|
|
```
|
|
|
|
One-shot execution is also supported:
|
|
|
|
```bash
|
|
npx deskctl-cli --help
|
|
```
|
|
|
|
`deskctl-cli` currently supports `linux-x64` and installs the `deskctl` command by downloading the matching GitHub Release asset.
|
|
|
|
### Nix
|
|
|
|
```bash
|
|
nix run github:harivansh-afk/deskctl -- --help
|
|
nix profile install github:harivansh-afk/deskctl
|
|
```
|
|
|
|
The repo flake is the supported Nix install surface in this phase.
|
|
|
|
### Docker Convenience
|
|
|
|
Build a Linux binary locally with Docker:
|
|
|
|
```bash
|
|
docker compose -f docker/docker-compose.yml run --rm build
|
|
```
|
|
|
|
This writes `dist/deskctl-linux-x86_64`.
|
|
|
|
Copy it to an SSH machine where `scp` is unavailable:
|
|
|
|
```bash
|
|
ssh -p 443 deskctl@ssh.agentcomputer.ai 'cat > ~/deskctl && chmod +x ~/deskctl' < dist/deskctl-linux-x86_64
|
|
```
|
|
|
|
Run it on an X11 session:
|
|
|
|
```bash
|
|
DISPLAY=:1 XDG_SESSION_TYPE=x11 ~/deskctl --json snapshot --annotate
|
|
```
|
|
|
|
### Local Source Build
|
|
|
|
```bash
|
|
cargo build
|
|
```
|
|
|
|
## Quick Start
|
|
|
|
```bash
|
|
# Diagnose the environment first
|
|
deskctl doctor
|
|
|
|
# See the desktop
|
|
deskctl snapshot
|
|
|
|
# Query focused runtime state
|
|
deskctl get active-window
|
|
deskctl get monitors
|
|
|
|
# Click a window
|
|
deskctl click @w1
|
|
|
|
# Type text
|
|
deskctl type "hello world"
|
|
|
|
# Wait for a window or focus transition
|
|
deskctl wait window --selector 'title=Firefox' --timeout 10
|
|
deskctl wait focus --selector 'class=firefox' --timeout 5
|
|
|
|
# Focus by explicit selector
|
|
deskctl focus 'title=Firefox'
|
|
```
|
|
|
|
## Architecture
|
|
|
|
Client-daemon architecture over Unix sockets (NDJSON wire protocol).
|
|
The daemon starts automatically on first command and keeps the X11 connection alive for fast repeated calls.
|
|
|
|
Source layout:
|
|
|
|
- `src/lib.rs` exposes the shared library target
|
|
- `src/main.rs` is the thin CLI wrapper
|
|
- `src/` contains production code and unit tests
|
|
- `tests/` contains Linux/X11 integration tests
|
|
- `tests/support/` contains shared integration helpers
|
|
|
|
## Runtime Requirements
|
|
|
|
- Linux with X11 session
|
|
- Rust 1.75+ plus the source-build dependencies above when building from source
|
|
|
|
The binary itself only links the standard glibc runtime on Linux (`libc`, `libm`, `libgcc_s`).
|
|
|
|
For deskctl to be fully functional on a fresh VM you still need:
|
|
|
|
- an X11 server and an active `DISPLAY`
|
|
- `XDG_SESSION_TYPE=x11` or an equivalent X11 session environment
|
|
- a window manager or desktop environment that exposes standard EWMH properties such as `_NET_CLIENT_LIST_STACKING` and `_NET_ACTIVE_WINDOW`
|
|
- an X server with the extensions needed for input simulation and screen metadata, which is standard on normal desktop X11 setups
|
|
|
|
If setup fails, run:
|
|
|
|
```bash
|
|
deskctl doctor
|
|
```
|
|
|
|
## Contract Notes
|
|
|
|
- `@wN` refs are short-lived handles assigned by `snapshot` and `list-windows`
|
|
- `--json` output includes a stable `window_id` for programmatic targeting within the current daemon session
|
|
- `list-windows` is a cheap read-only operation and does not capture or write a screenshot
|
|
- the stable runtime JSON/error contract is documented in [docs/runtime-output.md](docs/runtime-output.md)
|
|
|
|
## Read and Wait Surface
|
|
|
|
The grouped runtime reads are:
|
|
|
|
```bash
|
|
deskctl get active-window
|
|
deskctl get monitors
|
|
deskctl get version
|
|
deskctl get systeminfo
|
|
```
|
|
|
|
The grouped runtime waits are:
|
|
|
|
```bash
|
|
deskctl wait window --selector 'title=Firefox' --timeout 10
|
|
deskctl wait focus --selector 'id=win3' --timeout 5
|
|
```
|
|
|
|
Successful `get active-window`, `wait window`, and `wait focus` responses return a `window` payload with:
|
|
- `ref_id`
|
|
- `window_id`
|
|
- `title`
|
|
- `app_name`
|
|
- geometry (`x`, `y`, `width`, `height`)
|
|
- state flags (`focused`, `minimized`)
|
|
|
|
`get monitors` returns:
|
|
- `count`
|
|
- `monitors[]` with geometry and primary/automatic flags
|
|
|
|
`get version` returns:
|
|
- `version`
|
|
- `backend`
|
|
|
|
`get systeminfo` stays runtime-scoped and returns:
|
|
- `backend`
|
|
- `display`
|
|
- `session_type`
|
|
- `session`
|
|
- `socket_path`
|
|
- `screen`
|
|
- `monitor_count`
|
|
- `monitors`
|
|
|
|
Wait timeout and selector failures are structured in `--json` mode so agents can recover without string parsing.
|
|
|
|
## Output Policy
|
|
|
|
Text mode is compact and follow-up-oriented, but JSON is the parsing contract.
|
|
|
|
- use `--json` when an agent needs strict parsing
|
|
- rely on `window_id`, selector-related fields, grouped read payloads, and structured error `kind` values for stable automation
|
|
- treat monitor naming, incidental whitespace, and default screenshot file names as best-effort
|
|
|
|
See [docs/runtime-output.md](docs/runtime-output.md) for the exact stable-vs-best-effort breakdown.
|
|
|
|
## Distribution
|
|
|
|
- GitHub Releases are the canonical binary source
|
|
- crates.io package: `deskctl`
|
|
- npm package: `deskctl-cli`
|
|
- installed command on every channel: `deskctl`
|
|
- repo-owned Nix install path: `flake.nix`
|
|
|
|
For maintainer publishing and release steps, see [docs/releasing.md](docs/releasing.md).
|
|
|
|
## Selector Contract
|
|
|
|
Explicit selector modes:
|
|
|
|
```bash
|
|
ref=w1
|
|
id=win1
|
|
title=Firefox
|
|
class=firefox
|
|
focused
|
|
```
|
|
|
|
Legacy refs remain supported:
|
|
|
|
```bash
|
|
@w1
|
|
w1
|
|
win1
|
|
```
|
|
|
|
Bare selectors such as `firefox` are still supported as fuzzy substring matches, but they now fail on ambiguity and return candidate windows instead of silently picking the first match.
|
|
|
|
## Support Boundary
|
|
|
|
`deskctl` supports Linux X11 in this phase. Wayland and Hyprland are explicitly out of scope for the current runtime contract.
|
|
|
|
## Workflow
|
|
|
|
Local validation uses the root `Makefile`:
|
|
|
|
```bash
|
|
make fmt-check
|
|
make lint
|
|
make test-unit
|
|
make test-integration
|
|
make site-format-check
|
|
make validate
|
|
```
|
|
|
|
`make validate` is the full repo-quality check and requires Linux with `xvfb-run` plus `pnpm --dir site install`.
|
|
|
|
The repository standardizes on `pre-commit` for fast commit-time checks:
|
|
|
|
```bash
|
|
pre-commit install
|
|
pre-commit run --all-files
|
|
```
|
|
|
|
See [CONTRIBUTING.md](CONTRIBUTING.md) for the full contributor guide.
|
|
|
|
## Acknowledgements
|
|
|
|
- [@barrettruth](github.com/barrettruth) - i stole the website from [vimdoc](https://github.com/barrettruth/vimdoc-language-server)
|