Phase 6: utility commands, SKILL.md, AGENTS.md, README.md

- Implement screen_size via xcap Monitor, mouse_position via x11rb
  query_pointer, standalone screenshot with optional annotation,
  launch for spawning detached processes
- Handler dispatchers for get-screen-size, get-mouse-position,
  screenshot, launch
- SKILL.md agent discovery file with allowed-tools frontmatter
- AGENTS.md contributor guidelines for AI agents
- README.md with installation, quick start, architecture overview
This commit is contained in:
Harivansh Rathi 2026-03-24 21:40:29 -04:00
parent 567115a6c2
commit 03dfd6b6ea
5 changed files with 339 additions and 8 deletions

53
README.md Normal file
View file

@ -0,0 +1,53 @@
# desktop-ctl
Desktop control CLI for AI agents on Linux X11. A single installable binary that gives agents full desktop access: screenshots with window refs, mouse/keyboard input, and window management.
Inspired by [agent-browser](https://github.com/vercel-labs/agent-browser) - but for the full desktop.
## Install
```bash
cargo install desktop-ctl
```
System dependencies (Debian/Ubuntu):
```bash
sudo apt install libxcb-dev libxrandr-dev libclang-dev
```
## Quick Start
```bash
# See the desktop
desktop-ctl snapshot
# Click a window
desktop-ctl click @w1
# Type text
desktop-ctl type "hello world"
# Focus by name
desktop-ctl focus "firefox"
```
## Architecture
Client-daemon architecture over Unix sockets (NDJSON wire protocol). The daemon starts automatically on first command and keeps the X11 connection alive for fast repeated calls.
```
Agent -> desktop-ctl CLI (thin client) -> Unix socket -> desktop-ctl daemon -> X11
```
## Requirements
- Linux with X11 session
- Rust 1.75+ (for building)
## Wayland Support
Coming in v0.2. The trait-based backend design means adding Hyprland/Wayland support is a single trait implementation with zero refactoring of the core.
## License
MIT OR Apache-2.0