deskctl/openspec/changes/archive/2026-03-25-stabilize-v0-2-foundation/design.md
Hari 6dce22eaef
stabilize (#3)
* specs

* Stabilize deskctl runtime foundation

Co-authored-by: Codex <noreply@openai.com>

* opsx archive

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-25 18:31:08 -04:00

6.3 KiB

Context

deskctl already exposes a useful X11 runtime for screenshots, input, and window management, but the current implementation mixes together concerns that need to be separated before the repo can become a stable primitive. Public window data still exposes xcb_id, list-windows is routed through screenshot-producing code, setup failures are opaque, and daemon startup behavior is not yet explicit enough for reliable reuse by higher-level tooling.

Phase 1 is the foundation tranche. It should make the runtime contract clean and cheap to consume without expanding into packaging, skills, or non-X11 backends.

Goals / Non-Goals

Goals:

  • Define a backend-neutral public window identity and selector contract.
  • Make read-only window enumeration cheap and side-effect free.
  • Add a first-run diagnostics command that explains broken environments precisely.
  • Harden daemon startup and health behavior enough for predictable CLI use.
  • Keep the Phase 1 scope implementable in one focused change.

Non-Goals:

  • Wayland support or any additional backend implementation.
  • npm distribution, crates.io publishing, or release automation changes.
  • Broad new read surface such as monitors, workspaces, clipboard, or batching.
  • Agent skills, config files, or policy/confirmation features.

Decisions

1. Public window identity becomes window_id, not xcb_id

The stable contract will expose an opaque window_id for programmatic targeting. Backend-specific handles such as X11 window IDs stay internal to daemon/backend state.

Rationale:

  • This removes X11 leakage from the public interface.
  • It keeps the future backend boundary open without promising a Wayland implementation now.
  • It makes selector behavior explicit: users and agents target @wN, window names, or window_id, not backend handles.

Alternatives considered:

  • Keep exposing xcb_id and add window_id alongside it. Rejected because it cements the wrong contract and encourages downstream dependence on X11 internals.
  • Hide programmatic identity entirely and rely only on refs. Rejected because refs are intentionally ephemeral and not sufficient for durable automation.

2. Window enumeration and screenshot capture become separate backend operations

The backend API will separate:

  • window listing / state collection
  • screenshot capture
  • composed snapshot behavior

list-windows will call the cheap enumeration path directly. snapshot remains the convenience command that combines enumeration plus screenshot generation.

Rationale:

  • Read-only commands must not capture screenshots or write /tmp files.
  • This reduces latency and unintended side effects.
  • It clarifies which operations are safe to call frequently in agent loops.

Alternatives considered:

  • Keep the current snapshot(false) path and optimize it internally. Rejected because the coupling itself is the product problem; the API shape needs to reflect the intended behavior.

3. deskctl doctor runs without requiring a healthy daemon

doctor will be implemented as a CLI command that can run before daemon startup. It will probe environment prerequisites directly and optionally inspect daemon/socket state as one of the checks.

Expected checks:

  • DISPLAY present
  • X11 session expectation (XDG_SESSION_TYPE=x11 or explicit note if missing)
  • X server connectivity
  • required extensions or equivalent runtime capabilities used by the backend
  • socket directory existence and permissions
  • basic window enumeration
  • screenshot capture viability

Rationale:

  • Diagnostics must work when daemon startup is the thing that is broken.
  • The command should return actionable failure messages, not “failed to connect to daemon.”

Alternatives considered:

  • Implement doctor as a normal daemon request. Rejected because it hides the startup path and cannot diagnose stale socket or spawn failures well.

4. Daemon hardening stays minimal and focused in this phase

Phase 1 daemon work will cover:

  • stale socket detection and cleanup on startup/connect
  • clearer startup/connect failure messages
  • a lightweight health-check request or equivalent status probe used by the client path

Rationale:

  • This is enough to make CLI behavior predictable without turning the spec into a full daemon observability project.

Alternatives considered:

  • Include idle timeout, structured logging, and full lifecycle policy now. Rejected because those are useful but not necessary to stabilize the basic runtime contract.

5. X11 remains the explicit support boundary for this change

The spec and implementation will define expected behavior for X11 environments only. Unsupported-session diagnostics are in scope; a second backend is not.

Rationale:

  • The repo needs a stable foundation more than a nominally portable abstraction.
  • Clear support boundaries are better than implying near-term Wayland support.

Risks / Trade-offs

  • [Breaking JSON contract for existing users] → Treat this as a deliberate pre-1.0 stabilization change, update docs/examples, and keep the new shape simple.
  • [More internal mapping state between public IDs and backend handles] → Keep one canonical mapping in daemon state and use it for selector resolution.
  • [Separate read and screenshot paths could drift] → Share the same window collection logic underneath both operations.
  • [doctor checks may be environment-specific] → Keep the checks narrow, tied to actual backend requirements, and report concrete pass/fail reasons.

Migration Plan

  1. Introduce the new public window_id contract and update selector resolution to accept it.
  2. Refactor backend and daemon code so list-windows uses a pure read path.
  3. Add deskctl doctor and wire it into the CLI as a daemon-independent command.
  4. Add unit and integration coverage for the new contract and cheap read behavior.
  5. Update user-facing docs and examples after implementation so the new output shape is canonical.

Rollback strategy:

  • This is a pre-1.0 contract cleanup, so rollback is a normal code revert rather than a compatibility shim plan.

Open Questions

  • Whether the health check should be an explicit public daemon ping action or an internal client/daemon probe.
  • Whether doctor should expose machine-readable JSON on day one or land text output first and add JSON immediately afterward in the same tranche if implementation cost is low.