Xa11y brings cross-platform desktop automation via accessibility trees
Developer _crowecawcaw built xa11y, a Rust library with Python and JS bindings that uses OS accessibility APIs for cross-platform desktop automation, modeled after Playwright's locator and auto-waiting patterns.
Score breakdown
Teams automating or testing cross-platform desktop apps now have a Playwright-style accessibility-based library that avoids the cost and fragility of screenshot-driven agents.
- 01Built by _crowecawcaw to enable cross-platform desktop automation via OS accessibility APIs on Mac, Windows, and Linux.
- 02Screenshot-based agents and tools like OmniParser were found to be slow, expensive, or inaccurate for desktop automation.
- 03The library emulates Playwright's auto-waiting, selectors, and locator patterns to reduce test flakiness.
_crowecawcaw built xa11y after finding that existing desktop automation tools fell short for testing cross-platform desktop apps. Screenshot-based computer use agents struggled with mouse accuracy, and model-dependent approaches like OmniParser were slow and costly. Accessibility APIs emerged as a more reliable alternative, but no existing library offered a unified abstraction across Mac, Windows, and Linux — and early experiments were flaky because UI elements take time to appear and receive new IDs as the interface re-renders.
Xa11y is built around two core ideas: a single accessibility abstraction that works across all three platforms, and an emulation of Playwright's auto-waiting, selectors, and locator patterns to make automation robust.
Xa11y is built around two core ideas: a single accessibility abstraction that works across all three platforms, and an emulation of Playwright's auto-waiting, selectors, and locator patterns to make automation robust. The hardest engineering challenge was reconciling platform-specific API differences — for example, Windows exposes an entire app's accessibility tree in one API call, while Linux requires a separate call per element attribute. An early implementation that fetched the full tree before filtering could take 10 seconds or more on Linux for large apps; the library now evaluates filters while walking the tree and returns only matching elements rather than full subtrees.
The library is written in Rust, ships with Python and JavaScript bindings, and is released under the MIT license.
Key facts
- 01Built by _crowecawcaw to enable cross-platform desktop automation via OS accessibility APIs on Mac, Windows, and Linux.
- 02Screenshot-based agents and tools like OmniParser were found to be slow, expensive, or inaccurate for desktop automation.
- 03The library emulates Playwright's auto-waiting, selectors, and locator patterns to reduce test flakiness.
- 04A key challenge was abstracting platform-specific API differences — Windows reads an entire accessibility tree in one call; Linux requires a separate call per element attribute.
- 05An early full-tree-fetch approach could take 10 seconds or more on Linux for large apps; the library now filters while walking the tree.
- 06Written in Rust with Python and JS bindings, released under the MIT license.
Topics
Summary and scoring are generated automatically from the original article. We always link back to the publisher and never republish images or paywalled content. Last processed Jun 9, 2026 · 09:19 UTC. How this works →