Automate any desktop app with an AI agent
Stop copy-pasting between apps. The agent sees your screen, clicks buttons, fills forms, and moves data between applications — so you do not have to.
- 1-click uninstall
- Cancel anytime
- Files never leave your computer

Impact
What changes
The same task, two ways — how it plays out by hand today, and what changes once Lapu AI runs it for you.
Without Lapu AI
An ops manager spends 2 hours copying 50 client records from a spreadsheet into a CRM, switching between windows, clicking through form fields, and double-checking each entry.
With Lapu AI
Lapu AI reads the spreadsheet, focuses the CRM window, and fills each record into the form. The manager approves each entry and handles the exceptions the agent flags.
The challenge
Repetitive desktop workflows — copying data between apps, filling forms, clicking through multi-step processes — eat hours every week. Traditional automation tools require scripting knowledge or brittle macro recorders that break when a UI changes. Most AI tools cannot interact with desktop applications at all.
How Lapu AI solves this
Lapu AI does the clicking for you. It looks at whatever app is on your screen, finds the buttons and fields, and works through them the way you would: filling a form, moving between windows, copying data from one app into another. You describe the job once and it handles the repetitive clicking and typing. Because every action waits for your approval, you are never watching it do something you did not sign off on.
Every click and keystroke requires your permission. The agent shows you exactly what it plans to do before acting.
Workflow
How it works
Describe the workflow
Tell the agent what you need — for example, 'Take the client list from this spreadsheet and enter each row into the CRM app.' The agent plans the sequence of actions across applications.
The agent observes your desktop
The agent looks at what is on your screen and identifies every interactive element — buttons, text fields, menus, tabs. Each element gets a numbered reference.
Execute with permission-gated actions
The agent clicks, types, scrolls, and navigates between windows. Each input action (click, type, hotkey) is classified as critical-risk and requires your per-action approval. The agent verifies each step succeeded before continuing.
Under the hood — for the technically curious
Under the surface, Lapu AI reads each app through native accessibility APIs (Swift on macOS, C# on Windows), the same interface screen readers use. It lists windows, reads the UI element tree, and takes annotated screenshots where every interactive control gets a numbered reference, so it knows exactly which button or field to act on instead of guessing at pixel coordinates.
Permissions it asks for
- Desktop Automation (screen capture) — to take annotated screenshots and read UI trees
- Desktop Automation (input) — to click, type, and scroll in desktop apps (critical-risk, per-action approval)
- File Read — to read source data files the workflow references
Each is permission-gated — Lapu AI asks before it runs.
Just ask
Say it in plain words
No commands to learn. Tell Lapu AI what you want the way you would tell a coworker.
You
Open the spreadsheet at ~/clients.xlsx, then for each row enter the company name and email into the HubSpot new contact form.
You
Take a screenshot of the current app, read the error message, and search for a fix in the browser.
You
Fill out the expense report form in SAP using the data from this CSV file.
Ready to try this workflow?
Download Lapu AI and run it on your own machine. Free to start — see exactly what it looks like first.
- 1-click uninstall
- Cancel anytime
- Files never leave your computer

FAQ
Common questions
Which desktop apps does it work with?
Lapu AI works with any application that exposes accessibility APIs — which includes most modern macOS and Windows apps. It can interact with native apps, Electron apps, web browsers, and system dialogs.
Can it break my apps or click the wrong thing?
Every input action (click, type, scroll, hotkey) is classified as critical-risk. The agent shows you exactly what element it plans to interact with and waits for your approval before each action. You can reject or modify any step.
How does it know what is on screen?
The agent uses native accessibility APIs to read the UI element tree — the same API screen readers use. It also takes annotated screenshots where interactive elements are highlighted with numbered labels. This perception pipeline scores and ranks elements so the agent knows which buttons, fields, and controls are available.
Explore more
Related use cases
Cross-App Workflows
One prompt, multiple apps. The agent reads a spreadsheet, runs a script, pastes results into a presentation, and logs the outcome — all in a single workflow.
See how it worksAutomationBrowser Automation
No more tab-juggling. The agent opens pages, reads content, fills forms, clicks through flows, and pulls the data back into your files — across sites, in a single workflow.
See how it worksProductivityDocument Processing
Process contracts, invoices, and reports without uploading them anywhere. Built-in skills extract, merge, and convert PDF, Word, Excel, and PowerPoint files on your machine.
See how it works


