AI agent security: permissions, audit, sandboxing
A desktop AI agent runs real commands on real files. AI agent security is the engineering work that decides whether that is safe. Permission tiers, audit trails, workspace boundaries, and an honest account of what software cannot solve.
The threat model for a desktop AI agent
A chatbot returns text. An agent takes actions. When an agent runs on your desktop, “take an action” can mean deleting a file, sending an email, running a shell command, or paying an invoice. That changes the security question from “can the model say something wrong?” to “what is the worst thing this software can do before I notice?”
Three concrete failure modes drive the design of any serious agent.
- Prompt injection. The agent reads content — a webpage, a PDF, an email, the output of a shell command — and that content contains hidden instructions. OWASP ranks this as the number-one risk for LLM applications in 2025, splitting it into direct (user types something hostile) and indirect (the model reads something hostile embedded in benign content) (OWASP LLM Top 10, 2025). The model cannot reliably tell the two apart.
- Excessive agency. An agent handed sweeping tool access, then left to run in a loop, will eventually do something its operator did not intend. OWASP's mitigation is blunt: scope tools to the minimum needed and require human approval for privileged operations.
- Tool-level vulnerabilities. In May 2026, Microsoft published research showing that two vulnerabilities in the Semantic Kernel agent framework let a successful prompt injection escalate into remote code execution on the host. Microsoft's framing: “AI models aren't security boundaries. The tools you expose define your attacker's affected scope.”
Read those three together. The threat is not that the model goes rogue. The threat is that the model receives bad instructions from data it processes, has too many tools wired up, and one of those tools has a bug. A permission system has to defend against all three. For the full breakdown, see our deep dive on desktop AI permission models.
Permission tiers in detail
The NIST AI Risk Management Framework organises AI risk work into four functions — Govern, Map, Measure, Manage. Translated to a desktop AI agent, the practical question is: which actions does the agent perform freely, and which require an explicit human check? Lapu AI uses four tiers.
| Tier | Examples | Approval |
|---|---|---|
file:read | List files, read a document, take a screenshot | Auto-approved inside workspace; logged |
file:write | Create, rename, or edit a file in the workspace | Auto-approved inside workspace; logged |
file:delete / out-of-workspace write | Delete files, write to ~/Library, registry, system folders | Per-action human confirmation |
shell:run | Execute a shell command; pipes, redirects, sudo flagged | Per-action human confirmation |
browser:control | Drive a browser tab, fill forms, submit pages | Per-action confirmation with preview |
app:control | Drive other desktop apps via accessibility APIs | Per-action confirmation with preview |
Two consequences fall out of this design. First, scope matters more than tier. A file write is reversible inside a project workspace and catastrophic outside it. NVIDIA's AI Red Team explicitly recommends blocking writes outside the workspace to prevent “persistence mechanisms, sandbox escapes, and remote code execution techniques.” Second, approvals are not cached. If the agent earned permission to delete one file an hour ago, that is not a license to delete a different file now.
Audit trail: what is logged, and for how long
Every action Lapu AI takes is recorded as a row in a local audit log. The log answers four questions for any past action: what happened, when, on what input, and why the agent did it. Concretely each row captures:
- Timestamp. ISO 8601, local timezone.
- Action type. One of
file:read,file:write,file:delete,shell:run,net:fetch,app:control. - Arguments. Full file path, full shell command, destination URL. No truncation.
- Triggering prompt. The user instruction (or up-stream agent step) that led to this action.
- Outcome. Succeeded, rejected by user, errored, or cancelled.
Logs are stored on your machine. Default retention is 90 days, configurable. Logs never leave your machine in the Free, Premium, and Max tiers. Teams and Enterprise plans can export to a customer-managed sink — SIEM, S3, GCS — for longer retention required by compliance regimes.
Sandboxing: what Lapu contains, what the OS contains
Lapu AI runs as a desktop application, not a kernel module. The permission system enforces what the agent can do at the application layer; the operating system enforces the rest. The honest split looks like this.
What Lapu AI contains. Process isolation between the renderer, the agent loop, and the tool executors. The agent cannot bypass its own permission checks to call a tool the user has not approved. Shell commands run in a subprocess that the agent cannot replace at runtime. File operations resolve symlinks before checking workspace boundaries, so a symlink trick cannot escape the workspace.
What the OS contains. macOS System Integrity Protection blocks writes to system folders even if Lapu is compromised. Windows User Account Control blocks privilege elevation. App sandbox entitlements scope what files Lapu can see at all. These are the real backstops if the agent layer is bypassed.
What no software can contain. A determined prompt injection attack delivered through hostile content the user instructed the agent to process. Lapu cannot fully prevent this — it can only contain blast radius through per-action approvals, workspace boundaries, and the audit trail. Anyone who tells you their agent is “immune to prompt injection” is selling marketing.
How to configure permissions in Lapu AI
- Choose a workspace. On first run, point Lapu AI at the folder you want it to work in. File writes default to inside this workspace; anything outside escalates to a per-action confirmation.
- Start with read-only. Leave reversible write and external action tiers locked. Run a few read-only tasks first — list files, summarise a document, take a screenshot — and watch the audit panel populate.
- Approve writes per-action. When you ask for a task that needs to write, the agent surfaces a prompt showing the exact path, the action, and whether it is reversible. Approve once, approve for the session, or reject.
- Review the audit trail. Open the audit panel after any session. Every read, write, shell command, and network call is logged with a timestamp and the prompt that triggered it. Export or clear as needed.
- Revoke when you are done. Permissions are not sticky. Close the project, revoke session approvals from settings, and the agent re-prompts the next time it tries the same action.
How Lapu's model compares to other agent security designs
Different agents make different bets about where to draw the security boundary. Each has trade-offs.
| Model | Boundary | Trade-off |
|---|---|---|
| OS-level admin grant | User runs agent with full admin rights | Maximum capability, no scope — one prompt injection ruins your machine |
| Anthropic computer use | Model capability; deployer runs it in a VM | Strong isolation if VM is used; most users will not set that up |
| OpenAI Operator browser sandbox | Remote browser in OpenAI's cloud | Contains browser-level damage; cannot reach your files at all |
| Lapu AI permission model | Local agent + per-action approvals + workspace boundary | Reaches real files and shell; relies on user reading prompts |
These are not all the same product. A remote browser agent and a desktop file-and-shell agent solve different problems with different threat models. The right comparison is not “which is safest” in the abstract — it is which security design matches the work you need an agent to do. If you want a head-to-head on the browser-agent flavour, see Lapu AI vs OpenAI Operator.
For enterprise: compliance, audit, SSO
The same permission and audit machinery that protects an individual user also covers the compliance ask from a security review. Two angles matter for buyers.
Audit trail as a compliance artefact. Every action an agent took on a regulated workstation can be reconstructed from the local audit log. On Teams and Enterprise plans, logs can be shipped to a customer-managed sink for retention periods longer than 90 days — SOC 2 evidence collection, HIPAA access logging, internal forensics. The audit format is documented and stable.
Identity and access. Teams and Enterprise plans support SSO and SAML, so the agent inherits the identity boundary the rest of the workstation already enforces. Combined with the workspace boundary, this scopes what the agent can touch to what the user is already authorised to touch. See pricing for the full Teams and Enterprise feature set, and Lapu AI for developers for the engineer-facing view.
Related reading
Frequently asked questions
- Is it safe to give an AI agent access to my computer?
- It can be, but only if the agent uses a permission model with explicit scope. The safe pattern is read-only access by default, per-action confirmation for any write, delete, or external action, and an audit trail you can replay. An agent without those three things is not safe to install on a machine that holds your real work.
- What is the biggest security risk with AI agents?
- Prompt injection. OWASP ranks it as the number-one risk for LLM applications in 2025. The agent reads content — a webpage, a PDF, an email — and that content contains hidden instructions. The model cannot reliably distinguish those instructions from the user's. No filter is perfect, which is why the mitigation is human-in-the-loop confirmation for any consequential action, not better filtering.
- How long does Lapu AI retain audit logs?
- Lapu AI's audit trail is stored locally on your machine. Default retention is 90 days, configurable in settings. Logs are never uploaded to Lapu AI servers. On Teams and Enterprise plans, logs can be exported to a customer-managed sink for compliance retention longer than 90 days.
- Can Lapu AI escape its sandbox and do something I did not approve?
- Lapu AI cannot perform a write, delete, shell command, or external action outside the permission model — those actions are gated at the agent layer. It is not, however, a kernel-level sandbox. A determined prompt-injection attack combined with a vulnerable tool could in principle escalate. Defense in depth — workspace boundaries, per-action approvals, audit trail, and the underlying OS protections (System Integrity Protection on macOS, UAC on Windows) — is the practical mitigation.
- How does Lapu AI compare to Anthropic computer use or OpenAI Operator on security?
- Anthropic's computer use is a model capability, not a product — its safety guidance tells developers to run it in a dedicated VM with allowlisted internet, which most users will not do. OpenAI Operator runs in a remote browser sandbox in OpenAI's cloud, which contains browser-level damage but cannot touch your files. Lapu AI runs locally with per-action approvals and a complete audit trail. Different threat models — pick the one that matches your work.
- What does an AI agent audit trail actually record?
- In Lapu AI, every action gets a row: the timestamp, the action type (file read, file write, shell command, network call), the exact arguments (full file path, full shell command), the prompt that triggered it, and the outcome (succeeded, rejected, errored). You can replay any session and reconstruct what the agent did.
Try the permission model on your own machine
The fastest way to evaluate AI agent security is to install one and run a destructive task on a throwaway folder. Watch the prompts. Read the audit trail. Decide whether the trade-offs make sense for your work.
Put your busywork on autopilot
Lapu AI handles the repetitive work between you and outcomes. One desktop agent, zero tab-switching. Available now on macOS and Windows.
Create a free account. Download in under a minute.

