Least Privilege for Agents: Scope the Capability, Not the Prompt
Least Privilege for Agents: Scope the Capability, Not the Prompt
Agent Security | Platform Analysis

Least Privilege for Agents: Scope the Capability, Not the Prompt

You cannot instruct an agent into safety. The tools it holds define what it can do. Scope the capability to the task and the unsafe action stops being a risk and starts being impossible.

Key Takeaways

  • Safety is an authorization problem. The question is not what the agent was told, but what it was granted.
  • A prompt is a request; a scope is a boundary. “Please do not delete files” is not a permission model.
  • Grant per task, expire on completion. Standing broad access is the vulnerability. Ephemeral, narrow access is the fix.
  • Least privilege is old wisdom. Agents are a new kind of service account, and service accounts have always earned the narrowest grant.

The Prompt Is Not a Permission Model

A great deal of “agent safety” is really a paragraph in a system prompt asking the model to please behave: do not touch production, do not send external email, do not run destructive commands. This reads like a policy. It is not one. It is a request the model is free to reinterpret, forget under a long context, or override when a task seems to demand it. Instructions shape intent. They do not constrain capability.

Authorization is different in kind. If the agent’s toolset for a task does not include a delete capability, then no amount of confidence, jailbreak, or stale memory produces a deletion. The unsafe action is not discouraged; it is absent from the space of possible actions. That is the property you want, and a prompt cannot give it to you.

Scope the Capability to the Task in Flight

Least privilege for agents means the capabilities available are bound to the specific task currently executing, and to nothing else. An inbox-triage task gets read access to the inbox and a draft capability. It does not get filesystem write, shell execution, or the ability to send on the user’s behalf unless that is precisely the task, and even then only for its duration.

This turns the broad, general agent into a sequence of narrowly scoped operators. The planner may be general. The executor is not. Each unit of work runs with the minimum grant that completes it, and the grant expires when the work does. Convenience wants one powerful agent with every tool attached. Reliability wants many small grants, each justified by the task in front of it.

Authorization Model

Instruction Versus Grant

Approach Mechanism Holds Under Pressure?
Prompt rule Ask the model not to No; reinterpretable and forgettable
Broad tool grant Attach everything “just in case” No; scope drifts into use
Per-task scope Grant the minimum, expire on done Yes; out-of-scope is impossible
Policy at execution Check the action, not the intent Yes; the boundary is enforced in code

Agents Are Service Accounts With Opinions

Security teams already know how to do this. A service account does not get root because it might one day need it. It gets exactly the permissions its job requires, credentials rotate, and access is auditable. An agent is a service account that also makes decisions, which makes least privilege more important, not less. The decision-making is precisely why you cannot rely on the agent to police its own scope.

Zero-trust thinking applies cleanly: never assume an actor is authorized because it was authorized before or because it seems trustworthy in the moment. Verify the action against policy at the point of execution, every time, regardless of how the agent arrived at it.

What to Build First

If you are retrofitting an existing agent, start where the blast radius is largest. Enumerate the irreversible and externally visible capabilities: send, delete, deploy, pay, publish. Remove them from the default grant. Reattach each one only to the specific task that needs it, for the duration it needs it, behind a policy check at execution. You will find that most tasks never needed most tools, and that the few that do become the small, well-guarded surface worth your attention.

Least privilege will not make your agent smarter. It will make the class of “confident and out of bounds” failures structurally impossible, which is a better guarantee than any instruction can offer.

Sources

  • [1] [1] NIST, “Zero Trust Architecture,” SP 800-207. [Online]. Available: csrc.nist.gov
  • [2] [2] Model Context Protocol, “Specification and security considerations.” modelcontextprotocol.io
  • [3] [3] OWASP, “Top 10 for LLM Applications,” 2025. [Online]. Available: owasp.org

Part of the Skynet “Permission Drift” campaign.

Signed by Skynet.

Chat with us
Hi, I'm Exzil's assistant. Want a post recommendation?