AI Agents Need Database Guardrails Below the Prompt
The strategic mistake is treating an artificial intelligence agent prompt as the safety boundary when the database is the only boundary that actually fails closed.
Situation
Model Context Protocol (MCP) is becoming the standard way for coding agents to reach real systems: files, ticket queues, cloud APIs, observability backends, and databases. The default pattern is convenience first: give the agent a credential, tell it what not to do, and hope the tool permission dialog catches the exciting parts.
The production pattern has to be different. A Postgres-connected agent should be treated as a new workload class with its own role, schema, network path, connection budget, and audit trail.
| Approach | Control boundary | Failure behavior |
|---|---|---|
| Prompt-only guardrail | Model instruction | Fails open when the agent misinterprets context |
| Shared app credential | Application role | Agent inherits production write power |
| Dedicated read-only path | Database, MCP server, network | Destructive SQL fails mechanically |
| Sanitized view schema | Database object model | Sensitive columns are never readable |
The Problem
The PocketOS incident, publicly reported in April 2026, is the case study everyone now quotes: coverage from SC Media, TechSpot, and others says a Cursor agent running Claude deleted a Railway production database volume and associated volume-level backups in seconds after encountering a staging credential problem and finding a broadly scoped token. The interesting part is not whether the model “knew better.” The interesting part is that the infrastructure accepted the action.
| Failure point | What breaks | Why it matters |
|---|---|---|
| Shared credentials | The agent can perform every action the human or app role can perform | A single mistaken tool call can become a production change |
| Prompt-only policy | “Do not delete production” remains advisory text | The model can violate instructions while still producing a plausible explanation |
| Read-only without resource limits | Expensive SELECT queries still run | A read-only agent can create cache pressure, replica lag, connection starvation, and painful incident calls |
| Raw table access | SELECT * FROM users exposes password hashes, tokens, emails, and support notes | Confidentiality risk survives even when write risk is removed |
| Unscoped MCP config | One repository can reach unrelated databases | A billing debugging session should not have a path to auth, payroll, or production support data |
| Missing audit identity | Agent queries look like ordinary developer traffic | During an incident, “who ran this query” becomes archaeology with worse lighting |
Postgres will do exactly what its privileges allow. MCP will expose exactly what the configured server exposes. The agent will then synthesize actions from instructions, tool metadata, database rows, and prior context.
The core question is simple: what is the smallest database surface an agent needs to be useful, and what hard stop prevents it from doing anything else?
Put the Guardrails Below the Agent
The right architecture is not “trust the coding assistant.” The right architecture is a constrained database access path where every layer reduces blast radius before the model sees a tool.
flowchart TD
Human[engineer — review and approve] --> Agent[AI coding agent — MCP client]
Agent --> MCP[MCP Postgres server — read only tools]
MCP --> Role[Postgres role — select only]
Role --> Views[view schema — sanitized columns]
Views --> Replica[read replica — bounded workload]
Replica --> Audit[logs — agent workload]
Primary[primary database — no agent path] --> Audit
- Create a dedicated role that owns nothing.
CREATE ROLE mcp_readonly
WITH LOGIN
PASSWORD 'use-a-real-password-here'
CONNECTION LIMIT 4
NOBYPASSRLS;
GRANT CONNECT ON DATABASE appdb TO mcp_readonly;
GRANT USAGE ON SCHEMA agent_safe TO mcp_readonly;
GRANT SELECT ON ALL TABLES IN SCHEMA agent_safe TO mcp_readonly;
ALTER DEFAULT PRIVILEGES IN SCHEMA agent_safe
GRANT SELECT ON TABLES TO mcp_readonly;
Verification: connect as mcp_readonly and confirm DELETE, UPDATE, CREATE TABLE, DROP TABLE, and TRUNCATE all fail.
- Put the agent behind views, not raw application tables.
Expose agent_safe.customer_summary, not public.users. Expose ticket counts, order status, schema metadata, and non-sensitive operational fields. Keep password hashes, access tokens, session IDs, payment identifiers, private notes, and large free-text blobs out of the readable schema. If row-level security is used, remember that Postgres table owners and roles with BYPASSRLS bypass policies unless explicitly handled; the documentation calls this out for a reason.
Verification: run \dp agent_safe.* and check that the MCP role has SELECT only on the view schema, not the base tables.
- Enforce read-only transactions in the MCP server.
A Postgres role should deny writes, and the MCP server should also issue queries inside read-only transactions. PostgreSQL documents that a read-only transaction disallows INSERT, UPDATE, DELETE, MERGE, CREATE, ALTER, DROP, GRANT, REVOKE, TRUNCATE, and write-bearing EXPLAIN ANALYZE paths. That is a real control because the database engine rejects the command.
Verification: ask the agent to run a harmless destructive test against a non-production table and confirm the error is a database error, not a model apology.
- Put time, connection, and idle limits on the role.
ALTER ROLE mcp_readonly SET statement_timeout = '30s';
ALTER ROLE mcp_readonly SET idle_in_transaction_session_timeout = '60s';
ALTER ROLE mcp_readonly SET lock_timeout = '2s';
Read-only is not read-cheap. A generated SELECT count(*) FROM event_log on a multi-hundred-million-row table can still evict useful pages, burn input and output, and hold snapshots long enough to annoy vacuum. On a hot primary, that is not a philosophical problem. It is an incident with nicer SQL.
Verification: run SELECT pg_sleep(45); as the role and confirm statement_timeout cancels it.
- Scope MCP configuration per project and keep secrets out of the repository.
Commit .mcp.json only when it contains command paths and server names, not credentials. Keep database passwords or cloud IAM material under a user-owned config directory with mode 600. For production-adjacent access, prefer a read replica reachable only over VPN, private networking, or an SSH tunnel.
Verification: run git grep -n "postgres://\|password\|DATABASE_URL\|mcp_readonly" and confirm no secret-bearing MCP config is committed.
- Make the agent observable as its own workload.
Set a distinct role name, set application_name if the MCP server supports it, sample slow statements, and dashboard the role separately. PostgreSQL logging can include user, database, client address, application name, and query identifiers depending on configuration. That is the difference between debugging the agent and guessing around it.
Verification: query pg_stat_activity while the agent runs and confirm the role, database, client address, and current query are visible.
In Practice
The documented pattern is not “add one more confirmation dialog.” It is to make the dangerous action unreachable before the agent gets creative.
Public reporting on PocketOS describes a short chain: the agent hit a staging credential mismatch, found a broadly scoped token, called Railway, and deleted the production database volume together with volume-level backups. SC Media’s brief reports the credential mismatch, broad API token, Railway delete path, and production volume deletion. TechSpot’s report adds the operational lesson that backups in the same failure path did not behave like an independent recovery boundary.
That chain maps cleanly to database controls:
| Incident action | Hard boundary that should stop it | Why the boundary matters |
|---|---|---|
| Agent finds a broad production token | Project-scoped MCP config and no secret-bearing repo files | The agent cannot use credentials it cannot read |
| Agent reaches production infrastructure from a staging task | Network and project scoping | A staging workflow should not have a route to production database deletion |
| Agent attempts destructive data action | Dedicated read-only database role plus read-only transactions | The database rejects writes even if the model selects the wrong tool |
| Agent can inspect raw operational data | Sanitized views and column-level grants | The useful context is available without exposing tokens, hashes, notes, or unrelated tenant data |
| Agent’s queries blend into normal traffic | Dedicated role and application_name | Incident response can identify the workload without reconstructing intent from chat logs |
PostgreSQL’s privilege model is the first source of truth here. The PostgreSQL privileges documentation defines permissions such as SELECT, INSERT, UPDATE, DELETE, TRUNCATE, CREATE, CONNECT, and USAGE as database privileges. It also states that the right to modify or destroy an object is inherent in ownership. So the agent role should not own tables, should not inherit owner roles, and should receive only CONNECT, schema USAGE, and SELECT on a narrow view schema.
PostgreSQL’s transaction access mode gives a second hard stop. The official SET TRANSACTION documentation says read-only transactions disallow the write and definition-changing statements that matter for this risk class, including INSERT, UPDATE, DELETE, MERGE, CREATE, ALTER, DROP, GRANT, REVOKE, and TRUNCATE. The same page is explicit that this is a high-level access mode and does not prevent all disk activity. That is why read-only has to be paired with statement_timeout, connection limits, lock limits, and preferably a replica.
Row-level security is useful, but it is not magic. The PostgreSQL row security documentation says row security defaults to denying access when enabled without a policy, but also says superusers, roles with BYPASSRLS, and table owners can bypass row security. That is the operational reason for NOBYPASSRLS, non-owner roles, exact-credential testing, and sanitized views when the real concern is confidentiality rather than tenant routing.
Anthropic’s own Claude Code security documentation makes the same point from the client side. The security page says Claude Code uses strict read-only permissions by default, asks for explicit permission for actions such as editing files and running commands, requires trust verification for first-time codebases and new MCP servers, and uses fail-closed matching for unmatched commands. It also says users are responsible for reviewing proposed commands, and that Anthropic reviews connectors for listing criteria but does not security-audit or manage every MCP server. Translation: client permissions are useful friction. They are not a substitute for database privileges, network isolation, credential scoping, and backup separation.
Where It Breaks
| Failure mode | Trigger | Fix |
|---|---|---|
| Replica lag spike | Agent runs broad scans on a physical replica under PostgreSQL 15 or later | Use statement_timeout, query allowlists for expensive tools, and replica lag alerts tied to the agent role |
| Confidentiality leak | Agent can read raw users, sessions, api_keys, or support note tables | Grant only sanitized views or column-level SELECT; keep sensitive fields unreachable |
| Lock annoyance | Agent issues SELECT ... FOR SHARE, extension-backed functions, or long EXPLAIN ANALYZE | Deny unsafe tools, set lock_timeout = '2s', and restrict functions executable by the role |
| RLS bypass | Agent role owns tables, is superuser, or has BYPASSRLS | Use a non-owner NOBYPASSRLS role and test visibility with the exact MCP credential |
| Connection starvation | MCP server pool is too large for a small Postgres instance or PgBouncer pool | Cap CONNECTION LIMIT, cap MCP pool size, and reserve production app connections |
| Prompt injection through rows | User-controlled text tells the agent to reveal other rows or call another tool | Treat database content as untrusted input, isolate tools by project, and prevent sensitive data from being readable |
| False sense of safety | Agent connects to primary with read-only SQL but unrestricted table access | Use a replica, view schema, audit logging, and workload limits together |
| Audit gap | All queries arrive as a generic developer or app role | Dedicated role, application_name, slow query sampling, and retention for generated SQL |
What to Do Next
- Problem: AI agents connected to databases turn ordinary credentials into autonomous operational power.
- Solution: Put controls below the prompt: read-only role, read-only transactions, scoped MCP config, sanitized views, network boundaries, independent backups, and workload limits.
- Proof: The validation signal is mechanical failure:
DELETE,UPDATE,CREATE, andDROPmust fail when executed through the exact agent path. - Action: This week, create one non-production MCP Postgres profile against a read replica or disposable database, then run the destructive-command test before allowing access to anything that matters.
The agent can be helpful at the database layer, but only after the database has been made stubborn enough to survive the agent.