Natural Language SQL Agents Need Database Guardrails

The dangerous part of a natural-language SQL agent is not bad SQL. It is authority compilation: a sentence from a user becomes a database operation unless the system proves, before execution, which role, rows, columns, cost, endpoint, and business definitions the query is allowed to touch.

Situation

PostgreSQL chat agents are moving from demos into operational workflows: fraud review, support analytics, compliance pulls, finance close checks, customer health reports. The production pattern is not the chat interface. It is the control plane around database authority.

Default approach	Production approach
Prompt goes to LLM, LLM writes SQL, workflow runs it	Prompt becomes an authorized analytical request, SQL is generated, parsed, bounded, executed, audited, and summarized
Agent connects as a broad application user	Agent connects through a read-only role scoped to curated views
Safety lives in prompt instructions	Safety lives in PostgreSQL privileges, row-level security, SQL parsing, timeouts, execution policy, and audit records
Results are trusted because the query ran	Results are checked against definitions, row counts, tenant scope, freshness, truncation, and expected shape

A workflow stack using Crafted AI Framework, n8n, CopilotKit, Supabase, Slack, and PostgreSQL can be useful. The source pattern is attractive: natural-language request, generated PostgreSQL query, n8n workflow execution, CopilotKit-style summarization, and delivery to a UI or channel.

That is the easy part.

The harder question is: what happens when the user asks a plausible question that maps to an expensive, unauthorized, stale, or semantically wrong query?

The Problem

Natural-language SQL fails in production because language is flexible and databases are literal. “Show anomalous transactions in Q3” sounds harmless until the agent scans a large event table on the primary writer, omits the tenant predicate, reads restricted columns through broad credentials, and sends a confident summary to Slack.

Failure point	What breaks	Why it matters
PostgreSQL role design	Agent connects as an app owner, migration user, Supabase service role, or another role with broad grants	`SELECT` becomes only the visible part of authority; the same credentials may read sensitive columns, bypass RLS, or run write statements
SQL generation	LLM emits `SELECT *`, missing tenant filters, broad joins, ambiguous dates, unbounded detail queries, or `ORDER BY` on non-indexed expressions	A syntactically valid query can be operationally wrong, expensive, or unauthorized
PostgreSQL planner behavior	A generated query can choose a sequential scan, hash join, nested loop, or large sort based on predicates and statistics	The agent does not know that its “simple report” just became an OLTP workload problem
Row-level security	Policies apply only when enabled and evaluated for the role actually executing the query	Authorization bugs move from application code into database policy, where silent under-filtering is easy to miss
Workflow automation	Webhooks, schedules, and retries repeatedly trigger the same bad query	A single bad prompt becomes recurring workload
Result summarization	CopilotKit or another summarizer compresses rows into prose	The final answer can hide missing filters, partial results, timeout truncation, replica lag, or policy caveats

The core question is not “Can the agent write SQL?” The core question is “Can the system prove that the generated SQL is authorized, bounded, explainable, and cheap enough to run before PostgreSQL sees it?”

Architecture Problem

The architectural tension is that natural language and database authority operate on incompatible principles.

Natural language is designed to be flexible, contextual, and forgiving. “Show me the risky transactions last quarter” is meaningful to a human even without knowing which table, which column definition of risk, which fiscal calendar, which tenant, or how expensive the query is. The speaker expects the listener to resolve ambiguity gracefully.

Database authority is designed to be precise, bounded, and unforgiving. PostgreSQL does not interpret intent. It executes exactly what it receives: the role determines what can be read, the SQL determines what is read, and once a query runs, the cost and data exposure have already occurred.

A naive SQL agent architecture collapses these two systems directly: user text goes to a model, the model emits SQL, and that SQL runs. This architecture fails in production not because the model is incompetent but because the authority boundary is wrong. The model is solving a language problem. The authority problem requires a different layer.

The architecture problem is: how do you insert a control plane between language and authority that is narrow enough to be safe, without being so narrow that it is useless?

Design Options

Three common approaches exist, and each trades safety against capability differently.

Option	Description	Safety mechanism	Failure mode
Prompt-only guardrails	LLM is instructed not to write dangerous queries	Model compliance	Any prompt injection, jailbreak, or training gap can bypass it
Application-layer validation	Middleware checks SQL for banned patterns before execution	Regex and keyword matching	Multi-statement tricks, schema aliases, and edge-case syntax bypass string checks
Database-native boundaries + control plane	PostgreSQL role, RLS, views, parser gate, planner check, read-only execution, timeouts	Database engine and abstract syntax tree	Requires upfront investment; does not protect against slow but valid queries unless planner bounds are set

Option A: Prompt-only is appropriate for demos and internal low-risk tools where the SQL touches only non-sensitive read data and the blast radius of a wrong query is low. It should never be used in production with customer data, production credentials, or any write path.

Option B: Application-layer validation adds a middleware filter that scans SQL for DROP, DELETE, INSERT, and similar keywords. This is stronger than a prompt, but still weak: PostgreSQL syntax has too many legitimate variations and aliases to reliably block dangerous patterns with strings. String-based SQL validation fails open under adversarial pressure.

Option C: Database-native + control plane is the only production-grade approach. It eliminates reliance on model compliance or string matching by enforcing authority at the layer that cannot be bypassed: the PostgreSQL role model, the AST parser, the transaction mode, and the execution endpoint.

Tradeoff Matrix

Dimension	Prompt-only	App-layer validation	Database-native control plane
Setup time	Minutes	Hours	Days
Authority enforcement	Model compliance only	Partial — string matching	Database engine — cannot be bypassed
Write protection	Advisory	Partial	Enforced
PII exposure risk	High	Partial	Low — views and column grants
Load isolation	None	None	Enforced by endpoint routing and timeouts
Prompt injection resistance	None	Low	High — model output cannot grant authority
Compliance defensibility	None	Low	High — role grants and RLS are auditable
Right for	Demos, internal tools	Low-risk read workflows	Customer data, production, regulated contexts

Build a SQL Agent Control Plane

The right architecture puts the LLM behind a policy boundary. The model may propose SQL. It does not decide whether the SQL is safe.

flowchart TD
    User[User question] --> Intake[request intake — identity and purpose]
    Intake --> Catalog[semantic catalog — approved metrics and views]
    Catalog --> Generator[LLM SQL generator]
    Generator --> Parser[SQL parser — inspect query tree]
    Parser --> Policy[policy gate — tables columns tenant and limits]
    Policy -->|approved query| Planner[PostgreSQL explain check]
    Policy -->|rejected query| Repair[repair prompt with policy error]
    Repair --> Generator
    Planner -->|acceptable cost| Replica[read replica or analytics endpoint]
    Planner -->|too expensive| Reject[reject with safer query shape]
    Replica --> Validator[result validator — shape and scope]
    Validator --> Summarizer[LLM report composer]
    Summarizer --> Delivery[Slack email dashboard or UI]
    Validator --> Audit[audit log — prompt query user result metadata]

The architecture has six controls. Skip any one of them and the agent has more authority than you think.

Constrain the data surface before prompting the model.

Do not expose base tables such as transactions, customers, accounts, or payments directly. Create approved views such as analytics_agent.agent_fraud_transactions_v1 and analytics_agent.agent_customer_activity_daily_v1. These views should encode allowed columns, masking rules, joins, freshness expectations, and business definitions such as “high-risk country” or “Q3 fiscal calendar.”

A useful view is boring on purpose:
```
CREATE SCHEMA IF NOT EXISTS analytics_agent;

CREATE VIEW analytics_agent.agent_fraud_transactions_v1
WITH (security_barrier = true) AS
SELECT
    t.tenant_id,
    t.transaction_id,
    t.user_id,
    t.amount_cents,
    t.transaction_at,
    t.destination_country,
    rc.risk_level,
    rc.definition_version AS risk_definition_version
FROM app.transactions t
JOIN app.risk_countries rc
    ON rc.country_code = t.destination_country
WHERE t.deleted_at IS NULL;
```
PostgreSQL security_barrier views matter because user-supplied predicates are not always innocent. PostgreSQL documents that view conditions are evaluated before user-added conditions for security-barrier views, with leakproof-function caveats (PostgreSQL 16 CREATE VIEW). That does not make a view a complete security system, but it makes predicate ordering part of the access design instead of an accident.

Verification:
```
SELECT grantee, table_schema, table_name, privilege_type
FROM information_schema.role_table_grants
WHERE grantee = 'agent_reader'
ORDER BY table_schema, table_name, privilege_type;
```
Then connect as the runtime role and confirm it has SELECT only on approved views:
```
psql "$AGENT_DATABASE_URL" -c "\dp analytics_agent.*"
```

Use PostgreSQL privileges and RLS as the first hard boundary.

PostgreSQL row-level security restricts which rows are visible once row security is enabled. The documentation also states that table owners normally bypass row security unless FORCE ROW LEVEL SECURITY is set, and roles with BYPASSRLS bypass it (PostgreSQL 16 RLS). Supabase has the same operational warning in another form: service keys can bypass RLS and should not be exposed to customers or browsers (Supabase RLS docs).

For agent access, ownership, application runtime, and agent querying should be separate roles:

CREATE ROLE agent_reader NOLOGIN;
CREATE ROLE agent_runtime LOGIN PASSWORD 'use-secret-manager';

GRANT agent_reader TO agent_runtime;

REVOKE ALL ON SCHEMA app FROM agent_reader;
REVOKE ALL ON ALL TABLES IN SCHEMA app FROM agent_reader;

GRANT USAGE ON SCHEMA analytics_agent TO agent_reader;
GRANT SELECT ON analytics_agent.agent_fraud_transactions_v1 TO agent_reader;

ALTER ROLE agent_runtime SET statement_timeout = '5s';
ALTER ROLE agent_runtime SET lock_timeout = '500ms';
ALTER ROLE agent_runtime SET idle_in_transaction_session_timeout = '10s';
ALTER ROLE agent_runtime SET default_transaction_read_only = on;
ALTER ROLE agent_runtime SET work_mem = '16MB';

If tenant isolation is handled through RLS or session context, test the exact runtime role:

BEGIN READ ONLY;
SET LOCAL app.tenant_id = '42';

SELECT count(*)
FROM analytics_agent.agent_fraud_transactions_v1
WHERE tenant_id = current_setting('app.tenant_id')::bigint;

COMMIT;

Verification should compare at least three perspectives: table owner, application role, and agent role. The agent role is the one that matters.

Parse generated SQL before execution.

A regex that blocks DELETE is theater. Parse the query into an abstract syntax tree and inspect statement type, referenced relations, selected columns, functions, joins, predicates, LIMIT, comments, and statement count. For PostgreSQL-specific syntax, use a parser tied to PostgreSQL grammar, such as libpg_query, which exposes the PostgreSQL parser outside the server (pganalyze libpg_query).

The policy should reject multi-statement input before relying on database timeouts. PostgreSQL 16 documents that statement_timeout applies to each statement in a simple-query message, and that behavior changed from versions before PostgreSQL 13 (PostgreSQL 16 client defaults). That version detail matters: a control plane that accepts SELECT ...; DROP ...; and hopes timeout saves it has already failed.

The rejection suite should include at least these cases:
```
DELETE FROM app.transactions WHERE tenant_id = 42;

SELECT * FROM app.customers;

SELECT email, card_number
FROM analytics_agent.agent_fraud_transactions_v1;

SELECT *
FROM analytics_agent.agent_fraud_transactions_v1
WHERE amount_cents > 1000000;

SELECT pg_sleep(30);

SELECT *
FROM analytics_agent.agent_fraud_transactions_v1;
DROP TABLE app.transactions;
```
Verification: dangerous prompts should produce blocked SQL, not “best effort” repairs that silently weaken the policy.
Run planner checks before execution.

PostgreSQL EXPLAIN (FORMAT JSON) returns the selected plan without executing the statement. PostgreSQL also notes that planner decisions depend on up-to-date pg_statistic data (PostgreSQL 16 EXPLAIN). Treat planner checks as a guardrail, not as proof.

Example policy:
```
{
  "max_estimated_rows": 1000000,
  "max_total_cost": 250000,
  "forbid_seq_scan_on": [
    "app.transactions",
    "app.events",
    "app.audit_log"
  ],
  "require_limit_for_detail_queries": true,
  "max_limit": 5000
}
```
Use EXPLAIN without ANALYZE in the preflight path. EXPLAIN ANALYZE executes the statement, which defeats the purpose of a pre-execution gate.
Execute on isolated read capacity.

Natural-language analytics should not run on the primary writer unless the dataset is small and the blast radius is understood. Amazon RDS documents PostgreSQL read replicas as read-only instances used to scale read traffic (RDS PostgreSQL read replicas). Aurora reader endpoints provide connection balancing for read-only connections across reader instances, with the caveat that if a cluster has no Aurora Replicas the reader endpoint connects to the primary instance (Aurora reader endpoint).

Verification should be explicit:
```
SHOW transaction_read_only;
SELECT pg_is_in_recovery();
```
In ordinary PostgreSQL physical replicas, pg_is_in_recovery() returns true on a standby. In managed services, also verify the endpoint label and deployment topology because the connection string is part of the architecture.

Make audit records useful for replay.

Logging “user asked a question” is not enough. A production audit record should let a reviewer reconstruct the request, policy decision, query, plan, execution boundary, and delivered answer.

{
  "request_id": "req_01j...",
  "user_id": "user_12345",
  "tenant_id": "42",
  "source": "copilot_ui",
  "natural_language_prompt": "Show transactions over $10,000 in Q3 2025 for user 12345 and flag high-risk countries",
  "semantic_definitions": {
    "quarter": "calendar_quarter_v1",
    "risk_country": "risk_country_v2"
  },
  "generated_sql_hash": "sha256:...",
  "approved_sql_hash": "sha256:...",
  "referenced_relations": [
    "analytics_agent.agent_fraud_transactions_v1"
  ],
  "policy_decision": "approved",
  "policy_version": "sql_agent_policy_2026_05_23",
  "postgres_role": "agent_runtime",
  "execution_endpoint": "reader",
  "statement_timeout_ms": 5000,
  "estimated_rows": 840,
  "returned_rows": 3,
  "result_truncated": false,
  "replica_lag_ms": 1200,
  "delivered_to": "slack:fallback-review-channel"
}

A minimal guardrail policy looks like this:

Control	Example policy	Failure behavior
Statement type	Allow one `SELECT` statement only	Reject
Relation access	Allow `analytics_agent.*` views only	Reject
Column access	Block raw `email`, `ssn`, `card_number`, `access_token`, `address`	Reject
Tenant scope	Require `tenant_id = current_setting('app.tenant_id')` or enforce through RLS	Reject
Row bound	Require `LIMIT <= 5000` unless aggregate-only	Rewrite or reject
Time bound	Require date predicate for event tables over 10 million rows	Reject
Planner bound	Reject estimated rows over 1 million or total cost over policy threshold	Reject
Execution bound	`READ ONLY`, `statement_timeout`, `lock_timeout`, read endpoint	Cancel or reject
Summary bound	Require row count, filter statement, definition versions, and truncation status	Withhold summary

The uncomfortable detail: the LLM should not be asked to remember these controls. It should be allowed to fail against them.

In Practice

This is not a private case study. It follows from documented PostgreSQL behavior, Supabase security guidance, and public cloud database design.

Documented behavior or decision	Production lesson
PostgreSQL read-only transactions disallow `INSERT`, `UPDATE`, `DELETE`, `MERGE`, DDL, `TRUNCATE`, and other write-oriented commands, with documented exceptions and caveats (PostgreSQL 15 SET TRANSACTION)	A prompt instruction saying “never modify data” is weaker than a transaction mode that refuses write statements
PostgreSQL RLS applies policies once row security is enabled, but table owners normally bypass row security unless forced, and `BYPASSRLS` roles bypass it (PostgreSQL 16 RLS)	Agent isolation belongs in the database role model, not only in application middleware
Supabase service keys can bypass RLS and are intended for administrative server-side use, not exposed clients (Supabase RLS docs)	A database agent should not run with Supabase service-role authority unless it is performing an explicitly administrative workflow
PostgreSQL `security_barrier` views affect when view predicates are evaluated relative to user-supplied predicates, with leakproof-function caveats (PostgreSQL 16 CREATE VIEW)	Curated views are not just developer convenience; they are part of the access boundary for agent-generated predicates
PostgreSQL `statement_timeout` is measured from command arrival through completion and, since PostgreSQL 13, applies separately to each statement in a simple-query message (PostgreSQL 16 client defaults)	The parser must reject multiple statements; timeout policy is not a substitute for statement-shape validation
PostgreSQL `idle_in_transaction_session_timeout` terminates sessions idle inside an open transaction, and the docs note that open transactions can prevent cleanup of recently dead tuples (PostgreSQL 16 client defaults)	A chat workflow that starts a transaction and waits on an external LLM call can contribute to bloat if timeout policy is missing
Amazon RDS documents PostgreSQL read replicas as read-only instances for scaling read traffic (RDS PostgreSQL read replicas)	Analytical agent traffic should be isolated from the write path before recurring workflows depend on it
Aurora reader endpoints balance read-only connections across reader instances when replicas exist (Aurora reader endpoint)	The database endpoint is an architectural control, not a deployment detail

I have not run the exact Crafted AI Framework plus n8n plus CopilotKit stack at scale personally. The documented failure mode is still clear: any system that turns user language into PostgreSQL queries must defend against overbroad authority, expensive plans, ambiguous definitions, stale reads, and misleading summaries.

The production pattern is to split query authoring from query authority. The LLM authors a candidate. PostgreSQL, the parser, the policy engine, and the workflow orchestrator decide whether that candidate deserves execution.

For the source example, the user asks:

Show transactions over $10,000 in Q2 2025 for user ID 12345 and flag high-risk countries.

A weak agent might produce this:

SELECT
    t.*,
    c.risk_level
FROM transactions t
JOIN countries c ON t.destination_country = c.country_code
WHERE t.user_id = 12345
  AND t.amount > 10000
  AND t.date BETWEEN '2025-04-01' AND '2025-06-30'
  AND c.risk_level = 'high';

This query should be rejected, even though it looks close. It references base tables, uses SELECT *, relies on ambiguous money units, omits tenant binding, uses an inclusive date boundary on a likely timestamp column, relies on unversioned risk definitions, and has no explicit row bound.

A guarded system should repair it into a query against an approved surface:

SELECT
    transaction_id,
    user_id,
    amount_cents,
    transaction_at,
    destination_country,
    risk_level,
    risk_definition_version
FROM analytics_agent.agent_fraud_transactions_v1
WHERE tenant_id = current_setting('app.tenant_id')::bigint
  AND user_id = 12345
  AND amount_cents > 1000000
  AND transaction_at >= TIMESTAMPTZ '2025-04-01 00:00:00+00'
  AND transaction_at <  TIMESTAMPTZ '2025-07-01 00:00:00+00'
  AND risk_level = 'high'
ORDER BY amount_cents DESC
LIMIT 500;

The validation result should be explicit:

Check	Result	Reason
Statement type	Pass	Single `SELECT`
Relation allowlist	Pass	Uses `analytics_agent.agent_fraud_transactions_v1`
Base table access	Pass	No direct `app.*` relation
Sensitive columns	Pass	No raw email, card number, token, or address fields
Tenant scope	Pass	Binds to `current_setting('app.tenant_id')`
Time scope	Pass	Half-open Q3 UTC range
Row bound	Pass	`LIMIT 500`
Planner check	Pass or reject	Based on `EXPLAIN (FORMAT JSON)` policy thresholds
Execution endpoint	Pass	Reader connection only
Summary contract	Pass	Must include filters, definitions, row count, and truncation status

The workflow output should not only say “3 transactions over $10,000 detected.” It should include the query boundary:

Q2 2025 was interpreted as 2025-04-01 through 2025-06-30 UTC. High-risk country came from risk_country_v2. Results were limited to tenant 42, user 12345, and 500 rows. The query returned 3 rows from the reader endpoint. No causal explanation was inferred from these rows.

That is not verbosity. That is evidence.

A useful workflow looks like this:

Stage	Input	Output	Control
User request	Natural-language question	Structured intent	Require authenticated user, tenant context, and purpose
Semantic lookup	“Q3 2025”, “high-risk country”, “transactions”	Approved metric and view definitions	Use catalog definitions, not model memory
SQL generation	Structured intent and schema subset	Candidate SQL	Prompt includes only approved views
SQL validation	Candidate SQL	Approved or rejected query	Parser enforces allowlist, predicates, and limits
Plan check	Approved query	Plan JSON	Reject large scans, unsafe joins, and high-cost plans
Execution	Final SQL	Rows or aggregate result	Read-only role, read endpoint, timeout, lock timeout
Result validation	Rows plus metadata	Validated result envelope	Check row count, truncation, tenant scope, and freshness
Summarization	Validated result envelope	Report	Include filters, row count, definitions, and caveats
Audit	Prompt, SQL, user, plan, result metadata	Immutable log	Support review, replay, and incident analysis

A basic PostgreSQL harness should be part of the release checklist:

-- Must fail: no base table access
SET ROLE agent_runtime;
SELECT count(*) FROM app.transactions;

-- Must fail: no write path
BEGIN READ ONLY;
DELETE FROM analytics_agent.agent_fraud_transactions_v1 WHERE tenant_id = 42;
ROLLBACK;

-- Must pass: approved view and bounded tenant context
BEGIN READ ONLY;
SET LOCAL app.tenant_id = '42';
SELECT transaction_id
FROM analytics_agent.agent_fraud_transactions_v1
WHERE tenant_id = current_setting('app.tenant_id')::bigint
ORDER BY transaction_at DESC
LIMIT 10;
COMMIT;

-- Must be inspected before execution in the control plane
EXPLAIN (FORMAT JSON)
SELECT transaction_id
FROM analytics_agent.agent_fraud_transactions_v1
WHERE tenant_id = current_setting('app.tenant_id')::bigint
ORDER BY transaction_at DESC
LIMIT 10;

This is the difference between a demo and an operating surface: the negative tests are as important as the happy path.

Where It Breaks

Failure mode	Trigger	Fix
The agent omits tenant scope	User asks a broad question, schema includes `tenant_id`, prompt does not force tenant binding	Enforce tenant scope through RLS or reject SQL missing the required tenant predicate
The query is read-only but still harmful	`SELECT count(*)` or a broad join scans a large event table on the writer	Route to a replica, require date predicates, set `statement_timeout`, and block high-cost plans from `EXPLAIN (FORMAT JSON)`
RLS gives false confidence	Policy exists, but the agent executes as table owner, a `BYPASSRLS` role, or a Supabase service role	Test access as the exact runtime role; avoid service-role credentials for user-scoped analytics
Views leak more than intended	A curated view includes sensitive columns, unsafe functions, or unclear predicate behavior	Keep views narrow, use `security_barrier` where appropriate, and test selected columns through the agent role
`LIMIT` hides correctness bugs	Agent adds `LIMIT 100` to satisfy policy but summarizes as if the result is complete	Require the report to state row limits and total count strategy; use aggregates for counts and samples for inspection
Replica lag creates stale answers	Agent reads from an asynchronous replica during incident response or fraud review	Include replica lag in result metadata; route freshness-critical questions to a dedicated bounded primary path
SQL parser and database version drift	Parser supports a different PostgreSQL grammar than the server executes	Pin parser support to the database major version; reject unsupported syntax rather than falling back to string checks
n8n retries multiply load	Workflow retry policy repeats a timeout-heavy query after transient failures	Add idempotency keys, exponential backoff, per-user rate limits, and query fingerprint throttling
LLM call happens inside a transaction	Workflow opens a transaction, calls the model, and waits while the database session sits idle	Generate and validate before `BEGIN`; set `idle_in_transaction_session_timeout` anyway
Summarizer invents explanation	Result table has sparse evidence, but the LLM describes causality or risk with high confidence	Give the summarizer only rows, schema definitions, and allowed explanation patterns; separate observation from interpretation
Business terms drift	“High risk,” “active user,” or “Q3” changes across finance, fraud, and product teams	Store definitions in a semantic catalog with versioned names such as `risk_country_v2` and `fiscal_quarter_calendar_v1`

The version-specific gotcha worth repeating is parser and server drift. PostgreSQL syntax and timeout behavior change across major versions. If the validation service parses a different dialect than the server executes, the safety layer can reject valid queries, accept wrong assumptions, or fail open under pressure. A SQL agent control plane should fail closed. Annoying users is cheaper than explaining why an assistant queried outside its boundary.

What to Do Next

Problem: A natural-language SQL agent concentrates risk because it converts ambiguous user intent into executable database authority.
Solution: Put the LLM behind a control plane with curated views, PostgreSQL roles, RLS, SQL parsing, planner checks, read-only execution, timeouts, endpoint isolation, result validation, and audit logs.
Proof: The first validation signal is a rejection suite where dangerous prompts produce blocked SQL and every approved query has a stored prompt, query, plan, role, timeout, row count, freshness marker, and delivery target.
Action: This week, build one read-only agent role that can query only two approved views, then add a parser gate that rejects writes, cross-schema reads, missing tenant scope, sensitive columns, multi-statement input, and unbounded selects.

A database agent is production-ready only when the least interesting part of the system is the chat box.

Situation

The Problem

Architecture Problem

Design Options

Tradeoff Matrix

Build a SQL Agent Control Plane

In Practice

Where It Breaks

What to Do Next

Rajiv

Related Posts

The Stack for AI-Accelerated Database Operations Is Now Open Source

Stop Writing Ad-Hoc Queries: Build a Skill Backbone for Your DB Engineering Workflows

Top GitHub Breakouts: March 2026 — Agent Adaptation and Production-Scale Vector Search