Token-Efficient Tool Use

Every tool you expose has a context cost before the agent does any work. Database and cloud teams love tool catalogs. There is a script for schema diff, a dashboard for replication lag, a CLI for backups, a Terraform wrapper, a ticket API, and a dozen MCP servers. Connecting all of them feels powerful.

Situation

Database and cloud teams love tool catalogs. There is a script for schema diff, a dashboard for replication lag, a CLI for backups, a Terraform wrapper, a ticket API, and a dozen MCP servers. Connecting all of them feels powerful.

The pattern matters for database, cloud, and platform teams because agents do not operate in a vacuum. They inherit repository rules, tool permissions, deployment workflows, incident history, and the quality of the evidence available to them.

Operating layer	Default approach	Better alternative
Context	Rely on a long prompt or chat history	Give the agent task-specific evidence and rules
Tooling	Expose broad tools and inspect later	Expose narrow tools with clear approval boundaries
Verification	Read the final answer	Check the artifact, trace, and final state

The Problem

Tool abundance can make agents worse. Tool definitions consume context. Raw outputs consume more. The model spends tokens reading tools it will never call and terminal output it does not need.

The practical question is not whether an agent can produce a convincing response. The question is whether the engineering system around that response makes the work observable, reversible, and reviewable.

Failure point	What breaks	Why it matters
Weak boundary	Agent authority is broader than the task	A diagnostic run can become an unsafe change
Missing evidence	The agent cannot cite the state it used	Review becomes opinion instead of verification
No lifecycle	The workflow ends at a message	Ownership, audit, cleanup, and rollback disappear

Context Budgeted Tools

Design tools around intent, not infrastructure inventory. Expose a small set of high-value actions and summaries rather than every low-level API.

flowchart TD
    A[task request — bounded intent] --> B[context budgeted tools — controls]
    B --> C[tool execution — evidence collected]
    C --> D[verification — final state checked]
    D --> E[human handoff — audit retained]

Define the operating boundary.
Write down the task class, allowed tools, environment, data class, and approval mode before the agent runs.
Shape the evidence.
Return compact observations instead of raw dumps. The agent should see enough to reason, but not so much that context is wasted.
Require proof of completion.
Completion should be an artifact or state check: a passing test, a reviewed plan, a valid rollback, a trace, or a linked ticket.

Measure the token footprint of tool definitions, tool outputs, and conversation history. Treat that footprint as a budget with owners.

In Practice

Context: Anthropic’s advanced tool use guidance calls out the token cost of tool definitions and describes patterns for more efficient tool use, including reducing unnecessary context and using tools programmatically. Source: Anthropic, Introducing advanced tool use.

Action: Measure the token footprint of tool definitions, tool outputs, and conversation history. Treat that footprint as a budget with owners.

Result: A smaller, better-described tool surface lets the model spend more context on the task evidence and less on unused affordances.

Learning: Design tools around intent, not infrastructure inventory. Expose a small set of high-value actions and summaries rather than every low-level API. This is a documented pattern or a direct consequence of how the named systems behave, not a fabricated production story.

Where It Breaks

Failure mode	Trigger	Fix
Tool overload	Agent receives every tool in every task	Load tools by task class
Raw dumps	SQL or logs return thousands of lines	Return summarized deltas
Ambiguous names	Agent chooses wrong tool	Use intent-based names
No budget	Context consumption is invisible	Track token cost per workflow

What to Do Next

Problem: Tool abundance can make agents worse. Tool definitions consume context. Raw outputs consume more. The model spends tokens reading tools it will never call and terminal output it does not need.
Solution: Design tools around intent, not infrastructure inventory. Expose a small set of high-value actions and summaries rather than every low-level API.
Proof: A smaller, better-described tool surface lets the model spend more context on the task evidence and less on unused affordances.
Action: Pick one agent workflow and remove every tool that is not needed for its first successful execution path.

The teams that get value from agents will not be the teams with the longest prompts. They will be the teams that turn agent work into a controlled engineering workflow.

Situation

The Problem

Context Budgeted Tools

In Practice

Where It Breaks

What to Do Next

Rajiv

Related Posts

Build vs Buy: The AI Platform Architecture Decision

AI Governance for Engineering Teams: Preventing Shadow AI Spend Without Blocking Innovation

AI Token Cost Overruns: Why AI Coding Assistants Are Becoming the New Cloud Bill Problem