The Three-Layer Agent Infrastructure Stack for Database Operations (April 2025)

Building an AI agent for database operations — one that validates migrations, answers schema questions, or walks engineers through recovery procedures — requires three infrastructure layers that most teams don’t have pre-assembled: a workflow framework that handles multi-step logic, an observability system to debug the agent in production, and an inference serving layer that scales under concurrent load. April 2025 shipped production-quality open-source solutions for all three in the same month.

Situation

Database teams that want to automate operations using AI agents face a build-first problem: the tooling to write agent logic, observe what agents do in production, and serve the inference workload at scale has historically required assembling multiple independent systems. Google’s Agent Development Kit (ADK), VoltAgent, and llm-d each address one of these three layers. ADK v0.1.0 launched April 9, 2025 at Google Cloud Next; llm-d entered CNCF sandbox the same month; VoltAgent reached GitHub in April 2025.

The Problem

The infrastructure gaps that block database teams from shipping their first agent:

Infrastructure gap	What breaks	Why it matters
No agent framework with workflow support	Multi-step operations require custom state machines	Agent logic becomes unmaintainable as workflows grow beyond 3-4 steps
No agent observability	Agents that fail in production are opaque — no trace of tool call, context, or model input	Debugging production agent failures takes hours without structured traces
Dev inference server in production	Single vLLM instance can’t handle concurrent agent requests at real load	Agents time out under realistic multi-user workload
No routing intelligence	All requests go to the same model instance regardless of cache state	Prefix cache misses on repeated system prompts; latency stays high

The question for a database team building its first agent: is there now an open-source path to all three layers without building the infrastructure independently?

The Three-Layer Agent Stack for Database Teams

These projects form a complete agent infrastructure:

flowchart TD
    DBAgent[database operations agent]
    DBAgent --> LogicLayer[agent workflow and task coordination]
    DBAgent --> ObsLayer[production observability and debugging]
    DBAgent --> InfraLayer[scalable LLM inference on Kubernetes]
    LogicLayer --> ADK[Google ADK v0.1.0 — multi-agent workflow runtime]
    ObsLayer --> VoltAgent[VoltAgent — observability console and evals]
    InfraLayer --> llmd[llm-d — Kubernetes-native distributed inference]
    ADK --> Outcome1[multi-step DB agent logic without custom state machines]
    VoltAgent --> Outcome2[trace every agent decision in production]
    llmd --> Outcome3[inference scales to concurrent agent load]

Google ADK — Agent Workflow Framework

The problem it solves: Multi-step database operations — retrieve schema, evaluate migration safety, route to approval workflow, execute or reject — require an agent that can compose steps, delegate to sub-agents, and support human-in-the-loop pauses. Building this as custom code produces brittle state machines. ADK provides multi-agent composition through a subagent delegation model.

Google released ADK v0.1.0 on April 9, 2025 at Google Cloud Next under Apache 2.0. According to the v0.1.0 release notes, the initial release shipped: multi-agent support, tool authentication, rich tool support including MCP, callback support, built-in code execution, asynchronous runtime, and experimental live/bidirectional agent support. Multi-agent coordination in the v0.x releases uses subagent delegation — a parent agent routes tasks to specialized sub-agents declared at construction time.

from google.adk import Agent

schema_review = Agent(
    name="schema_review",
    model="gemini-2.5-flash",
    instruction="Review the DDL. Flag any DROP, TRUNCATE, or destructive column type changes.",
)

migration_agent = Agent(
    name="migration_agent",
    model="gemini-2.5-flash",
    instruction=(
        "Coordinate schema review before executing migrations. "
        "If schema review flags destructive changes, stop and report — do not proceed."
    ),
    sub_agents=[schema_review],
)

The ADK web interface (adk web path/to/agents_dir) was available from v0.1.0 and provides a browser-based UI for testing agents during development — a meaningful reduction in friction for iterating on database agent logic before production deployment.

Where it breaks: ADK v0.x was an early-stage release. The project shipped weekly versions in April–May 2025 (v0.1.0 through v0.5.0), each carrying breaking changes. Teams that built on an early 0.x version should check the release notes before upgrading. The multi-agent subagent API is different from the graph-based Workflow API that shipped in later major versions — any migration will require rewriting agent composition code.

VoltAgent — Agent Observability and Operations

The problem it solves: An agent running against a database in production is opaque without structured observability. When an agent produces a wrong schema recommendation or calls the wrong tool, you need structured traces — which tool was invoked, what context the model received, what decision was made, and why. VoltAgent provides this observability layer.

According to the project README, VoltAgent consists of two components: an open-source TypeScript framework and VoltOps Console (available as cloud-hosted or self-hosted). The framework provides Memory, RAG, Guardrails, Tools, MCP support, and a Workflow Engine. VoltOps Console adds Observability, Automation, Deployment, Evals, Guardrails, and Prompt management for production agent operations. Multi-agent systems are supported, with supervisor coordination between specialized agents.

For a database operations agent, the observability layer is the production-critical component: when an agent produces incorrect output, structured traces from VoltOps Console allow debugging the decision chain rather than replaying the interaction from scratch or adding ad-hoc logging.

import { createAgent } from "@voltagent/core";

const dbOpsAgent = createAgent({
  name: "db-ops-agent",
  instructions: "You are a database operations assistant. Help engineers with schema questions and query optimization.",
  tools: [schemaLookupTool, queryExplainTool, runbookSearchTool],
  memory: { provider: "in-memory" },
});
// VoltOps Console traces every tool call, model input, and decision

Where it breaks: VoltOps Console’s self-hosted deployment adds operational overhead. The project README describes it as “cloud or self-hosted” but does not detail the self-hosted infrastructure requirements in the repository. Teams that need full observability without cloud dependencies should verify the self-hosted deployment footprint against their infrastructure before adopting. The framework itself is MIT-licensed and self-contained; the observability console is the component that requires external deployment decisions.

llm-d — Kubernetes-Native Distributed LLM Inference

The problem it solves: A database operations agent serving multiple engineers concurrently needs an inference layer that scales. A single vLLM instance handles a few concurrent requests; production agent workloads need intelligent routing, KV-cache management across instances, and autoscaling tied to real inference signals.

llm-d is a CNCF sandbox project, co-founded by Red Hat, Google Cloud, IBM Research, CoreWeave, and NVIDIA according to the project README. It provides distributed LLM serving on Kubernetes as an orchestration layer above model servers (vLLM or SGLang). According to the README, llm-d’s four core capabilities are: intelligent routing (prefix-cache-aware and load-aware request balancing), advanced KV-cache management (tiered offloading to CPU or disk with global indexing), large-model serving via prefill/decode disaggregation, and SLO-aware autoscaling based on real-time inference signals. An OpenAI-compatible Batch API is documented for asynchronous large-scale inference jobs.

helm repo add llm-d https://llm-d.github.io/charts
helm install llm-d-inference llm-d/llm-d \
  --set model.name=meta-llama/Llama-3.1-8B-Instruct \
  --set inference.replicaCount=3

The README documents Helm charts and benchmarked deployment recipes (“well-lit path guides”) for common hardware and model combinations. These provide a baseline for teams deploying specific model sizes without running their own performance characterization from scratch.

Where it breaks: llm-d is optimized for Kubernetes deployments with GPU accelerators. It requires an existing cluster with GPU node pools — teams without that infrastructure will need to provision it before llm-d adds value. For database teams running small-scale agents where a single GPU instance handles the request volume, the Kubernetes operational overhead is not warranted until agent workload requires horizontal scaling. CNCF sandbox status indicates early-stage evaluation, not production maturity equivalent to Incubating or Graduated CNCF projects.

In Practice

All claims above come from the respective project READMEs. Items to verify before relying on these:

ADK v0.1.0 through v0.5.0 were each 0.x releases with breaking changes between minor versions. The features described — multi-agent subagent delegation, MCP tool support, async runtime, built-in code execution — are from the v0.1.0 release notes and have been verified against the official GitHub release. The subagent API described here reflects the 0.x era; ADK’s composition model changed significantly in later major versions. Check the ADK docs for the version you are installing.

VoltAgent’s open-source TypeScript framework is available under MIT license at the documented npm package (@voltagent/core). VoltOps Console is described as “cloud or self-hosted” — cloud pricing and self-hosted requirements are on the VoltAgent website, not in the project README. Teams should verify both before committing to the platform for production observability.

llm-d’s co-founding institutions (Red Hat, Google Cloud, IBM Research, CoreWeave, NVIDIA) are listed in the project README. CNCF sandbox acceptance is a documented fact; it indicates a project in active early development with CNCF oversight, not a project that has passed the maturity bar of CNCF Incubating or Graduated status.

Where It Breaks

Failure mode	Trigger	Fix
ADK 0.x breaking changes between minor versions	Each 0.x release carried API changes in April–May 2025	Pin to a specific 0.x version in requirements.txt; upgrade only after reviewing the release notes for each intermediate version
VoltOps Console self-host complexity	Team needs observability without cloud dependency	Verify self-hosted deployment requirements; consider cloud tier for initial adoption
llm-d K8s prerequisite	No GPU node pool in existing cluster	Start with single-node vLLM for low-concurrency workloads; add llm-d when horizontal scaling is needed
Agent debugging without observability	Complex ADK workflows produce opaque failure traces	Integrate VoltOps from the first production deployment — retrofitting observability is harder
llm-d model server version lock	llm-d pinned to specific vLLM or SGLang versions	Review llm-d release notes before upgrading the underlying model server

What to Do Next

Problem: Database operations agents require three pre-assembled infrastructure layers — workflow framework, production observability, and scalable inference — that most teams are starting from scratch on.
Solution: Google ADK (v0.1.0+) for agent workflow logic and multi-agent composition, VoltAgent for production observability and evals, llm-d for Kubernetes-native inference serving at concurrent load.
Proof: Build a single-step ADK agent that accepts a slow query log entry and returns an index recommendation. If the agent returns a useful recommendation consistently, you have validated the ADK layer — then add VoltOps observability before exposing the agent to a second engineer.
Action: This week, install google-adk (pip install google-adk) and run adk web against a minimal schema Q&A agent. The built-in browser UI was available from v0.1.0 and provides enough feedback to iterate on agent logic before VoltAgent observability is needed for production use. Check the ADK release notes for the Python version requirement of the version you are installing.

Situation

The Problem

The Three-Layer Agent Stack for Database Teams

Google ADK — Agent Workflow Framework

VoltAgent — Agent Observability and Operations

llm-d — Kubernetes-Native Distributed LLM Inference

In Practice

Where It Breaks

What to Do Next

Rajiv

Related Posts

Build vs Buy: The AI Platform Architecture Decision

AI Governance for Engineering Teams: Preventing Shadow AI Spend Without Blocking Innovation

AI Token Cost Overruns: Why AI Coding Assistants Are Becoming the New Cloud Bill Problem