Architecture Review Checklists

SQL Server to PostgreSQL Migration Cost Defense Checklist

A pragmatic checklist to defend the business case for migrating away from Microsoft SQL Server.

#checklist #databases

All Posts

Apr 16, 2026 2 min read

L1 Field Note

SQL Server to PostgreSQL Migration Cost Defense Checklist

A pragmatic checklist to defend the business case for migrating away from Microsoft SQL Server.

#checklist #databases

Jun 26, 2023 13 min read

L3 Reference Guide

Schema Deployment Risk Checklist

Assessing lock type, table size, reversibility, and rollback plan before every schema migration — a structured checklist for zero-downtime deployments.

#databases #checklist #architecture

Mar 18, 2024 10 min read

L3 Reference Guide

Index Debt Review: How to Find Bad, Missing, and Duplicate Indexes

A SQL-driven audit workflow for identifying unused, duplicate, bloated, and missing indexes in PostgreSQL before they drain write performance and storage.

#databases #checklist #failures

Apr 8, 2024 7 min read

L2 Deep Dive

MongoDB Version Upgrade Risk Review

A systematic runbook for assessing MongoDB version upgrade risk — FCV, driver compatibility, deprecated operators, and rollback paths before any production cutover.

#databases #checklist #architecture

May 20, 2024 5 min read

L1 Field Note

Database Security Review for AI Access

Granting an autonomous AI agent access to your database breaks every assumption of traditional RBAC. How to secure databases against unpredictable, unbounded AI queries.

#ai-engineering #databases #checklist

Sep 12, 2024 7 min read

L3 Reference Guide

Cloud Architecture Review Checklist for Database-Backed Applications

Review checklist for database-backed cloud applications: connection saturation, migration locking, retry amplification, and region dependency failures.

#architecture #cloud #databases #failures

May 12, 2025 7 min read

L3 Reference Guide

MongoDB Queryable Encryption Architecture Review

A pre-go-live architecture review for MongoDB Queryable Encryption — key management, field classification, query type constraints, driver requirements, and key rotation.

#databases #architecture #checklist

Feb 6, 2026 4 min read

L1 Field Note

#ai-engineering #architecture #checklist

Agent-to-Agent Review Loops

A practical review pattern where one agent creates a change and specialized agents review risk, rollback, security, and observability.

Sep 14, 2021 7 min read

L2 Deep Dive

Automation Readiness Review: Inputs, State, Permissions, Rollback, and Audit

A five-question checklist before running automation in production: are inputs bounded, is state understood, are permissions scoped, is rollback credible, and is the audit trail durable enough to reconstruct what happened.

Dec 14, 2021 7 min read

L2 Deep Dive

Automation Incident Review: When the Tool Worked and the System Failed

The hardest automation incidents are not broken tools — they happen when every tool executes exactly as asked while the surrounding system loses the ability to evaluate whether that action is still safe.

Mar 8, 2022 7 min read

L2 Deep Dive

Terraform Plan Review: What Senior Engineers Look For

Terraform plan review is not a syntax check — it is the last cheap place to catch a production architecture mistake before an API turns intent into infrastructure. What senior engineers actually look for in a plan output.

Jun 14, 2022 7 min read

L2 Deep Dive

Terraform Module Design Checklist for Database Infrastructure

Database Terraform modules fail when they hide operational decisions behind convenient defaults — a checklist covering parameter groups, backup policies, encryption, and the boundaries that must never be automated away.

Jun 25, 2022 7 min read

L2 Deep Dive

System Design Review Checklist for Senior Engineers

Most system designs fail for reasons visible at review time: overloaded dependencies, ambiguous ownership, unsafe retries, unbounded queues, and missing rollback paths — a checklist senior engineers use to surface those risks early.

Oct 11, 2022 7 min read

L2 Deep Dive

Policy as Code for Terraform: OPA, Sentinel, Checkov, and Human Review

Terraform review fails when humans rediscover the same constraints in every PR — how OPA, Sentinel, and Checkov encode policy gates that catch public storage buckets, unencrypted databases, and missing tags at plan time.

Jan 21, 2023 7 min read

L2 Deep Dive

Azure Database Reliability Review: Failover Groups, Backups, and Geo-Replication

Azure database recovery beyond 'we have backups': failover group cutover, geo-replication lag, and backup restore testing as the real reliability floor.

May 6, 2023 6 min read

L2 Deep Dive

GCP Database Cost Review: Cloud SQL, Spanner, Bigtable, Memorystore, and BigQuery

Cloud SQL, Spanner, Bigtable, Memorystore, and BigQuery each bill differently — cost overruns trace to applying the wrong model to the wrong workload.

Aug 4, 2023 7 min read

L2 Deep Dive

OCI Disaster Recovery Review: Regions, ADs, Backups, Data Guard, and GoldenGate

OCI disaster recovery gaps that emerge when teams rely on regional failover alone, and how Data Guard and GoldenGate address the database replication tier.

Jan 1, 2024 8 min read

L2 Deep Dive

Black Friday Database Readiness: Hot Keys, Connection Pools, Cache Misses, and Queue Depth

Hot key contention, connection pool exhaustion, and cache miss bursts each hit local thresholds before aggregate dashboards show anything alarming.

May 20, 2024 7 min read

L2 Deep Dive

#ai-engineering #architecture

The Harness Around the Agent: How Stripe Runs 1,000 Unattended Code Reviews per Week

Stripe's Minions system runs over a thousand AI code reviews weekly using a fork of an open-source agent. The reliability comes from the deterministic pipeline around it, not the model inside.

Jun 10, 2024 5 min read

L3 Reference Guide

pgcrypto vs KMS vs HSM: Decision Framework

Engineers often over-rotate to Hardware Security Modules (HSMs) for non-regulatory workloads or under-rotate to database extensions. How to map data classification to the right cryptographic tier.

#architecture #cloud #security

Jun 18, 2024 7 min read

L2 Deep Dive

Terraform in CI/CD: Plan, Review, Apply, Lock, and Rollback Boundaries

Terraform in CI/CD requires different gates than application deployments: plan review thresholds, apply lock design, environment promotion, and a rollback boundary that actually works when state diverges.

Aug 13, 2024 7 min read

L3 Reference Guide

Event-Driven Architecture Review: Schema Evolution, Ordering, Replay, and Dead Letters

The four failure boundaries in event-driven systems: schema evolution contracts, ordering guarantees, consumer replay safety, and dead-letter queue handling.

Aug 28, 2024 7 min read

L2 Deep Dive

Service Decomposition Review: When a New Microservice Creates a Worse Database Problem

Splitting a service without relocating the database boundary creates distributed coordination overhead worse than the monolith the split was meant to fix.

Sep 27, 2024 9 min read

L3 Reference Guide

AWS vs Azure vs GCP vs OCI for Database-Backed Systems: Decision Framework

How to choose between AWS, Azure, GCP, and OCI for database-backed systems by matching managed database failure behavior to your system's dominant recovery requirement.

#architecture #cloud #databases

Nov 26, 2024 6 min read

L2 Deep Dive

The Staff Engineer's System Design Review: Questions That Expose Real Risk

Review questions a staff engineer asks to surface cascade failures, missing fallbacks, state boundaries, and load assumptions that design docs bury.

Jan 28, 2025 23 min read

L3 Reference Guide

GitHub Year in Review: 2024 — What Open Source Changed in the Engineering Stack

Nine breakout repositories across three themes — agents that operated computers, RAG that grew a graph spine, and databases that finally spoke natively to LLMs — define what actually shifted in the engineering stack in 2024.

#ai-engineering #architecture #databases #cloud

Oct 14, 2025 7 min read

L2 Deep Dive

AI Agents in Platform Automation: Useful Assistant or Unreviewed Change Engine

When AI agents accelerate platform operations versus when they generate unreviewed changes — the permission boundary and audit design that separates useful from risky.

#ai-engineering #architecture #cloud

Jan 12, 2026 4 min read

L1 Field Note

#ai-engineering #architecture

Outcome-Based Agent Evaluation vs Transcript Review

A field note on why agent evaluation should measure verified state changes instead of polished reasoning traces.

Jan 20, 2026 4 min read

L1 Field Note

#ai-engineering #databases #architecture

Agentic Code Review for Database Repositories

Database repositories contain hidden rules human reviewers know: never add a blocking index at peak hours, never widen IAM without owner approval. Agent review surfaces these violations before merge — without displacing the human judgment that set the rules.

Jan 28, 2026 16 min read

L3 Reference Guide

GitHub Year in Review: 2025 — What Open Source Changed in the Engineering Stack

Nine breakout repos across four themes — MCP protocol adoption, agent memory infrastructure, AI-native platform ops, and database automation — that eliminated the hand-built glue code between AI agents and production systems.

#ai-engineering #architecture #databases #cloud

Jun 12, 2026 4 min read

L1 Field Note