Retrieval architecture across pgvector, hybrid search, embeddings, GPU databases, and the operational boundaries around AI database access.
14 postsDatabases
Who This Is For
Engineers building RAG pipelines, semantic search, or recommendation systems who need to make real architectural decisions — index type, embedding model, hybrid vs pure vector, and when to move off Postgres.
What You Will Be Able to Do
Choose between pgvector, Qdrant, Pinecone, and Weaviate based on query pattern and scale
Understand HNSW vs IVFFlat tradeoffs and when each index type breaks down
Design hybrid search that combines BM25 and vector scores without making recall worse
Estimate memory and throughput requirements before committing to a vector DB deployment
Prerequisites
You know what an embedding is and have at least experimented with semantic search. Familiarity with PostgreSQL is helpful.
How CPU, GPU, and TPU architectures differ in ways that matter for databases and AI workloads — and which compute class to reach for when adding vector search, embedding generation, or GPU-accelerated analytics.
How CPU, GPU, and TPU architectures differ in ways that matter for databases and AI workloads — and which compute class to reach for when adding vector search, embedding generation, or GPU-accelerated analytics.
How pgvector adds vector storage and similarity search to PostgreSQL, what the three distance operators do, and the index you must create before you hit 100K rows.
A practical, DBA-friendly explanation of why modern analytical databases are increasingly using GPUs for scans, joins, aggregations, and AI-adjacent workloads.
A DBA-friendly walkthrough of how modern GPU databases execute large analytical SQL queries using columnar storage, parallel scans, and GPU aggregation.
Why PostgreSQL and MySQL use B-trees while Cassandra and RocksDB use LSM trees — the read/write tradeoff that determines which storage engine fits your workload.
Isolating the OCI Autonomous Transaction Processing write path from catalog and analytics load using GoldenGate replication and Object Storage offloading.
Cloud cost triage across compute, storage, data transfer, logs, and managed services — a repeatable workflow for finding runaway spend before the bill arrives.
The second wave of March 2026 breakouts: an agent that learns from every conversation, a Rust vector index that outperforms FAISS at a fraction of the memory, and a Kubernetes-native agent control plane.