pgvector Basics: Embeddings Inside PostgreSQL

pgvector lets you store and query embeddings directly in PostgreSQL — no separate vector database required. The extension is straightforward to install and the SQL surface is small. What catches engineers is that PostgreSQL will silently fall back to a full sequential scan if you never create a vector index, and at 10K rows that’s fine, but at 1M rows it’s unusable.

Situation

Embedding-based search has moved from ML research into standard backend work. Any feature that does semantic search, recommendations, or RAG retrieval needs to store embedding vectors and query them by similarity. The default answer for the past few years was to reach for a dedicated vector database — Pinecone, Weaviate, Qdrant. That’s still reasonable for pure vector workloads at scale. But for teams already running PostgreSQL, adding a second operational system for vectors means new infrastructure, new credentials, a second backup strategy, and cross-system consistency problems when the embedding and the source document live in different stores.

pgvector, a PostgreSQL extension maintained on GitHub at pgvector/pgvector, adds a native vector column type and three index strategies to an existing Postgres instance. If your application already runs on PostgreSQL and your vector search latency requirements are in the tens-of-milliseconds range rather than single-digit milliseconds, pgvector lets you keep vectors and metadata in the same rows, under the same ACID guarantees, queried with the same SQL you already write.

The Problem

Engineers discover pgvector, install it in an afternoon, add a vector(1536) column to an existing table, and populate it with OpenAI embeddings using text-embedding-ada-002. The first few similarity queries are fast. They ship the feature. Six months later, the table has grown to several hundred thousand rows and those queries are timing out.

The root cause is almost always the same: no index was created on the vector column. PostgreSQL’s query planner has no way to prune a vector search geometrically without an index, so it scans every row and computes the distance to the query vector one row at a time. At 10K rows a sequential scan takes milliseconds. At 1M rows it takes seconds. The extension documentation on the pgvector GitHub README is explicit about this — approximate nearest-neighbor indexes are required for large datasets — but the requirement is easy to miss when the extension works so well at small scale.

The core question this post answers: what do you need to set up correctly on day one so that pgvector stays fast as data grows?

Core Concept

flowchart TD
  App[Application] --> Query[SQL Query with Embedding]
  Query --> PG[PostgreSQL — pgvector extension]
  PG --> Planner[Query Planner]
  Planner --> CheckIndex{Vector Index Exists}
  CheckIndex -->|No| SeqScan[Sequential Scan]
  SeqScan --> ComputeAll[Compute Distance for Every Row]
  CheckIndex -->|Yes| IndexScan[HNSW or IVFFlat Index Scan]
  IndexScan --> ComputeApprox[Approximate Nearest Neighbor Search]
  ComputeAll --> Results[Return Top K Results]
  ComputeApprox --> Results

Installation. pgvector ships as a standard PostgreSQL extension. On most managed cloud databases (Amazon RDS, Google Cloud SQL, Supabase, Neon) it’s already available. On a self-managed Postgres instance, install from the pgvector GitHub repository or via your distro’s package manager, then run:

CREATE EXTENSION IF NOT EXISTS vector;

That’s the full installation step. No daemon, no separate service.

Column type and table shape. pgvector adds a vector(n) column type where n is the number of dimensions. OpenAI’s text-embedding-ada-002 model produces 1536-dimensional vectors; text-embedding-3-small and text-embedding-3-large use variable dimensions configurable at generation time with 1536 as a common default. A minimal embeddings table looks like:

CREATE TABLE documents (
  id       bigserial PRIMARY KEY,
  content  text       NOT NULL,
  embedding vector(1536)
);

Inserting a row with an embedding means passing the vector as a string literal or using a client library that serializes it for you:

INSERT INTO documents (content, embedding)
VALUES ('The query planner chooses scan strategies based on statistics.', '[0.021, -0.008, 0.034, ...]');

The three distance operators. pgvector exposes three similarity operators, each suited to different use cases:

Operator	Name	When to use
`<->`	L2 (Euclidean) distance	General-purpose; works on raw or normalized vectors
`<=>`	Cosine distance	Text embeddings; robust to vectors of different magnitudes
`<#>`	Negative inner product	Normalized vectors only; fastest to compute

A cosine similarity query — “return the 5 documents most semantically similar to this query embedding” — looks like:

SELECT id, content, embedding <=> '[0.021, -0.008, 0.034, ...]' AS distance
FROM documents
ORDER BY distance
LIMIT 5;

For text embeddings, <=> (cosine) is the safe default. It is magnitude-insensitive, which matters because embedding models do not guarantee that all vectors will have the same norm.

Index types. Without an index, every query above is a full sequential scan. pgvector supports two approximate nearest-neighbor index types:

Index	Build cost	Query recall	Memory use	Good for
IVFFlat	Lower	Tunable (lists parameter)	Lower	Datasets that change infrequently; faster to build
HNSW	Higher	Higher by default	Higher	Datasets that are queried heavily; better recall at same speed

For an initial deployment, IVFFlat is simpler. The lists parameter divides the vector space into clusters; a good starting value is sqrt(row_count). A minimal IVFFlat index on cosine distance:

CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);

For HNSW:

CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);

At datasets below roughly 10K rows, a sequential scan will often outperform an approximate index because the index lookup overhead isn’t amortized. At 100K rows and beyond, the index becomes necessary. There is no harm in creating the index early.

In Practice

The pgvector GitHub README documents the full operator and index syntax. The project is maintained at pgvector/pgvector on GitHub and the README is the authoritative source for supported Postgres versions, operator names, and index parameter ranges.

OpenAI’s embeddings API documentation specifies that text-embedding-ada-002 produces 1536-dimensional vectors. That dimension count is a fixed constraint — the vector(n) column type enforces an exact match, and a query embedding with a different dimension count will return a PostgreSQL type error at runtime. This is a documented behavior of the pgvector type system, not an edge case.

The documented behavior of PostgreSQL’s query planner is that without a vector index, the planner will perform a sequential scan and compute all distances. EXPLAIN ANALYZE on a similarity query against an unindexed column will show Seq Scan in the plan. Adding an IVFFlat or HNSW index causes the planner to switch to an index scan for large enough datasets — observable directly in the EXPLAIN output.

The documented pattern for vector deployments is to implement index assertions in CI to prevent regressions. Because pgvector will silently fall back to a sequential scan if the vector index is invalid or dropped, automated tests running EXPLAIN against a sample dataset ensure that the planner selects an Index Scan rather than a Seq Scan before code reaches production.

Where It Breaks

Scenario	What breaks	Why
No index at scale	Similarity queries time out above ~100K rows	PostgreSQL falls back to sequential scan, computing all pairwise distances in memory
Dimension mismatch	Type error at query time	pgvector enforces exact dimension count; query embedding must match column definition
Cosine similarity on non-normalized vectors	Unexpected result rankings	Cosine distance accounts for angle only; two vectors with very different magnitudes can rank highly even when semantically distant if norms are unequal — use `<=>` not `<#>` unless you normalize at insertion time

What to Do Next

Problem: pgvector silently uses a sequential scan on unindexed vector columns, so similarity queries that are fast at development scale become unusable in production without a code change.
Solution: Create an IVFFlat or HNSW index on the vector column at table creation time, using vector_cosine_ops for text embeddings; verify with EXPLAIN ANALYZE that the planner uses the index.
Proof: Run EXPLAIN ANALYZE on your similarity query — the plan should show Index Scan using ... on documents rather than Seq Scan.
Action: This week, add the CREATE INDEX ... USING hnsw statement to your schema migration for any table with a vector column, and add a EXPLAIN assertion to your staging smoke test so index regression is caught before it reaches production.

Situation

The Problem

Core Concept

In Practice

Where It Breaks

What to Do Next

Rajiv

Related Posts

Vectorless RAG Patterns for Database Knowledge Systems

The Stack for AI-Accelerated Database Operations Is Now Open Source

Stop Writing Ad-Hoc Queries: Build a Skill Backbone for Your DB Engineering Workflows