log in
consulting hosting industries the daily tools about contact

pgvector: Embedding Search Inside Postgres, No Extra DB Required

I was skeptical about skipping a dedicated vector DB. After shipping pgvector on two real projects, I'm convinced it's the right call for most teams.

The pitch for a dedicated vector database sounds reasonable until you realize you already have a perfectly good database sitting there. pgvector is a Postgres extension that adds a vector column type and similarity search operators, and it's good enough that I've stopped reaching for Pinecone or Weaviate by default.

I want to be clear about what "good enough" means, because that phrase does a lot of work. It means: fast enough for most apps, operationally simple, and inside the transactional boundary you already trust.

What pgvector Actually Solves

The problem is this: you have text (documents, product descriptions, support tickets, whatever), you embed it into a high-dimensional float vector using a model like text-embedding-3-small, and now you need to find the most semantically similar rows to a query vector. Cosine similarity, not keyword matching.

The naive approach — pulling all vectors into PHP and doing the math yourself — falls apart fast. Even 10,000 rows with 1536-dimensional embeddings is 60MB of floats you're moving across the wire every query. That's not a plan.

A dedicated vector DB solves this by doing the search server-side, but now you have two systems to operate, two sources of truth to keep in sync, and two failure modes to page you at 2am. For a lot of the projects I build — healthcare portals, internal tools, e-commerce search — that operational overhead is the wrong trade.

pgvector moves the similarity search into Postgres where it belongs, next to the rest of your data. You get joins. You get WHERE clauses. You get transactions. You get pg_dump. It's just SQL.

Installation and Setup

If you're on a managed Postgres host (RDS, Supabase, Railway, Render), pgvector is likely already available. Enable it:

CREATE EXTENSION IF NOT EXISTS vector;

Then add a vector column to your table. I'll use a Laravel migration:

<?php

use Illuminate\Database\Migrations\Migration;
use Illuminate\Database\Schema\Blueprint;
use Illuminate\Support\Facades\Schema;
use Illuminate\Support\Facades\DB;

return new class extends Migration
{
    public function up(): void
    {
        Schema::create('documents', function (Blueprint $table) {
            $table->id();
            $table->string('title');
            $table->text('content');
            $table->timestamps();
        });

        // Blueprint doesn't know about vector columns, so we do it raw.
        DB::statement('ALTER TABLE documents ADD COLUMN embedding vector(1536)');

        // IVFFlat index for approximate nearest-neighbor search.
        // lists = roughly sqrt(row count). Tune after you have real data.
        DB::statement('
            CREATE INDEX documents_embedding_ivfflat_idx
            ON documents
            USING ivfflat (embedding vector_cosine_ops)
            WITH (lists = 100)
        ');
    }

    public function down(): void
    {
        Schema::dropIfExists('documents');
    }
};

The dimension count (1536) must match your model. OpenAI's text-embedding-3-small is 1536. Their text-embedding-3-large is 3072. Ada 002 was also 1536. Get this wrong and Postgres will reject the insert.

Storing and Querying Embeddings in Laravel

I'm using the OpenAI PHP client and plain PDO-style bindings here. There's no Laravel-native pgvector package I'd trust in production yet — the casting layer is thin enough that raw queries are fine.

<?php

namespace App\Services;

use Illuminate\Support\Facades\DB;
use OpenAI\Laravel\Facades\OpenAI;

class DocumentSearchService
{
    /**
     * Embed and store a document.
     */
    public function store(string $title, string $content): void
    {
        $response = OpenAI::embeddings()->create([
            'model' => 'text-embedding-3-small',
            'input' => $content,
        ]);

        $vector = $response->embeddings[0]->embedding; // float[]
        $vectorLiteral = '[' . implode(',', $vector) . ']';

        DB::statement(
            'INSERT INTO documents (title, content, embedding, created_at, updated_at)
             VALUES (?, ?, ?::vector, now(), now())',
            [$title, $content, $vectorLiteral]
        );
    }

    /**
     * Find the k most similar documents to a query string.
     *
     * @return array<int, object>
     */
    public function search(string $query, int $k = 5): array
    {
        $response = OpenAI::embeddings()->create([
            'model' => 'text-embedding-3-small',
            'input' => $query,
        ]);

        $vector = $response->embeddings[0]->embedding;
        $vectorLiteral = '[' . implode(',', $vector) . ']';

        return DB::select(
            'SELECT id, title, content,
                    1 - (embedding <=> ?::vector) AS similarity
             FROM documents
             WHERE embedding IS NOT NULL
             ORDER BY embedding <=> ?::vector
             LIMIT ?',
            [$vectorLiteral, $vectorLiteral, $k]
        );
    }
}

The <=> operator is cosine distance. Lower is closer, which is why I do 1 - (embedding <=> ...) to get a similarity score between 0 and 1. The other operators are <-> (L2/Euclidean) and <#> (negative inner product). For text embeddings, cosine is almost always what you want.

Notice I'm binding $vectorLiteral twice. That's intentional — Postgres can't use the index if you alias the expression, so you need the literal in the ORDER BY too. Yes, it's repetitive. Yes, it matters for performance.

The Gotchas That Actually Bit Me

The index doesn't help until you probe it correctly. IVFFlat divides your vectors into lists clusters at index creation time. At query time, it searches ivfflat.probes clusters (default: 1). One probe is fast but misses results. I set this at the session level before queries that need recall:

DB::statement('SET ivfflat.probes = 10');

Higher probes = better recall, slower queries. I've settled on 10 for most setups. Profile your own data.

Build the index after you load your data. IVFFlat needs to see real rows to build meaningful clusters. If you create the index on an empty table and then insert 100k rows, the index is built on nothing and performs like garbage. Bulk load first, index second. I learned this the hard way on a biotech client's document corpus — search results were inexplicably bad for a week before I figured out why.

HNSW is the better index if you're on pgvector 0.5.0+. IVFFlat was the only option early on. HNSW (hierarchical navigable small world) has better recall, no need to probe-tune, and doesn't require bulk loading before indexing. The tradeoff is higher memory use and slower build time. I'm migrating new projects to HNSW:

CREATE INDEX documents_embedding_hnsw_idx
ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

Large dimensions kill index performance. 1536 works fine. 3072 (text-embedding-3-large) starts to hurt. At 3072 dimensions the index build is slow and memory pressure is real. I stick with text-embedding-3-small unless the accuracy difference is measurable in production.

You can't update a vector column in place efficiently via Eloquent. Because Eloquent doesn't know the column type, you'll hit casting issues if you try to do $doc->embedding = $vector; $doc->save(). Use raw DB::statement or DB::update for writes. It's a small annoyance but it catches people off guard.

Null embeddings will ruin your index scans. Any row with a NULL embedding gets excluded from index-based search but Postgres still has to account for it in certain query plans. I add WHERE embedding IS NOT NULL to every search query as a habit, and I set the column NOT NULL with a default deferred constraint when the workflow allows it.

When I'd Reach for This

Honestly, for most projects up to a few million rows, pgvector is the right answer. If you're already on Postgres — and you probably are — the incremental operational cost is zero. You don't learn a new query language, you don't manage a separate service, and your embeddings live in the same transaction as the rows they describe.

I'd use pgvector for:

  • Semantic search on internal documents, support tickets, product catalogs
  • RAG (retrieval-augmented generation) pipelines where the chunk count is under ~5M
  • Any project where joining vectors to relational data is useful (and it always is)
  • Teams that don't have dedicated ML infra and aren't going to get it

I'd consider a dedicated vector DB when:

  • You're storing hundreds of millions of vectors and need sub-10ms p99 at scale
  • Your vectors are produced by a different system and Postgres isn't the source of truth anyway
  • You need features like multi-tenant namespace isolation at the vector layer specifically

That last scenario is rarer than the vendors want you to think. I've had exactly one client where Pinecone made more sense than pgvector, and it was because their entire data architecture was already outside Postgres.

The Bottom Line

pgvector isn't a toy. It's a well-maintained extension with active development, real indexing strategies, and enough performance headroom for the vast majority of apps. The instinct to reach for a specialized tool when a general-purpose one already works is a habit worth fighting. Postgres has been quietly eating specialized databases for 20 years — vector search is just the latest course.

Need help shipping something like this? Get in touch.