log in
consulting hosting industries the daily tools about contact

Postgres JSON Columns: When They're a Lifesaver and When They're a Trap

I've shipped both approaches in production. Here's when JSON columns save you and when they quietly wreck you.

I've reached for Postgres JSON columns at least a dozen times in the last five years and been grateful for it maybe half of those times. The other half I was cursing myself six months later when the "flexible" column had become an unmapped wasteland of inconsistent keys and missing data. The choice isn't complicated once you've been burned by both sides.

What you're actually choosing between

A proper relational schema means you know every field, every type, every constraint — at the database level. Foreign keys enforce relationships. NOT NULL means it actually can't be null. You can index individual columns, write clean joins, and your ORM stays happy.

A JSON or JSONB column means you're storing a blob of structured-ish data inside a single field. Postgres is genuinely good at this — JSONB is binary-indexed, you can query into it, and you can even put GIN indexes on it. But the database doesn't know what's in there. Your application does, or it thinks it does, or it used to.

That's the crux of it. A proper schema is a contract enforced by the database. A JSON column is a contract enforced by your application code — and application code changes, gets forgotten, and breaks in ways that silently corrupt data for weeks before anyone notices.

Where JSON columns genuinely win

Truly variable structure. I built an intake form system for a Seattle healthcare client a couple years back. Facilities could define their own custom fields — dropdown, freetext, date, checkbox arrays — and different patient types had different field sets. Normalizing that would have required a full EAV (entity-attribute-value) schema, which is arguably worse than JSON both in query complexity and in maintainability. A JSONB custom_fields column was the right call. We knew the shape would vary by design.

External API responses you need to stash. When I'm integrating a third-party API — an MLS feed, a LIMS result payload, a payment webhook — I often store the raw response as JSONB alongside my normalized columns. That way I've got the canonical source of truth if my parsing logic was wrong, and I can re-process it later without calling the API again. I'm not querying that raw blob in application logic; I'm just keeping it.

Prototype schemas that are genuinely in flux. Early in a project, before you understand the domain, JSON gives you room to iterate without migrations. The key word is early. This is not a permanent state of affairs.

Sparse data with hundreds of optional attributes. Think product catalogs with wildly different attribute sets per category. A bicycle has gears and frame size; a t-shirt has fabric and cut. You could do a separate attributes table with a FK back to products. Or you could use JSONB and accept the tradeoff. I've done both depending on how often those attributes need to be queried.

Where proper schemas win (which is most of the time)

Anytime you need to filter, sort, aggregate, or join on a value, the normalized column wins. Full stop.

Querying into JSONB is syntactically ugly and performance degrades fast without careful indexing:

-- This works, but it's not pretty
SELECT * FROM orders
WHERE metadata->>'status' = 'pending'
  AND (metadata->>'amount')::numeric > 500;

Compare that to:

SELECT * FROM orders
WHERE status = 'pending'
  AND amount > 500;

The second one uses indexes naturally, reads instantly, and doesn't require a cast that silently returns nothing if the key doesn't exist.

And that's the quiet killer: metadata->>'amount' returns NULL if amount isn't present in the JSON. No error. No warning. Your query just excludes those rows. With a proper column, a missing value is either NOT NULL enforced at write time or an explicit NULL you can reason about.

A Laravel pattern I actually use

In Laravel projects I'll often do a hybrid: normalized columns for anything I query on, a JSONB meta column for supplemental data that varies by record type and doesn't drive queries.

// Migration
Schema::create('products', function (Blueprint $table) {
    $table->id();
    $table->string('sku')->unique();
    $table->string('name');
    $table->decimal('price', 10, 2);
    $table->string('status')->default('active');
    $table->jsonb('attributes')->nullable(); // sparse, variable, rarely queried
    $table->timestamps();
});

// GIN index if you ever need to search inside it
// DB::statement('CREATE INDEX products_attributes_gin ON products USING GIN (attributes)');
// Model
class Product extends Model
{
    protected $casts = [
        'attributes' => 'array',
    ];

    // Helper to get a typed attribute safely
    public function getAttribute(string $key, mixed $default = null): mixed
    {
        return data_get($this->attributes_data, $key, $default);
    }

    // Avoid collision with Eloquent's getAttribute
    public function getAttributesDataAttribute(): array
    {
        return $this->attributes ?? [];
    }
}

The sku, price, and status fields are normalized because I'm filtering and sorting on them constantly. The attributes column holds things like { "frame_size": "56cm", "material": "carbon" } that are display-only and vary by product category. If I ever find myself writing a query that filters on something inside attributes, that's my signal to promote it to a real column.

The gotchas that bit me

Type coercion is all on you. Postgres stores JSON values as strings, numbers, booleans, etc., but when you pull them out through ->> you get text. Every numeric comparison, every date parse, every boolean check needs an explicit cast. Miss one and you get silent wrong results. I had a billing report at a print management client that was subtly undercounting for three months because (payload->>'quantity')::int was quietly returning NULL for records written by an older code path that used qty as the key instead of quantity. The column had both. Nobody caught it until a client audit.

Migrations don't protect you. You can add a key to your JSON payload in one deploy and never add it to old records. Six months later someone writes code that assumes every record has that key and you've got a production error at 2am. With a proper column and a migration, the database backfills or rejects. With JSON, you've got a subtle inconsistency baked in.

GIN indexes help but don't replace columns. A GIN index on a JSONB column can make @> containment queries fast, but it's not the same as a btree index on a regular column. Range queries, ordering, and certain equality checks won't use it the way you expect. Profile before you assume.

Eloquent's $casts doesn't validate. Casting a JSONB column to 'array' in Laravel will happily give you a PHP array, but it won't tell you if the shape is wrong. I add a value object or a dedicated DTO when the JSON structure starts to matter:

// Better: wrap it so your code has a contract
class ProductAttributes
{
    public function __construct(
        public readonly ?string $frameSize = null,
        public readonly ?string $material = null,
        public readonly ?float $weight = null,
    ) {}

    public static function fromArray(array $data): self
    {
        return new self(
            frameSize: $data['frame_size'] ?? null,
            material: $data['material'] ?? null,
            weight: isset($data['weight']) ? (float) $data['weight'] : null,
        );
    }
}

Now at least your application layer has a defined shape, even if the database doesn't enforce it.

When I'd reach for JSON columns

  • Custom fields defined by end users at runtime
  • Raw API response archival alongside normalized data
  • Sparse, display-only attributes that will never be filtered or aggregated
  • Early prototype work where the schema is deliberately unfinished

When I wouldn't

  • Any field that appears in a WHERE clause more than occasionally
  • Any field that's joined to another table
  • Any field with referential integrity requirements
  • Any field that's financially or clinically significant — the database constraint is your last line of defense
  • Anything that starts as JSON but "we'll clean it up later" (you won't)

The real question

Every time I'm tempted by a JSON column, I ask one question: will I ever need to query on this value? If the answer is yes, or maybe, or I don't know yet — that's a real column. JSON is for data you carry along for the ride, not data you steer by.

Postgres JSONB is a genuinely powerful feature. It's not an excuse to skip schema design.

Need help shipping something like this? Get in touch.