Cold Comfort: When PostgreSQL Protects Your Data by Locking You Out

Cold Comfort: When PostgreSQL Protects Your Data by Locking You Out

PostgreSQL’s MVCC architecture is designed to preserve data integrity at all costs.

But when autovacuum falls behind, that protection can come at a steep operational price: your data is still there — but you can’t write to it, and in some cases, you can’t even access it.

Over the past three years, PostgreSQL has experienced multiple production-halting outages across major cloud providers and enterprise deployments.

The root cause in every case? Vacuum lag.

Three Confirmed Outages (2024–2025)

  1. AWS RDS PostgreSQL Wraparound Incident
    Date: December 2024 (updated February 2025)
    Impact: PostgreSQL entered read-only mode
    Cause: Transaction ID wraparound protection triggered due to autovacuum lag
    Recovery: Required emergency vacuuming and downtime
    Source: AWS Blog
  2. Google Cloud SQL PostgreSQL Lockouts
    Date: Throughout 2024
    Impact: PostgreSQL refused new transactions to prevent wraparound data loss
    Cause: Missed autovacuum cycles; stale replication slots and prepared transactions
    Recovery: Required single-user mode vacuum and cleanup
    Source: Google Cloud Docs
  3. Metronome Multixact Exhaustion
    Date: May 2025
    Impact: Four outages in one week; blocked write operations across API and UI
    Cause: Multixact member exhaustion from long-lived transactions and insufficient vacuuming
    Recovery: Required emergency tuning and freeze vacuuming
    Source: Metronome Postmortem

Meanwhile: No Documented MariaDB Outages

In the same timeframe, MariaDB has had no confirmed production-halting incidents.

Its architecture avoids transaction ID wraparound and multixact exhaustion entirely. It does not rely on background vacuuming to maintain write availability.

Stewardship Risk Summary (2023–2025)

System Outages Business Impact Stewardship Risk
PostgreSQL 3 confirmed High Requires freeze monitoring, vacuum tuning, emergency playbooks
MariaDB None Low Stable, no vacuum dependencies

MariaDB has had no documented outages in the past three years.

It doesn’t rely on autovacuum, doesn’t suffer from transaction ID wraparound, and doesn’t require freeze monitoring to stay writable.

MariaDB's architecture avoids the entire class of risks that have repeatedly halted PostgreSQL workloads.

PostgreSQL’s vacuum system is not optional — it’s existential.

In high-write environments, autovacuum can consume 20–40% of CPU and I/O.

When vacuuming falls behind, PostgreSQL will halt writes or enter read-only mode to prevent corruption.

If you’re running PostgreSQL at scale, you need:

  • Freeze-age monitoring
  • Autovacuum tuning
  • Emergency vacuum playbooks
  • Contributor-safe diagnostics and alerting
“Catastrophic data loss. (Actually the data is still there, but that's cold comfort if you cannot get at it.)”
PostgreSQL Documentation

Tags:
#PostgreSQL #Autovacuum #DatabaseOutages #MVCC #WraparoundRisk #MariaDB #Benchmarking #DatabaseStewardship #FreezeMonitoring #CloudReliability #MariaDBFoundation #OpenSource #PerformanceTesting #ContributorDriven #MySQLCompatible #CloudDatabases #LegacyDriven

Comments

Popular posts from this blog

MySQL Cluster Replication

New MySQL System QA blog

Perl IO::Socket, how to make them talk on both ends Cient/Server