Long running transactions can prevent postgres from vacuuming tables. But why?
Let’s first grab our current connections from pg_stat_activit
.
SELECT application_name, pid, datname, usename, state, backend_xmin
FROM pg_stat_activit WHERE backend_xmin IS NOT NULL OR backend_xid IS NOT NULL;
application_name | pid | datname | usename | state | backend_xmin ------------------+---------+---------+----------+-----------------------+--------------
1025706 | tester | postgres | idle in transaction | 3290353754
psql | 4161240 | tester | tester | active | 3290354754 Consumer |
In this example we have 2 application names: psql
& Consumer
.
The important column here is backend_xmin
which represents the current state of the database
that the application is seeing. Everytime we start a transaction the transaction is seeing the state
of the database at the point of transaction start. As a result postgres can not vacuum tuples that have become dead
after a transaction has started which yet has to end. The lowest number of backend_xmin
represents the cut-off
for dead tuples that postgres can remove since these tuples are not needed anymore.
Long running transaction can cause a lot of harm to applications where we are updating tables frequently. They contribute to table bloat.
One way to protect your database against bad client code is to set (idle_in_transaction_session_timeout)[https://postgresqlco.nf/doc/en/param/idle_in_transaction_session_timeout/] which is by default disabled.