In situations where you have very large tables, then relying on autovacuum_vacuum_scale_factor can mean many thousands or hundreds of thousands of rows are updated before a vacuum is triggered.
One way to ensure you have regular vacuuming is to reduce autovacuum_vacuum_scale_factor = 0 for those tables, and increase the autovacuum_vacuum_threshold to an appropriate level. Be warned that you can trap yourself into a constant vacuum spiral if you don't make the threshold high enough, so be informed about your update volume before you make changes and monitor it afterwards to ensure you're not constantly vacuuming.
I tested this on PostgreSQL v9.6.7 on CentOS v7.4, running standard parameters except for setting log_autovacuum_min_duration = 0 so that we can see the autovacuum activity:
postgres=# SELECT version();
version
----------------------------------------------------------------------------------------------------------
PostgreSQL 9.6.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16), 64-bit
(1 row)
postgres=# SHOW autovacuum_vacuum_threshold;
autovacuum_vacuum_threshold
-----------------------------
50
(1 row)
postgres=# SHOW autovacuum_vacuum_scale_factor;
autovacuum_vacuum_scale_factor
--------------------------------
0.2
(1 row)
postgres=# show log_autovacuum_min_duration;
log_autovacuum_min_duration
-----------------------------
0
(1 row)
Create a table, and set the autovacuum parameters explicitly:
postgres=# CREATE TABLE vacuum_threshold_test(id int);
CREATE TABLE
postgres=# ALTER TABLE vacuum_threshold_test SET (autovacuum_vacuum_scale_factor = 0);
ALTER TABLE
postgres=# ALTER TABLE vacuum_threshold_test SET (autovacuum_vacuum_threshold = 1000);
ALTER TABLE
postgres=# \d+ vacuum_threshold_test
Table "public.vacuum_threshold_test"
Column | Type | Modifiers | Storage | Stats target | Description
--------+---------+-----------+---------+--------------+-------------
id | integer | | plain | |
Options: autovacuum_vacuum_scale_factor=0, autovacuum_vacuum_threshold=1000
Create some dummy data, in excess of the autovacuum_vacuum_threshold:
postgres=# INSERT INTO vacuum_threshold_test(id) (SELECT generate_series(1,2000));
INSERT 0 2000
Update exactly the same number of rows as the autovacuum_vacuum_threshold, and for good measure force a checkpoint:
postgres=# UPDATE vacuum_threshold_test SET id = id + 1 WHERE id BETWEEN 1 AND 1000;
UPDATE 1000
postgres=# CHECKPOINT;
CHECKPOINT
That isn't enough to trigger an autovacuum, as it must be GREATER THAN the autovacuum_vacuum_threshold, so update one more row:
postgres=# UPDATE vacuum_threshold_test SET id = id + 1 WHERE id = 2000;
UPDATE 1
That then triggers an autovacuum, but strangely it only removed 1 tuple... not 1001 as you would expect:
< 2018-02-14 11:59:05.571 GMT > LOG: automatic vacuum of table "postgres.public.vacuum_threshold_test": index scans: 0
pages: 0 removed, 14 remain, 0 skipped due to pins, 0 skipped frozen
tuples: 1 removed, 2000 remain, 0 are dead but not yet removable
buffer usage: 51 hits, 2 misses, 12 dirtied
avg read rate: 0.613 MB/s, avg write rate: 3.677 MB/s
system usage: CPU 0.00s/0.00u sec elapsed 0.02 sec
Alternatively if you exceed the autvacuum_vacuum_threshold in one transaction, it removes all 1001 of them:
postgres=# UPDATE vacuum_threshold_test SET id = id + 1 WHERE id BETWEEN 2 AND 1002;
UPDATE 1001
< 2018-02-14 12:07:05.648 GMT > LOG: automatic vacuum of table "postgres.public.vacuum_threshold_test": index scans: 0
pages: 0 removed, 18 remain, 0 skipped due to pins, 0 skipped frozen
tuples: 1001 removed, 2000 remain, 0 are dead but not yet removable
buffer usage: 60 hits, 0 misses, 0 dirtied
avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s
system usage: CPU 0.00s/0.00u sec elapsed 0.00 sec
If you know what you want, then you can tune individual tables to always vacuum at the right point to prevent bloat, rather than relying on the cluster wide parameters in postgresql.conf.
Interestingly, if you set both autovacuum_vacuum_scale_factor = 0, and autovacuum_vacuum_threshold = 0, then any single update/delete to the table will result in autovacuum being triggered.
This has been logged as a bug on the PostgreSQL mailing list:
https://www.postgresql.org/message-id/151861949011.1450.18007996621988910831%40wrigleys.postgresql.org
Interestingly, if you set both autovacuum_vacuum_scale_factor = 0, and autovacuum_vacuum_threshold = 0, then any single update/delete to the table will result in autovacuum being triggered.
This has been logged as a bug on the PostgreSQL mailing list:
https://www.postgresql.org/message-id/151861949011.1450.18007996621988910831%40wrigleys.postgresql.org