PostgreSQL Blog by Greg Clough: January 2018

Wednesday, 17 January 2018

Babelfish for Oracle and PostgreSQL

I did a talk on the similarities and differences between Oracle and PostgreSQL at CodeNode in London:

https://skillsmatter.com/skillscasts/11414-london-postgresql-january#video

Wednesday, 10 January 2018

Faster shutdowns with checkpoint

Have you ever restarted your PostgreSQL cluster and wondered why it's taking so long? The cause of this is generally one of these:

You didn't include the "-m fast" flag (v9.4 and earlier), and sessions are still connected
A large update query is rolling back
There are a large number of dirty blocks, which are not yet written to disk

You can cover off the first two by:

Using "pg_ctl -m fast"
Checking in pg_stat_activity for long-running queries

But to ensure you minimise your downtime when restarting a PostgreSQL cluster, then consider forcing a "checkpoint" beforehand.

The normal process of a shutdown is:

Stop any new connections
Disconnect any existing connections, and rollback transactions (if "-m fast" is used)
Write all dirty blocks to disk
Shutdown the database cleanly, and restart

The 3rd step can sometimes take a long time if you've got a lot of dirty buffers. Maybe you've got a very long checkpoint_timeout set, or possibly your server is just very busy.

If you run a manual "CHECKPOINT" immediately prior to running the shutdown, then you can sometimes drastically shorten the time a restart takes and keep your users happy:

postgres@localhost[default:5432] ~/$ psql

psql (10.1)

Type "help" for help.

postgres=# CHECKPOINT;

CHECKPOINT

postgres=# \q

postgres@localhost[default:5432] ~/$ pg_ctl restart -m fast

waiting for server to shut down.... done

server stopped

waiting for server to start....LOG: listening on IPv4 address "0.0.0.0", port 5432

LOG: listening on IPv6 address "::", port 5432

LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"

LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"

LOG: redirecting log output to logging collector process

HINT: Future log output will appear in directory "log".

done

server started

You can combine this all into a single step by running:

postgres@localhost[default:5432] ~/$ psql -c CHECKPOINT && pg_ctl restart -m fast
CHECKPOINT
waiting for server to shut down.... done
server stopped
waiting for server to start....2018-01-11 09:53:21.419 GMT [8919] LOG: listening on IPv4 address "0.0.0.0", port 5432
LOG: listening on IPv6 address "::", port 5432
LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
LOG: redirecting log output to logging collector process
HINT: Future log output will appear in directory "log".
done

server started

Bug in pg_rewind overwrites pg_control

For those of you that don't monitor the pgsql-bugs mailing list, this interesting bug was posted recently:

>> I have encountered a bug in PostgreSQL 10.1: when the target
>> directory for pg_rewind contains a read-only file (e.g. server.key),
>> pg_rewind exits with "could not open target file" (legitimate) and
>> corrupts the control file global/pg_control to size 0 (bug). From now
>> on, pg_rewind always exits with "unexpected control file size 0,
>> expected 8192" and a restore from pg_basebackup is needed.
>
> Likely that's reproducible down to 9.5 where pg_rewind has been
> introduced. I agree that we should do better with failure handling
> here. Corrupting the control file is not cool.

I can already confirm that this also occurs with PostgreSQL 9.6.

So if you want to use pg_rewind, then make sure you don't have any read-only files in your $PGDATA directory... otherwise your control file will get overwritten. :-O

I suspect this will get fixed in v9.5.11, v9.6.7, and v10.2... but there are no guarantees yet.

Here's the link to the original chat:

https://www.postgresql.org/message-id/20180104200633.17004.16377%40wrigleys.postgresql.org

Autovacuum broken by Incorrect localhost entry

If you're seeing an error like this in your postgresql.log:

LOG: could not connect socket for statistics collector: Invalid argument
LOG: disabling statistics collector for lack of working socket
WARNING: autovacuum not started because of misconfiguration
HINT: Enable the "track_counts" option.

There's a good chance that your localhost entry in /etc/hosts is broken. In my case, there was a typo in the IP address, which should have been 127.0.0.1:

27.0.0.1 localhost

Once that was fixed, then the PostgreSQL database started up normally and Autovacuum was running.

PostgreSQL Blog by Greg Clough