I did a talk on the similarities and differences between Oracle and PostgreSQL at CodeNode in London:
https://skillsmatter.com/skillscasts/11414-london-postgresql-january#video
I'm a PostgreSQL DBA, and when I run across interesting facts then I'll post them here.
Wednesday, 17 January 2018
Wednesday, 10 January 2018
Faster shutdowns with checkpoint
Have you ever restarted your PostgreSQL cluster and wondered why it's taking so long? The cause of this is generally one of these:
You can cover off the first two by:
The normal process of a shutdown is:
- You didn't include the "-m fast" flag (v9.4 and earlier), and sessions are still connected
- A large update query is rolling back
- There are a large number of dirty blocks, which are not yet written to disk
You can cover off the first two by:
- Using "pg_ctl -m fast"
- Checking in pg_stat_activity for long-running queries
The normal process of a shutdown is:
- Stop any new connections
- Disconnect any existing connections, and rollback transactions (if "-m fast" is used)
- Write all dirty blocks to disk
- Shutdown the database cleanly, and restart
The 3rd step can sometimes take a long time if you've got a lot of dirty buffers. Maybe you've got a very long checkpoint_timeout set, or possibly your server is just very busy.
If you run a manual "CHECKPOINT" immediately prior to running the shutdown, then you can sometimes drastically shorten the time a restart takes and keep your users happy:
postgres@localhost[default:5432] ~/$ psql
psql (10.1)
Type "help" for help.
postgres=# CHECKPOINT;
CHECKPOINT
postgres=# \q
postgres@localhost[default:5432] ~/$ pg_ctl restart -m fast
waiting for server to shut down.... done
server stopped
waiting for server to start....LOG: listening on IPv4 address "0.0.0.0", port 5432
LOG: listening on IPv6 address "::", port 5432
LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
LOG: redirecting log output to logging collector process
HINT: Future log output will appear in directory "log".
done
server started
You can combine this all into a single step by running:
postgres@localhost[default:5432] ~/$ psql -c CHECKPOINT && pg_ctl restart -m fast
CHECKPOINT
waiting for server to shut down.... done
server stopped
waiting for server to start....2018-01-11 09:53:21.419 GMT [8919] LOG: listening on IPv4 address "0.0.0.0", port 5432
LOG: listening on IPv6 address "::", port 5432
LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
LOG: redirecting log output to logging collector process
HINT: Future log output will appear in directory "log".
done
server started
You can combine this all into a single step by running:
postgres@localhost[default:5432] ~/$ psql -c CHECKPOINT && pg_ctl restart -m fast
CHECKPOINT
waiting for server to shut down.... done
server stopped
waiting for server to start....2018-01-11 09:53:21.419 GMT [8919] LOG: listening on IPv4 address "0.0.0.0", port 5432
LOG: listening on IPv6 address "::", port 5432
LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
LOG: redirecting log output to logging collector process
HINT: Future log output will appear in directory "log".
done
server started
Bug in pg_rewind overwrites pg_control
For those of you that don't monitor the pgsql-bugs mailing list, this interesting bug was posted recently:
>> I have encountered a bug in PostgreSQL 10.1: when the target
>> directory for pg_rewind contains a read-only file (e.g. server.key),
>> pg_rewind exits with "could not open target file" (legitimate) and
>> corrupts the control file global/pg_control to size 0 (bug). From now
>> on, pg_rewind always exits with "unexpected control file size 0,
>> expected 8192" and a restore from pg_basebackup is needed.
>
> Likely that's reproducible down to 9.5 where pg_rewind has been
> introduced. I agree that we should do better with failure handling
> here. Corrupting the control file is not cool.
I can already confirm that this also occurs with PostgreSQL 9.6.
So if you want to use pg_rewind, then make sure you don't have any read-only files in your $PGDATA directory... otherwise your control file will get overwritten. :-O
I suspect this will get fixed in v9.5.11, v9.6.7, and v10.2... but there are no guarantees yet.
Here's the link to the original chat:
https://www.postgresql.org/message-id/20180104200633.17004.16377%40wrigleys.postgresql.org
>> I have encountered a bug in PostgreSQL 10.1: when the target
>> directory for pg_rewind contains a read-only file (e.g. server.key),
>> pg_rewind exits with "could not open target file" (legitimate) and
>> corrupts the control file global/pg_control to size 0 (bug). From now
>> on, pg_rewind always exits with "unexpected control file size 0,
>> expected 8192" and a restore from pg_basebackup is needed.
>
> Likely that's reproducible down to 9.5 where pg_rewind has been
> introduced. I agree that we should do better with failure handling
> here. Corrupting the control file is not cool.
I can already confirm that this also occurs with PostgreSQL 9.6.
So if you want to use pg_rewind, then make sure you don't have any read-only files in your $PGDATA directory... otherwise your control file will get overwritten. :-O
I suspect this will get fixed in v9.5.11, v9.6.7, and v10.2... but there are no guarantees yet.
Here's the link to the original chat:
https://www.postgresql.org/message-id/20180104200633.17004.16377%40wrigleys.postgresql.org
Autovacuum broken by Incorrect localhost entry
If you're seeing an error like this in your postgresql.log:
LOG: could not connect socket for statistics collector: Invalid argument
LOG: disabling statistics collector for lack of working socket
WARNING: autovacuum not started because of misconfiguration
HINT: Enable the "track_counts" option.
LOG: could not connect socket for statistics collector: Invalid argument
LOG: disabling statistics collector for lack of working socket
WARNING: autovacuum not started because of misconfiguration
HINT: Enable the "track_counts" option.
There's a good chance that your localhost entry in /etc/hosts is broken. In my case, there was a typo in the IP address, which should have been 127.0.0.1:
27.0.0.1 localhost
Once that was fixed, then the PostgreSQL database started up normally and Autovacuum was running.
Subscribe to:
Posts (Atom)
New PostgreSQL RDS versions - v10.14 , v11.9, and v12.4
It appears that new patch versions of PostgreSQL RDS are being rolled out in AWS.. so we should expect an announcement soon: REGION: us-east...
-
I realise it's a blanket statement to say NEVER to use pg_resetwal / pg_resetxlog , but in almost 100% of cases this will be true. One...
-
Have you ever restarted your PostgreSQL cluster and wondered why it's taking so long? The cause of this is generally one of these: Y...
-
It's always something that comes up, even in today's vitrualised and containerised environments... "How can I run multiple Post...