Showing posts with label postgresql. Show all posts
Showing posts with label postgresql. Show all posts

Wednesday, 23 October 2019

postgresql.log file grows infinitely, and never rotates... did your disk fill up previously?

An interesting (and currently undocumented) behaviour of PostgreSQL is that when there is an error writing to the postgresql.log configured in log_filename, that all future log rotations are disabled by this piece of code in syslogger.c:


/*
 * ENFILE/EMFILE are not too surprising on a busy system; just
 * keep using the old file till we manage to get a new one.
 * Otherwise, assume something's wrong with Log_directory and stop
 * trying to create files.
 */
 if (errno != ENFILE && errno != EMFILE)
 {
     ereport(LOG,
             (errmsg("disabling automatic rotation (use SIGHUP to re-enable)")));
     rotation_disabled = true;
 }


In most default configurations this will never be noticed, because the log_directory is on the same filesystem as the data_directory.  In that case the entire instance will crash, and once space is cleared then it will be restarted afresh with log rotation enabled.




If, however, you had the foresight to put your log_directory onto a different filesystem, then it can fill up without taking the entire cluster down.  Once you clear space then logging will resume, but both log_rotation_age, and log_rotation_size will be disabled:


if (Log_RotationAge > 0 && !rotation_disabled)
{
 /* Do a logfile rotation if it's time */
 now = (pg_time_t) time(NULL);
 if (now >= next_rotation_time)
        rotation_requested = time_based_rotation = true;
}

if (!rotation_requested && Log_RotationSize > 0 && !rotation_disabled)
{
 /* Do a rotation if file is too big */
 if (ftell(syslogFile) >= Log_RotationSize * 1024L)
 {
        rotation_requested = true;
        size_rotation_for |= LOG_DESTINATION_STDERR;
 }
 if (csvlogFile != NULL &&
        ftell(csvlogFile) >= Log_RotationSize * 1024L)
 {
        rotation_requested = true;
        size_rotation_for |= LOG_DESTINATION_CSVLOG;
 }
}


The result of being disabled means that one single log file will be written to, growing and growing until it once again fills the disk.  If you're relying on log_truncate_on_rotation to run, or cron jobs to cleanup old logs, then neither will work.

To correct this, once some space has been cleared then you just need to send a SIGHUP to the postmaster process.  This is safely done by calling pg_reload_conf(), but you could also use kill or pg_ctl reload if you prefer.


/*
 * If we had a rotation-disabling failure, re-enable rotation
 * attempts after SIGHUP, and force one immediately.
 */
if (rotation_disabled)
{
    rotation_disabled = false;
    rotation_requested = true;
}


After that your logging will go back to normal, and will rotate as you have configured in your postgresql.conf.

Thursday, 1 February 2018

Monitoring bgwriter efficiency during checkpoints with pg_buffercache

Have you ever wanted to know what's happening inside your PostgreSQL buffer cache?  Is your bgwriter keeping up, or are you suffering with a lot of dirty buffers during a checkpoint?

Uncloak some of the mystery by using the pg_buffercache extension.

CREATE EXTENSION pg_buffercache;


You can then use the following query, and plot the results into your favourite graphing solution:

SELECT buffer_status, sum(count) AS count
  FROM (SELECT CASE isdirty
                 WHEN true THEN 'dirty'
                 WHEN false THEN 'clean'
                 ELSE 'empty'
               END AS buffer_status,
               count(*) AS count
          FROM pg_buffercache
          GROUP BY buffer_status
        UNION ALL
          SELECT * FROM (VALUES ('dirty', 0), ('clean', 0), ('empty', 0)) AS tab2 (buffer_status,count)) tab1
  GROUP BY buffer_status;


The output will look like this:

postgres=# SELECT buffer_status, sum(count) AS count
  FROM (SELECT CASE isdirty
                 WHEN true THEN 'dirty'
                 WHEN false THEN 'clean'
                 ELSE 'empty'
               END AS buffer_status,
               count(*) AS count
          FROM pg_buffercache
          GROUP BY buffer_status
        UNION ALL
          SELECT * FROM (VALUES ('dirty', 0), ('clean', 0), ('empty', 0)) AS tab2 (buffer_status,count)) tab1
  GROUP BY buffer_status;
 buffer_status | count
---------------+--------
 clean         |    544
 dirty         |     18
 empty         | 523726
(3 rows)


And whenever there is a checkpoint, the dirty count should go to zero:

postgres=# checkpoint;
CHECKPOINT

postgres=# SELECT buffer_status, sum(count) AS count
  FROM (SELECT CASE isdirty
                 WHEN true THEN 'dirty'
                 WHEN false THEN 'clean'
                 ELSE 'empty'
               END AS buffer_status,
               count(*) AS count
          FROM pg_buffercache
          GROUP BY buffer_status
        UNION ALL
          SELECT * FROM (VALUES ('dirty', 0), ('clean', 0), ('empty', 0)) AS tab2 (buffer_status,count)) tab1
  GROUP BY buffer_status;
 buffer_status | count
---------------+--------
 clean         |    562
 dirty         |      0
 empty         | 523726
(3 rows)


ACTION: If there is a large clearance of dirty blocks during the checkpoint, then you should tune bgwriter to be more aggressive.


Should you want to get the output as one per line, then you can use crosstab:

CREATE EXTENSION tablefunc;

SELECT * FROM crosstab(
  $$
    SELECT now() sample_timestamp, buffer_status, sum(buffer_count)::int AS buffer_count
          FROM (SELECT CASE isdirty
                         WHEN false THEN 'clean'
                         WHEN true THEN 'dirty'
                         ELSE 'empty'
                       END AS buffer_status,
                       count(*) AS buffer_count
                  FROM pg_buffercache
                  GROUP BY buffer_status
                UNION ALL
                  SELECT * FROM (VALUES ('clean', 0), ('dirty', 0), ('empty', 0)) AS dummy_table(buffer_status, buffer_count)
               ) outer_table
          GROUP BY buffer_status
          ORDER BY buffer_status
  $$)

AS ct(sample_timestamp timestamptz, clean int, dirty int, empty int);

The output will then look like:

       sample_timestamp        | clean | dirty | empty
-------------------------------+-------+-------+--------
 2018-02-02 11:38:00.915978+00 |  3108 |     0 | 521180
(1 row)


I created a test database, and ran pgbench against it whilst sampling every 10 seconds:

pgbench -i -s 100 pgbench

then

pgbench -c 32 -j 4 -P 10 -T 1200 pgbench


The database was configured with these parameters:

checkpoint_timeout = 5min
checkpoint_completion_target = 0.9

bgwriter_delay = 100ms
bgwriter_lru_maxpages = 1000
bgwriter_lru_multiplier = 3.0


The resulting graph was:


Some notable points:

1) The db_buffers is big enough to fit the entire data set, because we never run out of empty buffers (grey).
2) Work starts at #14 and the dirty buffers (orange) start to climb.
3) A checkpoint starts at #27, and dirty blocks are written to disk.  This results in clean buffers (blue) balancing out with the new dirty buffers being written.
4) This held in equilibrium until a forced manual checkpoint is done at #98.  This clears all dirty buffers to disk immediately, instead of letting bgwriter do it in a controlled fashion.
5) Things quickly get back to equilibrium.
6) The test completes at #128, and the buffers stay in their current clean/dirty/empty state.
7) Once the next timed checkpoint is triggered at #140, the dirty blocks are written to disk at the rate determined by the bgwriter parameters.
8) All dirty blocks are written to disk by #167

Wednesday, 10 January 2018

Faster shutdowns with checkpoint

Have you ever restarted your PostgreSQL cluster and wondered why it's taking so long?  The cause of this is generally one of these:

  • You didn't include the "-m fast" flag (v9.4 and earlier), and sessions are still connected
  • A large update query is rolling back
  • There are a large number of dirty blocks, which are not yet written to disk


You can cover off the first two by:

  • Using "pg_ctl -m fast"
  • Checking in pg_stat_activity for long-running queries
But to ensure you minimise your downtime when restarting a PostgreSQL cluster, then consider forcing a "checkpoint" beforehand.

The normal process of a shutdown is:
  1. Stop any new connections
  2. Disconnect any existing connections, and rollback transactions (if "-m fast" is used)
  3. Write all dirty blocks to disk
  4. Shutdown the database cleanly, and restart
The 3rd step can sometimes take a long time if you've got a lot of dirty buffers.  Maybe you've got a very long checkpoint_timeout set, or possibly your server is just very busy.

If you run a manual "CHECKPOINT" immediately prior to running the shutdown, then you can sometimes drastically shorten the time a restart takes and keep your users happy:

postgres@localhost[default:5432] ~/$ psql
psql (10.1)
Type "help" for help.

postgres=# CHECKPOINT;
CHECKPOINT
postgres=# \q

postgres@localhost[default:5432] ~/$ pg_ctl restart -m fast
waiting for server to shut down.... done
server stopped
waiting for server to start....LOG:  listening on IPv4 address "0.0.0.0", port 5432
LOG:  listening on IPv6 address "::", port 5432
LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
LOG:  redirecting log output to logging collector process
HINT:  Future log output will appear in directory "log".
 done
server started


You can combine this all into a single step by running:

postgres@localhost[default:5432] ~/$ psql -c CHECKPOINT && pg_ctl restart -m fast
CHECKPOINT
waiting for server to shut down.... done
server stopped
waiting for server to start....2018-01-11 09:53:21.419 GMT [8919] LOG:  listening on IPv4 address "0.0.0.0", port 5432
LOG:  listening on IPv6 address "::", port 5432
LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
LOG:  redirecting log output to logging collector process
HINT:  Future log output will appear in directory "log".
 done

server started

Bug in pg_rewind overwrites pg_control

For those of you that don't monitor the pgsql-bugs mailing list, this interesting bug was posted recently:

>> I have encountered a bug in PostgreSQL 10.1: when the target 
>> directory for pg_rewind contains a read-only file (e.g. server.key), 
>> pg_rewind exits with "could not open target file" (legitimate) and 
>> corrupts the control file global/pg_control to size 0 (bug). From now 
>> on, pg_rewind always exits with "unexpected control file size 0, 
>> expected 8192" and a restore from pg_basebackup is needed.

> Likely that's reproducible down to 9.5 where pg_rewind has been 
> introduced. I agree that we should do better with failure handling 
> here. Corrupting the control file is not cool.

I can already confirm that this also occurs with PostgreSQL 9.6.


So if you want to use pg_rewind, then make sure you don't have any read-only files in your $PGDATA directory... otherwise your control file will get overwritten. :-O

I suspect this will get fixed in v9.5.11, v9.6.7, and v10.2... but there are no guarantees yet.


Here's the link to the original chat:

https://www.postgresql.org/message-id/20180104200633.17004.16377%40wrigleys.postgresql.org

Autovacuum broken by Incorrect localhost entry

If you're seeing an error like this in your postgresql.log:

LOG:  could not connect socket for statistics collector: Invalid argument
LOG:  disabling statistics collector for lack of working socket
WARNING:  autovacuum not started because of misconfiguration
HINT:  Enable the "track_counts" option.


There's a good chance that your localhost entry in /etc/hosts is broken.  In my case, there was a typo in the IP address, which should have been 127.0.0.1:

27.0.0.1   localhost

Once that was fixed, then the PostgreSQL database started up normally and Autovacuum was running.

New PostgreSQL RDS versions - v10.14 , v11.9, and v12.4

It appears that new patch versions of PostgreSQL RDS are being rolled out in AWS.. so we should expect an announcement soon: REGION: us-east...