Pythian Blog: Technical Track

New Features for DBA's in PostgreSQL 16

16 has just been launched, and this year, we have a raft of new features to look through and evaluate.

Many of these new features are for developers with improvements in JSON functionality and better text collations.

However, I would argue that administrators get the best deal this time round as we get new monitoring features for I/O, pglogical has the ability to use parallel subscription processing, and a new load balancer for connections using libpq.

All the new features are talked about in the official PG release notes, found below.  I want to review a couple of them and see how they can help diagnose or fix specific issues that DBAs find in their day-to-day activities.  The two I’ll be reviewing are the new load balancing and monitoring features.

https://www.postgresql.org/about/news/postgresql-16-released-2715/

Load balancing

The new load-balancing feature is pretty straightforward to use.   The test that I used was to use psql but any client that uses the libpq library should also be able to use this functionality. Therefore, psycopg2 and psqlODBC can both use this type of load balancing.

Note: There is a new variable (which can be added as an environment variable) called load_balance_hosts, which has two values:

  • disable - This is the default value, and the list of servers are connected to (or at least an attempt is made) in the order that the hosts are listed.
  • random - The list of servers is tried but in a random order.  

The documentation does mention that the algorithm uses two different levels of random selection.  The first order is that the hosts themselves will be sorted in a random order.  The next phase will try all addresses of the current host but in a random order.  This can cause skew (as noted by the documentation) if a host has more than one IP address.  

Additional functionality can ensure that the database being connected to is a primary or standby by setting the target_session_attrs variable. This lets you specify what type of PostgreSQL server the client should be connected to. In PostgreSQL 10, there were only two options, but since PostgreSQL 14, there are six options to choose from (although some overlap in functionality).

The six options are:

Option

Description

any

This is the DEFAULT option. Any database type is acceptable

primary

Make a connection to the primary database instance

standby

Make a connection to a database instance which is in standby mode

read-write

The session must be able to do read-write transactions.

read-only

The session has to refuse read-write transactions (the opposite of the read-write option).

prefer-standby

Try to find a standby server, but if none are available, assume the ANY option and try the list again.

This functionality has been around since PostgreSQL 10, but it was always done in order, so the connections would always hit the same hosts in the order they were listed.  This could cause logon storms (too many connections being started at the same time) and cause performance issues on the listed server.

Combine that with the new random load_balace_hosts variable, and we have real load balancing.

So, before PostgreSQL 15, you would have had a list, and the server that met the criteria listed in the target_session_attrs would be connected to first.  Now, we have a situation where, if there is more than one standby, the queries can be balanced between them.  Still, DML (Data Manipulation Language) and DDL (Data Definition Language) have to go to a primary server, but SELECT queries can be loaded between PostgreSQL member instances in the High Availability Cluster.   Even bi-directional replication will benefit from this, although these probably already have dedicated load balancers sitting in front of the PostgreSQL instances.

In conclusion, it is now possible to set up load balancing for any connections that use libpq without the need for a dedicated load balancer.  However, it still should be noted that a dedicated load balancer also has an additional security component, so any database that connects to the internet needs to have a dedicated load balancer or proxy server in front of the database.

Monitoring

The most important monitoring feature that I will use is the addition of the column that records the last sequential or index scan that was made to an object (table/index) in the pg_stat_*table views.  

Last sequential scan time in the pg_stat_*table/index

This is the easiest to explain.  This feature adds an additional timestamp column to the pg_stat_*table/index views.  It records when the table or index was last touched, thus making it easier to spot which tables are being used and how often.

A simple example of this shows how this works.

First, let's create a table called exam_tab01.

db01=# CREATE TABLE exam_tab01 

 id serial primary key,  
 name text, 
 insert_date date
);
CREATE TABLE

Now, let’s insert some data into the table via the generate_series function:

db01=# INSERT INTO exam_tab01 (name, insert_date) SELECT md5(random()::text) name, now() insert_date FROM generate_series(1,30000);
INSERT 0 30000

We can have a look at the pg_stat_all_tables:

db01=# 
SELECT relid,
       schemaname,
       relname,
       seq_scan,
       last_seq_scan,
       seq_tup_read,
       idx_scan,
       last_idx_scan,
       idx_tup_fetch,
      n_tup_ins,
      n_live_tup,
      n_mod_since_analyze,
      n_ins_since_vacuum,
      last_autovacuum,
      last_autoanalyze                         
FROM  pg_stat_all_tables
WHERE relname = 'exam_tab01';

-[ RECORD 1 ]-------+------------------------------
relid               | 16390
schemaname          | public
relname             | exam_tab01
seq_scan            | 1
last_seq_scan       | 2023-10-05 09:56:37.554215+01
seq_tup_read        | 0
idx_scan            | 0
last_idx_scan       |
idx_tup_fetch       | 0
n_tup_ins           | 30000
n_live_tup          | 30000
n_mod_since_analyze | 30000
n_ins_since_vacuum  | 30000
last_autovacuum     |
last_autoanalyze    |

As we can see,  there is only one seq_scan so far (from the INSERT and no autovacuum or analysis has yet been run.

So, now run a COUNT against the table:

db01=# SELECT count(*) FROM exam_tab01;
-[ RECORD 1 ]
count | 30000

Let’s see the pg_stat_all_tables entry for exam_tab01, after the COUNT:

-[ RECORD 1 ]-------+------------------------------
relid               | 16390
schemaname          | public
relname             | exam_tab01
seq_scan            | 2
last_seq_scan       | 2023-10-05 09:59:41.019942+01
seq_tup_read        | 30000
idx_scan            | 0
last_idx_scan       |
n_tup_ins           | 30000
n_live_tup          | 30000
last_autovacuum     | 2023-10-05 09:59:35.569023+01
last_autoanalyze    | 2023-10-05 09:59:35.667303+01

We can see a few things now.  The last_autovacuum and last_autoanalyze columns have been populated, i.e., run, and the date on last_seq_scan has increased (so has the number for seq_scan).  

It is the last_seq_scan that is new in 16.  So, we can tell the last time an object was touched. 

So, let’s do another count:

db01=# SELECT count(*) FROM exam_tab01;
count | 30000

db01=# SELECT now();
now | 2023-10-05 10:00:07.102724+01

We can see from the data below the last_seq_scan time has changed again.

-[ RECORD 1 ]-------+------------------------------
relid               | 16390
schemaname          | public
relname             | exam_tab01
seq_scan            | 3
last_seq_scan       | 2023-10-05 10:00:05.551341+01
seq_tup_read        | 60000
idx_scan            | 0
last_idx_scan       |
n_tup_ins           | 30000
n_live_tup          | 30000
last_autovacuum     | 2023-10-05 09:59:35.569023+01
last_autoanalyze    | 2023-10-05 09:59:35.667303+01

One last thing to check the index. As we can see, the idx_scan column is blank right now.  So, let’s run this SQL:

db01=# SELECT count(*) FROM exam_tab01 WHERE id > 28000;

As one can see from the EXPLAIN plan, the query planner wants to use the primary key.

db01=# EXPLAIN SELECT count(*) FROM exam_tab01 WHERE id > 28000;
                                           QUERY PLAN
-------------------------------------------------------------------------------------------------
 Aggregate  (cost=64.29..64.30 rows=1 width=8)
   ->  Index Only Scan using exam_tab01_pkey on exam_tab01  (cost=0.29..59.29 rows=2000 width=0)
         Index Cond: (id > 28000)

So, what does that do to the stats in pg_stat_all_tables?
-[ RECORD 1 ]-------+------------------------------
relid               | 16390
schemaname          | public
relname             | exam_tab01
seq_scan            | 3
last_seq_scan       | 2023-10-05 10:00:05.551341+01
seq_tup_read        | 60000
idx_scan            | 1
last_idx_scan       | 2023-10-05 10:51:18.111977+01
idx_tup_fetch       | 0
n_tup_ins           | 30000
n_live_tup          | 30000
n_mod_since_analyze | 0
n_ins_since_vacuum  | 0
last_autovacuum     | 2023-10-05 09:59:35.569023+01
last_autoanalyze    | 2023-10-05 09:59:35.667303+01

The last_seq_scan has stayed the same, but we have a new date for the last_idx_scan, as we used an index to get the information rather than a table scan.  However, this works for all indexes connected to the table.  

As a final example, what if we created a new index, but this time on the “insert_date” column, and added another 30,000 rows?

 CREATE INDEX idx_exam_tab_ins_date ON exam_tab01 (insert_date);

and insert another 30000 rows (the same as before) for a total of 60000 rows.

We run a query against the “insert_date” column like this:

db01=# EXPLAIN ANALYZE 
SELECT count(*) 
FROM   exam_tab01 
WHERE  insert_date >= (now() - INTERVAL '1 MIN');
                                                                 QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=4.32..4.33 rows=1 width=8) (actual time=0.018..0.018 rows=1 loops=1)
   ->  Index Only Scan using idx_exam_tab_ins_date on exam_tab01  (cost=0.29..4.31 rows=1 width=0) (actual time=0.016..0.016 rows=0 loops=1)
         Index Cond: (insert_date >= (now() - '00:01:00'::interval))
         Heap Fetches: 0
 Planning Time: 0.085 ms
Execution Time: 0.037 ms

When we run a query against the pg_stat_all_tables,  what do we get?

-[ RECORD 1 ]-------+------------------------------
relid               | 16390
schemaname          | public
relname             | exam_tab01
seq_scan            | 4
last_seq_scan       | 2023-10-05 11:00:21.644475+01
seq_tup_read        | 90000
idx_scan            | 2
last_idx_scan       | 2023-10-05 11:03:48.902501+01
idx_tup_fetch       | 0
n_tup_ins           | 60000
n_tup_upd           | 0
n_tup_del           | 0
n_tup_hot_upd       | 0
n_tup_newpage_upd   | 0
n_live_tup          | 59987
n_dead_tup          | 0
n_mod_since_analyze | 0
n_ins_since_vacuum  | 0
last_vacuum         |
last_autovacuum     | 2023-10-05 11:01:36.641706+01
last_analyze        | 2023-10-05 11:01:06.613578+01
last_autoanalyze    | 2023-10-05 09:59:35.667303+01
vacuum_count        | 0
autovacuum_count    | 2
analyze_count       | 2
autoanalyze_count   | 1

We can see that the index stats have increased, even though we didn’t run it against the primary key.  However, if we look in pg_stat_all_indexes view, we get something new:

db01=# 
SELECT * 
FROM  pg_stat_all_indexes 
WHERE relname = 'exam_tab01';
-[ RECORD 1 ]-+------------------------------
relid         | 16390
indexrelid    | 16396
schemaname    | public
relname       | exam_tab01
indexrelname  | exam_tab01_pkey
idx_scan      | 1
last_idx_scan | 2023-10-05 10:51:18.111977+01
idx_tup_read  | 2000
idx_tup_fetch | 0
-[ RECORD 2 ]-+------------------------------
relid         | 16390
indexrelid    | 16399
schemaname    | public
relname       | exam_tab01
indexrelname  | idx_exam_tab_ins_date
idx_scan      | 1
last_idx_scan | 2023-10-05 11:03:48.902501+01
idx_tup_read  | 0
idx_tup_fetch | 0

We have two rows, one for each index on the exam_tab01 table and we can see the last times that the indexes were used.  Therefore, for indexes, using the pg_stat_all_indexes is important.

Conclusion

These are important and nice features.  Other tools can already perform these functions (pgBouncer & pgPool for the load balancing), but these are now part of the core functionality, post PostgreSQL 16.  That is good news for admins in general.  I am already sure that some clients will be very interested in this new functionality, and it saves me from having to write functions and code that store this information.

Add in the improvements to pglogical and the new security features, I think that PostgreSQL 16 will be a very important major version.

No Comments Yet

Let us know what you think

Subscribe by email