Pythian Blog: Technical Track

An Initial Test of Google’s AlloyDB Columnar Engine

Google just recently launched the public preview of their new customized cloud version of PostgreSQL called AlloyDB. This Google Cloud Platform (GCP) managed database service is not the same as their Cloud SQL PostgreSQL service that still exists and is a slightly different offering.

 

 

One of the main features of their AlloyDB flavor of PostgreSQL is a new Columnar Engine (CE). This is composed of some dedicated memory for column format data, which Google calls the column store. There’s also a corresponding columnar query planner and execution engine. Having data in memory in columnar format can significantly increase the speed of some analytical queries.

Conceptually, this should be similar to the In-Memory feature of Oracle Database, but with AlloyDB in a 100% PostgreSQL–compatible managed database service.

Although the workloads are usually much more complex and mixed, we decided to do an initial quick test of the potential of this one new feature to show the possibilities for applicable queries. (AlloyDB has other technical differences which are beyond the scope of this article.)

Our test environment uses a mid-sized AlloyDB machine size: 16 vCPUs which comes with 128GB of memory.

 

PostgreSQL Setup and Parameters

We made minimal changes to the core PostgreSQL parameters (managed as “flags” within the service and can be adjusted through the Web UI or gcloud API commands) and mostly used the default values for the AlloyDB service. The exception is:

work_mem = 65536

We also had the pg_stat_statments extension enabled. Apart from that, our baseline was with an “out-of-the-box” AlloyDB deployment.

Generating a Baseline

We used the standard PostgreSQL pgbench utility for a quick test and initially populated it so that the pgbench_accounts table would have 50M rows.

Commands (pgbench and psql) are run from a Google Compute Engine (GCE) virtual machine in the same region as the AlloyDB cluster. Since AlloyDB is fully PostgreSQL-compatible, utility connections (including pgbench and psql) are handled exactly the same way as they would be for any other PostgreSQL cluster. (After the initial AlloyDB deployment and GCP network configuration, deployment steps are outside of the scope of this article).

$ pgbench pgbench_test --initialize --init-steps=dtgv --fillfactor=90 --scale=500
dropping old tables...
creating tables...
generating data (client-side)...
50000000 of 50000000 tuples (100%) done (elapsed 46.37 s, remaining 0.00 s)
vacuuming...
done in 49.87 s (drop tables 1.21 s, create tables 0.01 s, client-side generate 48.25 s, vacuum 0.41 s).
$

Since this initial test is so narrow and limited in scope–and focused on seq scans of the entire table–there was no need to add the corresponding primary and foreign keys.

We prepared three simple queries designed specifically to perform seq scans of the entire table:

Query 1: SELECT * FROM pgbench_accounts WHERE bid = 500 AND abalance > 100;
Query 2: SELECT * FROM pgbench_accounts WHERE bid = 500 OR bid = 100;
Query 3: SELECT SUM(abalance) FROM pgbench_accounts WHERE bid = 500 OR bid = 100;

Running these very simple (and similar) queries against the pbench_accounts table shows:

pgbench_test=> EXPLAIN (ANALYZE,COSTS,SETTINGS,BUFFERS,TIMING,SUMMARY,WAL,VERBOSE)
pgbench_test-> SELECT * FROM pgbench_accounts WHERE bid = 500 AND abalance > 100;
                                                                 QUERY PLAN                                                                  
---------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..1222591.12 rows=1 width=97) (actual time=1537.827..1541.348 rows=0 loops=1)
   Output: aid, bid, abalance, filler
   Workers Planned: 2
   Workers Launched: 2
   Buffers: shared hit=909091
   ->  Parallel Seq Scan on public.pgbench_accounts  (cost=0.00..1221591.02 rows=1 width=97) (actual time=1531.381..1531.382 rows=0 loops=3)
         Output: aid, bid, abalance, filler
         Filter: ((pgbench_accounts.abalance > 100) AND (pgbench_accounts.bid = 500))
         Rows Removed by Filter: 16666667
         Buffers: shared hit=909091
         Worker 0:  actual time=1528.139..1528.140 rows=0 loops=1
           Buffers: shared hit=302957
         Worker 1:  actual time=1528.617..1528.617 rows=0 loops=1
           Buffers: shared hit=305201
 Settings: effective_cache_size = '52768720kB', work_mem = '64MB'
 Query Identifier: -2789621609201040747
 Planning Time: 0.075 ms
 Execution Time: 1541.372 ms
(18 rows)

pgbench_test=> EXPLAIN (ANALYZE,COSTS,SETTINGS,BUFFERS,TIMING,SUMMARY,WAL,VERBOSE)
pgbench_test-> SELECT * FROM pgbench_accounts WHERE bid = 500 OR bid = 100;
                                                                     QUERY PLAN                                                                     
----------------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..1241743.42 rows=191524 width=97) (actual time=332.758..1640.550 rows=200000 loops=1)
   Output: aid, bid, abalance, filler
   Workers Planned: 2
   Workers Launched: 2
   Buffers: shared hit=909091
   ->  Parallel Seq Scan on public.pgbench_accounts  (cost=0.00..1221591.02 rows=79802 width=97) (actual time=325.744..1619.348 rows=66667 loops=3)
         Output: aid, bid, abalance, filler
         Filter: ((pgbench_accounts.bid = 500) OR (pgbench_accounts.bid = 100))
         Rows Removed by Filter: 16600000
         Buffers: shared hit=909091
         Worker 0:  actual time=322.531..1620.121 rows=89935 loops=1
           Buffers: shared hit=302604
         Worker 1:  actual time=322.158..1621.575 rows=81025 loops=1
           Buffers: shared hit=305687
 Settings: effective_cache_size = '52768720kB', work_mem = '64MB'
 Query Identifier: 3730570278250096652
 Planning Time: 0.073 ms
 Execution Time: 1648.524 ms
(18 rows)

pgbench_test=> EXPLAIN (ANALYZE,COSTS,SETTINGS,BUFFERS,TIMING,SUMMARY,WAL,VERBOSE)
pgbench_test-> SELECT SUM(abalance) FROM pgbench_accounts WHERE bid = 500 OR bid = 100;
                                                                          QUERY PLAN                                                                           
---------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=1222790.74..1222790.75 rows=1 width=8) (actual time=1624.157..1627.709 rows=1 loops=1)
   Output: sum(abalance)
   Buffers: shared hit=909091
   ->  Gather  (cost=1222790.53..1222790.74 rows=2 width=8) (actual time=1623.927..1627.700 rows=3 loops=1)
         Output: (PARTIAL sum(abalance))
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared hit=909091
         ->  Partial Aggregate  (cost=1221790.53..1221790.54 rows=1 width=8) (actual time=1617.558..1617.559 rows=1 loops=3)
               Output: PARTIAL sum(abalance)
               Buffers: shared hit=909091
               Worker 0:  actual time=1614.305..1614.307 rows=1 loops=1
                 Buffers: shared hit=303993
               Worker 1:  actual time=1614.672..1614.674 rows=1 loops=1
                 Buffers: shared hit=302784
               ->  Parallel Seq Scan on public.pgbench_accounts  (cost=0.00..1221591.02 rows=79802 width=4) (actual time=320.520..1613.159 rows=66667 loops=3)
                     Output: aid, bid, abalance, filler
                     Filter: ((pgbench_accounts.bid = 500) OR (pgbench_accounts.bid = 100))
                     Rows Removed by Filter: 16600000
                     Buffers: shared hit=909091
                     Worker 0:  actual time=318.019..1609.445 rows=73930 loops=1
                       Buffers: shared hit=303993
                     Worker 1:  actual time=317.168..1610.406 rows=64800 loops=1
                       Buffers: shared hit=302784
 Settings: effective_cache_size = '52768720kB', work_mem = '64MB'
 Query Identifier: -1803082051895113688
 Planning Time: 0.087 ms
 Execution Time: 1627.749 ms
(28 rows)

pgbench_test=>

The average execution time of each of those is about 1600ms or 1.6 seconds.

Enabling and Re-testing with the Columnar Engine

Enabling the AlloyDB Columnar Engine (CE) and configuring it so our query can take advantage of it is a multi-step process.

First, we need to adjust the instance so that the new google_columnar_engine.enabled flag is set to ON. This can be done either through the GCP Web UI or through a gcloud command such as:

gcloud beta alloydb instances update <alloydb_instance_name> \
    --database-flags work_mem=65536,google_columnar_engine.enabled='on' \
    --region=<region_name> \
    --cluster=<alloydb_cluster_name> \
    --project=${DEVSHELL_PROJECT_ID}

IMPORTANT: At this time, our test table is so small that the entire relation can fit in the CE column store which is sized at the default of 1024MB. As our applicable dataset grows, we’ll also need to set the google_columnar_engine.memory_size_in_mb to a larger value.

After changing the flag/parameter (which means an instance restart), we need to also enable to the CE extension (in each applicable database):

pgbench_test=> CREATE EXTENSION google_columnar_engine;
CREATE EXTENSION
pgbench_test=>

And finally, we need to load the actual table into the column store. At this time, we’ll simply load the entire table as it’s very small. With larger tables we’ll need to also use the recommendations engine to only load columns that are useful for our queries.

Loading using the new CE function takes a bit of time:

pgbench_test=> \timing on
Timing is on.
pgbench_test=> SELECT google_columnar_engine_add('pgbench_accounts');
 google_columnar_engine_add 
----------------------------
                        624
(1 row)

Time: 9235.726 ms (00:09.236)
pgbench_test=>

And now we can re-run the EXPLAIN tests and our simple queries:

pgbench_test=> EXPLAIN (ANALYZE,COSTS,SETTINGS,BUFFERS,TIMING,SUMMARY,WAL,VERBOSE)
pgbench_test-> SELECT * FROM pgbench_accounts WHERE bid = 500 AND abalance > 100;
                                                                 QUERY PLAN                                                                  
---------------------------------------------------------------------------------------------------------------------------------------------
 Append  (cost=20.00..29.78 rows=2 width=97) (actual time=0.505..0.506 rows=0 loops=1)
   ->  Custom Scan (columnar scan) on public.pgbench_accounts  (cost=20.00..25.72 rows=1 width=97) (actual time=0.504..0.504 rows=0 loops=1)
         Output: aid, bid, abalance, filler
         Filter: ((pgbench_accounts.abalance > 100) AND (pgbench_accounts.bid = 500))
         Rows Removed by Columnar Filter: 50000000
         CU quals: ((pgbench_accounts.abalance > 100) AND (pgbench_accounts.bid = 500))
         Columnar cache search mode: native
   ->  Seq Scan on public.pgbench_accounts  (cost=0.00..4.06 rows=1 width=97) (never executed)
         Output: aid, bid, abalance, filler
         Filter: ((pgbench_accounts.abalance > 100) AND (pgbench_accounts.bid = 500))
 Settings: effective_cache_size = '52768720kB', work_mem = '64MB'
 Query Identifier: -2789621609201040747
 Planning Time: 0.580 ms
 Execution Time: 0.550 ms
(14 rows)

pgbench_test=> EXPLAIN (ANALYZE,COSTS,SETTINGS,BUFFERS,TIMING,SUMMARY,WAL,VERBOSE)
pgbench_test-> SELECT * FROM pgbench_accounts WHERE bid = 500 OR bid = 100;
                                                                              QUERY PLAN                                                                              
 
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
-
 Gather  (cost=1000.00..7354.38 rows=191524 width=97) (actual time=0.537..54.927 rows=200000 loops=1)
   Output: aid, bid, abalance, filler
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Append  (cost=0.00..4423.46 rows=79803 width=97) (actual time=0.265..18.123 rows=66667 loops=3)
         Worker 0:  actual time=0.012..0.012 rows=0 loops=1
         Worker 1:  actual time=0.532..27.130 rows=100000 loops=1
         ->  Parallel Custom Scan (columnar scan) on public.pgbench_accounts  (cost=20.00..4419.43 rows=79802 width=97) (actual time=0.264..13.714 rows=66667 loops=3)
               Output: aid, bid, abalance, filler
               Filter: ((pgbench_accounts.bid = 500) OR (pgbench_accounts.bid = 100))
               Rows Removed by Columnar Filter: 16600000
               CU quals: ((pgbench_accounts.bid = 500) OR (pgbench_accounts.bid = 100))
               Columnar cache search mode: native
               Worker 0:  actual time=0.010..0.010 rows=0 loops=1
               Worker 1:  actual time=0.531..20.585 rows=100000 loops=1
         ->  Parallel Seq Scan on public.pgbench_accounts  (cost=0.00..4.03 rows=1 width=97) (never executed)
               Output: aid, bid, abalance, filler
               Filter: ((pgbench_accounts.bid = 500) OR (pgbench_accounts.bid = 100))
 Settings: effective_cache_size = '52768720kB', work_mem = '64MB'
 Query Identifier: 3730570278250096652
 Planning Time: 0.437 ms
 Execution Time: 63.027 ms
(22 rows)

pgbench_test=> EXPLAIN (ANALYZE,COSTS,SETTINGS,BUFFERS,TIMING,SUMMARY,WAL,VERBOSE)
pgbench_test-> SELECT SUM(abalance) FROM pgbench_accounts WHERE bid = 500 OR bid = 100;
                                                                                   QUERY PLAN                                                                         
           
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------
 Finalize Aggregate  (cost=2622.64..2622.65 rows=1 width=8) (actual time=10.143..13.340 rows=1 loops=1)
   Output: sum(abalance)
   ->  Gather  (cost=2622.42..2622.63 rows=2 width=8) (actual time=2.839..13.330 rows=3 loops=1)
         Output: (PARTIAL sum(abalance))
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=1622.42..1622.43 rows=1 width=8) (actual time=0.875..0.876 rows=1 loops=3)
               Output: PARTIAL sum(abalance)
               Worker 0:  actual time=0.014..0.015 rows=1 loops=1
               Worker 1:  actual time=0.015..0.016 rows=1 loops=1
               ->  Parallel Append  (cost=0.00..1422.91 rows=79803 width=4) (actual time=0.062..0.872 rows=0 loops=3)
                     Worker 0:  actual time=0.012..0.013 rows=0 loops=1
                     Worker 1:  actual time=0.013..0.013 rows=0 loops=1
                     ->  Parallel Custom Scan (columnar scan) on public.pgbench_accounts  (cost=20.00..1418.89 rows=79802 width=4) (actual time=0.061..0.869 rows=6666
7 loops=3)
                           Output: abalance
                           Filter: ((pgbench_accounts.bid = 500) OR (pgbench_accounts.bid = 100))
                           Rows Removed by Columnar Filter: 16600000
                           Rows Aggregated by Columnar Scan: 66667
                           CU quals: ((pgbench_accounts.bid = 500) OR (pgbench_accounts.bid = 100))
                           Columnar cache search mode: native
                           Worker 0:  actual time=0.011..0.011 rows=0 loops=1
                           Worker 1:  actual time=0.011..0.011 rows=0 loops=1
                     ->  Parallel Seq Scan on public.pgbench_accounts  (cost=0.00..4.03 rows=1 width=4) (never executed)
                           Output: abalance
                           Filter: ((pgbench_accounts.bid = 500) OR (pgbench_accounts.bid = 100))
 Settings: effective_cache_size = '52768720kB', work_mem = '64MB'
 Query Identifier: -1803082051895113688
 Planning Time: 0.461 ms
 Execution Time: 13.400 ms
(29 rows)

pgbench_test=>

From the above, we can see that query 1 went from 1541.372 ms to just 0.550 ms. But it did no aggregations and returned no rows. Query 2 went from 1648.524 ms to 63.027 ms as it returned 200,000 rows. Still, it’s a huge jump with the CE: 1648.524/63.027=26.156 times faster. And query 3 which added an aggregation went from 1627.749 ms to just 13.400 ms.

In the plans, we can also see “Rows Removed by Columnar Filter“. The more selective, the query, the more useful the CE filtering.

 

Testing with Larger Tables and Using the Recommendation Engine

As we start to test with larger amounts of data, we also need to adjust the amount of memory reserved for the Columnar Engine.

Failing to do so will result in an error when we try to load the table:

pgbench_test=> SELECT google_columnar_engine_add('pgbench_accounts');
INFO:  Only 343 columnar units have been populated.
WARNING:  FixedArray::Fill: Out of memory
 google_columnar_engine_add 
----------------------------
                          0
(1 row)

pgbench_test=>

The solution to this is to turn on the recommendation engine, run some “training queries,” adjust the memory setting, and then re-test. Reminder: adjusting the memory flag means an instance restart.

To turn on the recommendation engine we must ensure that the parameter google_columnar_engine.enable_columnar_recommendation='on' is set – which should be the default.

With this parameter set (and after running the three queries) the CE recommendations engine can learn what columns should be added and how large the column store should be.

We first check how much memory the column store is required by running:

pgbench_test=> SELECT google_columnar_engine_run_recommendation(65536,'PERFORMANCE_OPTIMAL');
                   google_columnar_engine_run_recommendation                    
--------------------------------------------------------------------------------
 (1099,"pgbench_test.public.pgbench_accounts(abalance:1,aid:1,bid:1,filler:1)")
(1 row)

pgbench_test=>

This reports the optimal column store size for the tracked queries with the first parameter being the maximum we can provide. In our case, that high water mark is 50% of our machine’s 128GB. And the output shows that we need a column store of 1099MB which is only slightly larger than the 1024MB default. For added safety, a good practice to account for potential size estimate inaccuracies (or data volume-driven changes) is to add 10% to 20% to this value.

We need to ensure that we have a column store memory of at least that large. To ensure a column store memory of at least 1099MB, we need to set the flag google_columnar_engine.memory_size_in_mb. Since we have a large machine with 128GB of memory and want to test with larger tables, we bumped up the column store size quite a bit to a generous value of 8GB.

Now that we have a larger amount of memory allocated to the column store, we redo the training queries and the use a similar command to make it adopt the recommendation and add the suggested columns:

pgbench_test=> SELECT google_columnar_engine_run_recommendation(0,'FIXED_SIZE',TRUE);
                    google_columnar_engine_run_recommendation                    
---------------------------------------------------------------------------------
 (8192,"pgbench_test.public.pgbench_accounts(abalance:1,aid:1,bid:1,filler:1)")
(1 row)

pgbench_test=>

This time, the first value of “0” tells the engine that there’s no limit on the memory utilization. These recommendations can use as much of the column store as required for adding the currently recommended columns. The final value of “TRUE” means to actually add the columns.

Alternatively, we can have the columns automatically added by setting the flag google_columnar_engine.relations with the output of the recommendation functions because anything we populate manually using the functions won’t persist through instance restarts. (There is, however, an auto-recommendations and auto-population option to have both work in the background on a schedule.)

Finally, we can verify what tables and columns are in the column store using some new views:

pgbench_test=> SELECT relation_name, status, columnar_unit_count, block_count_in_cc, total_block_count
pgbench_test-> FROM g_columnar_relations;

  relation_name   | status | columnar_unit_count | block_count_in_cc | total_block_count 
------------------+--------+---------------------+-------------------+-------------------
 pgbench_accounts | Usable |                 444 |           1818182 |           1818182
(1 row)

pgbench_test=> SELECT relation_name, column_name, status FROM g_columnar_columns;

  relation_name   | column_name | status 
------------------+-------------+--------
 pgbench_accounts | aid         | Usable
 pgbench_accounts | bid         | Usable
 pgbench_accounts | abalance    | Usable
 pgbench_accounts | filler      | Usable
(4 rows)

pgbench_test=>

And now, we’re ready to test and compare with the larger table.

 

Testing Results with Larger Tables

We increased the table size and re-tested using the procedure described above. The results were:

pgbench_accounts rows Query 1 Ratio Query 2 Ratio Query 3 Ratio
50,000,000 2802.49x 26.16x 121.47x
100,000,000 3412.97x 304.45x 250.50x
500,000,000 12382.44x 786.17x 3499.08x

These results can look impressive, showing a great performance benefit. But keeping in mind our tests are using a single table with “Seq Scan” operations, scanning the entire table without using indexes. Your mileage can vary, so don’t take these results as the usual outcome of this feature–they’re the product of very specific queries for which the CE is providing lookup abilities similar to indexes.

And creating larger tables to scale-up the testing becomes pointless as the table size will quickly exceed the available (machine) memory, forcing more IOs.

Conclusions

After this initial testing, and with some admittedly narrow scoped queries designed specifically to illustrate the potential benefits that the Columnar Engine in the new GCP AlloyDB service provides, we saw significant performance improvements–at least 26x faster, with potentially much more.

Keep in mind that:

  • A columnar data store is optimal for the right type of query–specifically analytical queries. In OLTP and mixed query workloads, the results may be very different.
  • Even with analytical queries, the right columns must be properly cached in the column store, and the queries must be structured accordingly.
  • Columnar store memory caching and memory hits vs. IOs (truly physical or serviced from the OS file cache) are also crucial.

It’s important to note that some up-front work is required. For example, you’ll need to properly size the column store, train the recommendations engine, and ensure that the columns are indeed loaded. Or rely on the automatic recommendations and adoption options – this is likely the best approach. But with some planning, and for the right type of query, the Columnar Engine has the potential to generate significant performance improvements.

The tests described above are just a starting point. Further testing that includes devising a more application-specific or industry benchmarking tool and with consideration of caching and warm-up runs (to maximize memory hits and to minimize IOs) would be the next steps.

 

I hope you’ve found this post helpful. Feel free to share your questions in the comments and subscribe to catch the next post.

 

 

 

No Comments Yet

Let us know what you think

Subscribe by email