Pythian Blog: Technical Track

Benchmarking Postgres on AWS 4,000 PIOPs EBS instances

Introduction

Disk I/O is frequently the performance bottleneck with relational databases. With AWS recently releasing 4,000 PIOPs EBS volumes, I wanted to do some benchmarking with pgbench and PostgreSQL 9.2. Prior to this release the maximum available I/O capacity was 2,000 IOPs per volume. EBS IOPs are read and written in 16Kb chunks with their performance limited by both the I/O capacity of the EBS volumes and the network bandwidth between an EC2 instance and the EBS network. My goal isn't to provide a PostgreSQL tuning guide, an EC2 tuning guide, or a database deathmatch complete with graphs; I'll just be displaying what kind of performance is available out-of-the-box without substantive tuning. In other words, this is an exploratory benchmark not a comparative benchmark. I would have liked to compare the performance of 4,000 PIOPs EBS volumes with 2,000 PIOPs EBS volumes, but I ran out of time so that will have to wait for a following post.

Setup

Region

I conducted my testing in AWS' São Paulo region. One benefit of testing in sa-east-1 is that spot prices for larger instances are (anecdotally) more stable than in us-east. Unfortunately, sa-east-1 doesn't have any cluster compute (CC) instances available. CC instances have twice the bandwidth to the EBS network than non-CC EC2 instances. That additional bandwidth allows you to construct larger software RAID volumes. My cocktail napkin calculations show that it should be possible to reach 50,000 PIOPs on an EBS-backed CC instance without much of a problem.

EC2 instances

I tested with three EC2 instances: an m1.large from which to run pgbench, an m2.2xlarge with four EBS volumes, and an m1.xlarge with one EBS volume. All EBS volumes are 400GB with 4,000 provisioned IOPs. The m1.large instance was an on-demand instance; the other instances — the pgbench target database servers — were all spot instances with a maximum bid of $0.05. (In one case our first spot instance was terminated, and we had to rebuild it). Some brief testing showed that having an external machine driving the benchmark was critical for the best results.

Operating System

All EC2 instances are running Ubuntu 12.10. A custom sysctl.conf tuned the Sys V shared memory as well as set swappiness to zero and memory overcommit to two.
kernel.shmmax = 13355443200
 kernel.shmall = 13355443200
 vm.swappiness = 0
 vm.overcommit_memory = 2
 
Packages The following packages were installed via apt-get:
  • htop
  • xfsprogs
  • debian-keyring
  • mdadm
  • postgresql-9.2
  • postgresql-contrib-9.2
In order to install the postgresql packages a pgdb.list file containing
deb https://apt.postgresql.org/pub/repos/apt/ squeeze-pgdg main
 
was placed in /etc/apt/sources.list.d and the following commands were run:
gpg --keyserver pgp.mit.edu --recv-keys ACCC4CF8
 gpg --armor --export ACCC4CF8 | apt-key add -
apt-get update

RAID and Filesystems

For the one volume instance, I simply created an XFS file system and mounted it on /mnt/benchmark.
mkdir /mnt/benchmark
 mkfs.xfs /dev/svdf 
 mount -t xfs /dev/svdf /mnt/benchmark
 echo "/dev/svdf /mnt/benchmark xfs defaults 1 2" >> /etc/fstab
 
For the four volume instance it was only slightly more involved. mkfs.xfs analyzes the underlying disk objects and determines the appropriate values for stride and width. Below are the commands for assembling a four volume mdadm software RAID array that is mounted on boot (assuming you've attached the EBS volumes to your EC2 instance). Running dpkg-reconfigure rebuilds the initrd image.
mkdir /mnt/benchmark
 mdadm --create /dev/md0 --level=0 --raid-volumes=4 /dev/svdf /dev/svdg /dev/svdh /dev/svdi
 mdadm --detail --scan >> /etc/mdadm/mdadm.conf
 mkfs.xfs /dev/md0
 echo "/dev/md0 /mnt/benchmark xfs defaults 1 2" >> /etc/fstab
 dpkg-reconfigure mdadm
 

Benchmarking

pgbench is a utlity included in the postgresql-contrib-9.2 package. It approximates the TPC-B benchmark and can be looked at as a database stress test whose output is measured in transactions per second. It involves a significant amount of disk I/O with transactions that run for relatively short amounts of time. vacuumdb was run before each pgbench iteration. For each database server pgbench was run mimicking 16 clients, 32 clients, 48 clients, 64 clients, 80 clients, and 96 clients. At each of those client values, pgbench iterated ten times in steps of 100 from 100 to 1,000 transactions per client. It's important to realize that pgbench's stress test is not typical of a web application workload; most consumer facing web applications could achieve much higher rates than those mentioned here. The only pgbench results against AWS/EBS volumes that I'm-aware-of/is-quickly-googleable is from early 2012 and, at its best, achieves rates 50% slower than the lowest rates found here. I drove the benchmark using a very small, very unfancy bash script. An example of the pgbench commandline would be:
pgbench -h $DBHOST -j4 -r -Mextended -n -c48 -t600 -U$DBUSER
 

m1.xlarge with single 4,000 PIOPs volume

The maximum transaction volume for this isntance was when running below 48 concurrent clients and under 500 transactions per client. While the transaction throuput never dropped precipitously at any point, loads outside of that range exhibited varying performance. Even at its worst, though, this instance handled between 600-700 transactions/second.

m2.2xlarge with four 4,000 PIOPs volumes

I was impressed; at no point did the benchmark stress this instance — the tps rate was between 1700-1900 in most situations with peaks up to 2200 transactions per second. If I was asked to blindly size a "big" PostgreSQL database server running on AWS this is probably where I would start. It's not so large that you have operational issues like worrying about MTBFs for ten volume RAID arrays or trying to snapshot 4TB of disk space, but it is large enough to absorb a substantial amount of traffic.

Graphs and Tabular Data

single-4K-volume tps

The spread of transactions/second irrespective of number of clients.

Data grouped by number of concurrent clients with each bar representing an increase in 100 transactions per second ranging from 100 to 1,000.

Progression of tps by individual level of concurrency. The x-axis tick marks measure single pgbench runs from 100 transactions per client to 1,000 transactions per client.

 

Raw tabular data

txns/client 100 200 300 400 500 600 700 800 900 1000
clients
16 1455 1283 1183 653 1197 533 631 1009 923 648
32 1500 1242 1232 757 747 630 1067 665 688 709
48 281 864 899 705 1029 749 736 593 766 641
64 944 1281 704 1010 739 596 778 662 820 612
80 815 893 1055 809 597 801 684 708 736 663
96 939 889 774 772 798 682 725 662 776 708

four-4,000-PIOPs-volumes tps

Again, a box plot of the data with a y-axis of transactions/second.

 

Grouped by number of concurrent clients between 100 and 1,000 transactions per client.

 

TPS by number of concurrent clients. The x-axis ticks mark pgbench runs progressing from 100 transactions per client to 1,000 transactions per client.

 

Tabular data m2.2xlarge with four 4,000 PIOPs EBS volumes

txns/client 100 200 300 400 500 600 700 800 900 1000
clients
16 1487 1617 1877 1415 1388 1882 1897 1771 1267 1785
32 1804 2083 2160 1791 1259 1997 2230 1501 1717 1918
48 1810 2152 1296 1951 2117 1775 1709 1803 1817 1847
64 1810 1580 1568 2056 1811 1784 1849 1909 1942 1658
80 1802 2044 1467 2142 1645 1896 1933 1740 1821 1851
96 1595 1403 2047 1731 1783 1859 1708 1896 1751 1801
 

No Comments Yet

Let us know what you think

Subscribe by email