Pythian Blog: Technical Track

CPU utilization is not a useful metric

Once upon a time CPU utilization was quite a useful metric. Following are the output of several tools that provide CPU utilization metrics:

top

top reports a load of 1.66. Is this correct? No. The correct load number is probably closer to 2.4.
 # top -b -n 1| head -20
 top - 11:27:45 up 151 days, 1:55, 7 users, load average: 1.66, 1.84, 1.88
 Tasks: 389 total, 3 running, 386 sleeping, 0 stopped, 0 zombie
 Cpu(s): 0.7%us, 20.6%sy, 1.2%ni, 77.3%id, 0.1%wa, 0.0%hi, 0.1%si, 0.0%st
 Mem: 32639636k total, 32206476k used, 433160k free, 235732k buffers
 Swap: 16359420k total, 10285664k used, 6073756k free, 2354840k cached
 
  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
 16702 root 20 0 8274m 5.0g 5.0g S 85.1 16.1 59164:55 VirtualBox
  4657 root 20 0 9.8g 5.2g 5.1g S 45.5 16.6 26518:13 VirtualBox
  6239 root 20 0 9.8g 5.1g 5.1g S 39.6 16.5 31200:52 VirtualBox
 27070 root 20 0 7954m 5.4g 5.4g S 17.8 17.5 17049:30 VirtualBox
 27693 root 20 0 2233m 441m 20m S 5.9 1.4 3407:34 firefox
  7648 root 20 0 6758m 4.1g 4.1g S 4.0 13.2 17069:52 VirtualBox
  6633 root 20 0 368m 63m 31m R 2.0 0.2 1338:58 Xorg
 14727 root 20 0 15216 1344 828 R 2.0 0.0 0:00.01 top
  1 root 20 0 19416 932 720 S 0.0 0.0 0:00.90 init
  2 root 20 0 0 0 0 S 0.0 0.0 0:03.53 kthreadd
  3 root 20 0 0 0 0 S 0.0 0.0 2:08.23 ksoftirqd/0
  5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H
  7 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/u:0H
 

sar

sar does not show the load average, but does report what it thinks is CPU utilization. Is it correct? Again, no. Actual idle should be closer 45-50%.
 # sar 1 1
 Linux 3.8.13-16.2.1.el6uek.x86_64 (myserver.jks.com) 01/22/2018 _x86_64_ (8 CPU)
 
 11:29:32 AM CPU %user %nice %system %iowait %steal %idle
 11:29:33 AM all 0.88 1.00 17.27 0.00 0.00 80.85
 Average: all 0.88 1.00 17.27 0.00 0.00 80.85
 

mpstat

mpstat reports per CPU. Again, these values are not quite correct.
 # mpstat -P ALL
 Linux 3.8.13-16.2.1.el6uek.x86_64 (myserver.jks.com) 01/22/2018 _x86_64_ (8 CPU)
 
 11:35:49 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
 11:35:49 AM all 0.74 1.19 20.58 0.11 0.00 0.06 0.00 0.00 77.32
 11:35:49 AM 0 1.11 1.18 20.24 0.58 0.00 0.48 0.00 0.00 76.42
 11:35:49 AM 1 0.88 1.32 22.45 0.08 0.00 0.02 0.00 0.00 75.25
 11:35:49 AM 2 0.84 1.34 22.78 0.06 0.00 0.01 0.00 0.00 74.98
 11:35:49 AM 3 0.81 1.31 21.69 0.05 0.00 0.00 0.00 0.00 76.15
 11:35:49 AM 4 0.64 1.00 16.76 0.05 0.00 0.00 0.00 0.00 81.54
 11:35:49 AM 5 0.57 1.11 19.28 0.02 0.00 0.00 0.00 0.00 79.02
 11:35:49 AM 6 0.57 1.10 19.46 0.02 0.00 0.00 0.00 0.00 78.85
 
 
Finally the venerable uptime command:

uptime

 # uptime
  11:29:48 up 151 days, 1:57, 7 users, load average: 1.70, 1.81, 1.87
 
 
Notice that mpstat and sar both report 8 CPUs, and that is the crux of the problem. Why is that a problem? It is a problem because this machine does not have 8 CPUs; it has only 4. The CPU is an Intel i7-4790S with hyperthreading enabled. When hyperthreading is enabled, Linux utilities believe that the number of CPUs is actually twice the number actually present. In this case it appears to top, sar, mpstat and uptime that there are 8 CPUs, when in reality there are only 4.

What is Hyperthreading?

"But wait; doesn't hyperthreading double the processing power of my CPU?" you may ask. Well, no, it doesn't. Hyperthreading is a clever bit of technology from Intel that allows the operating system to better take advantage of a CPU during what would otherwise be idle time. Please refer to the references list if you would like more detail. There are many sources that estimate the performance advantage of enabling hypertreading vs not enabling it. A good summary of the rules of thumb of expected performance benefits when hyperthreads are enabled:
Socket Count Max Benefit %
1 30%
2 15%
3+ testing required
When the previously noted utilities are reporting there are 8 CPUs, that is not quite correct then as enabling hyperthreading does not double the number of CPUs. Given the example i7 processor, the best we can hope for is that this single socket 4 core CPU will provide the equivalent work of approximately 5.6 cores. 8 * ( ( 100 - 30 ) / 100) = 5.6 estimated CPU / reported CPUs = metric adjustment % In this case: 5.6 / 8 = 0.7 When CPU utilization is reported as 80% idle, the real value is more like 56% 80 * 0.70 = 56 Load averages can be treated the same way: The load of 1.66 is actually ~2.4 1.66 / .7 = 2.37

Is hyperthreading enabled?

So by now you probably would like to know how to determine if hyperthreading is enabled. There are a couple things you need to know to investigate this. First find out the info about the CPU in question. The following instructions are for Linux. Start by determining the CPU model. Here is one easy method to find it:
 # grep CPU /proc/cpuinfo
 model name : Intel(R) Core(TM) i7-4790S CPU @ 3.20GHz
 model name : Intel(R) Core(TM) i7-4790S CPU @ 3.20GHz
 model name : Intel(R) Core(TM) i7-4790S CPU @ 3.20GHz
 model name : Intel(R) Core(TM) i7-4790S CPU @ 3.20GHz
 model name : Intel(R) Core(TM) i7-4790S CPU @ 3.20GHz
 model name : Intel(R) Core(TM) i7-4790S CPU @ 3.20GHz
 model name : Intel(R) Core(TM) i7-4790S CPU @ 3.20GHz
 model name : Intel(R) Core(TM) i7-4790S CPU @ 3.20GHz
 
The next step is to point your browser at https://ark.intel.com/, and then search for the exact CPU model. Searching for i7-4790S shows there are 4 cores and 8 threads available, so this CPU is capable of hyperthreading. The next step is to determine if hyperthreads are enabled. Doing so is less straightforward than previous steps. The following process can be used to determine the actual number of physical cores, and then compare that to the number of cores presented to the OS.

number of physical cores

There are 4 cores in this case
 # grep 'core id' /proc/cpuinfo | sort -u
 core id : 0
 core id : 1
 core id : 2
 core id : 3
 

number of processors

8 are shown
 # grep 'processor' /proc/cpuinfo | sort -u
 processor : 0
 processor : 1
 processor : 2
 processor : 3
 processor : 4
 processor : 5
 processor : 6
 processor : 7
 
The number of reported processors are double the number of physical cores, indicating that hyperthreads are enabled. This was tested on another server as well, one with 4 sockets of 10 cores each and hyperthreading known to be enabled. As there are only 40 physical cores enabled it is clear that hyperthreading is enabled.
 $ grep 'core id' /proc/cpuinfo | sort -u| wc -l
 10
 
 $ grep 'processor' /proc/cpuinfo | sort -u| wc -l
 80
 
 

So, what now?

The time for using CPU utilization as a metric to drive for performance improvements is now long past. CPU technology has advanced so much in the past several years that this metric now has limited usefulness. Load Averages and CPU utilization may still be useful as barometers on systems where it is known that exceeding a certain threshold indicates there may be some issues to look at. Other than that though, these metrics have outlived their usufullnes if the goal is to drive performance improvement through monitoring and mitigation of key metrics. For much more detailed information, please refer the the Reference section at this end of this blog.

References

Will Hyper-Threading Improve Processing Performance? CPU Utilization is Wrong Utilization is Virtually Useless as a Metric! Linux Load Averages: Solving the Mystery

No Comments Yet

Let us know what you think

Subscribe by email