I like the discipline of publishing my results on a blog. It forces me to double check my results and look for discrepancies in data.
The numbers I got in my recent Taurus test seemed off. Using a modified example configuration (concurrency 50, time 300s,) Taurus was able to completely saturate 4 cores of VM it was runnig on. At the same time, it was not able to saturate the VM with the Nginx instance under test. Nginx’s VM was using on average 80% of its single core. Taurus generated 352 requests per second to main page, which resulted in 2812 requests per second if I count static dependencies that were being downloaded, since Taurus was ignoring cache settings. Those numbers agree with the Nginx access.log entries.
Test 1 - Taurus-generated test
- Total test time: 306s
- Average Taurus host’s CPU utilization (4 cores): 100%
- Average Nginx host’s CPU utilization (1 core): 80%
- Average network traffic: 20 MB/s
- Predicted network traffic: 15 MB/s
- RPS for the html: 352 rps
- Total RPS: 2812 rps
I read the documentation and found that Taurus defaults to JMeter to conduct tests. I installed JMeter locally and used a recording function (with proxy) to create a simple test - opening the main page. I cleaned it up, saved it as a .jmx
file and used the same instance of Taurus to execute this test. Both tests request the same static resources in the same proportion - main html file and 7 static files.
However, the second test was much more powerful, it easily saturated Nginx host, generated an average of 160MB/s of traffic and used only 55% of the total CPU on Taurus host. The result was 3351 rps for loading the html and 26809 rps in total, as calculated from the Nginx access.log entries. That is a much better result for my single-core Nginx instance.
Test 2 - JMeter-generated equivalent test running under Taurus
- Total test time: 373s
- Average Taurus host’s CPU utilization (4 cores): 55%
- Average Nginx host’s CPU utilization (1 core): 120%
- Average network traffic: 160 MB/s
- Predicted network traffic: 140 MB/s
- RPS for the html: 3351 rps
- Total RPS: 26809 rps
The 120% of Nginx host saturation is a sum of 100% of VM utilization + 20% Proxmox host overhead for handling IO. This is a known quirk of Proxmox reporting.
I repeated the same test as Test 2, but ran it directly from the JMeter GUI (not recommended, as GUI adds overhead). I saturated CPU on my PC. The results were:
Test 3 - JMeter-generated equivalent test running under JMeter
- Total test time: 477s
- Average PC’s CPU utilization (8 cores): 100%
- Average Nginx host’s CPU utilization (1 core): 62%
- Average network traffic: 70 MB/s
- Predicted network traffic: 89 MB/s
- RPS for the html: 1644 rps
- Total RPS: 13157 rps
A few quirks in the data:
- Network traffic proportions between predicted traffic and real traffic do not match across the tests. For tests 1 and 3 it’s close to 0.74 - 0.80, for test 2 is 0.89. In general, a difference was expected as I just calculate the size of data to be sent in response without taking into account the http/s overhead. I expected the proportions to be consistent. I’m not sure why there is a difference in proportions.
- Based on the Nginx access.log, response for each static file was identical, regardless of the source. However, response for the html file was not the same. For each request generated by Test 1 (Taurus example), the response was 7760 bytes long in contrast to response for requests generated using JMeter generated test, those were 3080 bytes long in Tests 2 and 3. This is strange because my html has exactly 7760 bytes. This partially explains the discrepancy in predicted vs. real traffic proportion in Test 2. After adding this to my formula, the proportion in Test 2 changes to 0.78, but it also changes the proportion in Test 3 to 0.70.
- CPU utilization is not a good measurement of resources allocated, as it is influenced by many variables that are hard to control, like Turbo Boost. Even though my results seem off. I was looking at the CPU utilization of an Nginx host. I calculated coefficient representing a percentage of CPU consumed per each 1K requests/s. For tests 2 and 3 (both generated from JMeter but executed via Taurus and JMeter) I got a value close to 4.5, even though those tests were at opposite end of spectrum. The same coefficient calculated for Test 1 (created and executed via Taurus) had a value over 28, while Test 1 was in between Test 2 and Test 3 utilization figures.
I’m not sure why Test 1 generated by Taurus is so slow. First, I suspected Taurus overhead, but Test 2 for which I used .jmx
file and ran it under Taurus, proved me wrong, as it easily generate an order of magnitude more traffic. Then, I assumed that the Test 1 generated by Taurus was always parsing html to find all static dependencies do download. I suspected that JMeter generated Test 2 did not do that, as in recorded JMeter test static files were shown as a separate ’transactions’. But careful analysis of JMeter test proved this be wrong too. The recording was uncorrect and static files were downloaded twice in each loop, first when the html was downloaded and dependencies were detected, and then separately for other transactions. So I had to fix it. It then become obvious that JMeter, by default, is also looking for dependencies. Perhaps it processes dependencies in a different way, such as once at the beginning instead of for each request?
What else?
Test 3, which was run from JMeter GUI, was much slower due to GUI overhead.
Both tests 1 and 2 were run by Taurus and via CLI.
Fortunately Taurus always leaves a copy of the .jmx
files generated for each test defined in the .yml
file, so the test can be replicated. When I run this .jmx
file directly result are the same as in Test 1.
Jmx
xml files are not easy to read and require good understanding of JMeter to figure out what is what.
CPU overhead on Nginx hosts in Test 1 suggest that test generated by Taurus is doing something special. All tests use https, in all tests Nginx always returns code 200, predicted traffic is not that far from real to expect a compression being enabled for Test 1. I’m not sure what the problem is. One hypothesis is that 100% utilization on Taurus host could affect TCP/IP and intrudce overhead at the other end. The fact that both Test 1 and Test 3 suffer from CPU saturation on client-side and both have lower proportion between predicted traffic and real traffic suggests that this is the explanation. Unfortunatelyi, it does not explain the CPU coefficient. Second hypothesis is that the longer time Taurus requires to handle each request and receive data, due to saturated CPU, could affect Nginx host, intruducing some sort of overhead.
Those results suggest, that my Nginx-hosted static page can handle up to 3351 views per second with caching disabled, under certain conditions (Test 2). At the same time, it looks like some undisclosed variable can easily limit this value to under 400 (Test 1).
So what is Taurus doing in test generated from yml
config that causes more CPU being consumed on Nginx instance and Taurus itself?