Saturday, December 26, 2015

Improving dom0 Responsiveness Under Load [feedly]



----
Improving dom0 Responsiveness Under Load
// Latest blog entries

Recently, the XenServer Engineering Team has been working on improving the responsiveness of the control domain when it is under heavy load. Many VMs doing lots of I/O operations can prevent one from connecting to the host through ssh or make the XenCenter session disconnected with no apparent reason. All of this happened when the control plane was overloaded by the datapath plane, leaving very little CPU for such important processes as sshd or xapi. Let's have a look at how much time it takes to repeatedly execute a simple 'xe vm-list' command on a host with 20 VM pairs doing heavy network communication:

b2ap3_thumbnail_01.png

Most of the commands took around 2-3 seconds to complete, but some of them took as long as 30 seconds. The 2-3 seconds is slower than it should be, and 20-30 seconds is way outside of a reasonable operating window. The slow reaction time of 3 seconds and the heavy spikes of 30 seconds visible on the graph above are two separate issues affecting the responsiveness of the control commands. Let's tackle them one by one.

To fix the 2-3 seconds slowdown, we took advantage of the Linux kernel feature called cgroups (control groups). Cgroups allows the aggregation of processes into separate hierarchies that manage their access to various resources (CPU, memory, network). In our case, we utilised the CPU resource isolation, placing all control path daemons in the cpu control group subsystem. Giving them ten times more cpu share than datapath processes guarantee they would get enough computing power to execute control operations in a timely fashion. It's worth pointing out, that it does not slow down the datapath in times when the control plane is idle. The datapath reduces its cpu usage only when control operations need to run.

b2ap3_thumbnail_02_20151222-162855_1.png

We can see that the majority of the commands took just a fraction of a second to execute, which solves the first of our problems.

What about the commands that took 20-30 seconds to print out the list of VMs? This was caused by the way in which xapi handles the creation of threads, limiting the rate based on current load and memory usage in dom0. When the load goes too high, there is not enough xapi threads to handle the requests, which results in periodic spikes in the executions of the xe commands. However, this feature was useful when the dom0 was 32 bit and when the increased number of threads might have caused some issues to the stability of the whole system. Since dom0 is 64bit, and with the control groups enabled, we decided it is perfectly safe to get rid of xapi's thread limiting feature.

With these changes applied, the execution times of control path commands became as one would expect them to be:

b2ap3_thumbnail_03_20151222-162856_1.png

In spite of heavy I/O load, control path processes receive all the CPU they need to get the job done, so can do it without any delay, leaving the user with a nicely responsive host regardless of the load in the guests. This means that the host will remain responsive regardless of the load dom0 is under. This makes a tremendous difference to the user-experience when interacting with the host via XenCenter, the xe CLI or over SSH.

Another real world example in which we expected significant improvements is bootstorm. In this benchmark we start more than hundred VMs and measure how much time it takes for the guests to become fully operational (time measured from starting the 1st VM to the completion of the n-th VM). Usual strategy employed is to run 25 VMs at a time. Following is the comparison of the results before and after the changes:

b2ap3_thumbnail_4495.png

Before, booting guests overloaded the control path which slowed down the boot process of latter VMs. After our improvements, the time of booting consecutive guests grows linearly with the whole benchmark completing twice as fast compared to the build without changes.

Another view on the same data - showing the time to boot a single VM:

b2ap3_thumbnail_4496.png

CPU resource isolation and xapi improvements make VMs resilient to the load generated by the simultaneously booting guests. Each of them takes the same amount of time to become ready compared to the significant increase that happened for the host without changes. That is how you would expect for the control plane to operate.

What other benefits would that improvements bring for the XenServer users? They will have no more problems with synchronizing XenCenter with the host and issuing commands to xapi. We expect now that XenDesktop users should be able to start many VMs in the pool master leaving it still responding to control path commands. It would allow them to start more VMs on the master, reducing the necessary hardware and decreasing the total cost of ownership. Cloud administrators can have increased confidence in their ability to administer the host despite unknown and unpredictable workloads in their tenants' VMs.

Above improvements are planned for the forthcoming XenServer Dundee release, and can be experienced with the Dundee beta.2 preview.    


Read More
----

Shared via my feedly reader


Sent from my iPhone