Thursday, August 30, 2018

Kali Linux 2018.3 Release

Kali Linux 2018.3 Release
// Kali Linux

Another edition of Hacker Summer Camp has come and gone. We had a great time meeting our users, new and old, particularly at our Black Hat and DEF CON Dojos, which were led by our great friend @ihackstuff and the rest of the Offensive Security crew. Now that everyone is back home, it's time for our third Kali release of 2018, which is available for immediate download.

Kali 2018.3 brings the kernel up to version 4.17.0 and while 4.17.0 did not introduce many changes, 4.16.0 had a huge number of additions and improvements including more Spectre and Meltdown fixes, improved power management, and better GPU support.

New Tools and Tool Upgrades

Since our last release, we have added a number of new tools to the repositories, including:

  • idb – An iOS research / penetration testing tool
  • gdb-peda – Python Exploit Development Assistance for GDB
  • datasploit – OSINT Framework to perform various recon techniques
  • kerberoast – Kerberos assessment tools

In addition to these new packages, we have also upgraded a number of tools in our repos including aircrack-ng, burpsuite, openvas, wifite, and wpscan.
For the complete list of updates, fixes, and additions, please refer to the Kali Bug Tracker Changelog.

Download Kali Linux 2018.3

If you would like to check out this latest and greatest Kali release, you can find download links for ISOs and Torrents on the Kali Downloads page along with links to the Offensive Security virtual machine and ARM images, which have also been updated to 2018.3. If you already have a Kali installation you're happy with, you can easily upgrade in place as follows.

root@kali:~# apt update && apt -y full-upgrade

Making sure you are up-to-date

To double check your version, first make sure your Kali package repositories are correct.

root@kali:~# cat /etc/apt/sources.list
deb kali-rolling main non-free contrib

Then after running apt -y full-upgrade, you may require a reboot before checking:

root@kali:~# grep VERSION /etc/os-release

If you come across any bugs in Kali, please open a report on our bug tracker. It's more than a little challenging to fix what we don't know about.


Read in my feedly

Sent from my iPhone

Are Containers Replacing Virtual Machines?

Are Containers Replacing Virtual Machines?
// Docker Blog

With 20,000 partners and attendees converging at VMworld in Las Vegas this week, we often get asked if containers are replacing virtual machines (VMs). Many of our Docker Enterprise customers do run their containers on virtualized infrastructure while others run it on bare metal. Docker provides IT and operators choice on where to run their applications – in a virtual machine, on bare metal, or in the cloud. In this blog we'll provide a few thoughts on the relationship between VMs and containers.

Containers versus Virtual Machines

Point #1: Containers Are More Agile than VMs

At this stage of container maturity, there is very little doubt that containers give both developers and operators more agility. Containers deploy quickly, deliver immutable infrastructure and solve the age-old "works on my machine" problem. They also replace the traditional patching process, allowing organizations to respond to issues faster and making applications easier to maintain.

Point #2: Containers Enable Hybrid and Multi-Cloud Adoption

Once containerized, applications can be deployed on any infrastructure – on virtual machines, on bare metal, and on various public clouds running different hypervisors. Many organizations start with running containers on their virtualized infrastructure and find it easier to then migrate to the cloud without having to change code.

Point #3: Integrate Containers with Your Existing IT Processes

Most enterprise organizations have a mature virtualization environment which includes tooling around backups, monitoring, and automation, and people and processes that have been built around it. By running Docker Enterprise on virtualized infrastructure, organizations can easily integrate containers into their existing practices and get the benefits of points 1 and 2 above.

Running Containers Inside Virtual Machines

Point #4: Containers Save on VM Licensing

Containerized applications share common operating system and software libraries which greatly improves CPU utilization within a VM. This means an organization can reduce the overall number of virtual machines needed to operate their environment and increase the number of applications that can run on a server. Docker Enterprise customers often see 50% increased server consolidation after containerizing which means less hardware costs and savings on VM and OS licensing.

What About Bare Metal?

Just as organizations have reasons for using different servers or different operating systems, there are reasons that some organizations will want to run containers directly on bare metal. This is often due to performance or latency concerns or for licensing and cost reasons.

What About Security?

Containers are inherently secure on their own. Docker containers create isolation layers between applications and between the application and host and reduce the host surface area which protects both the host and the co-located containers by restricting access to the host. Docker containers running on bare-metal have the same high-level restrictions applied to them as they would if running on virtual machines. But Docker containers also pair well with virtualization technologies by protecting the virtual machine itself and providing defense in-depth for the host.

And the Winner Is…

In the end, Docker containers can run inside a virtual machine or on bare metal – the choice is up to you. Just like every other decision in the data center, the path you want to go down should align to your business priorities. Containers work well with virtual machines, but they can also run without them.

Are Containers replacing Virtual Machines? Read more here:
Click To Tweet

To learn more about the relationship between containers and virtual machines, check out these resources:

The post Are Containers Replacing Virtual Machines? appeared first on Docker Blog.


Read in my feedly

Sent from my iPhone

The “Depend on Docker” Philosophy at Baker Hughes, a GE Company

The "Depend on Docker" Philosophy at Baker Hughes, a GE Company
// Docker Blog

Alex Iankoulski and Arun Subramaniyan co-authored this blog.

BHGE is the world's leading full stream Oil & Gas company on a mission to find better ways to deliver energy to the world. BHGE Digital develops enterprise grade cloud-first SaaS solutions to improve efficiency and reduce non-productive time for the Oil & Gas industry.

In our group, we have developed an analytics-driven product portfolio to enable company-wide digital transformation for our customers. Challenges ranging from predicting the failures of mission-critical industrial assets such as gas turbines to optimizing the conditions of an Electric Submersible Pump (ESP) to increase production, which require building and maintaining sophisticated analytics at scale.

The past few years have taught us this: where there is a whale, there is a way!

We were happy to share our story at DockerCon recently, and wanted to share it here on the Docker blog as well. You can watch the session here:



We face two major challenges in delivering advanced analytics:

  1. Data silos
    We must handle a multitude of data sources that range from disconnected historical datasets to high speed sensor streams. Industrial data volumes and velocities dwarf even the largest ERP implementations as shown below.


2. Analytics silos
Analytics silos consist of complex analytics written over several decades in multiple programming languages (polyglot) and runtime environments. The need to orchestrate these analytics to work together to produce
a valuable outcome makes the challenge doubly hard.


Our approach to solving the hardest problems facing the industrial world: combine the power of domain expertise with modern deep learning/machine learning/probabilistic techniques and scalable software practices.

At BHGE, we have developed innovative solutions to accelerate software development in a scalable and sustainable way. The top two questions that our developers in the industrial world face are: How can we make software development easier? How can we make software that can be built, ship, and run on Mac, Windows, Linux, on-prem, and on any cloud platform?

Docker Enterprise allows us to break down silos, reduce complexities, encapsulate dependencies, accelerate development, and scale at will. We use Docker Enterprise for everything from building to testing and deploying software. Other than a few specialized cases, we find very little reason to run anything outside of the Docker container platform.

We gave a live talk as part of the Transformational Stories track at DockerCon 2018, titled "Depend on Docker" where we discussed our journey to accelerate ideas to production software.

In our talk, we cover use cases that need a polyglot infrastructure with highly diverse groups from scientists, aerospace and petroleum engineers to software architects to co-create a production application (you can watch the video or see the slides).

For us, a project qualifies as "depend-on-docker" if the only "external" dependency it needs to go from source to running software is Docker. In the spirit of DockerCon, at the talk we demonstrated and open-sourced our depend-on-docker project, and showed examples of some projects that follow the "depend-on-docker" philosophy, such as semtktree and enigma (follow the links to our Github pages).

In addition to its ease of use, we have made starting your own depend-on-docker project on Linux or Windows really simple. We hope that after you take a look at our GitHub or watch our DockerCon video you will be inspired to build anything you can imagine and convinced that the only external dependency you need is Docker!

Learn @BHGECO approach to break down silos and accelerating software development in a scalable way
Click To Tweet

The post The "Depend on Docker" Philosophy at Baker Hughes, a GE Company appeared first on Docker Blog.


Read in my feedly

Sent from my iPhone

Networking KVM for CloudStack – a 2018 revisit for CentOS7 and Ubuntu 18.04

Networking KVM for CloudStack – a 2018 revisit for CentOS7 and Ubuntu 18.04
// CloudStack Consultancy & CloudStack...


We published the original blog post on KVM networking in 2016 – but in the meantime we have moved on a generation in CentOS and Ubuntu operating systems, and some of the original information is therefore out of date. In this revisit of the original blog post we cover new configuration options for CentOS 7.x as well as Ubuntu 18.04, both of which are now supported hypervisor operating systems in CloudStack 4.11. Ubuntu 18.04 has replaced the legacy networking model with the new Netplan implementation, and this does mean different configuration both for linux bridge setups as well as OpenvSwitch.

KVM hypervisor networking for CloudStack can sometimes be a challenge, considering KVM doesn't quite have the same mature guest networking model found in the likes of VMware vSphere and Citrix XenServer. In this blog post we're looking at the options for networking KVM hosts using bridges and VLANs, and dive a bit deeper into the configuration for these options. Installation of the hypervisor and CloudStack agent is pretty well covered in the CloudStack installation guide, so we'll not spend too much time on this.

Network bridges

On a linux KVM host guest networking is accomplished using network bridges. These are similar to vSwitches on a VMware ESXi host or networks on a XenServer host (in fact networking on a XenServer host is also accomplished using bridges).

A KVM network bridge is a Layer-2 software device which allows traffic to be forwarded between ports internally on the bridge and the physical network uplinks. The traffic flow is controlled by MAC address tables maintained by the bridge itself, which determine which hosts are connected to which bridge port. The bridges allow for traffic segregation using traditional Layer-2 VLANs as well as SDN Layer-3 overlay networks.


Linux bridges vs OpenVswitch

The bridging on a KVM host can be accomplished using traditional linux bridge networking or by adopting the OpenVswitch back end. Traditional linux bridges have been implemented in the linux kernel since version 2.2, and have been maintained through the 2.x and 3.x kernels. Linux bridges provide all the basic Layer-2 networking required for a KVM hypervisor back end, but it lacks some automation options and is configured on a per host basis.

OpenVswitch was developed to address this, and provides additional automation in addition to new networking capabilities like Software Defined Networking (SDN). OpenVswitch allows for centralised control and distribution across physical hypervisor hosts, similar to distributed vSwitches in VMware vSphere. Distributed switch control does require additional controller infrastructure like OpenDaylight, Nicira, VMware NSX, etc. – which we won't cover in this article as it's not a requirement for CloudStack.

It is also worth noting Citrix started using the OpenVswitch backend in XenServer 6.0.

Network configuration overview

For this example we will configure the following networking model, assuming a linux host with four network interfaces which are bonded for resilience. We also assume all switch ports are trunk ports:

  • Network interfaces eth0 + eth1 are bonded as bond0.
  • Network interfaces eth2 + eth3 are bonded as bond1.
  • Bond0 provides the physical uplink for the bridge "cloudbr0". This bridge carries the untagged host network interface / IP address, and will also be used for the VLAN tagged guest networks.
  • Bond1 provides the physical uplink for the bridge "cloudbr1". This bridge handles the VLAN tagged public traffic.

The CloudStack zone networks will then be configured as follows:

  • Management and guest traffic is configured to use KVM traffic label "cloudbr0".
  • Public traffic is configured to use KVM traffic label "cloudbr1".

In addition to the above it's important to remember CloudStack itself requires internal connectivity from the hypervisor host to system VMs (Virtual Routers, SSVM and CPVM) over the link local subnet. This is done over a host-only bridge "cloud0", which is created by CloudStack when the host is added to a CloudStack zone.



Linux bridge configuration – CentOS

In the following CentOS example we have changed the NIC naming convention back to the legacy "eth0" format rather than the new "eno16777728" format. This is a personal preference – and is generally done to make automation of configuration settings easier. The configuration suggested throughout this blog post can also be implemented using the new NIC naming format.

Across all CentOS versions the "NetworkManager" service is also generally disabled, since this has been found to complicate KVM network configuration and cause unwanted behaviour:

   # systemctl stop NetworkManager  # systemctl disable NetworkManager  

To enable bonding and bridging CentOS 7.x requires the modules installed / loaded:

   # modprobe --first-time bonding  # yum -y install bridge-utils  

If IPv6 isn't required we also add the following lines to /etc/sysctl.conf:

net.ipv6.conf.all.disable_ipv6 = 1   net.ipv6.conf.default.disable_ipv6 = 1  net.ipv6.conf.lo.disable_ipv6 = 1  

In CentOS the linux bridge configuration is done with configuration files in /etc/sysconfig/network-scripts/. Each of the four individual NIC interfaces are configured as follows (eth0 / eth1 / eth2 / eth3 are all configured the same way). Note there is no IP configuration against the NICs themselves – these purely point to the respective bonds:

# vi /etc/sysconfig/network-scripts/ifcfg-eth0  
DEVICE=eth0  NAME=eth0  TYPE=Ethernet  BOOTPROTO=none  ONBOOT=yes  MASTER=bond0  SLAVE=yes  HWADDR=00:0C:12:xx:xx:xx  NM_CONTROLLED=no  

The bond configurations are specified in the equivalent ifcfg-bond scripts and specify bonding options as well as the upstream bridge name. In this case we're just setting a basic active-passive bond (mode=1) with up/down delays of zero and status monitoring every 100ms (miimon=100). Note there are a multitude of bonding options – please refer to the CentOS / RedHat official documentation to tune these to your specific use case.

# vi /etc/sysconfig/network-scripts/ifcfg-bond0  
DEVICE=bond0  NAME=bond0  TYPE=Bond  BRIDGE=cloudbr0  ONBOOT=yes  NM_CONTROLLED=no  BONDING_OPTS="mode=active-backup miimon=100 updelay=0 downdelay=0"  

The same goes for bond1:

# vi /etc/sysconfig/network-scripts/ifcfg-bond1  
DEVICE=bond1  NAME=bond1  TYPE=Bond  BRIDGE=cloudbr1  ONBOOT=yes  NM_CONTROLLED=no  BONDING_OPTS="mode=active-backup miimon=100 updelay=0 downdelay=0"  

Cloudbr0 is configured in the ifcfg-cloudbr0 script. In addition to the bridge configuration we also specify the host IP address, which is tied directly to the bridge since it is on an untagged VLAN:

# vi /etc/sysconfig/network-scripts/ifcfg-cloudbr0  

Cloudbr1 does not have an IP address configured hence the configuration is simpler:

# vi /etc/sysconfig/network-scripts/ifcfg-cloudbr1  

Optional tagged interface for storage traffic

If a dedicated VLAN tagged IP interface is required for e.g. storage traffic this can be accomplished by created a VLAN on top of the bond and tying this to a dedicated bridge. In this case we create a new bridge on bond0 using VLAN 100:

# vi /etc/sysconfig/network-scripts/ifcfg-bond.100  
DEVICE=bond0.100  VLAN=yes  BOOTPROTO=none  ONBOOT=yes  TYPE=Unknown  BRIDGE=cloudbr100  

The bridge can now be configured with the desired IP address for storage connectivity:

# vi /etc/sysconfig/network-scripts/ifcfg-cloudbr100  

Internal bridge cloud0

When using linux bridge networking there is no requirement to configure the internal "cloud0" bridge, this is all handled by CloudStack.

Network startup

Note – once all network startup scripts are in place and the network service is restarted you may lose connectivity to the host if there are any configuration errors in the files, hence make sure you have console access to rectify any issues.

To make the configuration live restart the network service:

# systemctl restart network  

To check the bridges use the brctl command:

# brctl show  
bridge name bridge id STP enabled interfaces  cloudbr0 8000.000c29b55932 no bond0  cloudbr1 8000.000c29b45956 no bond1  

The bonds can be checked with:

# cat /proc/net/bonding/bond0  
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)    Bonding Mode: fault-tolerance (active-backup)  Primary Slave: None  Currently Active Slave: eth0  MII Status: up  MII Polling Interval (ms): 100  Up Delay (ms): 0  Down Delay (ms): 0    Slave Interface: eth0  MII Status: up  Speed: 1000 Mbps  Duplex: full  Link Failure Count: 0  Permanent HW addr: 00:0c:xx:xx:xx:xx  Slave queue ID: 0    Slave Interface: eth1  MII Status: up  Speed: 1000 Mbps  Duplex: full  Link Failure Count: 0  Permanent HW addr: 00:0c:xx:xx:xx:xx  Slave queue ID: 0  

Linux bridge configuration – Ubuntu

With the 18.04 "Bionic Beaver" release Ubuntu have retired the legacy way of configuring networking through /etc/network/interfaces in favour of Netplan – This changes how networking is configured – although the principles around bridge configuration are the same as in previous Ubuntu versions.

First of all ensure correct hostname and FQDN are set in /etc/hostname and /etc/hosts respectively.

To stop network bridge traffic from traversing IPtables / ARPtables also add the following lines to /etc/sysctl.conf, this prevents bridge traffic from traversing IPtables / ARPtables on the host.

# vi /etc/sysctl.conf  
net.bridge.bridge-nf-call-ip6tables = 0  net.bridge.bridge-nf-call-iptables = 0  net.bridge.bridge-nf-call-arptables = 0  

Ubuntu 18.04 installs the "bridge-utils" and bridge/bonding kernel options by default, and the corresponding modules are also loaded by default, hence there are no requirements to add anything to /etc/modules.

In Ubuntu 18.04 all interface, bond and bridge configuration are configured using cloud-init and the Netplan configuration in /etc/netplan/XX-cloud-init.yaml. Same as for CentOS we are configuring basic active-passive bonds (mode=1) with status monitoring every 100ms (miimon=100), and configuring bridges on top of these. As before the host IP address is tied to cloudbr0:

# vi /etc/netplan/50-cloud-init.yaml  
network:      ethernets:          eth0:              dhcp4: no          eth1:              dhcp4: no          eth2:              dhcp4: no          eth3:              dhcp4: no      bonds:          bond0:              dhcp4: no              interfaces:                  - eth0                  - eth1              parameters:                  mode: active-backup                  primary: eth0          bond1:              dhcp4: no              interfaces:                  - eth2                  - eth3              parameters:                  mode: active-backup                  primary: eth2      bridges:          cloudbr0:              addresses:                  -              gateway4:              nameservers:                  search: [mycloud.local]                  addresses: [,]              interfaces:                  - bond0          cloudbr1:              dhcp4: no              interfaces:                  - bond1      version: 2  

Optional tagged interface for storage traffic

To add an options VLAN tagged interface for storage traffic add a VLAN and a new bridge to the above configuration:

# vi /etc/netplan/50-cloud-init.yaml  
    vlans:          bond100:              id: 100              link: bond0              dhcp4: no      bridges:          cloudbr100:              addresses:                 -              interfaces:                 - bond100  

Internal bridge cloud0

When using linux bridge networking the internal "cloud0" bridge is again handled by CloudStack, i.e. there's no need for specific configuration to be specified for this.

Network startup

Note – once all network startup scripts are in place and the network service is restarted you may lose connectivity to the host if there are any configuration errors in the files, hence make sure you have console access to rectify any issues.

To make the configuration reload Netplan with

# netplan apply  

To check the bridges use the brctl command:

# brctl show  
bridge name	bridge id		STP enabled	interfaces  cloud0		8000.000000000000	no  cloudbr0	8000.52664b74c6a7	no		bond0  cloudbr1	8000.2e13dfd92f96	no		bond1  cloudbr100	8000.02684d6541db	no		bond100  

To check the VLANs and bonds:

# cat /proc/net/vlan/config  VLAN Dev name | VLAN ID  Name-Type: VLAN_NAME_TYPE_RAW_PLUS_VID_NO_PAD  bond100 | 100 | bond0  
# cat /proc/net/bonding/bond0  Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)    Bonding Mode: fault-tolerance (active-backup)  Primary Slave: None  Currently Active Slave: eth1  MII Status: up  MII Polling Interval (ms): 100  Up Delay (ms): 0  Down Delay (ms): 0    Slave Interface: eth1  MII Status: up  Speed: 1000 Mbps  Duplex: full  Link Failure Count: 10  Permanent HW addr: 00:0c:xx:xx:xx:xx  Slave queue ID: 0    Slave Interface: eth0  MII Status: up  Speed: 1000 Mbps  Duplex: full  Link Failure Count: 10  Permanent HW addr: 00:0c:xx:xx:xx:xx  Slave queue ID: 0  


OpenVswitch bridge configuration – CentOS

The OpenVswitch version in the standard CentOS repositories is relatively old (version 2.0). To install a newer version either locate and install this from a third party CentOS/Fedora/RedHat repository, alternatively download and compile the packages from the OVS website (notes on how to compile the packages can be found in

Once packages are available install and enable OVS with

# yum localinstall openvswitch-<version>.rpm  # systemctl start openvswitch  # systemctl enable openvswitch  

In addition to this the bridge module should be blacklisted. Experience has shown that even blacklisting this module does not prevent it from being loaded. To force this set the module install to /bin/false. Please note the CloudStack agent install depends on the bridge module being in place, hence this step should be carried out after agent install.

echo "install bridge /bin/false" > /etc/modprobe.d/bridge-blacklist.conf  

As with linux bridging above the following examples assumes IPv6 has been disabled and legacy ethX network interface names are used. In addition the hostname has been set in /etc/sysconfig/network and /etc/hosts.

Add the initial OVS bridges using the ovs-vsctl toolset:

# ovs-vsctl add-br cloudbr0  # ovs-vsctl add-br cloudbr1  # ovs-vsctl add-bond cloudbr0 bond0 eth0 eth1  # ovs-vsctl add-bond cloudbr1 bond1 eth2 eth3  

This will configure the bridges in the OVS database, but the settings will not be persistent. To make the settings persistent we need to configure the network configuration scripts in /etc/sysconfig/network-scripts/, similar to when using linux bridges.

Each individual network interface has a generic configuration – note there is no reference to bonds at this stage. The following ifcfg-eth script applies to all interfaces:

# vi /etc/sysconfig/network-scripts/ifcfg-eth0  
DEVICE="eth0"  TYPE="Ethernet"  BOOTPROTO="none"  NAME="eth0"  ONBOOT="yes"  NM_CONTROLLED=no  HOTPLUG=no  HWADDR=00:0C:xx:xx:xx:xx  

The bonds reference the interfaces as well as the upstream bridge. In addition the bond configuration specifies the OVS specific settings for the bond (active-backup, no LACP, 100ms status monitoring):

# vi /etc/sysconfig/network-scripts/ifcfg-bond0  
DEVICE=bond0  ONBOOT=yes  DEVICETYPE=ovs  TYPE=OVSBond  OVS_BRIDGE=cloudbr0  BOOTPROTO=none  BOND_IFACES="eth0 eth1"  OVS_OPTIONS="bond_mode=active-backup lacp=off other_config:bond-detect-mode=miimon other_config:bond-miimon-interval=100"  HOTPLUG=no  
# vi /etc/sysconfig/network-scripts/ifcfg-bond1  
DEVICE=bond1  ONBOOT=yes  DEVICETYPE=ovs  TYPE=OVSBond  OVS_BRIDGE=cloudbr1  BOOTPROTO=none  BOND_IFACES="eth2 eth3"  OVS_OPTIONS="bond_mode=active-backup lacp=off other_config:bond-detect-mode=miimon other_config:bond-miimon-interval=100"  HOTPLUG=no  

The bridges are now configured as follows. The host IP address is specified on the untagged cloudbr0 bridge:

# vi /etc/sysconfig/network-scripts/ifcfg-cloudbr0  

Cloudbr1 is configured without an IP address:

# vi /etc/sysconfig/network-scripts/ifcfg-cloudbr1  

Internal bridge cloud0

Under CentOS7.x and CloudStack 4.11 the cloud0 bridge is automatically configured, hence no additional configuration steps required.

Optional tagged interface for storage traffic

If a dedicated VLAN tagged IP interface is required for e.g. storage traffic this is accomplished by creating a VLAN tagged fake bridge on top of one of the cloud bridges. In this case we add it to cloudbr0 with VLAN 100:

# ovs-vsctl add-br cloudbr100 cloudbr0 100  
# vi /etc/sysconfig/network-scripts/ifcfg-cloudbr100  
DEVICE=cloudbr100  ONBOOT=yes  DEVICETYPE=ovs  TYPE=OVSBridge  BOOTPROTO=static  IPADDR=  NETMASK=  OVS_OPTIONS="cloudbr0 100"  HOTPLUG=no  

Additional OVS network settings

To finish off the OVS network configuration specify the hostname, gateway and IPv6 settings:

vim /etc/sysconfig/network  

VLAN problems when using OVS

Kernel versions older than 3.3 had some issues with VLAN traffic propagating between KVM hosts. This has not been observed in CentOS 7.5 (kernel version 3.10) – however if this issue is encountered look up the OVS VLAN splinter workaround.

Network startup

Note – as mentioned for linux bridge networking – once all network startup scripts are in place and the network service is restarted you may lose connectivity to the host if there are any configuration errors in the files, hence make sure you have console access to rectify any issues.

To make the configuration live restart the network service:

# systemctl restart network  

To check the bridges use the ovs-vsctl command. The following shows the optional cloudbr100 on VLAN 100:

# ovs-vsctl show  
49cba0db-a529-48e3-9f23-4999e27a7f72      Bridge "cloudbr0"          Port "cloudbr0"              Interface "cloudbr0"                  type: internal          Port "cloudbr100"              tag: 100              Interface "cloudbr100"                  type: internal          Port "bond0"              Interface "veth0"              Interface "eth0"      Bridge "cloudbr1"          Port "bond1"              Interface "eth1"              Interface "veth1"          Port "cloudbr1"              Interface "cloudbr1"                  type: internal      Bridge "cloud0"          Port "cloud0"              Interface "cloud0"                  type: internal      ovs_version: "2.9.2"  

The bond status can be checked with the ovs-appctl command:

ovs-appctl bond/show bond0  ---- bond0 ----  bond_mode: active-backup  bond may use recirculation: no, Recirc-ID : -1  bond-hash-basis: 0  updelay: 0 ms  downdelay: 0 ms  lacp_status: off  active slave mac: 00:0c:xx:xx:xx:xx(eth0)    slave eth0: enabled  active slave  may_enable: true    slave eth1: enabled  may_enable: true  

To ensure that only OVS bridges are used also check that linux bridge control returns no bridges:

# brctl show  bridge name	bridge id		STP enabled	interfaces  

As a final note – the CloudStack agent also requires the following two lines added to /etc/cloudstack/agent/ after install:


OpenVswitch bridge configuration – Ubuntu

As discussed earlier in this blog post Ubuntu 18.04 introduced Netplan as a replacement to the legacy "/etc/network/interfaces" network configuration. Unfortunately Netplan does not support OVS, hence the first challenge is to revert Ubuntu to the legacy configuration method.

To disable Netplan first of all add "netcfg/do_not_use_netplan=true" to the GRUB_CMDLINE_LINUX option in /etc/default/grub. The following example also shows the use of legacy interface names as well as IPv6 being disabled:

GRUB_CMDLINE_LINUX="net.ifnames=0 biosdevname=0 ipv6.disable=1 netcfg/do_not_use_netplan=true"  

Then rebuild GRUB and reboot the server:

grub-mkconfig -o /boot/grub/grub.cfg  

To set the hostname first of all edit "/etc/cloud/cloud.cfg" and set this to preserve the system hostname:

preserve_hostname: true  

Thereafter set the hostname with hostnamectl:

hostnamectl set-hostname --static --transient --pretty <hostname>  

Now remove Netplan, install OVS from the Ubuntu repositories as well the "ifupdown" package to get standard network functionality back:

apt-get purge nplan  apt-get install openvswitch-switch  apt-get install ifupdown  

As for CentOS we need to blacklist the bridge module to prevent standard bridges being created. Please note the CloudStack agent install depends on the bridge module being in place, hence this step should be carried out after agent install.

echo "install bridge /bin/false" > /etc/modprobe.d/bridge-blacklist.conf  

To stop network bridge traffic from traversing IPtables / ARPtables also add the following lines to /etc/sysctl.conf:

# vi /etc/sysctl.conf  
net.bridge.bridge-nf-call-ip6tables = 0  net.bridge.bridge-nf-call-iptables = 0  net.bridge.bridge-nf-call-arptables = 0  

Same as for CentOS we first of all add the OVS bridges and bonds from command line using the ovs-vsctl command line tools. In this case we also add the additional tagged fake bridge cloudbr100 on VLAN 100:

# ovs-vsctl add-br cloudbr0  # ovs-vsctl add-br cloudbr1  # ovs-vsctl add-bond cloudbr0 bond0 eth0 eth1 bond_mode=active-backup other_config:bond-detect-mode=miimon other_config:bond-miimon-interval=100  # ovs-vsctl add-bond cloudbr1 bond1 eth2 eth3 bond_mode=active-backup other_config:bond-detect-mode=miimon other_config:bond-miimon-interval=100  # ovs-vsctl add-br cloudbr100 cloudbr0 100  

As for linux bridge all network configuration is applied in "/etc/network/interfaces":

# vi /etc/network/interfaces  
# The loopback network interface  auto lo  iface lo inet loopback    # The primary network interface  iface eth0 inet manual  iface eth1 inet manual  iface eth2 inet manual  iface eth3 inet manual    auto cloudbr0  allow-ovs cloudbr0  iface cloudbr0 inet static    address    netmask    gateway    dns-nameserver    ovs_type OVSBridge    ovs_ports bond0    allow-cloudbr0 bond0   iface bond0 inet manual     ovs_bridge cloudbr0     ovs_type OVSBond     ovs_bonds eth0 eth1     ovs_option bond_mode=active-backup other_config:miimon=100    auto cloudbr1  allow-ovs cloudbr1  iface cloudbr1 inet manual    allow-cloudbr1 bond1   iface bond1 inet manual     ovs_bridge cloudbr1     ovs_type OVSBond     ovs_bonds eth2 eth3     ovs_option bond_mode=active-backup other_config:miimon=100  

Network startup

Since Ubuntu 14.04 the bridges have started automatically without any requirement for additional startup scripts. Since OVS uses the same toolset across both CentOS and Ubuntu the same processes as described earlier in this blog post can be utilised.

# ovs-appctl bond/show bond0  # ovs-vsctl show  

To ensure that only OVS bridges are used also check that linux bridge control returns no bridges:

# brctl show  bridge name	bridge id		STP enabled	interfaces  

As mentioned earlier the following also needs added to the /etc/cloudstack/agent/ file:


Internal bridge cloud0

In Ubuntu there is no requirement to add additional configuration for the internal cloud0 bridge, CloudStack manages this.

Optional tagged interface for storage traffic

Additional VLAN tagged interfaces are again accomplished by creating a VLAN tagged fake bridge on top of one of the cloud bridges. In this case we add it to cloudbr0 with VLAN 100 at the end of the interfaces file:

# ovs-vsctl add-br cloudbr100 cloudbr0 100  
# vi /etc/network/interfaces  
auto cloudbr100  allow-cloudbr0 cloudbr100  iface cloudbr100 inet static    address    netmask    ovs_type OVSIntPort    ovs_bridge cloudbr0    ovs_options tag=100  


As KVM is becoming more stable and mature, more people are going to start looking at using it rather that the more traditional XenServer or vSphere solutions, and we hope this article will assist in configuring host networking. As always we're happy to receive feedback , so please get in touch with any comments, questions or suggestions.

About The Author

Dag Sonstebo is  a Cloud Architect at ShapeBlue, The Cloud Specialists. Dag spends most of his time designing, implementing and automating IaaS solutions based on Apache CloudStack.

The post Networking KVM for CloudStack – a 2018 revisit for CentOS7 and Ubuntu 18.04 appeared first on The CloudStack Company.


Read in my feedly

Sent from my iPhone

Chef Open Source Community News – August 2018

Chef Open Source Community News – August 2018
// Chef Blog

Here's this month's round up of what happened in August across the Chef, Habitat, and InSpec open-source communities.


This month's release of Chef Client 14.4 is the newest in the Chef 14 series, and includes seven new preview resources, many improvements to existing resources, and a brand new Knife profile management command. Chef Client 14.4 also includes OpenSSL 1.0.2p to resolve two vulnerabilities: CVE-2018-0732 and CVE-2018-0737.

In Chef Client 14.3 which we released last month, we introduced the concept of a preview resource. These resources are ready to use in production today, but because they were cleaned up and migrated from community cookbooks, conflict with them. They are marked as preview because the cookbook version will take precedence if that cookbook is found in your run list, and that behavior will persist until Chef 15, when only the version in core Chef will run.

This month's new resources are cron_d, cron_access, openssl_x509_certificate, openssl_x509_request, openssl_x509_crl, openssl_ec_private_key and openssl_ec_public_key. Improved resources this month are sysctl, windows_task, ifconfig, route, and systemd_unit. You can read all about the new and improved resources in the release notes.

Finally, this month's new feature is Knife configuration profiles. In the past, connecting to multiple Chef servers and switching between their configurations required the use of tools like knife-block or chefvm. No more! You can now set and use multiple configuration profiles using several profile-related subcommands to knife config. For more information, consult the documentation for Knife.


We released Habitat 0.61.0 during the month of August. This is primarily a bugfix release, but is also laying the groundwork to support older Linux kernels as a result of the core plans rebuild from June. We made several improvements to supervisor operability as well.

Stuart Paterson also published a blog post on lift, shift and modernize of legacy applications in practice.


We released InSpec several times during the month of August, introducing a few new resources including aws_ecs_cluster and iis_app_pool. We fixed some other bugs including one that precluded you from running certain InSpec commands like check, archive and json without being a privileged user.

The post Chef Open Source Community News – August 2018 appeared first on Chef Blog.


Read in my feedly

Sent from my iPhone

Wednesday, August 29, 2018

Foreshadow vulnerability on XenServer and XCP-ng

Foreshadow vulnerability on XenServer and XCP-ng
// Xen Orchestra

Foreshadow vulnerability on XenServer and XCP-ng

This is a recap on the latest Foreshadow vulnerability and how it affects XenServer and XCP-ng.

Foreshadow, XSA-273

Yet another Intel x86 security issue… Basically, someone could steal data in RAM, outside the VM boundaries (ie: from other VMs on the same host). If you have non-trusted users in your VMs, it's time to patch ASAP. And maybe disable hyper-threading.

Foreshadow vulnerability on XenServer and XCP-ng

You can find more details here and here.

Should I disable hyper-threading?

No obvious answer sadly:

If an HVM guest kernel is untrusted (i.e. not under host admin control), it is probably not safe to be scheduled with hyper-threading active.

But if you have control on your VMs, please be sure you have all recent fixes available from your OS vendor. Then, no "need" to disable HT.

XAPI security issue, XSA-271

Let's quote the XSA document:

An unauthenticated user with access to the management network can read arbitrary files from the dom0 filesystem. This includes the pool secret /etc/xensource/ptoken which grants the attacker full administrator access.

This is… big. Update ASAP (see below on how) or close your XAPI from outside, now! If you have hosts all around the world, another possibility is to let your XAPI only reachable from a secured tunnel, without external access.

On XenServer

There are multiple patches, depending on your current XenServer version. Citrix did a recap on those vulnerabilities here:

Patched versions are:

  • 7.0
  • 7.1 CU1
  • 7.4
  • 7.5

You can patch directly from Xen Orchestra UI as soon Citrix publishes them in their official online XML (few days in general).

XenServer 7.2/7.3

If you are using XenServer 7.2 (and 7.3) you have 2 options:

If you already have a paid contract with Citrix, 7.2 and 7.3 aren't supported anymore, please upgrade to XenServer 7.5!

On XCP-ng

You can read this official XCP-ng blog post regarding both XSAs. As usual and as documented, 2 possibilities:

  • CLI: yum update on each host
  • Web UI in Xen Orchestra (see screenshots below)

Foreshadow vulnerability on XenServer and XCP-ng

Foreshadow vulnerability on XenServer and XCP-ng

Please reboot your hosts then, and always reboot the pool master first.

Note: a toolstack restart is enough to fix XSA-271, but reboot is needed for XSA-273

Foreshadow vulnerability on XenServer and XCP-ng


Read in my feedly

Sent from my iPhone

XCP-ng Security Bulletin

XCP-ng Security Bulletin
// XCP-ng

Latest news regarding security on XCP-ng! For this first bulletin, it's all about Foreshadow vulnerability, but also a specific XAPI problem. Take time to read it or at least stay up-to-date!


Also known as "L1 Terminal Fault speculative side channel" or even shorter: Foreshadow.

The most interesting details are available here.

In short, with Intel CPUs and if you don't have a control on each of your VM (ie: you are selling VPS services), this can cause a major confidentiality risk. Indeed, people in those VMs will be able to read data in RAM, outside their own VM.


This is a XAPI HTTP security issue, leading to potential acccess of the whole root access for the dom0 (and all its VMs).

Note: fresh XCP-ng installs aren't impacted because it's caused by a dedicated folder used with Citrix hotfixes.

All the vulnerability details are available here.

Solution: just update

Please keep your XCP-ng hosts up-to-date. RPMs are already available!

Remember you can update from the command line with yum update, but also use Xen Orchestra to do it on your whole pool just by clicking on "Install pool patches"! If you want to remember this, don't forget about our official Wiki.

You can see the list of updates in the host view:


Read in my feedly

Sent from my iPhone

Snort 3 beta available now!

Snort 3 beta available now!
// Snort Blog

We know our customers and community members have been waiting a while for this — so we are thrilled to announce that Snort 3 (build 247) is available in beta now. Snort 3 is a redesign of Snort 2 with a number of significant improvements.

Here are some highlights you should know about before downloading:
  • Configuration — We use LuaJIT for configuration. The config syntax is simple, consistent, and executable. LuaJIT plugins for rule options and loggers are supported, too.
  • Detection — We have worked closely with Cisco Talos to update rules to meet their needs, including a feature they call "sticky buffers." With the use of the Hyperscan search engine, regex fast patterns make rules faster and more accurate.
  • HTTP — We have a new and stateful HTTP inspector that currently handles 99 percent of the HTTP Evader cases, and will soon cover all of them. There are many new features, as well, including new rule options. HTTP/2 support is under development.
  • Performance — We have substantially increased performance for deep packet inspection.  Snort 3 supports multiple packet-processing threads, and scales linearly with a much smaller amount of memory required for shared configs, like rule engines.
  • JSON event logging — These can be used to integrate with tools such as the Elastic Stack.  See this blog post for more details.
  • Plugins — Snort 3 was designed to be extensible and there are over 225 of plugins of various types. It is easy to add your own codec, inspector, rule action, rule option, or logger.  SO rules are plugins, too, and it is much easier to add your own.
You can get Snort 3 from or from GitHub.

These packages / repositories are available:
  • snort3 — The main engine source code and plugins
  • snort3_extra — Other experimental and example plugins
  • snort3_demo — A test suite with working examples
We push updates to GitHub multiple times per week, and the master branch is always stable.

In addition to the cool new features, Snort 3 also supports all the capabilities of Snort 2.9.11, but we aren't done. Coming soon, we have:
  • Next generation DAQ
  • Connection events
  • Search engine acceleration
  • ... and much more.
Please submit bugs, questions, and feedback to or the Snort-Users mailing list.

Happy Snorting!
The Snort Release Team


Read in my feedly

Sent from my iPhone

HashiCorp Vault 0.11

HashiCorp Vault 0.11
// Hashicorp Blog

Vault 0.11

We are excited to announce the release of HashiCorp Vault 0.11! Vault is a security tool for secrets management, data encryption, and identity-based access among other features.

The 0.11 release of Vault delivers new features to streamline the management of tokens for applications and users attempting to access Vault, provide secure multi tenancy for multiple teams and organizations using a single Vault installation, and other features focused on enhancing system performance and automation.

New features in 0.11 include:

  • Namespaces (Enterprise): Provide Secure Multi-tenancy within Vault via isolated, self-managed environments.
  • Performance Standby Nodes (Enterprise): Multiply read performance for Vault Enterprise infrastructure via a new type of performance-focused standby node.
  • Vault Agent: Automatically manage the secure introduction and renewal of tokens for local applications.
  • ACL Templates: Support templating for identity groups, entities, and metadata within ACL policies.
  • Alibaba Cloud Support: Support Alibaba Cloud identity systems and provide dynamic credential creation for Alibaba Cloud infrastructures via Vault.
  • Microsoft Azure Secrets Engine: Generate dynamic credential access to Microsoft Azure infrastructure via Vault.

The release also includes additional new features, secure workflow enhancements, general improvements, and bug fixes. The Vault 0.11 changelog provides a full list of features, enhancements, and bug fixes.

As always, we send a big thank-you to our community for their ideas, bug reports, and pull requests.


Note: This is a Vault Enterprise Pro feature


Vault 0.11 sees the new introduction of Namespaces, a suite of features that allows Vault Enterprise users to create isolated environments to support secure multi-tenancy within a single Vault Enterprise infrastructure. This allows for multiple teams or organizations to operate within separate environments that can be centrally managed and configured by a central ops or security team.

Within a namespace, users and applications can create and manage separate versions of the following:

  • Secret Engines
  • Auth Methods
  • Identities (Entities and Identity Groups)
  • Policies
  • Tokens

Namespaces also allow members of a namespace to be delegated as administrators, allowing them to self-manage policies that apply only within that namespace. This significantly reduces the management burden of Vault Enterprise, allowing teams (and even individuals) to self-manage their own environments.

For more on namespaces, see our in-depth feature preview.

Performance Standby Nodes

Note: This is a Vault Enterprise feature

Performance Standby Nodes (or simply "Performance Standbys") are a new node type within Vault to multiply Vault's ability to serve read-only operations (that is, operations that do not modify Vault's storage) within a single cluster. A selection of performance standby nodes come standard with Vault Enterprise Premium, and they can be added to Vault Enterprise Pro infrastructures.

A performance standby is just like a traditional High Availability (HA) standby node but is able to service read-only requests from users or applications. This allows for Vault to quickly scale its ability to service these kinds of operations, providing near-linear request-per-second scaling in many common scenarios for some secrets engines like K/V and Transit. By spreading traffic across performance standby nodes, clients can scale these IOPS horizontally to handle extremely high traffic workloads.

Vault Agent

Vault Agent is a new mode for the Vault binary that allows Vault to automatically manage the process of securely introducing and rotating access tokens for a system. By configuring an auto-auth system with a Vault 0.11+ binary, Vault can be run as an agent that provides fresh local access tokens on a system for applications and users to leverage in accessing secrets.

For more on Vault Agent, see our in-depth feature preview.

ACL Templates

In Vault 0.11 policies may now use templates to explicitly refer to entities, identities groups, and metadata within policies. This allows policies that are easier to manage and more explicit when granting RBAC to specific identities within Vault.

For example, a policy may now be written to carve out storage for a specific entity:

path "secret/data/{{}}/*" { capabilities = ["create", "update", "read", "delete"] }

Or a policy can be written to assign RBAC to an identity group, allowing any member of the group to successfully perform operations but disallowing anyone else:

path "secret/data/groups/{{}}/*" { capabilities = ["create", "update", "read", "delete"] }

Alibaba Cloud Support

Vault now supports integration with Alibaba Cloud. Vault 0.11 sees the release of Alibaba Auth Methods and an Alibaba Cloud Secrets Engine - both of which allow users to login with Alibaba Cloud credentials and generate dynamic credentials for access to an Alibaba Cloud infrastructure respectively.

Vault users can also configure Alibaba Cloud storage targets as a Storage backend with Vault 0.11, and in the near future we will release functionality to allow Vault Enterprise users to Auto Unseal and Seal Wrap using Alibaba Cloud KMS.

Microsoft Azure Secret Engine

Vault 0.11 now supports a Secrets Engine plugin that allows for Vault users to create dynamic access credentials to Microsoft Azure systems. Using time-limited service principals, Azure Secrets Engine allows Vault to broker secure access for users and applications provisioning resources on Azure.

Other Features

There are many new features in Vault 0.11 that have been developed over the course of the 0.10.x releases. We have summarized a few of the larger features below, and as always consult the Changelog for full details.

  • JWT/OIDC Discovery Auth Method: A new auth method that accepts JWTs and either validates signatures locally or uses OIDC Discovery to fetch the current set of keys for signature validation. Various claims can be specified for validation (in addition to the cryptographic signature) and a user and optional groups claim can be used to provide Identity information.
  • UI Control Group Workflow (Enterprise): The UI will now detect control group responses and provides a workflow to view the status of the request and to authorize requests
  • Active Directory Secrets Engine: A new ad secrets engine has been created which allows Vault to rotate and provide credentials for configured AD accounts. This Secrets Engine also supports automated rotation of its root credential.
  • Azure Key Vault Support: Support for Microsoft Azure Key Vault for Auto Unseal and Seal Wrap.
  • HA Support for MySQL Storage: MySQL storage now supports HA.
  • Vault UI Browser CLI: The UI now supports usage of read/write/list/delete commands in a CLI that can be accessed from the nav bar. Complex inputs such as JSON files are not currently supported. This surfaces features otherwise unsupported in Vault's UI.
  • FoundationDB Storage: You can now use FoundationDB for storing Vault data.

Upgrade Details

Vault 0.11 introduces significant new functionality. As such, we provide both general upgrade instructions and a Vault 0.11-specific upgrade page.

As always, we recommend upgrading and testing this release in an isolated environment. If you experience any issues, please report them on the Vault GitHub issue tracker or post to the Vault mailing list.

For more information about HashiCorp Vault Enterprise, visit Users can download the open source version of Vault at

We hope you enjoy Vault 0.11!


Read in my feedly

Sent from my iPhone

Building Resilient Infrastructure with Nomad: Scheduling and Self-Healing

Building Resilient Infrastructure with Nomad: Scheduling and Self-Healing
// Hashicorp Blog

This is the second post in our series Building Resilient Infrastructure with Nomad. In this series we explore how Nomad handles unexpected failures, outages, and routine maintenance of cluster infrastructure, often without operator intervention required.

In this post we'll look at how the Nomad client enables fast and accurate scheduling as well as self-healing through driver health checks and liveness heartbeats.

Nomad client agent

The Nomad agent is a long running process which runs on every machine that is part of the Nomad cluster. The behavior of the agent depends on if it is running in client or server mode. Clients are responsible for running tasks, while servers are responsible for managing the cluster. Each cluster has usually 3 or 5 server node agents and potentially thousands of clients.

The primary purpose of client mode agents is to run user workloads such as docker containers. To enable this the client will fingerprint its environment to determine the capabilities and resources of the host machine, and also to determine what drivers are available. Once this is done, clients register with servers and continue to check in with them regularly in order to provide the node information, heartbeat to provide liveness, and run any tasks assigned to them.

Diagram showing Nomad client-server architecture.


Scheduling is a core function of Nomad servers. It is the process of assigning tasks from jobs to client machines. This process must respect the constraints as declared in the job file, and optimize for resource utilization.

You'll recall from Part 1 of this series that a job is a declarative description of tasks, including their constraints and resources required. Jobs are submitted by users and represent a desired state. The mapping of a task group in a job to clients is done using allocations. An allocation declares that a set of tasks in a job should be run on a particular node. Scheduling is the process of determining the appropriate allocations and is done as part of an evaluation.

Evaluations are created when job is created, updated, or a node fails.

Diagram showing Nomad scheduling process.

Schedulers, part of the Nomad server, are responsible for processing evaluations and generating allocation plans. There are three scheduler types in Nomad, each optimized for a specific type of workload: service, batch, and system.

First the scheduler reconciles the desired state (indicated by the job file) with the real state of the cluster to determine what must be done. New allocations may need to be placed. Existing allocations may need to be updated, migrated, or stopped.

Placing allocations is split into two distinct phases: feasibility checking and ranking. In the first phase the scheduler finds nodes that are feasible by filtering unhealthy nodes, those missing necessary drivers, and those failing the specified constraints for the job. This is where Nomad uses the node fingerprinting and driver information provided by Nomad clients.

The second phase is ranking, where the scheduler scores feasible nodes to find the best fit. Scoring is based on a combination of bin packing and anti-affinity (co-locating multiple instances of a task group is discouraged) which optimizes for density while reducing the liklihood of correlated failures. In Nomad 0.9.0, the next major release, scoring will also take into consideration user-specified affinities and anti-affinities

In a traditional data center environment where and how to place workloads is typically a manual operation requiring decision-making and intervention by an operator. With Nomad, scheduling decisions are automatic and are optimized for the desired workload and the present state and capabilities of the cluster.

Limiting job placement based on driver health

Task drivers are used by Nomad clients to execute tasks and provide resource isolation. Nomad provides an extensible set of tasks drivers in order to support a broad set of workloads across all major operating systems. Tasks drivers vary in their configuration options, environments they can be used in, and resource isolation mechanisms available.

The types of task drivers in Nomad are: Docker, isolated fork/exec, Java, LXC, Qemu, raw fork/exec, Rkt, and custom drivers written in Go (pluggable driver system coming soon in Nomad 0.9.0).

Driver health checking capabilities, introduced in Nomad 0.8, enable Nomad to limit placement of allocations based on driver health status and by surfacing driver health status to operators. For task drivers that support health-checking, Nomad will exclude allocating jobs to nodes whose drivers are reported as unhealthy.

Healing from lost client nodes

While the Nomad client is running, it performs heartbeating with servers to maintain liveness. If the heartbeats fail, the Nomad servers assume the client node has failed, and they stop assigning new tasks and start creating replacement allocations. It is impossible to distinguish between a network failure and a Nomad agent crash, so both cases are handled the same. Once the network recovers or a crashed agent restarts, the node status will be updated and normal operation resumed.

Limiting job placement based on driver health and automatically detecting failed client nodes and recheduling jobs accordingly are two self-healing features of Nomad that occur without the need for additional monitoring, scripting, or other operator intervention.


In this second post in our series on Building Resilient Infrastructure with Nomad (part 1), we covered how the Nomad client-server agents enable fast and accurate scheduling as well as self-healing through driver health checks and liveness heartbeats.

Nomad client agents are responsible for determining the resources and capabilities of their hosts, including which drivers are available, and for running tasks. Nomad server agents are responsible for maintaining cluster state and for scheduling tasks. Client and server agents work together to enable fast, accurate scheduling as well as self-healing actions such as automatic rescheduling of tasks off failed nodes and marking nodes with failing drivers as ineligible to receive tasks requiring those drivers.

In the next post, we'll look at how Nomad helps operators manage the Job Lifecycle: updates, rolling deployments, including canary and blue-green deployments, as well as migrating tasks as part of client node decommissioning.


Read in my feedly

Sent from my iPhone

Building Resilient Infrastructure with Nomad: Restarting tasks

Building Resilient Infrastructure with Nomad: Restarting tasks
// Hashicorp Blog

Nomad is a powerful and flexible scheduler designed for long running services and batch jobs. Through a wide range of drivers, Nomad can schedule container based workloads, raw binaries, java applications and more. Nomad is easy to operate and scale, and integrates seamlessly with HashiCorp Consul for service to service communication and HashiCorp Vault for secrets management.

Nomad provides developers with self-service infrastructure. Nomad jobs are described using a high level declarative format syntax that is version controlled and promotes infrastructure as code. Once a job is submitted to Nomad, it is responsible for deploying and ensuring the availability of a service. One of the benefits of running Nomad is increased reliability and resiliency of your computing infrastructure.

Welcome to our series on Building Resilient Infrastructure with Nomad, where we explore how Nomad handles unexpected failures, outages, and routine maintenance of cluster infrastructure, often without operator intervention required.

In this first post, we'll look at how Nomad automates the restart of failed and unresponsive tasks as well as reschedule of repeatedly failing tasks to other nodes.

Tasks and job declaration

Diagram showing simplified Nomad job workflow.

A Nomad task is a command, service, application, or other workload executed on Nomad client nodes by their driver. Tasks can be short-lived batch jobs or long-running services such as a web application, database server, or API.

Tasks are defined in a declarative jobspec in HCL syntax. Job files are submitted to the Nomad server and the server determines where and how the task(s) defined in the job file should be allocated to client nodes. Another way to conceptualize this is: The job spec represents the desired state of the workload and the Nomad server creates and maintains the actual state.

The heirarchy for a job is: job → group → task. Each job file has only a single job, however a job may have multiple groups, and each group may have multiple tasks. Groups contain a set of tasks that are to be co-located on the same node.

Here is a simplified job file defining a Redis workload:

```hcl job "example" { datacenters = ["dc1"] type = "service"

constraint { attribute = "${}" value = "linux" } group "cache" { count = 1

task "redis" {    driver = "docker"      config {      image = "redis:3.2"    }    resources {      cpu    = 500 # 500 MHz      memory = 256 # 256MB    }  }  

} } ```

Job authors can define constraints as well as resources for their workloads. Constraints limit the placement of workloads on nodes by attributes such as kernel type and version. Resource requirements include the memory, network, CPU, etc. required for the task to run.

There are three types of jobs: system, service, and batch, which determines the scheduler Nomad will use for the tasks in this job. The service scheduler is designed for scheduling long lived services that should never go down. Batch jobs are much less sensitive to short term performance fluctuations and are short lived, finishing in a few minutes to a few days. The system scheduler is used to register jobs that should be run on all clients that meet the job's constraints. It is also invoked when clients join the cluster or transition into the ready state

Nomad makes task workloads resilient by allowing job authors to specify strategies for automatically restarting failed and unresponsive tasks as well as automatically rescheduling repeatedly failing tasks to other nodes.

Diagram showing Nomad job workflow, local restarts, and rescheduling.

Restarting failed tasks

Task failure can occur when a task fails to complete successfully, as in the case of a batch type job or when a service fails due to a fatal error or running out of memory.

Nomad will restart failed tasks on the same node according to the directives in the restart stanza of the job file. Operators specify the number of restarts allowed with attempts, how long Nomad should wait before restarting the task with delay, the amount of time to limit attempted restarts to with interval. Use (failure) mode to specify what Nomad should do if the job is not running after all restart attempts within the given interval have been exhausted.

The default failure mode is is fail which tells Nomad not to attempt to restart the job. This is the recommended value for non-idempotent jobs which are not likely to succeed after a few failures. The other option is delay which tells Nomad to wait the amount of time specified by interval before restarting the job.

The following restart stanza tells Nomad to attempt a maximum of 2 restarts within 1 minute, delaying 15s between each restart and not to attempt any more restarts after those are exhausted. This is also the default restart policy for non-batch type jobs.

hcl group "cache" { ... restart { attempts = 2 interval = "30m" delay = "15s" mode = "fail" } task "redis" { ... } }

This local restart behavior is designed to make tasks resilient against bugs, memory leaks, and other ephemeral issues. This is similar to using a process supervisor such as systemd, upstart, or runit outside of Nomad.

Restarting unresponsive tasks

Another common scenario is needing to restart a task that is not yet failing but has become unresponsive or otherwise unhealthy.

Nomad will restart unresponsive tasks according to the directives in the check_restart stanza. This works in conjunction with Consul health checks. Nomad will restart tasks when a health check has failed limit times. A value of 1 causes a restart on the first failure. The default, 0, disables health check based restarts.

Failures must be consecutive. A single passing check will reset the count, so services alternating between a passing and failing state may not be restarted. Use grace to specify a waiting period to resume health checking after a restart. Set ignore_warnings = true to have Nomad treat a warning status like a passing one and not trigger a restart.

The following check_restart policy tells Nomad to restart the Redis task after its health check has failed 3 consecutive times, to wait 90 seconds after restarting the task to resume health checking, and to restart upon a warning status (in addition to failure).

hcl task "redis" { ... service { check_restart { limit = 3 grace = "90s" ignore_warnings = false } } }

In a traditional data center environment, restarting failed tasks is often handled with a process supervisor, which needs to be configured by an operator. Automatically detecting and restarting unhealthy tasks is more complex, and either requires custom scripts to integrate monitoring systems or operator intervention. With Nomad they happen automatically with no operator intervention required.

Rescheduling failed tasks

Tasks that are not running successfully after the specified number of restarts may be failing due to an issue with the node they are running on such as failed hardware, kernel deadlocks, or other unrecoverable errors.

Using the reschedule stanza, operators tell Nomad under what circumstances to reschedule failing jobs to another node.

Nomad prefers to reschedule to a node not previously used for that task. As with the restart stanza, you can specify the number of reschedule attempts Nomad should try with attempts, how long Nomad should wait between reschedule attempts with delay, and the amount of time to limit attempted reschedule attempts to with interval.

Additionally, specify the function to be used to calculate subsequent reschedule attempts after the initial delay with delay_function. The options are constant, exponential, and fibonacci. For service jobs, fibonacci scheduling has the nice property of fast rescheduling initially to recover from short lived outages while slowing down to avoid churn during longer outages. When using the exponential or fibonacci delay functions, use max_delay to set the upper bound for delay time after which it will not increase. Set unlimited to true or false to enable unlimited reschedule attempts or not.

To disable rescheduling completely, set attempts = 0 and unlimited = false.

The following reschedule stanza tells Nomad to try rescheduling the task group an unlimited number of times and to increase the delay between subsequent attempts exponentially, with a starting delay of 30 seconds up to a maximum of 1 hour.

hcl group "cache" { ... reschedule { delay = "30s" delay_function = "exponential" max_delay = "1hr" unlimited = true } }

The reschedule stanza does not apply to system jobs because they run on every node.

As of Nomad version 0.8.4, the reschedule stanza applies during deployments.

In a traditional data center, node failures would be detected by a monitoring system and trigger an alert for operators. Then operators would need to manually intervene either to recover the failed node, or migrate the workload to another node. With the reschedule feature, operators can plan for the most common failure scenarios and Nomad will respond automatically, avoiding the need for manual intervention. Nomad applies sensible defaults so most users get local restarts and rescheduling without having to think about the various restart parameters.


In this first post in our series Building Resilient Infrastructure with Nomad, we covered how Nomad provides resiliency for computing infrastructure through automated restarting and rescheduling of failed and unresponsive tasks.

Operators specify Nomad's local restart strategy for failed tasks with the restart stanza. When used in conjunction with Consul and the check_restart stanza, Nomad will locally restart unresponsive tasks according to the restart parameters. Operators specify Nomad's strategy for rescheduling failed tasks with the reschedule stanza.

In the next post we'll look at how the Nomad client enables fast and accurate scheduling as well as self-healing through driver health checks and liveliness heartbeats.


Read in my feedly

Sent from my iPhone