Building an Openstack Cloud

Background

Last February I started looking at Openstack.  The project seemed quiet impressive, but still felt rough round the edges.  Armed with there unwieldy installation manual I had an attempt at installing my own cloud.  After a few days of tinkering around I promptly give up and then moved onto using devstack.

Devstack is a set of tools that automatically installs a Openstack cloud for developers.  The problem is it’s not a very production ready environment;  It’s intended to give you an environment to test changes to Openstack code, blow it away and start again.  Although it was useful enough to allow be to proof of concept a few projects using Openstack and heat.

Since then Openstack has moved on massively, Grizzly was released, Havana is almost available (later in October) and RedHat have spent much time and effort announcing RedHat Openstack (RHOS).  While RHOS looks like an attractive proposition for large companies, who want the comfort blanked of commercial support, it’s upstream cousin RDO that is getting much attention from everyone else.  Part of the RedHat/RDO effort has been the creation of packstack.  A set off puppet modules that help install Openstack.

Out of the box, it’s very simple to run packstack --allinone and about 15minutes later you will have a Openstack cloud running.

Packstack works by generating an anwser file, and then letting you specify the defaults.  Using the ‘–allinone’ flag, will insert sensible defaults based on your machines interfaces.  If you want to setup a multi-node deployment, then you will have to do a little more work.

Over the past few days I’ve went through several cycles of installing/un-installing, and eventually reached the point where I have my very own multi-node cloud.  Packstack works very well, and in fairness most of the issues I faced were due to having to work around my own environment, where I am trying to integrate into an existing development Lab.  The rest of this blog will explain how I actually setup my system.

 

Cloud Design

I started by roughly sketching the physical layout and topology of my cloud.   I had access to three HP DL360g7 servers in the existing lab.  Each with 4x1G ports and 2x10G ports.  2 of the 1G ports are connected as access switch ports.  1 of the 10G ports are connected as a trunk.

In my work we develop a server that handles a high volume of TCP traffic.  On a physical box we aim to support around 30K TPS, which equates to around 5Gbps with the traffic profile we use.  The intention was to build a cloud where my VMs would be connected via the 10G interface, and evaluate what performance we could get from a cloud platform.

The diagram below illustrates my setup:

Openstack Topology

Openstack Physical Topology

I have designated the 3 servers as ‘server 01′, ‘server 02′ and ‘server 03′.   Server 01, will be used as the controller, network node and compute node, and storage node.  Servers 2 and 3 will be purely compute nodes.  This is simply to make the most of my limited hardware.  In production you would want to make sure you have a dedicated network node, as well as having active standby’s for the controller and network node to provide HA.

 

Public/API Network

All 3 machines are connected to the existing lab network, and have an address on the 192.168.0.0/24 range.  There are serveral other non-openstack servers connected to this network, and it is routable from employees desktops. It also provides external access to the internet.   By default VMs on the cloud will get a private address, but we will be able to assign a floating IP from this network, to provide access into the VMs from the lab network.

Management Network

Openstack has a number of services that need to talk to one another to coordinate tasks like booting VMs, monitoring etc.  It’s best to place this on a dedicated management network.  Here I have created a private vlan, that uses 172.16.0.0/24.  

VM Network

The VM network is connected using the 10G port.  The physical machines don’t actually take an IP address on this network, instead Openstacks networking component ‘Neutron’ will use openvswitch to plumb VMs onto this network.  

Any communication between the virtual machines will go across this network.   The mode of networking that I use is provider tenant with VLANs.  When you create a network within Openstack, it assigns a VLAN id, so it’s important that the physical port these machines are connected to is configured as a trunk.  In my case I was assigned the VLAN ids in the range 10-20, which I can configure Openstack to use.  Initially my ports were configured as access point, which meant that VMs on different physical hosts weren’t able to talk to one another as the switch rejected the tagged packets.

Host Installation

I started by freshly installing my machines. I used CentOS 6.4. Normally we would use RHEL 6.4, but this install requires some packages that aren’t available in the RHEL repos yet, so you would need to add a CentOS repo either way.

I installed using the minimal server configuration.
Once complete I got the Openstack RDO Havana repos, upgraded the system and added the following packages:


yum install -y http://rdo.fedorapeople.org/openstack-havana/rdo-release-havana.rpm
yum update -y
yum install nmap tcpdump openstack-packstack

At this point you should make sure your network interfaces are configured. You can use DHCP, but I found there was alot of messing around with the network, so it was easier to statically configure them. I also had a few issues around routing, where the NetworkManager seemed to select the wrong default route. To avoid any messing around I disabled NetworkManager too.

The table below illustrates the interfaces configured:

Server Public/API Management VM Net
Server 01 eth0: 192.168.0.11 eth1: 172.16.0.11 eth2:
Server 02 eth0: 192.168.0.12 eth1: 172.16.0.12 eth2:
Server 03 eth0: 192.168.0.13 eth1: 172.16.0.13 eth2:

Here is an example of the interface cfg files from server 01.
/etc/sysconfig/network-scripts/ifcfg-eth0

DEVICE=eth0
TYPE=Ethernet
BOOTPROTO=static
IPADDR=192.168.0.11
NETMASK=255.255.255.0
GATEWAY=192.168.0.1
DOMAIN=mydomain.com
DNS1=192.168.10.110
DNS2=192.168.10.111
ONBOOT=yes
DEFROUTE=yes

/etc/sysconfig/network-scripts/ifcfg-eth1

DEVICE=eth1
TYPE=Ethernet
ONBOOT=yes
BOOTPROTO=static
IPADDR=172.16.0.11
DEFROUTE=no

/etc/sysconfig/network-scripts/ifcfg-eth2

DEVICE=eth2
TYPE=Ethernet
ONBOOT=yes
DEFROUTE=no

Shutdown NetworkManager, and start the network deamon. You should do this from the console.

service NetworkManager stop
chkconfig NetworkManager off

service network start
chkconfig network on

Once this is complete, verify you still have connectivity on each of your interfaces, and that your /etc/resolv.conf is correct. As NetworkManager no longer looks after it, you may need to manually populate the details of your nameservers. i.e.

search mydomain.com
domain mydomain.com
nameserver 192.168.10.110
nameserver 192.168.10.111

Finally, there seems to be issue around selinux polices and Openstack. For now you need to edit /etc/selinux/config and set

SELINUX=permissive

At this stage your almost ready to start installing, but as part of the yum update, a new Openstack kernel will have been installed that includes support for GRE.  You should reboot your server at this point to load the new kernel.

 

Openstack Installation

When your system reboots log onto Server 01, to begin installing Openstack.  From the command line run

packstack --gen-answer-file=multi-node.txt

This will generate an answer file with some defaults, that you can modify to suit your needs.

Open multi-node.txt with vim, or some other suitable editor.  If you search for the string ‘CONFIG_MYSQL_HOST’  you should see that packstack has configured the host using your Public/API network address.  If this is the case, then do a search and replace for 192.168.0.11, with your management network address i.e. 172.16.0.11.

Next configure the following settings

Host Settings

CONFIG_NOVA_VNCPROXY_HOST

You should set this to be the public IP of your controller node i.e.

CONFIG_NOVA_VNCPROXY_HOST=192.168.0.11

This will allow you to VNC into VMs from the Openstack horizon dashboard.

CONFIG_NOVA_COMPUTE_HOSTS

You should specify the management addresses of all machines you wish to use as compute hosts i.e.

CONFIG_NOVA_COMPUTE_HOSTS=172.16.0.11,172.16.0.12,172.16.0.13
Network Settings

The following settings relate to setting up the network, for inter VM communication.

CONFIG_NEUTRON_OVS_TENANT_NETWORK_TYPE

The default is local, but you should change this to vlan.

CONFIG_NEUTRON_OVS_TENANT_NETWORK_TYPE=vlan

 

CONFIG_NEUTRON_OVS_VLAN_RANGES

You can specify the vlan id range to use here.  If you were allocated a specific set of VLAN ids to use, then enter them here i.e. I was given 10-20 so I would specify

CONFIG_NEUTRON_OVS_VLAN_RANGES=physnet1:10:20

 

CONFIG_NEUTRON_OVS_BRIDGE_MAPPINGS

You should specify the physical bridge that will be used for VM communication.  Packstack will create this, but the convention is br-<interface name> i.e.

CONFIG_NEUTRON_OVS_BRIDGE_MAPPINGS=physnet1:br-eth2

 

CONFIG_NEUTRON_OVS_BRIDGE_IFACES

This specifies which port to add to the bridge above.  So in our case we want to add eth2 as a port to br-eth2 i.e.

CONFIG_NEUTRON_OVS_BRIDGE_IFACES=br-eth2:eth2

 

CONFIG_NTP_SERVERS

Specify one or more ntp servres to keep your nodes in sync

CONFIG_NTP_SERVERS=0.uk.pool.ntp.org,1.uk.pool.ntp.org,2.uk.pool.ntp.org,3.uk.pool.ntp.org

 

Services

You may want to install some services which are off by default.  I changed the following:

CONFIG_SWIFT_INSTALL=y
CONFIG_HEAT_INSTALL=y
CONFIG_HEAT_CFN_INSTALL=y

For my purposes we are interested in performance, howerver by default Openstack will overcommit your CPU and RAM resources. To avoid this you can change the ratio to 1.0, i.e. if you have 24 logical CPUs (On my DL360g7 I have 2*CPU*6cores*2hyperthreads = 24) then you will have 24 vCPUs available in Openstack.

CONFIG_NOVA_SCHED_CPU_ALLOC_RATIO=1.0
CONFIG_NOVA_SCHED_RAM_ALLOC_RATIO=1.0

You can now start your Openstack installation by running

packstack --answer-file=multi-node.txt

When it completes packstack should present you with the URL to login with. It will have also generated a file called keystonerc_admin. Take a look at this file to get the admin password to login with ‘OS_PASSWORD’.

Now you have a cloud, take a look around the dashboard before reading my next post on how to configure your cloud.

Dealing with duplicate photos

Over the past couple of years I’ve got through a few different computers.  Due to laziness on my behalf I’ve let my photo library grow widely out of control.  It got to the point where I had manually created backups on three different drives.  Unfortunately I had no designated master, and now had a jumble of 100G of photos to sort out.

I knew there were a lot of duplicates, but because I’d used different media managers on different computers I’d ended up with duplicate photos, but named differently.  It was time to sort it all out – enter fdupes.

fdupes is a nice utility that will go through a collection of files and identify duplicates.  It does this by doing a binary comparison, so it doesn’t matter if your files are named differently.

The syntax is pretty straight forward

fdupes -R . -s -1

This simply tells fdupe to recursively (-R) go through all directories from the current one (.); checking all files including sysmlinks (-s), and list all occurrences of a duplicate file on the same line (-1).

If fdupes identifies any duplicates you will get output like this:

./2012/Jan/img-01.jpg ./2012/restored/03243.jpg

./2013/Jan/img-01.jpg ./someotherfolder/img-01.jpg

Each line indicates a duplicate photo as been found.  Within each line all occurrences of duplicates are listed. 

 

A simple script can then be used to process this data.  The following script uses awk to ignore the first column of each line, and print the remaining columns (duplicates) into a new file.  This new file can then be processed however you like (moving, deleting the files listed in it)


#!/bin/bash

FILE=$1
COUNT=0
TMP_FILE=/tmp/files_to_delete.$RANDOM

# Process the file, and print each duplicate file onto a new line
#
while read line
do
echo $line | awk -F’ ./’ ‘{for(i=2;i<=NF;i++){printf “./%s\n”, $i};}’ >> $TMP_FILE
done < $FILE

echo “Wrote duplicates to $TMP_FILE”

You could modify this script to delete your duplicates, but by writing them to a separate file first, it gives you the opportunity to review the data before doing anything destructive.

If you want to delete the files try this


FILE=$1

while read file
do
if [ -f "$file" ]; then
rm -f “$file”
fi
done < $FILE