Setup of Ganeti cluster on root servers of Hetzner

I'll setup a 3 node Ganeti cluster on Hetzners root servers, to run static and Drupal based sites for friends and NGOs. Ganeti is designed to facilitate cluster management of virtual servers (the "instances") using commodity hardware (the "nodes"). It provides fast and simple recovery after physical failures, disk creation management, operating system installation, startup, shutdown, and failover between physical systems. Ganeti is built on top of existing virtualization technologies such as Xen or KVM and other open source software. It's easy to start with one physical node, and one cluster can scale up to 150 physical nodes. Below I'll follow Ganeti terminology, and the root servers will be called "nodes". The nodes will be the KVM hosts for guest OS systems. These guest systems (or the virtual machines) are named to "instances".

The recent Ganeti stack on Debian Wheezy, as of October 2014, consists of:

  • Debian Wheezy 7.7
  • DRBD 8.3.11
  • KVM 1.1.2
  • Ganeti 2.11.5.

This will definitely change over time, of course. Next to the Ganeti stack, some other tools will be also installed: pound as reverse-proxy and load balancer, Shorewall and fail2ban as firewall to reject malicious IP addresses.

Get root servers with Debian image

First order root servers with one additional IP. Register at Hetzner, and order, either some new E40 servers, or go for server bidding to bargain. After 30-60 minutes and some confirmation emails containing the node IP address and a temporary root password, you can login to your fresh root server. Mine has Intel i7-3770, 32 GB and 2x3TB disks, with traffic 10 TB per month.

Login and, in order to deploy their prepared Debian OS image on the node, run the installimage script.

Select Debian, and the 64 bit minimal image named debian-77-wheezy-64-minimal. In the editor use software RAID, and set the hostname as FQDN like this: node1.example.com

For the partition layout, set 1GB /boot directory and 3 volume groups. I take 120GB for the system volume group, including 16GB swap and the 100+GB root filesystem with ext4 for the OS, a 1500GB volume group for Ganeti, and the rest of disk space named as unused.

PART   /boot  ext4    1G
PART   lvm    system  120G
PART   lvm    ganeti  1500G
PART   lvm    unused  all
LV     system swap    swap  swap  16G
LV     system root    /     ext4  all

Save (F2) and exit (F10), then the install process will run.

When it's ready, reboot and login again with temporary password. Check few things:

  • RAID status

    Check with cat /proc/mdstat, it should be similar like this:

    Personalities : [raid1]
    md3 : active (auto-read-only) raid1 sda5[0] sdb5[1]
          253635520 blocks super 1.2 [2/2] [UU]
              resync=PENDING
    
    
    md2 : active (auto-read-only) raid1 sda3[0] sdb3[1]
          1572732736 blocks super 1.2 [2/2] [UU]
              resync=PENDING
    
    
    md1 : active raid1 sda2[0] sdb2[1]
          125763456 blocks super 1.2 [2/2] [UU]
          [>....................]  resync =  0.4% (572672/125763456) finish=1901.8min speed=1096K/sec
    
    
    md0 : active raid1 sda1[0] sdb1[1]
          1048000 blocks super 1.2 [2/2] [UU]
    
    
    unused devices: <none>
    
  • Volume groups status

    By vgscan, and the result should be like this:

    Reading all physical volumes.  This may take a while...
    Found volume group "unused" using metadata type lvm2
    Found volume group "ganeti" using metadata type lvm2
    Found volume group "system" using metadata type lvm2
    
  • Networking

    via icfonfig,

  • and so on...


Basic Debian config

Update OS, add few utilities and users, setup ssh. See: https://gist.github.com/doka/3df1fffb7ab331592c4c

Reboot and login again with temporary password, and start with locales.

locales

The locales should be reconfigured first, before changing passwords. I select my local languages, as the second language, next to English, and set English for system language. Defaults will be en_US.UTF8 and hu_HU.UTF8 for me. Use either:

dpkg-reconfigure locales

or:

echo "
# This file lists locales that you wish to have built. You can find a list
# of valid supported locales at /usr/share/i18n/SUPPORTED, and you can add
# user defined locales to /usr/local/share/i18n/SUPPORTED. If you change
# this file, you need to rerun locale-gen.
en_US.UTF-8 UTF-8
hu_HU.UTF-8 UTF-8
" | tee /etc/locale.gen

echo "
# File generated by update-locale
LANG=en_US.UTF-8
LANGUAGE="en_US:en"
" | tee /etc/default/locale

locale-gen

root password

Simply change root password, but first check your locales, especially for special characters!

passwd

timezone

You can change the timezone by running:

  dpkg-reconfigure tzdata

default editor

I’ve found vim as default editor at /root/.bash_profile. It’s Hetzner specific, not Debian. I change it to nano:

  sed -i 's/EDITOR="vim"/EDITOR="nano"/' /root/.bash_profile

After a new login you’ll have nano.

hardening SSH

Copy your public key to new server. First logout from the server, and do the following on your home directory of your notebook (Not on the new server!):

cat .ssh/id_*sa.pub | ssh root@node1 'cat >> .ssh/authorized_keys'
ssh root@node1 'chmod 600 .ssh/authorized_keys;'

More simple way on Ubuntu client:

  ssh-copy-id root@node1

If SSH alerts you, then remove old record:

  ssh-keygen -f ".ssh/known_hosts" -R IP-of-the-newserver

Now you can login without passwords.

  ssh root@node1

Set key authentication on the node, but do not restart ssh, we’ll do it bit later!

sed -i 's/PermitRootLogin yes/PermitRootLogin without-password/' /etc/ssh/sshd_config
sed -i 's/#PasswordAuthentication yes/PasswordAuthentication no/' /etc/ssh/sshd_config

Again, do not restart the SSH server now, we’ll do it later!

update OS

Replacing /etc/apt/sources.list by Hetzner apt mirrors for Debian Wheezy.

echo "
###############################################################################
# Hetzner APT-Mirror
deb     http://mirror.hetzner.de/debian/packages wheezy main contrib non-free
deb     http://mirror.hetzner.de/debian/security wheezy/updates main contrib non-free

###############################################################################
# Backup mirrors
#
deb     http://cdn.debian.net/debian/ wheezy main non-free contrib
deb-src http://cdn.debian.net/debian/ wheezy main non-free contrib

deb     http://security.debian.org/  wheezy/updates  main contrib non-free
deb-src http://security.debian.org/  wheezy/updates  main contrib non-free
" | tee /etc/apt/sources.list

And update and upgrade:

apt-get -y update && apt-get -y upgrade

Add first user in Debian

Add an user, also with sudo rights:

adduser doka
adduser doka sudo

Logout, and copy again your public key to new server from your notebook!

ssh doka@node1 'mkdir .ssh;chmod 700 .ssh;'
cat .ssh/id_*sa.pub | ssh doka@node1 'cat >> .ssh/authorized_keys'
ssh doka@node1 'chmod 600 .ssh/authorized_keys;'

reboot

Now login to the new server as root, it should happen without asking for a password, and you can restart the SSH service or make a reboot:

  /etc/init.d/ssh restart

Setup networking for Ganeti

So we have now an up-to-date Debian Wheezy server with a bit hardened SSH. Let’s continue with networks, IP addresses, hostname, DNS and setting up the networking mode for Ganeti.

Each node will have an IP for the host access, and an additional public IP, used for cluster IP in case of master node, otherwise unused. The loadbalancer ans firewall will run on node, everything else on virtual machines (instances). And Hetzner has also some rules regarding their network architecture.

Config checks

Check few relevant setting, like hostname and DNS resolution. Small but important, always use FQDN in /etc/hostname. Default should be OK, but check it:

cat /etc/hostname

DNS resolution check:

dig google.com

Setting in /etc/hosts

Manually define the DNS settings for all the nodes and instances by updating the /etc/hosts file.

First copy the original host file:

cp /etc/hosts ~/.

Then change /etc/hosts accordingly:

### node 1
#  2014-10
#                 
127.0.0.1 localhost
#
# cluster - on node1
148.251.xxx.yyy  cluster1.dxhost.hu   cluster1
#
# node1
148.251.aaa.bbb  node1.dxhost.hu      node1
148.251.ccc.ddd  vm1.dxhost.hu        vm1
#
# node2
176.9.eee.fff    node2.dxhost.hu      node2
#
# instances
192.168.1.101   vm1.dxhost.hu         vm1
192.168.1.102   vm2.dxhost.hu         vm2
192.168.1.103   vm3.dxhost.hu         vm3
192.168.1.104   vm4.dxhost.hu         vm4

Settings in /etc/network/interfaces

First copy the original interface file:

cp /etc/network/interfaces ~/.

Then change /etc/network/interfaces as follows:

### Networking setup on Debian for Ganeti with KVM
#
# node1.dxhost.hu
# 2014-10

# Loopback device:
auto lo
iface lo inet loopback

# physical device of the host
auto  eth0
iface eth0 inet static
  address      148.251.aaa.bbb
  netmask      255.255.255.255
  gateway      148.251.aaa.ccc
  pointopoint  148.251.aaa.ccc


# Private bridge with NAT-ed access to the Internet
auto  vmbr0
iface vmbr0 inet static
  address      192.168.1.1
  netmask      255.255.255.0
  bridge_ports none
  bridge_stp   off
  bridge_fd    0
  # masquerading private IPs to get access to the Internet:
  post-up   iptables -t nat -A POSTROUTING -s '192.168.1.0/24' -o eth0 -j MASQUERADE
  post-down iptables -t nat -D POSTROUTING -s '192.168.1.0/24' -o eth0 -j MASQUERADE


# Private bridge for node internal traffic
auto  vmbr1
iface vmbr1 inet static
  address      10.10.1.1
  netmask      255.255.255.0
  bridge_ports none
  bridge_stp   off
  bridge_fd    0


# bridge for VMs with public IPs (DMZ)
auto  vmbr2
iface vmbr2 inet static
  address      148.251.aaa.bbb
  netmask      255.255.255.255
  bridge_ports none
  bridge_stp   off
  bridge_fd    0
  # route on the node for the additional public IP
  up   ip route add 148.251.ccc.ddd/32 dev vmbr2
  down ip route del 148.251.ccc.ddd/32 dev vmbr2
  # activate IP forwarding
  post-up echo 1 > /proc/sys/net/ipv4/ip_forward

Setup Ganeti

Reboot the node before continue.

If something goes wrong, then activate rescue system at Hetzner, login with temporary root password, assemble your volume group for system files, and repair the network setup:

vgchange -ay
lvscan
 ACTIVE            '/dev/system/swap' [16,00 GiB] inherit
 ACTIVE            '/dev/system/root' [103,93 GiB] inherit

mount /dev/system/root /mnt
nano /mnt/etc/network/interfaces

If reboot goes well, and you can login into the server, then check the network setup with ifconfig, ping google.com, ping node1 from somewhere. You should see something like that for ifconfig:

eth0      Link encap:Ethernet  HWaddr d4:3d:7e:ec:ff:ee  
          inet addr: 148.251.aaa.bbb  Bcast:148.251.aaa.bbb  Mask:255.255.255.255
          inet6 addr: fe80::d63d:7eff:ffff:eeee/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:64 errors:0 dropped:0 overruns:0 frame:0
          TX packets:63 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:10327 (10.0 KiB)  TX bytes:7750 (7.5 KiB)
          Interrupt:43 Base address:0x6000

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

vmbr0     Link encap:Ethernet  HWaddr 56:a0:30:40:ff:ee  
          inet addr:192.168.1.1  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: fe80::54a0:30ff:ffff:eeee/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:86 (86.0 B)

vmbr1     Link encap:Ethernet  HWaddr 06:36:bf:be:ff:ee  
          inet addr:10.10.1.1  Bcast:10.10.1.255  Mask:255.255.255.0
          inet6 addr: fe80::436:bfff:ffff:eeee/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:86 (86.0 B)

vmbr2     Link encap:Ethernet  HWaddr 0e:92:f4:8e:ff:ee  
          inet addr:148.251.aaa.ddd  Bcast:148.251.aaa.ddd  Mask:255.255.255.255
          inet6 addr: fe80::c92:f4ff:ffff:eeee/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:180 (180.0 B)

Now we are going to install and setup KVM, DRBD and the Ganeti packages.

Install KVM hypervisor

We install only KVM, and do not need libvirt, since libvirt-bin (libvirtd) administers the kvm instances using libvirt, but Ganeti uses KVM (the qemu-kvm backend) directly.

Install KVM:

apt-get install qemu-kvm

The following NEW packages will be installed:
  dbus ipxe-qemu libaio1 libasound2 libasyncns0 libbluetooth3 libbrlapi0.5 libcaca0 libcurl3-gnutls libdbus-1-3 libdirectfb-1.2-9 libflac8 libice6 libiscsi1 libjpeg8 libjson0 libogg0 libpixman-1-0 libpng12-0 libpulse0 libsdl1.2debian libsm6 libsndfile1 libspice-server1 libsystemd-login0 libts-0.0-0 libusbredirparser0 libvdeplug2 libvorbis0a libvorbisenc2 libx11-xcb1 libxi6 libxtst6 qemu-keymaps qemu-kvm qemu-utils seabios sharutils tsconf vgabios x11-common

Install and config DRBD

DRBD is RAID1 over the network, and is required for high availability. The stable (default) Debian Wheezy repository contains DRBD 8.3.13 and Ganeti 2.5.2, so we have to use the backport repository to get Ganeti 2.11.5.

Add the the Debian backports repo to the source list /etc/apt/sources.list

Debian backports:
# Debian backports repository
deb http://ftp.debian.org/debian wheezy-backports main contrib

Or all in one:

cat >> /etc/apt/sources.list <<-EOF
###############################################################################
# Debian backports repository
deb http://ftp.debian.org/debian wheezy-backports main contrib
EOF

Update packages:

apt-get update

Install DRBD from backport:

apt-get -t wheezy-backports install drbd8-utils

The following NEW packages will be installed:
  drbd8-utils heirloom-mailx

Configure DRBD:

echo drbd minor_count=128 usermode_helper=/bin/true >> /etc/modules
depmod -a
modprobe drbd minor_count=128 usermode_helper=/bin/true

Change the active LVM filter to ignore drbd devices:

nano /etc/lvm/lvm.conf
filter = [ "r|/dev/cdrom|", "r|/dev/drbd[0-9]+|" ]

Run vgscan after you change this parameter to ensure that the cache file gets regenerated.

Install and config Ganeti

Check what you have actually in backport:

apt-cache show ganeti2

First lines should be like:

Package: ganeti2
Source: ganeti
Version: 2.11.5-1~bpo70+1

Install Ganeti and DRBD from backport:

apt-get -t wheezy-backports install ganeti2

It also install the OS support packages ganeti-instance-debootstrap, which is the default OS install scripts for instances. Nevertheless, I’m not using it.

The following NEW packages will be installed:

debootstrap dump fping ganeti ganeti-2.11 ganeti-haskell-2.11 ganeti-htools-2.11 ganeti-instance-debootstrap ganeti2 iputils-arping javascript-common kpartx libgmp10   libjs-jquery libsysfs2 ndisc6 python-bitarray python-crypto python-fdsend python-ipaddr python-openssl python-paramiko python-pycurl python-pyinotify python-pyparsing python-simplejson python-support socat wwwconfig-common

If you see this warning, then it’s OK, since the cluster is not yet initialized.

    [....] Starting Ganeti cluster:Missing configuration file /var/lib/ganeti/server.pem
    [warn] Incomplete configuration, will not run. ... (warning).

Set kernel path

Set softlink for kernel and initrd. It’s needed in order to add the node to the cluster, and also for ganeti-instance-image, in case of dump based instance creation ...

cd /boot
ln -s vmlinuz-3.2.0-4-amd64 vmlinuz-3-kvmU
ln -s initrd.img-3.2.0-4-amd64 initrd-3-kvmU

Setup the OS templating tool

In Ganeti, unlike virsh, you can't just install any OS from ISO image.
There are some options:
* ganeti-instance-debootstrap: for Debian and Ubuntu (default, included in ganeti2 package) * ganeti-instance-image (from ganeti-instance-image apt repository) raw

We will use ganeti-instance-image, since it simply works and any ISO image can be used to create instances. Ganeti Instance Image is the guest OS definition for Ganeti that uses either file system dumps or tar ball images to deploy instances . See howto: http://notes.ceondo.com/ganeti/

Install Ganeti Instance Image

First add the ganeti-instance-image repo to the source list in /etc/apt/sources.list

The ganeti-instance-image repo:

# ganeti-instance-image repo for wheezy
deb http://ftp.osuosl.org/pub/osl/ganeti-instance-image/apt/ wheezy main contrib non-free

Add this repo to source file by script:

cat >> /etc/apt/sources.list <<-EOF
#################################################################
## Ganeti Instance Image repo
deb http://ftp.osuosl.org/pub/osl/ganeti-instance-image/apt/ wheezy main contrib non-free
EOF

Update packages:

apt-get update

The apt-get update will now complain because of a missing public key for the repo, but that’s OK.

Reading package lists... Done
W: GPG error: http://ftp.osuosl.org wheezy Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 416FA15D27F4B742

Install it from ganeti-instance-image apt repository:

apt-get install ganeti-instance-image

The following NEW packages will be installed:
  ganeti-instance-image gawk libsigsegv2

Some complains during install as well:
WARNING: The following packages cannot be authenticated! ganeti-instance-image Install these packages without verification [y/N]?

Status

We have now the complete Ganeti stack installed on the node. Next step can be:

  • either the initiation of the Ganeti cluster and creation of the first instance
  • or the fresh Ganeti node can be added to an existing Ganeti cluster.

Choose the one you need...