Building a Proxmox Test Cluster in VirtualBox Part 3: Building The Cluster

In the last installment of this series, I discussed setting up the Proxmox VE hosts. Until now, you could do most of this configuring in triplicate with MobaXTerm. Now you can still use it to multicast, just be sure to disable it when you have to customize the configs for each host. This part of the process is a lethal combination of being really repetitive while also requiring a lot of attention to detail. This is also the point where it gets a bit like virtualization-inception: VirtualBox VMs which are PVE hosts to PVEVMs.

Network Adapter Configuration
I did my best to simplify the network design:

  • There are 3 PVE hosts with corresponding management IP’s:
    1. prox1 – 192.168.1.101
    2. prox2 – 192.168.1.102
    3. prox3 – 192.168.1.103
  • Each PVE host has 3 network adapters:
    1. Adapter 1: A Bridged Adapter that connects to the [physical] internal network.
    2. Adapter 2: Host only Adapter #2 that will serve as the [virtual] isolated cluster network.
    3. Adapter 3: Host only Adapter #3 that will serve as the [virtual] dedicated migration network.
  • Each network adapter plugs into a different [virtual] network segment with a different ip range:
    1. Adapter 1 (enp0s3) – 192.168.1.0/24
    2. Adapter 2 (enp0s8) – 192.168.2.0/24
    3. Adapter 3 (enp0s9) – 192.168.3.0/24
  • Each PVE hosts’ IP on each network roughly corresponds to its hostname:
    1. prox1 – 192.168.1.101, 192.168.2.1, 192.168.3.1
    2. prox2 – 192.168.1.102, 192.168.2.2, 192.168.3.2
    3. prox3 – 192.168.1.103, 192.168.2.3, 192.168.3.3

I have built this cluster a few times and my Ethernet adapter names (enp0s3, enp0s8, and enp0s9) have always been the same. That may be a product of all the cloning, so YMMV. Pay close attention here because this can get very confusing.

Open the network interface config file for each PVE host:

nano /etc/network/interfaces

You should see the entry for your first Ethernet adapter (the bridged adapter in VirtualBox), followed by the virtual machines' bridge interface with the static IP that you set when you installed Proxmox. This is a Proxmox virtual Ethernet adapter. The last two entries should be your two host only adapters, #2 and #3 in VirtualBox. These are the adapters that we need to modify. The file for prox1 probably looks like this:

nano /etc/network/interfaces

auto lo
iface lo inet loopback

iface enp0s3 inet manual

auto vmbr0
iface vmbr0 inet static
        address 192.168.1.101
        netmask 255.255.255.0
        gateway 192.168.1.1
        bridge_ports enp0s3
        bridge_stp off
        bridge_fd 0

iface enp0s8 inet manual

iface enp0s9 inet manual

Ignore the entries for lo, enp0s3, and vmbr0. Delete the last two entries (enp0s8 and enp0s9) and replace them with this:

#cluster network
auto enp0s8
iface enp0s8 inet static
        address 192.168.2.1
        netmask 255.255.255.0
#migration network
auto enp0s9
iface enp0s9 inet static
        address 192.168.3.1
        netmask 255.255.255.0

Now repeat the process for prox2 and prox3, changing the last octet for the IP's to .2 and .3 respectively. When you are finished your nodes should be configured like so:

  1. prox1 - 192.168.1.101, 192.168.2.1, 192.168.3.1
  2. prox2 - 192.168.1.102, 192.168.2.2, 192.168.3.2
  3. prox3 - 192.168.1.103, 192.168.2.3, 192.168.3.3

Save your changes on each node. Then reboot each one:

shutdown -r now

Network Testing
When your PVE hosts are booted back up, SSH into them again. Have each host ping every other host IP to make sure everything is working:

ping -c 4 192.168.2.1;\
ping -c 4 192.168.2.2;\
ping -c 4 192.168.2.3;\
ping -c 4 192.168.3.1;\
ping -c 4 192.168.3.2;\
ping -c 4 192.168.3.3

The result should be 4 replies from each IP on each host with no packet loss. I am aware that each host is pinging itself twice. But you have to admit it look pretty bad ass.

Once you can hit all of your IP's successfully, now it's time to make sure that multicast is working properly. This isn't a big deal in VirtualBox because the virtual switches are configured to handle multicast correctly, but it's important to see the test run so you can do it on real hardware in the future.

First send a bunch of multicast traffic at once:

omping -c 10000 -i 0.001 -F -q 192.168.2.1 192.168.2.2 192.168.2.3

You should see a result that is 10000 sent with 0% loss. Like so:

192.168.2.1 :   unicast, xmt/rcv/%loss = 9406/9395/0%, min/avg/max/std-dev = 0.085/0.980/15.200/1.940
192.168.2.1 : multicast, xmt/rcv/%loss = 9406/9395/0%, min/avg/max/std-dev = 0.172/1.100/15.975/1.975
192.168.2.2 :   unicast, xmt/rcv/%loss = 10000/9991/0%, min/avg/max/std-dev = 0.091/1.669/40.480/3.777
192.168.2.2 : multicast, xmt/rcv/%loss = 10000/9991/0%, min/avg/max/std-dev = 0.173/1.802/40.590/3.794

Then send a sustained stream of multicast traffic for a few minutes:

omping -c 600 -i 1 -q 192.168.2.1 192.168.2.2 192.168.2.3

Let this test run for a few minutes. Then cancel it with CTRL+C.

The result should again be 0% loss, like so:

root@prox1:~# omping -c 600 -i 1 -q 192.168.2.1 192.168.2.2 192.168.2.3
192.168.2.2 : waiting for response msg
192.168.2.3 : waiting for response msg
192.168.2.3 : joined (S,G) = (*, 232.43.211.234), pinging
192.168.2.2 : joined (S,G) = (*, 232.43.211.234), pinging
^C
192.168.2.2 :   unicast, xmt/rcv/%loss = 208/208/0%, min/avg/max/std-dev = 0.236/1.488/6.552/1.000
192.168.2.2 : multicast, xmt/rcv/%loss = 208/208/0%, min/avg/max/std-dev = 0.338/2.022/7.157/1.198
192.168.2.3 :   unicast, xmt/rcv/%loss = 208/208/0%, min/avg/max/std-dev = 0.168/1.292/7.576/0.905
192.168.2.3 : multicast, xmt/rcv/%loss = 208/208/0%, min/avg/max/std-dev = 0.301/1.791/8.044/1.092

Now that your cluster network is up and running, you can finally build your cluster. Up to this point, you have been entering identical commands into a SSH sessions. At this point, you can stop using the multi-exec feature of your SSH client.

First, create the initial cluster node on Prox1, like so:

root@prox1:~# pvecm create TestCluster --ring0_addr 192.168.2.1 --bindnet0_addr 192.168.2.0

Then join Prox2 to the new cluster:

root@prox2:~# pvecm add 192.168.2.1 --ring0_addr 192.168.2.2

Followed by Prox3:

root@prox3:~# pvecm add 192.168.2.1 --ring0_addr 192.168.2.3

One final configuration on Prox1 is to set the third network interface as the dedicated migration network by updating the datacenter.cfg file, like so:

root@prox1:~# nano /etc/pve/datacenter.cfg

keyboard: en-us
migration: secure,network=192.168.3.0/24

Now that the cluster is set up, you can log out of your SSH sessions and switch to the web GUI. When you open the web GUI for Prox1 (https://192.168.1.101:8006) and you should see all 3 nodes in your TestCluster:

Now you can manage all of your PVE Hosts from one graphical interface. You can also do cool shit like migrating VMs from one host to another, but before we can do that we need to set up some PVEVMs. There are more things to set up, but to see them in action, we need to build a couple of PVEVMS to work with, which I will cover in the next installment: Building PVE Containers.

Building a Proxmox Test Cluster in VirtualBox Part 2: Configuring the Hosts

In the last installment of this series, I discussed setting up the Proxmox VE hosts in VirtualBox. At this stage in the exercise there should be 3 VirtualBox VMs (VBVMs) running, in headless mode. These VBVMs should be running Proxmox VE, ready to host their own VMs (PVEVMs).

Before you can set up the cluster, storage replication, and high availability, you need to do a bit of housekeeping on your hosts. In this post, I will go over those steps making sure that the hosts are up to date OS wise and that your storage is properly configured for later use. Most of these steps can be accomplished via the Web UI, but using SSH will be faster and more accurate. Especially when you use an SSH client like SuperPuTTY or MobaXTerm that lets you type and paste into multiple terminals at the same time.

Log in as root@ip-address for each PVE node. In the previous post, the IPs I chose were 192.168.1.101, 192.168.1.102, and 192.168.1.103.

I don’t want to bog this post down with a bunch of Stupid SSH Tricks, so just spend a few minutes getting acquainted with MobaXTerm and thank me later. The examples below will work in a single SSH session, but you will have to paste them into 3 different windows like a peasant, instead of feeling like some kind of superhacker:

Step 1 – Fix The Subscription Thing

No, not the nag screen that pops up when you log into the web UI, I mean the errors that you get when you try to update a PVE host with the enterprise repos enabled.

All you have to do is modify a secondary sources.list file. Open it with your editor, comment out the first line and add the second line:

nano  /etc/apt/sources.list.d/pve-enterprise.list

# old and busted
#deb https://enterprise.proxmox.com/debian/pve stretch pve-enterprise

# new hotness
deb http://download.proxmox.com/debian/pve stretch pve-no-subscription

Save the file, and run your updates:

apt-get update; apt-get -y upgrade

While you are logged in to all 3 hosts, you might as well update the list of available Linux Container templates:

pveam update

Finally, if you set up your virtual disk files correctly according to the last post, you can set up your ZFS disk pool:

  1. List your available disks, hopefully you see two 64GB volumes that aren’t in use on /dev/sdb and /dev/sdc:
    
    root@prox1:~# lsblk
    NAME               MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
    sda                  8:0    0   32G  0 disk
    ├─sda1               8:1    0 1007K  0 part
    ├─sda2               8:2    0  512M  0 part
    └─sda3               8:3    0 31.5G  0 part
      ├─pve-swap       253:0    0  3.9G  0 lvm  [SWAP]
      ├─pve-root       253:1    0  7.8G  0 lvm  /
      ├─pve-data_tmeta 253:2    0    1G  0 lvm
      │ └─pve-data     253:4    0   14G  0 lvm
      └─pve-data_tdata 253:3    0   14G  0 lvm
        └─pve-data     253:4    0   14G  0 lvm
    sdb                  8:16   0   64G  0 disk
    sdc                  8:32   0   64G  0 disk
    sr0                 11:0    1  642M  0 rom
    root@prox1:~#
    

    Assuming you see those two disks, and they are in fact ‘sdb’ and ‘sdc’ then you can create your zpool. Which you can think of as a kind of software RAID array. There’s waaaay more to it than that, but that’s another post for another day when I know more about ZFS. For this exercise, I wanted to make a simulated RAID1 array, for “redundancy.” Set up the drives in a pool like so:

    
    zpool create -f -o ashift=12 z-store mirror sdb sdc
    zfs set compression=lz4 z-store
    zfs create z-store/vmdata
    

    In a later post we will use the zpool on each host for Storage Replication. The PVEVM files for each of your guest machines will copy themselves to the other hosts at regular intervals so when you migrate a guest from one node to another it won’t take long. This feature pairs very well with High Availability, where your cluster can determine if a node is down and spin up PVEVMs that are offline.

    Now that your disks are configured, it’s time to move on to Part 3: Building The Cluster.

Building a Proxmox Test Cluster in VirtualBox Part 1: Building The Hosts

In my last post, I set the stage for why I built the virtualbox cluster, and now it it time to discuss the how.

In researching the best way to design a network for a Proxmox cluster, the bare minimum is one network connection. This one link does the following:

  1. Hosts the web server for the management GUI – The web UI is pretty slick, and it’s great for viewing stats and checking for errors.
  2. Hosts the network bridge for guest VMs – This bridge acts as a kind of virtual network switch for your PVEVMs to talk to the outside world.
  3. Connects the host to the Internet – The PVE host needs to download security updates, Linux container templates, and install packages.

This one network interface is sort of the lifeline for a Proxmox host. It would be a shame if that link got bombed by incessant network traffic. As I discovered (the hard way) one possible source of incessant network traffic is the cluster communication heartbeat. Obviously, that traffic needs to go on its own network segment. Normally, that would be a VLAN or something, but I have some little dumb switches and the nodes have some old quad port NICs, so I wanted to just assign an IP to one port, and plug that port into a switch that is physically isolated from “my” network.

Once a cluster is working, migrating machines happens over the cluster network link. This is OK, but if your cluster network happens to suck (like when some jackass plugs it into a 10 year old switch) it can cause problems with determining if all the cluster nodes are online. So, now I want to set up an additional interface for VM migration. Migration seems like the kind of thing that happens only occasionally, but when you enable Storage Replication, the nodes are copying data every 15 minutes. Constant cluster chatter, plus constant file synchronization, has the potential to saturate a single network link. This gets even worse when you add High Availability, and there is a constant vote on if a PVEVM is up and running, followed by a scramble to get it going on another node.

So, at minimum we will need 3 network interfaces for the test cluster on VirtualBox. I didn’t want to spend a lot of time tinkering with firewall and NAS appliances, so I am leaving the “Prox management on its own network segment” and the “Dedicated network storage segment” discussions out of this exercise. I can’t decide if the management interface for my physical Proxmox cluster should sit on my internal network, or on its own segment. For this exercise, the management interface is going to sit on the internal network. My Synology NAS has 4 network ports, so I am definitely going to dedicate a network segment for the cluster to talk to the NAS, but that won’t be a part of this exercise.

[Virtual] Hardware Mode(tm)

Once you are booted up and VirtualBox is running, you can start building your VBVMs. I recommend building one VBVM to use as a template and then cloning it 3 times. I found that I kept missing important things and having to start over, so better to fix the master and then destroy the clones.

I called my master image “proxZZ” so it showed up last in the list of VBVMs. I also never actually started up the master image, so it was always powered off and the ZZ’s made it look like it was sleeping.

Create proxZZ with the following:

  • First, make sure that you have created 2 additional Host Only Network Adapters in VirtualBox. In this exercise you will only use two, but it can get confusing when you are trying to match en0s9 to something, so do yourself a favor and make three. Make sure to disable the DHCP server on both adapters.
  • Create a new virtual machine with the following characteristics :
    1. Name: ProxZZ
    2. Type: Linux
    3. Version: Debian 64bit (Proxmox is Debian under the hood.)
    4. Memory Size: 2048MB
    5. Hard drive: dynamically allocated, 32GB in size.
  • Make sure that you have created 3 total virtual hard disks as follows:
    1. SATA0: 32GB. This will be your boot drive and system disk. This is where Proxmox PVE will be installed. Static disks are supposed to be faster, but this isn’t even remotely about speed. My laptop has a 240gb SSD, so I don’t have a ton of space to waste.
    2. SATA1: 64GB, dynamically allocated. This will be one of your ZFS volumes.
    3. SATA2: 64GB, dynamically allocated. This will be your other ZFS volume. Together they will make a RAID1 array.
  • WHile you are in the storage tab, make sure to mount the Proxmox installer ISO
  • Make sure that you have created 3 network interfaces as follows:
    1. Adapter 1: Bridged Network – this will be your management interface.
    2. Adapter 2: Host Only Network Adapter #2 – this will be your cluster interface.
    3. Adapter 3: Host Only Network Adapter #3 – this will be your VM migration interface.
    4. You may be tempted to do something clever like unplugging virtual cables or something. Don’t. You will be cloning this machine in a minute and you will have a hard time keeping all of this straight.
  • Before you finish, make sure that the machine is set to boot from the hard drive first, followed by the CD/Optical drive. This seems stupid, but you will be booting these things in headless mode, and forgetting to eject the virtual CD rom is super annoying. So fix it here and stop being bothered with it.

When it’s done, it should look something like this:

Once you are sure your source VM is in good shape, make 3 clones of it. Don’t install Proxmox yet. SSH keys and stuff will play a major role in this exercise later, and I am not sure if VirtualBox is smart enough to re-create them when you clone it. I ran into this a few times so just clone the powered off VBVM. I called the clones prox1, prox2, and prox3.

[Virtual] Software Mode(tm)

Now it is time to start your 3 clones. This can get pretty repetitive, especially if you start the process over a couple of times. While you will appreciate cloning the servers, there isn’t really a simple way that I have discovered to mass produce the PVE hosts. In a few iterations of this exercise, I misnamed one of the nodes (like pro1 or prx2) and it’s super annoying later when you get the cluster set up and see one of the nodes named wrong. There is a procedure to fix the node name after you build it, but seriously just take your time and pay attention.

When you first start the install you will probably see an error about KVM Virtualization not being detected. Which should be impossible, right? The PVE host is *literally* a virtual machine. This is probably VBox for Windows not passing something to PVE Linux, or VBox misidentifying your CPU. Whatever it is, this isn’t a showstopper because our test systems that will run on the PVE cluster aren’t going to be VMs, they’ll be Linux Containers, but in the interest of not making my head hurt I am still going to call them VMs. The PVE hosts are VirtualBox VMs, and the Linux Containers running on Proxmox are Proxmox VE VMs. VBVMs are VMs running under VirtualBox, PVEVMs are Linux Containers running under Proxmox VE. *whew*

As you do the install, select your 32gb boot drive and configure your IP addresses.
I went with a sequence based on the hostname:
prox1 – 192.168.1.101
prox2 – 192.168.1.102
prox3 – 192.168.1.103
Like I said before, go slowly and pay attention. This part is super repetitive and it’s easy to make a stupid mistake that you have to troubleshoot later. At some point, I guarantee that you will give up, destroy the clones, and start over 🙂

Send In The Clones

Once your hosts are installed, it’s time to shut them down and boot them again, this time in headless mode. This is where fixing the boot order on ProxZZ pays off. With all 3 VBVMs are started up, you are ready for the next stage of the exercise: installing your PVE hosts.

Adventures in Proxmox Part 2: Building a Test Cluster with Virtualbox

If you have read my previous post about my first foray into Proxmox, you know that the infrastructure of my home network is, as the Irish would say, not the best. I have been tinkering with routers and smart switches, learning about VLANs and subnets and all kinds of other things that I thought I understood, but it turns out I didn’t.

Doing stuff with server and network gear at home is a challenge because the family just doesn’t get Hardware Mode(tm). Hardware means being sequestered in the workshop, possibly interfering with our access to the Internet. I have to wait for those rare occasions when I am: 1) at home and 2) not changing diapers and 3) not asleep and 4) no one is actively using the Internet. I have been putting things in place, one piece at a time, but my progress is, well, not the best.

Part of my networking woes are design. I don’t know how to build a network for a Proxmox cluster, because I don’t know the right way to build a Proxmox cluster. I also can’t spend hours in my basement lab tinkering. I need to be upstairs with the family. So I decided to build a little portable test cluster, on my laptop, using VirtualBox.

The network design at my house looks a bit like a plate of spaghetti, with old, unmanaged switches in random spots, like meatballs. Little switches plugged into big ones. No tiers, no plan, just hearty Italian improvisimo. Last year, when I fired up two Proxmox nodes, with no consideration for what might happen… Mamma mia!It took a couple of days before the network completely crashed, and a couple of more days to figure out the problem.

The great thing about VirtualBox is that you can build Host Only Networks. A host only network behaves like a physical switch with no uplink. VirtualBox virtual machines (VMs) can talk to each other, and to the physical host without talking to the outside world. This seemed like a decent facsimile of my plan to use a small unmanaged switch to isolate cluster traffic from the rest of the network.

The other great thing about VirtualBox is that you can add lots of network interfaces to a VM in order to simulate network interactions. You can build a router using a Linux or BSD distro and use it to connect your various host only networks to a bridge into your real physical network. I tried that at first, I am not sure that it’s necessary for this exercise.

And last, but not least, VirtualBox lets you clone a VM. As in, to make a procedurally generated copy of a VM, and then start it up along side it. This is a great feature for when you are screwing up configs and installs.

It is the combination of these features that allowed me to create a little virtual lab on a PC so I could figure out how to set up all the cool stuff that Proxmox can do, and figure out what kind of network I will need for it.

Phase 1: The plan

The plan for this exercise is to figure out how to use several features of Proxmox VE. The features are as follows:

  1. Online Backup and Restore – Proxmox has the ability to take and store snapshots of VMs and containers. This is a great feature for a home lab where you are learning about systems and you are likely to make mistakes. Obviously, I use this feature all the time.
  2. Clustering – Proxmox has the ability to run multiple hosts in tandem with the ability to migrate guest VMs and Linux containers from one host to another. In theory, using a NAS as shared storage you can migrate a VM without shutting it down. Since the point of this exercise is to build Proxmox hosts and not NAS appliances, we are going to focus on offline migrations where you either suspend the host or shut it down prior to migrating.
  3. Storage Replication – Proxmox natively supports ZFS, and can use the ZFS Send and Receive commands to make regular copies of your VMs onto the other cluster nodes. Having a recent copy of the VM makes migrations go much faster, and saves you from losing more than a few minutes worth of data or configuration changes. I wish I had this feature working when I was building my Swedish Internet router.
  4. High Availability – If you have 3 or more PVE nodes in your cluster, you can set some of your VMs to automatically migrate if there is an outage on the node the VM is hosted on. The decision to migrate is based on a kind of voting system that uses a quorum to decide if a host is offline. I want to use this feature to ensure that my access devices are up and running to support my remote access shenanigans.

Phase 2: Preparation

To build the lab, you will need the following:

  1. A desktop or laptop computer with 2 or more cores and at least 8gb of RAM. You could probably pull this off with 4gb if you are patient. My laptop has an old dual core i5 and 8gb and it was pretty much maxed out the whole time, so your mileage may vary.
  2. A working OS with a web browser and SSH client. Linux would probably be best, but my laptop was running Win10pro. I recommend a tabbed SSH client capable of sending keystrokes to multiple SSH sessionsLike Moba XTerm.
  3. VirtualBox installed and running.
  4. The latest Proxmox VE ISO file.

With the plan in place, and the necessary software gathered, it’s time to proceed to Building A Proxmox Test Cluster in VirtualBox, Part 1: Building The Nodes.