In my last post, I set the stage for why I built the virtualbox cluster, and now it it time to discuss the how.
In researching the best way to design a network for a Proxmox cluster, the bare minimum is one network connection. This one link does the following:
- Hosts the web server for the management GUI – The web UI is pretty slick, and it’s great for viewing stats and checking for errors.
- Hosts the network bridge for guest VMs – This bridge acts as a kind of virtual network switch for your PVEVMs to talk to the outside world.
- Connects the host to the Internet – The PVE host needs to download security updates, Linux container templates, and install packages.
This one network interface is sort of the lifeline for a Proxmox host. It would be a shame if that link got bombed by incessant network traffic. As I discovered (the hard way) one possible source of incessant network traffic is the cluster communication heartbeat. Obviously, that traffic needs to go on its own network segment. Normally, that would be a VLAN or something, but I have some little dumb switches and the nodes have some old quad port NICs, so I wanted to just assign an IP to one port, and plug that port into a switch that is physically isolated from “my” network.
Once a cluster is working, migrating machines happens over the cluster network link. This is OK, but if your cluster network happens to suck (like when some jackass plugs it into a 10 year old switch) it can cause problems with determining if all the cluster nodes are online. So, now I want to set up an additional interface for VM migration. Migration seems like the kind of thing that happens only occasionally, but when you enable Storage Replication, the nodes are copying data every 15 minutes. Constant cluster chatter, plus constant file synchronization, has the potential to saturate a single network link. This gets even worse when you add High Availability, and there is a constant vote on if a PVEVM is up and running, followed by a scramble to get it going on another node.
So, at minimum we will need 3 network interfaces for the test cluster on VirtualBox. I didn’t want to spend a lot of time tinkering with firewall and NAS appliances, so I am leaving the “Prox management on its own network segment” and the “Dedicated network storage segment” discussions out of this exercise. I can’t decide if the management interface for my physical Proxmox cluster should sit on my internal network, or on its own segment. For this exercise, the management interface is going to sit on the internal network. My Synology NAS has 4 network ports, so I am definitely going to dedicate a network segment for the cluster to talk to the NAS, but that won’t be a part of this exercise.
[Virtual] Hardware Mode(tm)
Once you are booted up and VirtualBox is running, you can start building your VBVMs. I recommend building one VBVM to use as a template and then cloning it 3 times. I found that I kept missing important things and having to start over, so better to fix the master and then destroy the clones.
I called my master image “proxZZ” so it showed up last in the list of VBVMs. I also never actually started up the master image, so it was always powered off and the ZZ’s made it look like it was sleeping.
Create proxZZ with the following:
- First, make sure that you have created 2 additional Host Only Network Adapters in VirtualBox. In this exercise you will only use two, but it can get confusing when you are trying to match en0s9 to something, so do yourself a favor and make three. Make sure to disable the DHCP server on both adapters.
- Create a new virtual machine with the following characteristics :
- Name: ProxZZ
- Type: Linux
- Version: Debian 64bit (Proxmox is Debian under the hood.)
- Memory Size: 2048MB
- Hard drive: dynamically allocated, 32GB in size.
- Make sure that you have created 3 total virtual hard disks as follows:
- SATA0: 32GB. This will be your boot drive and system disk. This is where Proxmox PVE will be installed. Static disks are supposed to be faster, but this isn’t even remotely about speed. My laptop has a 240gb SSD, so I don’t have a ton of space to waste.
- SATA1: 64GB, dynamically allocated. This will be one of your ZFS volumes.
- SATA2: 64GB, dynamically allocated. This will be your other ZFS volume. Together they will make a RAID1 array.
- WHile you are in the storage tab, make sure to mount the Proxmox installer ISO
- Make sure that you have created 3 network interfaces as follows:
- Adapter 1: Bridged Network – this will be your management interface.
- Adapter 2: Host Only Network Adapter #2 – this will be your cluster interface.
- Adapter 3: Host Only Network Adapter #3 – this will be your VM migration interface.
- You may be tempted to do something clever like unplugging virtual cables or something. Don’t. You will be cloning this machine in a minute and you will have a hard time keeping all of this straight.
- Before you finish, make sure that the machine is set to boot from the hard drive first, followed by the CD/Optical drive. This seems stupid, but you will be booting these things in headless mode, and forgetting to eject the virtual CD rom is super annoying. So fix it here and stop being bothered with it.
When it’s done, it should look something like this:
Once you are sure your source VM is in good shape, make 3 clones of it. Don’t install Proxmox yet. SSH keys and stuff will play a major role in this exercise later, and I am not sure if VirtualBox is smart enough to re-create them when you clone it. I ran into this a few times so just clone the powered off VBVM. I called the clones prox1, prox2, and prox3.
[Virtual] Software Mode(tm)
Now it is time to start your 3 clones. This can get pretty repetitive, especially if you start the process over a couple of times. While you will appreciate cloning the servers, there isn’t really a simple way that I have discovered to mass produce the PVE hosts. In a few iterations of this exercise, I misnamed one of the nodes (like pro1 or prx2) and it’s super annoying later when you get the cluster set up and see one of the nodes named wrong. There is a procedure to fix the node name after you build it, but seriously just take your time and pay attention.
When you first start the install you will probably see an error about KVM Virtualization not being detected. Which should be impossible, right? The PVE host is *literally* a virtual machine. This is probably VBox for Windows not passing something to PVE Linux, or VBox misidentifying your CPU. Whatever it is, this isn’t a showstopper because our test systems that will run on the PVE cluster aren’t going to be VMs, they’ll be Linux Containers, but in the interest of not making my head hurt I am still going to call them VMs. The PVE hosts are VirtualBox VMs, and the Linux Containers running on Proxmox are Proxmox VE VMs. VBVMs are VMs running under VirtualBox, PVEVMs are Linux Containers running under Proxmox VE. *whew*
As you do the install, select your 32gb boot drive and configure your IP addresses.
I went with a sequence based on the hostname:
prox1 – 192.168.1.101
prox2 – 192.168.1.102
prox3 – 192.168.1.103
Like I said before, go slowly and pay attention. This part is super repetitive and it’s easy to make a stupid mistake that you have to troubleshoot later. At some point, I guarantee that you will give up, destroy the clones, and start over 🙂
Send In The Clones
Once your hosts are installed, it’s time to shut them down and boot them again, this time in headless mode. This is where fixing the boot order on ProxZZ pays off. With all 3 VBVMs are started up, you are ready for the next stage of the exercise: installing your PVE hosts.