Proxmox Cluster (Tucana Cloud) - Part II

In our previous article, we have learned how to enable VLANs on a bridge and create a trunk port.

However, most of the changes were implemented using the ip command which does not persist on reboots.

We could have used the ifdown config file to run commands when the interface is started at boot time, but this approach is not scalable therefore I decided to write a few bash scripts to configure the network for us.

Before we start, make sure that the package jq is installed because we are going to be dealing with JSON quite a lot and this package is handy to iterate through JSON.

root@hv1:/etc/systemd/system# apt install jq
Reading package lists... Done
Building dependency tree       
Reading state information... Done
jq is already the newest version (1.5+dfsg-2+b1).

The first script we are creating is going to enable VLANs on our bridge and delete VLAN 1 from it :

#!/bin/bash

# this script sets the bridges parameters at startup for the Tucana Cloud
# which is basically the exclusion of VLAN 1 and enabling vlan filtering.

# However, more funtions can be added in the future.

# the device should be added to the /etc/network/interfaces file 
# otherwise the script is going to fail to setup the specfied bridge.
# Because it is going to be executed after the network.target.

# The bridges and its configuration should be set as JSON on the .conf file with the same name 
# as of this script.

# *** this script is still poorly implemented and will need revision as the Tucana cloud grows. ***

base_dir=/root/network/config
conf_file="${base_dir}/hv1_net_bridge.conf"

jq -c '.[]' $conf_file |
        while IFS=$'\n' read -r br; do

                br_name=$(jq -r '.br' <<< $br)
                br_vlan=$(jq -r '.vlan_filtering' <<< $br)

                # enabling vlan filtering 
                ip link set dev $br_name type bridge vlan_filtering 1

                # deleting VLAN 1 from the bridge
                bridge vlan del dev $br_name vid 1 self
        done
/root/network/config/hv1_net_bridge.sh

The next script is going to create the veth pair which our Proxmox is going to listen to.

#!/bin/bash

# setting the veth device
# just one route is supported atm

base_dir=/root/network/config
conf_file="${base_dir}/hv1_net_veth.conf"

errors=0

# looping over the veth devices in the config file
# -c : compact output in a single line
# BUG - check if the command executes
jq -r -c '.[]' $conf_file |
        while IFS=$'\n' read -r veth_pair; do
                # setting variables from the JSON object passed from jq
                veth_name=$(jq -r '.pair | .[0]' <<< $veth_pair)
                veth_peer=$(jq -r '.pair | .[1]' <<< $veth_pair)
                veth_br=$(jq -r '.bridge' <<< $veth_pair)
                veth_vlan=$(jq -r -c '.vlan' <<< $veth_pair)
                veth_ip=$(jq -r '.ip' <<< $veth_pair)
                veth_rt_net=$(jq -r '.route | arrays | .[0]["network"]' <<< $veth_pair)
                veth_rt_via=$(jq -r '.route | arrays | .[0]["via"]' <<< $veth_pair)
                veth_rt_dev=$(jq -r '.route | arrays | .[0]["dev"]' <<< $veth_pair)

                # if the veth device is added succesfully 
                if ip link add dev $veth_name type veth peer name $veth_peer 2> /dev/null ; then

                        logger "Net CFG net_veth:29 - SUCCESS: veth pair $veth_name/$veth_peer added"

                        bash "${base_dir}/hv1_add_vlan.sh" "${veth_br}" "${veth_peer}" "${veth_vlan}"

                        # if an IP addr was passed
                        if [ $veth_ip != "0" ]; then
                                # set IP addr
                                ip addr add $veth_ip dev $veth_name

                        fi
                        # bring the interfaces up
                        ip link set dev $veth_name up
                        ip link set dev $veth_peer up

                        # if a route was passed
                        if [ $veth_rt_net != "" ]; then
                                ip route add $veth_rt_net via $veth_rt_via dev $veth_rt_dev
                        fi

                        logger "Net CFG net_veth:48 - SUCCESS: mgmt interface $veth_name/$veth_peer added."
                else
                        logger "Net CFG net_veth:50 - ERROR: failed to added mgmt interface $veth_name/$veth_peer"
                        ((errors++))
                fi

        done

# if errors were registered
if [ $errors -gt 0 ]; then
        exit 1
else
        exit 0
fi
/root/network/config/hv1_net_veth.sh

The code above is a bit more complex than the previous one. It iterates through the configuration file creating specified veth pairs and routes.

And, the last script is going to create our teams' interfaces. The teams were created with resilience in mind. I know that for real resilience physical network equipment is needed which is not an option at the moment.

We are using a combination of onboard with offboard NICS to achieve a certain resilience on our network.

#!/bin/bash
# This script sets the team interfaces for the TUCANA cloud at boot time.

# BUG - there is a kernel/systemd bug which prevents userspace virtual interfaces to initiate and there is a workaround below.
# https://github.com/systemd/systemd/issues/3374

# It is essential that the interfaces to be used as ports within the team are DOWN.
# since the script is going to be managed by systemd to run before the ifdown
# a function to check if the interfaces are down is not needed.

errors=""
base_dir=/root/network/config

# team extra config
teams_cfg_file="${base_dir}/hv1_net_teams.conf"

# create a team interface for each config file
for file in "${base_dir}"/teams/*; do
        # saving the team name
        ifname=$(jq -r '.device' $file)

        # if the team already exists skip further actions
        bash "${base_dir}/hv1_dev_exists.sh" $ifname
        if [ $? -ne 0 ]; then

                # instanciating the ip address to be set
                ip=$(jq --arg team "$ifname" -r '.[$team] |.ip' $teams_cfg_file)

                # bringing the ports of the team down 
                jq -r '.["ports"] | keys | .[]' $file |
                        while IFS=$'\n' read -r team_port; do
                                ip link set dev $team_port down
                        done
                # creating the team interface and redirecting the stderr to stdout
                output=$(teamd -g -f $file -d 2>&1 > /dev/null)
                # -g debug messages
                # -f config file
                # -d run as a daemon

                # if the team exit code does not indicate an error
                if [ $? -eq 0 ]; then

                        # logging on success is not needed cause teamd already logs a team creation to /var/log/messages
                        # if an IP address has been specified
                        if [ $ip != "0" ]
                        then
                                # setting an IP address to the recently created team
                                ip addr add $ip dev $ifname
                        fi

                        # if a bridge name has been passed 
                        team_br=$(jq --arg ifname "$ifname" -r '.[$ifname] | .bridge' $teams_cfg_file)

                        # if a bridge name was passed
                        if [ $team_br != "0" ]; then

                                # saving the vlans as JSON
                                vlans=$(jq --arg ifname "$ifname" -r -c '.[$ifname] | .vlan' $teams_cfg_file)

                                # script to add vlans
                                bash "${base_dir}/hv1_add_vlan.sh" $team_br $ifname $vlans
                                # usage: script.sh [bridge_name] [interface_name] [JSON_vlans]

                        fi
                        # bringing the team online
                        ip link set dev $ifname up
                else
                        errors="${errors}
                        ${output}"
                fi
        else
                logger "Net CFG net_teams.sh:71 - ERROR: $ifname already exists."
        fi
done

# if there were errors exit with the right code and log for debug
if [ -z "$errors" ]
then
        exit 0
else
        # looping and logging the error variable if any
        while IFS= read -r line; do
                logger "Net CFG net_teams:83 - ERROR: $line"
        done <<< "$errors"
        exit 1
fi
/root/network/config/hv1_net_teams.sh

The script above is a bit complex, but we can break it down into two main parts.

In the first part, we point a configuration file to the script to use when setting the layer 3 parameters and on the second we loop through a folder containing our teams' layer 2 configuration files which will be feed to the teamd daemon.

Our layer 2 configuration for the team is LACP load balancer hashing MAC addresses, IPv4, VLAN and TCP&UDP ports.


Let's briefly go through the configuration files.

These files are in JSON format and you can list as many interfaces/bridges as you want that the script is going to loop through it.

[
        {
          "br":"vmbr0",
          "vlan_filtering":"1"
        }
]
hv1_net_bridge.conf
{
        "team0": {
                "ip":"192.168.100.1/24",
                "vlan":0,
                "bridge":"0"
        },
        "team1":{
                "ip":"192.168.101.1/24",
                "vlan":0,
                "bridge":"0"
        },
        "team2": {
                "ip":"0",
                "vlan":[
                        { "vid":"10","pvid":"0" },
                        { "vid":"20","pvid":"0" },
                        { "vid":"30","pvid":"0" }
                ],
                "bridge":"vmbr0"
        }
}
hv1_net_teams.conf
[
        {
          "pair":["veth0", "veth1"],
          "bridge": "vmbr0",
          "vlan": [{"vid":10, "pvid":"untagged"}],
          "ip":"192.168.1.100/24",
          "route": [{"network":"192.168.200.0/24", "via":"192.168.1.1", "dev":"veth0"}]
        }
]
hv1_net_veth.conf

We now have created the scripts needed to automate the creation of our networks. However, we still did not achieve persistency over a system reboot and in the next article, we are going to explore how to have our scripts running at boot time with systemd.


External Resources:

teamd.conf(5) [centos man page]
teamd uses JSON format configuration. OPTIONS device (string, mandatory) Desired name of new team device. hwaddr (string) Desired hardware address of new team device. Usual MAC address format is accepted. runner.name (string, mand
teamd(8) — Arch manual pages
teamd.conf(5) — Arch manual pages