Skip to content

Building the Zerg Nest#

Installation is based on a fresh Centos Stream minimal server. Therefore, it is only adapted for CentOS installation, it may or may not work on other distribution.

Prerequise#

3 x nodes (bare-metal or VMs), each with:

  • A mainstream Linux OS (tested on CentOS 8/9 Stream)
  • At least 2GB RAM (You will need more Powaaaa, for the stack I will present futher allongs)
  • At least 50GB disk space (but it'll be tight)
  • Connectivity to each other within the same subnet, and on a low-latency link (i.e., no WAN links)

Installing Docker and Docker compose#

### Remove runc
dnf remove runc

### Adding docker repo


dnf install -y dnf-utils
dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
dnf install -y docker-ce docker-ce-cli containerd.io


### Installing docker-compose


dnf install -y python3-pip
pip3 install --upgrade pip
pip3 install setuptools-rust
pip3 install docker-compose

Let's start the whale#

systemctl enable docker --now

And now create a zerg swarm from it#

From swarm1:

# docker swarm init
Swarm initialized: current node (ksdjlqsldjqsd2516685485) is now a manager.
To add a worker to this swarm, run the following command:

    docker swarm \
     join --token SWMTKN-1-5pykfhyfvtsij0tg4ewrtqk7hz2twuq21okeqv54p1gw2ufdde-814yer1z55vmyk2mwdhvjbob1 \
     10.0.0.51:2377

To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.

We are going to add the two other nodes as manager:

docker swarm join-token manager
To add a manager to this swarm, run the following command:

    docker swarm join --token SWMTKN-1-5pykfhyfvtsij0tg4ewrtqk7hz2twuq21okeqv54p1gw2ufdde-2k0vay9aub5eheikw7qi9v82o 10.0.0.51:2377

Run the command provided on your other nodes to join them to the swarm as managers. After addition of a node, the output of docker node ls (on either host) should reflect all the nodes:

# docker node ls
ID                            HOSTNAME           STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION
p424u0yvmu0vvc8nsnspv83zw *   swarm1.lab.local   Ready     Active         Leader           20.10.6
kg6w6ucpb2jf8v8xqai23pv3a     swarm2.lab.local   Ready     Active         Reachable        20.10.6
lam7mgs5wus40iaydvp8u3ss7     swarm3.lab.local   Ready     Active         Reachable        20.10.6

You are now ready to swarm.

Official Documentation : https://docs.docker.com/engine/swarm/

Little Network tweak#

When running Docker Swarm on RedHat or CentOS VMs under VMware you may run into issues with communication over the swarm node routing mesh. This issue is traced back to UDP packets being dropped by the source node. Disabling checksum offloading appears to resolve this issue.

Run the following on your VMs:

ethtool -K [interface] tx-checksum-ip-generic off
cat > /etc/NetworkManager/dispatcher.d/pre-up.d/10-tx-checksum-ip-generic <<'EOF'
ethtool -K ens192 tx-checksum-ip-generic off
EOF

chmod +x /etc/NetworkManager/dispatcher.d/pre-up.d/10-tx-checksum-ip-generic

Note: [interface] is your network adaptater so change it accordingly.

Firewalling#



Base configuration#

Activate firewalld, you may want to check in /etc/firewalld/zones/ to check what is going to happen ^^ . One issue happening quite often is when you changed the default ssh port. As the ssh service declared in /usr/lib/firewalld/services/ssh.xml is referencing to port 22. If it's the case, copy the service.xml into /etc/firewalld/service and change the port of it.

(And yes, this happen to me a few times)

systemctl enable firewalld --now

Unmask the service if needed :

systemctl unmask firewalld

By default, firewalld is having a public zone created.

This public zone allow the use of ssh, cockpit, dhcpv6-client.

cat /etc/firewalld/zones/public.xml
 <?xml version="1.0" encoding="utf-8"?>
 <zone>
  <short>Public</short>
  <description>For use in public areas. You do not trust the other computers on networks to not harm your computer. Only selected incoming connections are accepted.</description>
  <service name="ssh"/>
  <service name="dhcpv6-client"/>
  <service name="cockpit"/>  
</zone>

If you don't use cockpit or dhcpv6-client, you can remove them from the configuration.

For example to delete cockpit service:

firewall-cmd --permamnent --zone=public --remove-service=cockpit
firewall-cmd --reload

Note here the parameters:

  • --permanent: means the rules gonna last after service restart

  • --zone: is used to indicate what zone should be modified, by default it's the public one

  • --remove-service : remove the service declared in /etc/firewalld/services/ without the .xml ending

firewall-cmd --reload is going to reload firewalld with latest configuration.

Let's add some services

firewall-cmd --permanent --zone=public --add-service=http

firewall-cmd --permanent --zone=public --add-service=https

firewall-cmd --reload

We are going now to create a new zone representing the nodes of our cluster, and add sources to it (understand incoming traffic).

firewall-cmd --permanent --new-zone=swarm

firewall-cmd --permanent --zone=swarm --add-source=10.0.0.51
firewall-cmd --permanent --zone=swarm --add-source=10.0.0.52
firewall-cmd --permanent --zone=swarm --add-source=10.0.0.53

firewall-cmd --reload

Let's check if sources where added

firewall-cmd --zone=swarm --list-sources
Ajouter les services nécessaires au cluster :
cp -a /usr/lib/firewalld/services/docker-swarm.xml /etc/firewalld/services/

firewall-cmd --zone=swarm --add-service=docker-swarm --permanent

firewall-cmd --reload

Docker Swarm#

Let's add the service to the swarm zone:

By default firewalld come bundled with some services. You can find them in /usr/lib/firewalld/services.

I like to copy them in /etc/firewalld/services when I use them as it prevent it to be changed after an update. Firewalld prioritize service in "/etc/firewalld/services/" then in /usr/lib/firewalld/services .

So let's copy the docker-swarm service:

cp /usr/lib/firewalld/services/docker-swarm.xml /etc/firewalld/services/

cat /etc/firewalld/services/docker-swarm.xml
<?xml version="1.0" encoding="utf-8"?>
<service>
  <short>Docker integrated swarm mode</short>
  <description>Natively managed cluster of Docker Engines (>=1.12.0), where you deploy services.</description>
  <port port="2377" protocol="tcp"/>
  <port port="7946" protocol="tcp"/>
  <port port="7946" protocol="udp"/>
  <port port="4789" protocol="udp"/>
  <protocol value="esp"/>
</service>

Then add it to the zone:

firewall-cmd --permanent --zone=swarm --add-service=docker-swarm
firewall-cmd --reload

firewall-cmd --zone=swarm --list-services
docker-swarm

Now we did allow the port and protocols for the nodes to be used our cluster, and off course all those action have to be done on each host of the cluster.