Building the Zerg Nest#
Installation is based on a fresh Centos Stream minimal server. Therefore, it is only adapted for CentOS installation, it may or may not work on other distribution.
Prerequise#
3 x nodes (bare-metal or VMs), each with:
- A mainstream Linux OS (tested on CentOS 8/9 Stream)
- At least 2GB RAM (You will need more Powaaaa, for the stack I will present futher allongs)
- At least 50GB disk space (but it'll be tight)
- Connectivity to each other within the same subnet, and on a low-latency link (i.e., no WAN links)
Installing Docker and Docker compose#
### Remove runc
dnf remove runc
### Adding docker repo
dnf install -y dnf-utils
dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
dnf install -y docker-ce docker-ce-cli containerd.io
### Installing docker-compose
dnf install -y python3-pip
pip3 install --upgrade pip
pip3 install setuptools-rust
pip3 install docker-compose
Let's start the whale#
systemctl enable docker --now
And now create a zerg swarm from it#
From swarm1:
# docker swarm init
Swarm initialized: current node (ksdjlqsldjqsd2516685485) is now a manager.
To add a worker to this swarm, run the following command:
docker swarm \
join --token SWMTKN-1-5pykfhyfvtsij0tg4ewrtqk7hz2twuq21okeqv54p1gw2ufdde-814yer1z55vmyk2mwdhvjbob1 \
10.0.0.51:2377
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
We are going to add the two other nodes as manager:
docker swarm join-token manager
To add a manager to this swarm, run the following command:
docker swarm join --token SWMTKN-1-5pykfhyfvtsij0tg4ewrtqk7hz2twuq21okeqv54p1gw2ufdde-2k0vay9aub5eheikw7qi9v82o 10.0.0.51:2377
Run the command provided on your other nodes to join them to the swarm as managers. After addition of a node, the output of docker node ls (on either host) should reflect all the nodes:
# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
p424u0yvmu0vvc8nsnspv83zw * swarm1.lab.local Ready Active Leader 20.10.6
kg6w6ucpb2jf8v8xqai23pv3a swarm2.lab.local Ready Active Reachable 20.10.6
lam7mgs5wus40iaydvp8u3ss7 swarm3.lab.local Ready Active Reachable 20.10.6
You are now ready to swarm.
Official Documentation : https://docs.docker.com/engine/swarm/
Little Network tweak#
When running Docker Swarm on RedHat or CentOS VMs under VMware you may run into issues with communication over the swarm node routing mesh. This issue is traced back to UDP packets being dropped by the source node. Disabling checksum offloading appears to resolve this issue.
Run the following on your VMs:
ethtool -K [interface] tx-checksum-ip-generic off
cat > /etc/NetworkManager/dispatcher.d/pre-up.d/10-tx-checksum-ip-generic <<'EOF'
ethtool -K ens192 tx-checksum-ip-generic off
EOF
chmod +x /etc/NetworkManager/dispatcher.d/pre-up.d/10-tx-checksum-ip-generic
Note: [interface] is your network adaptater so change it accordingly.
Firewalling#
Base configuration#
Activate firewalld, you may want to check in /etc/firewalld/zones/ to check what is going to happen ^^ . One issue happening quite often is when you changed the default ssh port. As the ssh service declared in /usr/lib/firewalld/services/ssh.xml is referencing to port 22. If it's the case, copy the service.xml into /etc/firewalld/service and change the port of it.
(And yes, this happen to me a few times)
systemctl enable firewalld --now
Unmask the service if needed :
systemctl unmask firewalld
By default, firewalld is having a public zone created.
This public zone allow the use of ssh, cockpit, dhcpv6-client.
cat /etc/firewalld/zones/public.xml
<?xml version="1.0" encoding="utf-8"?>
<zone>
<short>Public</short>
<description>For use in public areas. You do not trust the other computers on networks to not harm your computer. Only selected incoming connections are accepted.</description>
<service name="ssh"/>
<service name="dhcpv6-client"/>
<service name="cockpit"/>
</zone>
If you don't use cockpit or dhcpv6-client, you can remove them from the configuration.
For example to delete cockpit service:
firewall-cmd --permamnent --zone=public --remove-service=cockpit
firewall-cmd --reload
Note here the parameters:
-
--permanent: means the rules gonna last after service restart
-
--zone: is used to indicate what zone should be modified, by default it's the public one
-
--remove-service : remove the service declared in /etc/firewalld/services/ without the .xml ending
firewall-cmd --reload is going to reload firewalld with latest configuration.
Let's add some services
firewall-cmd --permanent --zone=public --add-service=http
firewall-cmd --permanent --zone=public --add-service=https
firewall-cmd --reload
We are going now to create a new zone representing the nodes of our cluster, and add sources to it (understand incoming traffic).
firewall-cmd --permanent --new-zone=swarm
firewall-cmd --permanent --zone=swarm --add-source=10.0.0.51
firewall-cmd --permanent --zone=swarm --add-source=10.0.0.52
firewall-cmd --permanent --zone=swarm --add-source=10.0.0.53
firewall-cmd --reload
Let's check if sources where added
firewall-cmd --zone=swarm --list-sources
cp -a /usr/lib/firewalld/services/docker-swarm.xml /etc/firewalld/services/
firewall-cmd --zone=swarm --add-service=docker-swarm --permanent
firewall-cmd --reload
Docker Swarm#
Let's add the service to the swarm zone:
By default firewalld come bundled with some services. You can find them in /usr/lib/firewalld/services.
I like to copy them in /etc/firewalld/services when I use them as it prevent it to be changed after an update. Firewalld prioritize service in "/etc/firewalld/services/" then in /usr/lib/firewalld/services .
So let's copy the docker-swarm service:
cp /usr/lib/firewalld/services/docker-swarm.xml /etc/firewalld/services/
cat /etc/firewalld/services/docker-swarm.xml
<?xml version="1.0" encoding="utf-8"?>
<service>
<short>Docker integrated swarm mode</short>
<description>Natively managed cluster of Docker Engines (>=1.12.0), where you deploy services.</description>
<port port="2377" protocol="tcp"/>
<port port="7946" protocol="tcp"/>
<port port="7946" protocol="udp"/>
<port port="4789" protocol="udp"/>
<protocol value="esp"/>
</service>
Then add it to the zone:
firewall-cmd --permanent --zone=swarm --add-service=docker-swarm
firewall-cmd --reload
firewall-cmd --zone=swarm --list-services
docker-swarm
Now we did allow the port and protocols for the nodes to be used our cluster, and off course all those action have to be done on each host of the cluster.