How to Load Balance Galera/MariaDB Servers on Ubuntu Server 18.04 LTS

Post Reply
User avatar
LHammonds
Site Admin
Site Admin
Posts: 712
Joined: Fri Jul 31, 2009 6:27 pm
Are you a filthy spam bot?: No
Location: Behind You
Contact:

How to Load Balance Galera/MariaDB Servers on Ubuntu Server 18.04 LTS

Post: # 722Post LHammonds
Mon Aug 05, 2019 11:39 am

------------- WORK-IN-PROGRESS -------------

Greetings and salutations,

I hope this thread will be helpful to those who follow in my foot steps as well as getting any advice based on what I have done / documented.

To discuss this thread, please participate here: << Insert Ubuntu Forums Link >>

High-level overview

HAProxy is a free, very fast and reliable solution offering high availability, load balancing, and proxying for TCP and HTTP-based applications. It is particularly suited for very high traffic web sites and powers quite a number of the world's most visited ones

This tutorial will cover how to setup a redundant HAProxy system which will direct traffic to redundant database servers. This will provide high availability when a database or load balancing server goes offline...for things such as maintenance tasks.

This is an overview image of a highly-available web server platform.
This article covers the database load balancers.
Image

Tools utilized in this process
Helpful links

The list below are sources of information that was helpful in the creation of this document.
Assumptions

This documentation will need to make use of some very-specific information that will most-likely be different for each person / location. And as such, this information will be noted in this section. They will be highlighted in red throughout the document as a reminder that you should plug-in your own value rather than actually using these "place-holder" values.

Under no circumstance should you use the actual values listed below. They are place-holders for the real thing. This is just a checklist template you need to have answered before you start the install process.

Wherever you see RED in this document, you need to substitute it for you will use in your environment.
  • 192.168.107.166 (VirtualIP)
  • Load Balancer #1 server name: srv-lbdb1 (master)
  • Load Balancer #1 Internal IP address: 192.168.107.164
  • Load Balancer #2 server name: srv-lbdb2 (slave)
  • Load Balancer #2 Internal IP address: 192.168.107.165
  • Ubuntu Admin ID: administrator
  • Ubuntu Admin Password: myadminpass
  • Email Server Name (remote): srv-mail
  • Email Server Internal IP (remote): 192.168.107.25
  • DB Server #1 Internal IP Address: 192.168.107.123
  • DB Server #2 Internal IP Address: 192.168.107.124
  • DB Server #3 Internal IP Address: 192.168.107.125
Load Balancer Servers - Setup two Ubuntu servers for use as the HAProxy servers. This tutorial assumes the server was configured according to this tutorial: How to install and configure Ubuntu Server

Database Servers - Setup three Ubuntu servers for use as the database servers. This tutorial assumes the servers were configured according to these tutorials: How to Install and Configure Ubuntu Server, How to Install MariaDB Galera Cluster on Ubuntu 18.04 LTS

It is also assumed the reader knows how to use the VI editor. If not, you will need to beef up your skill set or use a different editor in place of it.

User avatar
LHammonds
Site Admin
Site Admin
Posts: 712
Joined: Fri Jul 31, 2009 6:27 pm
Are you a filthy spam bot?: No
Location: Behind You
Contact:

Install Keepalived

Post: # 723Post LHammonds
Mon Aug 05, 2019 11:47 am

Install Keepalived

Run this command on both load balance servers:

Code: Select all

sudo apt install keepalived
Master Load Balancer Firewall Rules

Edit the firewall script that was created during the initial setup of the server (if you followed my instructions):

Code: Select all

vi /var/scripts/prod/en-firewall.sh
Add (or enable) the following:

NOTE: If you have more than 1 backup load balancer (VRRP Router), be sure to add their IP addresses as well.

Code: Select all

echo "Adding Database Server rules"
ufw allow proto tcp to any port 3306 comment 'MariaDB' 1>/dev/null 2>&1
echo "Adding VRRP rules"
ufw allow to 224.0.0.18 comment 'VRRP Broadcast' 1>/dev/null 2>&1
ufw allow from 192.168.107.165 comment 'VRRP Router' 1>/dev/null 2>&1
Run the updated rules:

Code: Select all

/var/scripts/prod/en-firewall.sh
Slave Load Balancer Firewall Rules

Code: Select all

vi /var/scripts/prod/en-firewall.sh
Add the following:

Code: Select all

echo "Adding Database Server rules"
ufw allow proto tcp to any port 3306 comment 'MariaDB' 1>/dev/null 2>&1
echo "Adding VRRP rules"
ufw allow to 224.0.0.18 comment 'VRRP Broadcast' 1>/dev/null 2>&1
ufw allow from 192.168.107.164 comment 'VRRP Router' 1>/dev/null 2>&1
Run the updated rules:

Code: Select all

/var/scripts/prod/en-firewall.sh
Master Keepalive Config

On the master server (srv-lbdb1), create the keepalive configuration file:

Code: Select all

sudo touch /etc/keepalived/keepalived.conf
sudo chown root:root /etc/keepalived/keepalived.conf
sudo chmod 600 /etc/keepalived/keepalived.conf
Edit the configuration file:

Code: Select all

sudo vi /etc/keepalived/keepalived.conf
Add the following to the file (substituting for your own values):

Code: Select all

global_defs {
  notification_email {
    my_email@mydomain.com
  }
  notification_email_from keepalived@mydomain.com
  smtp_server 192.168.107.25
  smtp_connect_timeout 30
}
vrrp_script chk_haproxy {
  script "/usr/bin/killall -0 haproxy"
  interval 1            # check every second
  weight 2              # add 2 points of priority if OK
}
vrrp_instance VI_1 {
  interface ens32
  state MASTER
  smtp_alert
  virtual_router_id 51  # Should be same on all LBs
  priority 101          # 101 on master, 100 on slaves
  advert_int 1
  authentication {
    auth_type PASS
    auth_pass 1111
  }
  virtual_ipaddress {
    192.168.107.166
  }
  track_script {
    chk_haproxy
  }
}
Restart the service:

Code: Select all

sudo systemctl restart keepalived
Slave Keepalive Config

On the slave server (srv-lbdb2), create the keepalive configuration file:

Code: Select all

sudo touch /etc/keepalived/keepalived.conf
sudo chown root:root /etc/keepalived/keepalived.conf
sudo chmod 600 /etc/keepalived/keepalived.conf
Edit the configuration file:

Code: Select all

sudo vi /etc/keepalived/keepalived.conf
Add the following to the file (substituting for your own values):

Code: Select all

global_defs {
  notification_email {
    my_email@mydomain.com
  }
  notification_email_from keepalived@mydomain.com
  smtp_server 192.168.107.25
  smtp_connect_timeout 30
}
vrrp_script chk_haproxy {
  script "/usr/bin/killall -0 haproxy"
  interval 1            # check every second
  weight 2              # add 2 points of priority if OK
}
vrrp_instance VI_1 {
  interface ens32
  state MASTER
  smtp_alert
  virtual_router_id 51  # Should be same on all LBs
  priority 100          # 101 on master, 100 on slaves
  advert_int 1
  authentication {
    auth_type PASS
    auth_pass 1111
  }
  virtual_ipaddress {
    192.168.107.166
  }
  track_script {
    chk_haproxy
  }
}
Restart the service:

Code: Select all

sudo systemctl restart keepalived
Test Keepalived

You should be able to ping the virtual IP address at this point. If you reboot srv-lb1 while continuously pinging the virtual IP, you should only see 1 or maybe 2 drops in the ping when the slave takes over for the master. When the master comes back, another 1 or 2 drops in the ping will occur again as the virtual IP moves back from the slave to the master.

This is what the master NIC should look like (while active):

Code: Select all

# ip addr show ens32

Code: Select all

2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 00:50:56:bf:27:cc brd ff:ff:ff:ff:ff:ff
    inet 192.168.107.164/24 brd 192.168.107.255 scope global ens32
       valid_lft forever preferred_lft forever
    inet 192.168.107.166/32 scope global ens32
       valid_lft forever preferred_lft forever
    inet6 fe80::250:56ff:febf:27cc/64 scope link
       valid_lft forever preferred_lft forever
This is what the slave NIC should look like (while inactive):

Code: Select all

# ip addr show ens32

Code: Select all

2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 00:50:56:bf:6a:52 brd ff:ff:ff:ff:ff:ff
    inet 192.168.107.165/24 brd 192.168.107.255 scope global ens32
       valid_lft forever preferred_lft forever
    inet6 fe80::250:56ff:febf:6a52/64 scope link
       valid_lft forever preferred_lft forever
When the slave NIC is active with the virtual IP, you will see the virtual IP on it just like the master.

User avatar
LHammonds
Site Admin
Site Admin
Posts: 712
Joined: Fri Jul 31, 2009 6:27 pm
Are you a filthy spam bot?: No
Location: Behind You
Contact:

Install xinetd

Post: # 724Post LHammonds
Mon Aug 05, 2019 11:48 am

Install xinetd

Run this command on all database nodes in the cluster:

Code: Select all

sudo apt install xinetd
Firewall Rules

On each database node in the cluster, edit the firewall script that was created during the initial setup of the server (if you followed my instructions):

Code: Select all

vi /var/scripts/prod/en-firewall.sh
Add the following:

Code: Select all

echo "Adding xinetd rules"
ufw allow from 192.168.107.0/24 proto tcp to any port 9200 comment 'clustercheck' 1>/dev/null 2>&1
Run the updated rules:

Code: Select all

/var/scripts/prod/en-firewall.sh
Install Cluster Check Script

Run these commands on all database nodes in the cluster:

Code: Select all

cd /tmp
wget https://raw.githubusercontent.com/olafz/percona-clustercheck/master/clustercheck -O /tmp/clustercheck.sh
sudo chown root:root /tmp/clustercheck.sh
sudo chmod 755 /tmp/clustercheck.sh
sudo mv /tmp/clustercheck.sh /var/scripts/prod/clustercheck.sh
Modify Cluster Check Script

We need to make some slight changes in the code so it will work correctly with the built-in credentials and reference the correct my.cnf file.

On each database node, edit the script:

Code: Select all

sudo vi /var/scripts/prod/clustercheck.sh
Change the following lines (lines 40, 41, 42):

Code: Select all

MYSQL_USERNAME="${MYSQL_USERNAME:=-clustercheckuser}"
MYSQL_PASSWORD="${MYSQL_PASSWORD-clustercheckpassword!}"
DEFAULTS_EXTRA_FILE=${DEFAULTS_EXTRA_FILE:-/etc/my.cnf}
to these:

Code: Select all

MYSQL_USERNAME="${MYSQL_USERNAME:=clustercheckuser}"
MYSQL_PASSWORD="${MYSQL_PASSWORD:=clustercheckpassword!}"
DEFAULTS_EXTRA_FILE=${DEFAULTS_EXTRA_FILE:-/etc/mysql/my.cnf}
If you want the output to include the name of the node, you can change these lines (lines 29,86,98,109):

Code: Select all

    echo -en "Percona XtraDB Cluster Node is manually disabled.\r\n"
            echo -en "Percona XtraDB Cluster Node is read-only.\r\n"
    echo -en "Percona XtraDB Cluster Node is synced.\r\n"
    echo -en "Percona XtraDB Cluster Node is not synced.\r\n"
to these:

Code: Select all

    echo -en "MariaDB Cluster Node ${HOSTNAME} is manually disabled.\r\n"
            echo -en "MariaDB Cluster Node ${HOSTNAME} is read-only.\r\n"
    echo -en "MariaDB Cluster Node ${HOSTNAME} is synced.\r\n"
    echo -en "MariaDB Cluster Node ${HOSTNAME} is not synced.\r\n"

Configure xinetd

Run these SQL commands on just one database node in the cluster:

Code: Select all

mysql -u root -p
GRANT PROCESS ON *.* TO 'clustercheckuser'@'localhost' IDENTIFIED BY 'clustercheckpassword!';
FLUSH PRIVILEGES;
exit
Run these commands on all database nodes in the cluster:

Code: Select all

sudo touch /etc/xinetd.d/mariadbchk
sudo chown root:root /etc/xinetd.d/mariadbchk
sudo chmod 644 /etc/xinetd.d/mariadbchk
sudo vi /etc/xinetd.d/mariadbchk
Add the following to the file:

Code: Select all

# default: on
# description: mariadbchk
service mariadbchk
{
  disable = no
  flags = REUSE
  socket_type = stream
  port = 9200
  wait = no
  user = nobody
  server = /var/scripts/prod/clustercheck.sh
  log_on_failure += USERID
  only_from = 0.0.0.0/0
  per_source = UNLIMITED
}
Run the following commands on each database node in the cluster:

Code: Select all

sudo touch /lib/systemd/system/mariadbchk.socket
sudo touch /lib/systemd/system/mariadbchk@.service
sudo chown root:root /lib/systemd/system/mariadbchk*
sudo chmod 644 /lib/systemd/system/mariadbchk*
Edit the socket file on each database node in the cluster:

Code: Select all

vi /lib/systemd/system/mariadbchk.socket
Add the following to the file:

Code: Select all

[Unit]
Description=MariaDB Galera Cluster node check socket

[Socket]
ListenStream=9200
Accept=yes

[Install]
WantedBy=sockets.target
Edit the service file on each database node in the cluster:

Code: Select all

vi /lib/systemd/system/mariadbchk@.service
Add the following to the file:

Code: Select all

[Unit]
Description=MariaDB Galera Cluster node check service

[Service]
ExecStart=-/var/scripts/prod/clustercheck.sh
StandardInput=socket
Run these commands on each of the database nodes in the cluster:

Code: Select all

sudo systemctl enable mariadbchk.socket
sudo systemctl start mariadbchk.socket
Test xinetd

On each of the database nodes in the cluster, run the script manually:

Code: Select all

sudo /var/scripts/prod/clustercheck.sh

Code: Select all

HTTP/1.1 200 OK
Content-Type: text/plain
Connection: close
Content-Length: 40

Percona XtraDB Cluster Node is synced.
Access the port locally via telnet and see if it responds as expected:

Code: Select all

telnet 127.0.0.1 9200

Code: Select all

Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
HTTP/1.1 200 OK
Content-Type: text/plain
Connection: close
Content-Length: 40

Percona XtraDB Cluster Node is synced.
Connection closed by foreign host.
Now stop the database node on one of the servers and test the port again:

Code: Select all

sudo systemctl stop mariadb

Code: Select all

telnet 127.0.0.1 9200

Code: Select all

Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
HTTP/1.1 503 Service Unavailable
Content-Type: text/plain
Connection: close
Content-Length: 44

Percona XtraDB Cluster Node is not synced.
Connection closed by foreign host.
Do not forget to start the database node:

Code: Select all

sudo systemctl start mariadb
On each of the load balance servers (srv-lbdb1 and srv-lbdb2), make sure you can telnet to each database node and get the status (this tests your database firewall rules):

Code: Select all

telnet srv-db1 9200
telnet srv-db2 9200
telnet srv-db3 9200

User avatar
LHammonds
Site Admin
Site Admin
Posts: 712
Joined: Fri Jul 31, 2009 6:27 pm
Are you a filthy spam bot?: No
Location: Behind You
Contact:

Install HAProxy

Post: # 725Post LHammonds
Mon Aug 05, 2019 4:46 pm

Install HAProxy

Run this command on both load balance servers:

Code: Select all

sudo apt install haproxy
Name Resolution

You can modify your host file for name resolution and just use the names of servers in your configuration files rather than the IP addresses. If a server IP changes in the future, you only need to modify the host which is much easier than tracking down various application configuration files. You could do this with an internal DNS server but I prefer using the local host file for fastest resolution.

Edit the host file:

Code: Select all

sudo vi /etc/hosts
Add the database servers (substituting for your own values):

Code: Select all

192.168.107.123 srv-db1
192.168.107.124 srv-db2
192.168.107.125 srv-db3
Allow Virtual IP Binding

If you try to bind the inactive proxy to an IP it is not currently using (because the other proxy is using it) then it will fail to load the service. We need to modify the system to allow binding to a non-local IP that is not currently active.

On both load balance servers, edit the sysctl.conf file:

Code: Select all

sudo vi /etc/sysctl.conf
Add the following to the end of the file:

Code: Select all

net.ipv4.ip_nonlocal_bind=1
Activate the change:

Code: Select all

sysctl -p
HAProxy Database Users

Create the status check user for the load balance virtual IP on just one of the database nodes in the cluster (the change will replicate to the others). We need to set the host IP for every load balancing server IP we have (including the virtual IP):

Code: Select all

mysql
CREATE USER 'haproxy_check'@'192.168.107.164';
CREATE USER 'haproxy_check'@'192.168.107.165';
CREATE USER 'haproxy_check'@'192.168.107.166';
FLUSH PRIVILEGES;
exit
HAProxy Configuration

This is the default ownership and file permission settings of the configuration file:

Code: Select all

chown root:root /etc/haproxy/haproxy.cfg
chmod 644 /etc/haproxy/haproxy.cfg
Backup the original configuration file (on both servers):

Code: Select all

sudo cp /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.bak
Edit the configuration file (on all load balance servers):

Code: Select all

sudo vi /etc/haproxy/haproxy.cfg
Add the following to the bottom (substituting your own values)

Code: Select all

## Galera Cluster Frontend configuration
frontend galera_cluster_frontend
  ## Galera Cluster uses TCP connections.
  mode tcp
  ## Setup listener on port 3306 on the virtual IP.
  bind 192.168.107.166:3306
  ## Override the default log method to match tcp mode.
  option tcplog
  default_backend galera_cluster_backend

## Galera Cluster Backend configuration
backend galera_cluster_backend
  ## Galera Cluster uses TCP connections.
  mode tcp
  ## Enable keepalive function to maintain TCP connections.
  option tcpka
  ## Various policies for determining how to route traffic to the servers.
#  balance first
#  balance leastconn
#  balance roundrobin
  balance source
#  balance static-rr
  ## Defined backend database server check.
  option httpchk
  ## Node definitions for routing to database servers.
  server srv-db-cluster1 srv-db1:3306 check port 9200 rise 2 fall 3 inter 2000 fastinter 1000 downinter 5000
  server srv-db-cluster2 srv-db2:3306 check port 9200 rise 2 fall 3 inter 2000 fastinter 1000 downinter 5000
  server srv-db-cluster3 srv-db3:3306 check port 9200 rise 2 fall 3 inter 2000 fastinter 1000 downinter 5000

## HAProxy stats web gui - This entire section is optional.
listen stats
  ## Setup listener on port 9000 on any interface.
  bind *:9000
  ## http mode so a web browser can be used to access it.
  mode http
  ## Enable metrics to be recorded.
  stats enable
  ## Configure the URI.  Example: http://192.168.107.164:9000/stats
  stats uri /stats
  ## How long the browser waits before refreshing the page.
  stats refresh 30s
  ## Title for popup window.
  stats realm HAProxy\ Statistics
  ## Hide the HAProxy version. Ex: version 1.8.8-1ubuntu0.4, released 2019/01/24
  stats hide-version
  ## Define user credentials.
  stats auth haproxy:haproxy
  stats auth viewer:viewer
  ## Allows taking down and bringing up backend servers.
  stats admin if TRUE
Validate the changes to the configuration files:

Code: Select all

sudo haproxy -f /etc/haproxy/haproxy.cfg -c
On both servers, restart the proxy service:

Code: Select all

sudo systemctl restart haproxy
Verify that the service started and is running (active):

Code: Select all

systemctl status haproxy
You can also verify it is listening on the expected ports:

Code: Select all

sudo netstat -ntlp | grep haproxy
Output:

Code: Select all

tcp        0      0 0.0.0.0:9000            0.0.0.0:*               LISTEN      19475/haproxy
tcp        0      0 0.0.0.0:3306            0.0.0.0:*               LISTEN      19475/haproxy
Open a web browser and have a look at the statistics page at http://192.168.107.166:9000/stats/

Image

Post Reply