Hero Image

MySQL 8.0 Group Replication and Cloning

MySQL 8.0.17 Group Replication & Cloning

This guide aims to be running MySQL 8.0 with the group replication plugin, the new clone plugin and TLS/SSL between nodes enabled.

Link for myself to remember, recover a network-broken majority-lost cluster.

Requirements:

At the time of writing, this guide was made to be used with:

You will need 3 machines to have a fault-tolerant cluster (allows 1 to be down). Why 3? Because having 1 or 2 nodes won't allow any of your nodes to be down or the whole cluster is down. Having 3 or 4 means you can have 1 down for it to be up, and having 5 will allow 2 to be down. For more information as to why you need a certain number of nodes, you can have a look at Fault-tolerance in the official docs.

In my case, I spun up 5 nodes at Hetzner's cloud service for this tutorial:

  • node1.finalx.dev (Nuremberg)
  • node2.finalx.dev (Nuremberg)
  • node3.finalx.dev (Falkenstein)
  • node4.finalx.dev (Falkenstein)
  • node5.finalx.dev (Helsinki)

Preparation

  1. Make sure your system is up-to-date with the new sources. run apt update && apt upgrade.
  2. Install mysql-apt-config.deb on each node individually, and configure it to serve MySQL 8.0 (default!).
  3. Install MySQL Server with apt install mysql-community-server.

Please really make sure you're running 8.0.17 or higher before continuing! 8.0.16 won't do!

This is because:

  • Primary election algorithm: Primary server election based on patch-level rather than major/minor version, requires 8.0.17 or higher.
  • Using the clone-plugin with group replication requires 8.0.17 or higher.
  • Disabling the MyISAM storage engine and not running into problems with mysql_upgrade requires 8.0.16 or higher.
  • Using IPv6-addresses as local addresses with group replication requires 8.0.14 or higher.
  • Enabling transaction consistency guarantee (group_replication_consistency variable) for every transaction requires 8.0.14 or higher.
  • Changing a running cluster from multi-master to single-master requires 8.0.13 or higher.
  • Some settings have been omitted in this guide because their default values were improved for replication in 8.0.3.
  • Another setting has been omitted in this guide because it was changed to be the default in 8.0.2 (transaction_write_set_extraction=XXHASH64).
  • The mysql.session user requires a 8.0.2 system schema or higher.

/etc/hosts

If you are going to be using hostnames like me, updated the /etc/hosts file of each node to not need DNS-resolving.

For me, that'll look like this (since I'm using a private network between the hosts):

10.0.0.2 node1.finalx.dev node1
10.0.0.3 node2.finalx.dev node2
10.0.0.4 node3.finalx.dev node3
10.0.0.5 node4.finalx.dev node4
10.0.0.6 node5.finalx.dev node5

Single master

We will be preparing one node first, and then add the rest later. You will be able to view the current active master in the PERFORMANCE_SCHEMA tables after everything is up and running.

Before we start, generate a random UUID; you can do this with:

uuidgen

Or, alternatively, if you don't have it installed for some reason, directly with MySQL:

echo $(mysql -Ne 'SELECT UUID()')

This will give you a simple clean output, like 0b224cd7-e518-11e9-b8a7-96000031dba4. We will use this random UUID as cluster ID later.

Setting up SSL before the cluster joins in

If you want this, have a look on the offical website.

You will want this, because recovery with the new authentication plugin can only be done over encrypted channels. On any server running MySQL 8.0, a CA, client cert and server cert are automatically generated. More on that later.

Setting up a replication user

On each instance seperately, open a MySQL-client with mysql, and execute the following:

SET SQL_LOG_BIN=0;
CREATE USER 'rec_ssl_user'@'%' REQUIRE SSL;
GRANT replication slave ON *.* TO 'rec_ssl_user'@'%';
GRANT BACKUP_ADMIN ON *.* TO 'rec_ssl_user'@'%';
SET SQL_LOG_BIN=1;
CHANGE MASTER TO MASTER_USER="rec_ssl_user" FOR CHANNEL "group_replication_recovery";

This will disable the binary log before creating the user, and re-enable it afterwards. Then, it will set up the replication user for the group replication channel.

Server 1 (node1.finalx.dev)

Make sure you copy server 1's certificate files over to all other nodes:

  • /var/lib/mysql/ca.pem
  • /var/lib/mysql/ca-key.pem
  • /var/lib/mysql/client-cert.pem
  • /var/lib/mysql/client-key.pem
  • /var/lib/mysql/server-cert.pem
  • /var/lib/mysql/server-key.pem

As you will need the CA to be the same, and the others need to be signed by the corresponding CA. This spits them out in easy copy/pasteable format:

for a in /var/lib/mysql/{ca,ca-key,client-cert,client-key,server-cert,server-key}.pem; do
  echo "cat << EOF > $a"
  cat $a
  echo "EOF"
done

Just paste the entire output in the shell on the other nodes after installing MySQL.

Next, open up /etc/mysql/mysql.conf.d/mysqld.local.cnf and add the following:

[mysqld]
disabled_storage_engines="MyISAM,BLACKHOLE,FEDERATED,ARCHIVE,MEMORY"

gtid_mode=ON
enforce_gtid_consistency=ON
binlog_checksum=NONE

plugin-load-add=mysql_clone.so
clone=FORCE_PLUS_PERMANENT

plugin_load_add=group_replication.so
group_replication=FORCE_PLUS_PERMANENT
group_replication_start_on_boot=off
group_replication_bootstrap_group=off
group_replication_ssl_mode=REQUIRED

### Requires adapting per server/cluster
group_replication_group_name    = "0b224cd7-e518-11e9-b8a7-96000031dba4"
group_replication_ip_whitelist  = "127.0.0.1,::1,10.0.0.0/8"

server_id = 1
bind-address                    =  10.0.0.2
mysqlx_bind_address             =  10.0.0.2
report_host                     = "10.0.0.2"
group_replication_local_address = "10.0.0.2:33061"
group_replication_group_seeds   = "10.0.0.2:33061,10.0.0.3:33061,10.0.0.4:33061,10.0.0.5:33061,10.0.0.6:33061"

It is vitally important that you do not simply add all of your servers (especially when more than 3) to group_replication_group_seeds just yet! When a new node joins, they will try to connect to every single one of them until it exhausts connection retries and fail to start. Add one node at a time, and only if the previous new node is done joining!

It is important that you:

  • Leave group_replication_start_on_boot to off on all nodes until the whole thing is up and running. Then just edit the config and turn it on there for the next reboot.
  • Leave group_replication_bootstrap_group to off on all nodes. You will only need this if your whole cluster is down, and even then you can do it from a MySQL-client.

Getting it to work.

Restart the MySQL-server on server 1:

systemctl restart mysql

Next, fire up the mysql client, and issue the following:

SET GLOBAL group_replication_bootstrap_group=ON;
START GROUP_REPLICATION;
SET GLOBAL group_replication_bootstrap_group=OFF;

Bootstrapping effectively marks the server as a new starting point for the cluster. You only enable it once you start a cluster back up after all nodes have been down, and leave it off otherwise. Hence it being turned off right after again, as well.

You can use PERFORMANCE_SCHEMA.replication_group_members to see the status of your cluster:

mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| group_replication_applier | a68c19b4-e517-11e9-8aa4-96000031dba4 | 10.0.0.2    |        3306 | ONLINE       | PRIMARY     | 8.0.17         |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
1 row in set (0.00 sec)

Create a database, table and fill it with some data to see it in action (later):

CREATE DATABASE test;
USE test;
CREATE TABLE t1 (c1 INT PRIMARY KEY, c2 TEXT NOT NULL);
INSERT INTO t1 VALUES (1, 'Luis');

Check the content of table t1 and the binary log:

SELECT * FROM t1;

Some versions won't show the events properly with SHOW BINLOG EVENTS, but if you use the shell command mysqlbinlog on the binlog files in /var/lib/mysql, you'll see them regardless.

Joining in other instances (node2.finalx.dev - node5.finalx.dev)

Right now, the cluster is active and running already, but with only one member. Let's expand it by adding our second node to it.

For every server you add (DO ONE BY ONE, NOT ALL AT ONCE, IT WILL BREAK), append the new server to the config of the other running servers, like so:

- group_replication_group_seeds   = "node1.finalx.dev:33061"
+ group_replication_group_seeds   = "node1.finalx.dev:33061,node2.finalx.dev:33061"

And after that, do not restart MySQL, but instead, fire up mysql and use a query to make it active in the running instance (or you'll have to bootstrap again with the first node you add):

SET GLOBAL group_replication_group_seeds = "node1.finalx.dev:33061,node2.finalx.dev:33061";

Next, on the new server to add, open up /etc/mysql/mysql.conf.d/mysqld.local.cnf and add the following, similar to server 1:

[mysqld]
disabled_storage_engines="MyISAM,BLACKHOLE,FEDERATED,ARCHIVE,MEMORY"

gtid_mode=ON
enforce_gtid_consistency=ON
binlog_checksum=NONE

plugin-load-add=mysql_clone.so
clone=FORCE_PLUS_PERMANENT

plugin_load_add=group_replication.so
group_replication=FORCE_PLUS_PERMANENT
group_replication_start_on_boot=on # CHANGE THIS FROM SERVER 1'S CONFIG
group_replication_bootstrap_group=off
group_replication_ssl_mode=REQUIRED

### Requires adapting per server/cluster
group_replication_group_name    = "0b224cd7-e518-11e9-b8a7-96000031dba4"
group_replication_ip_whitelist  = "127.0.0.1,::1,10.0.0.0/8"

server_id = 1
bind-address                    =  10.0.0.3
mysqlx_bind_address             =  10.0.0.3
report_host                     = "10.0.0.3"
group_replication_local_address = "10.0.0.3:33061"
group_replication_group_seeds   = "10.0.0.2:33061,10.0.0.3:33061,10.0.0.4:33061,10.0.0.5:33061,10.0.0.6:33061"

Note that here we already have group_replication_start_on_boot turned on by default (you can do that on server 1 now as well), and contain our current cluster + the new server to add.

If you followed this guide earlier, you already set up the replication user here. If not, scroll up a bit and do it now.

All that's left, is to restart the MySQL-instance:

systemctl restart mysql