MySQL Load Balancing: Migrating ProxySQL from On-Prem to AWS EC2

proxysql

AWS

migration

load balancing

Complex, inflexible architectures, redundancy and out-of-date technology, are common problems for companies facing data to cloud migration.

We look to the“clouds,” hoping that we will find there a magic solution to improve operational speed and performance, better workload and scalability, less prone and less complicated architectures. We hope to make our database administrator's life more comfortable. But is it really always a case?

As more enterprises are moving to the cloud, the hybrid model is actually becoming more popular. The hybrid model is seen as a safe model for many businesses.

In fact, it's challenging to do a heart transplant and port everything over immediately. Many companies are doing a slow migration that usually takes a year or even maybe forever until everything is migrated. The move should be made in an acceptable peace.

Unfortunately, hybrid means another puzzle piece that not necessary to reduce complexity. Perhaps as many others walking this road before you, you will find out that some of the applications will actually not move.

Or you will find out that the other project team just decided to use yet another cloud provider.

For instance, it is free, and relatively easy, to move any amount of data into an AWS EC2 instance, but you'll have to pay to transfer data out of AWS. The database services on Amazon are only available on Amazon. Vendor lock-in is there and should not be ignored.

Along the same lines, ClusterControl offers a suite of database automation and management functions to give you full control of your database infrastructure. On-prem, in the cloud and multiple vendors, support.

With ClusterControl, you can monitor, deploy, manage, and scale your databases, securely and with ease through our point-and-click interface.

Utilizing the cloud enables your company and applications to profit from the cost-savings and versatility that originate with cloud computing.

Supported Cloud Platforms

ClusterControl allows you to run multiple databases on the top of the most popular cloud providers without being locked-in to any vendor. It has offered the ability to deploy databases (and backup databases) in the cloud since ClusterControl 1.6.

The supported cloud platforms are Amazon AWS, Microsoft Azure and Google Cloud. It is possible to launch new instances and deploy MySQL, MariaDB, MongoDB, and PostgreSQL directly from the ClusterControl user interface.

The recent ClusterControl version (1.7.4) added support for the MySQL Replication 8.0, PostgreSQL and TimescaleDB from Amazon AWS, Google Cloud Platform, and Microsoft Azure.

ClusterControl: Supported Platforms

Cloud Providers Configuration

Before we jump into our first deployment we need to connect ClusterControl with our cloud provider.
It’s done in the Integrations panel.

ClusterControl- Cloud Credential Management

The tool will walk you through the Cloud integration with the straightforward wizard. As we can see in the below screenshot first, we start with one of the three big players Amazon Web Services (AWS), Google Cloud and Microsoft Azure.

ClusterControl -Supported Cloud Platforms

In the next section, we need to provide the necessary credentials.

When all is set and ClusterControl can talk with your cloud provider we can go to the deployment section.

Cloud Deployment Process

In this part, you want to select the supported cluster type, MySQL Galera Cluster, MongoDB Replica Set, or PostgreSQL Streaming Replication, TimescaleDB, MySQL Replication.

The next move is to pick the supported vendor for the selected cluster type. At the moment, the following vendors and versions are:

MySQL Galera Cluster - Percona XtraDB Cluster 5.7, MariaDB 10.2, MariaDB 10.3
MySQL Replication Cluster - Percona Server 8.0, MariaDB Server 10.3, Oracle MySQL Server 8.0
MongoDB Replica Set - Percona Server for MongoDB 3.6, MongoDB 3.6, MongoDB 4.0
PostgreSQL Cluster - PostgreSQL 11.0
TimescaleDB 11.0

The deployment procedure is aware of the functionality and flexibility of the cloud environments, like the type of VM's dynamic IP and hostname allocation, NAT-ed public IP address, virtual private cloud network or storage.

In the following dialog:

ClusterControl - Deploy MySQL Replication Cluster

Most of the settings in this step are dynamically populated from the cloud provider by the chosen credentials. You can configure the operating system, instance size, VPC setting, storage type, and size and also specify the SSH key location on the ClusterControl host. You can also let ClusterControl generate a new key specifically for these instances.

ClusterControl - Cloud Deployment, Select Virutal Machine

When all is set you will see your configuration. At this stage, you can also pick up additional subnet.

ClusterControl - Cloud Deployment Summary

Verify if everything is correct and hit the "Deploy Cluster" button to start the deployment.

You can then monitor the progress by clicking on the Activity -> Jobs -> Create Cluster -> Full Job Details:

Depending on the cluster size, it could take 10 to 20 minutes to complete. Once done, you will see a new database cluster listed under the ClusterControl dashboard.

Under the hood, the deployment process did the following:

Create SSH key
Create cloud VM instances
Configure security groups and networking (firewalls, subnets)
Verify the SSH connectivity from ClusterControl to all created instances
Prepare VM’s for a specific type of cluster (VM node configuration like package installation, kernel configuration, etc)
Deploy a database on every instance
Configure the clustering or replication links
Register the deployment into ClusterControl

After the deployment, you can review the process and see what exactly was executed. With the extended logging, you can see each command. You can see who triggered the job and what was the outcome.
If at any point you want to extend your cluster, you can use the scaling which is also integrated with your cloud provider.

The process is simple. In the first phase, you choose the desired VM type.

Finally, you can choose the master node and remaining settings which depends on your cluster type:

Conclusion

We showed you how to set up your database MySQL Replication environment on Microsoft Azure, it only took a couple of clicks to build virtual machines, network, and finally a reliable master/slave replication cluster. With new scaling in the cloud functionality, you can also easily expand cluster whenever needed.

This is just a first step if you want to see what to do next check out our other blogs where we talk about auto-recovery, backups, security and many other aspects of day to day administration with ClusterControl. Want to try it by yourself? Give it a try.

Tags:

Deployment and management your database environment can be a tedious task. It's very common nowadays to use tools for automating your deployment to make these tasks easier. Automation solutions such as Chef, Puppet, Ansible, or SaltStack are just some of the ways to achieve these goals.

This blog will show you how to use Puppet to deploy a Galera Cluster (specifically Percona XtraDB Cluster or PXC) utilizing ClusterControl Puppet Modules. This module makes the deployment, setup, and configuration easier than coding yourself from scratch. You may also want to check out one of our previous blogs about deploying a Galera Cluster using Chef, “How to Automate Deployment of MySQL Galera Cluster Using S9S CLI and Chef.”

Our S9S CLI tools are designed to be used in the terminal (or console) and can be utilized to automatically deploy databases. In this blog, we'll show you how to do deploy a Percona XtraDB Cluster on AWS using Puppet, using ClusterControl and its s9s CLI tools to help automate the job.

Installation and Setup For The Puppet Master and Agent Nodes

On this blog, I used Ubuntu 16.04 Xenial as the target Linux OS for this setup. It might be an old OS version for you, but we know it works with RHEL/CentOS and Debian/Ubuntu recent versions of the OS. I have two nodes that I used on this setup locally with the following host/IP:

Master Hosts:

IP = 192.168.40.200

Hostname = master.puppet.local

Agent Hosts:

IP = 192.168.40.20

Hostname = clustercontrol.puppet.local

Let's go over through the steps.

1) Setup the Master

## Install the packages required

wget https://apt.puppetlabs.com/puppet6-release-xenial.deb

sudo dpkg -i puppet6-release-xenial.deb

sudo apt update

sudo apt install -y puppetserver

## Now, let's do some minor configuration for Puppet

sudo vi /etc/default/puppetserver

## edit from

JAVA_ARGS="-Xms2g -Xmx2g -Djruby.logger.class=com.puppetlabs.jruby_utils.jruby.Slf4jLogger"

## to

JAVA_ARGS="-Xms512m -Xmx512m -Djruby.logger.class=com.puppetlabs.jruby_utils.jruby.Slf4jLogger"

## add alias hostnames in /etc/hosts

sudo vi /etc/hosts

## and add

192.168.40.10 client.puppet.local

192.168.40.200 server.puppet.local

## edit the config for server settings.

sudo vi /etc/puppetlabs/puppet/puppet.conf

## This can be depending on your setup so you might approach it differently than below.

[master]

vardir = /opt/puppetlabs/server/data/puppetserver

logdir = /var/log/puppetlabs/puppetserver

rundir = /var/run/puppetlabs/puppetserver

pidfile = /var/run/puppetlabs/puppetserver/puppetserver.pid

codedir = /etc/puppetlabs/code



dns_alt_names = master.puppet.local,master



[main]

certname = master.puppet.local

server = master.puppet.local 

environment = production

runinterval = 15m

## Generate a root and intermediate signing CA for Puppet Server

sudo /opt/puppetlabs/bin/puppetserver ca setup

## start puppet server

sudo systemctl start puppetserver

sudo systemctl enable puppetserver

2) Setup the Agent/Client Node

## Install the packages required

wget https://apt.puppetlabs.com/puppet6-release-xenial.deb

sudo dpkg -i puppet6-release-xenial.deb

sudo apt update

sudo apt install -y puppet-agent

## Edit the config settings for puppet client

sudo vi /etc/puppetlabs/puppet/puppet.conf

And add the example configuration below,

[main]

certname = clustercontrol.puppet.local

server = master.puppet.local

environment = production

runinterval = 15m

3) Authenticating (or Signing the Certificate Request) for Master/Client Communication

## Go back to the master node and run the following to view the view outstanding requests.

sudo /opt/puppetlabs/bin/puppetserver ca list

## The Result

Requested Certificates:

    clustercontrol.puppet.local   (SHA256) 0C:BA:9D:A8:55:75:30:27:31:05:6D:F1:8C:CD:EE:D7:1F:3C:0D:D8:BD:D3:68:F3:DA:84:F1:DE:FC:CD:9A:E1

## sign a request from agent/client

sudo /opt/puppetlabs/bin/puppetserver ca sign --certname clustercontrol.puppet.local

## The Result

Successfully signed certificate request for clustercontrol.puppet.local

## or you can also sign all request

sudo /opt/puppetlabs/bin/puppetserver ca sign --all

## in case you want to revoke, just do

sudo /opt/puppetlabs/bin/puppetserver ca revoke --certname <AGENT_NAME>

## to list all unsigned,

sudo /opt/puppetlabs/bin/puppetserver ca list --all

## Then verify or test in the client node,

## verify/test puppet agent

sudo /opt/puppetlabs/bin/puppet agent --test

Scripting Your Puppet Manifests and Setting up the ClusterControl Puppet Module

Our ClusterControl Puppet module can be downloaded here https://github.com/severalnines/puppet. Otherwise, you can also easily grab the Puppet Module from Puppet-Forge. We're regularly updating and modifying the Puppet Module, so we suggest you grab the github copy to ensure the most up-to-date version of the script.

You should also take into account that our Puppet Module is tested on CentOS/Ubuntu running with the most updated version of Puppet (6.7.x.). For this blog, the Puppet Module is tailored to work with the most recent release of ClusterControl (which as of this writing is 1.7.3). In case you missed it, you can check out our releases and patch releases over here.

1) Setup the ClusterControl Module in the Master Node

# Download from github and move the file to the module location of Puppet:

wget https://github.com/severalnines/puppet/archive/master.zip -O clustercontrol.zip; unzip -x clustercontrol.zip; mv puppet-master /etc/puppetlabs/code/environments/production/modules/clustercontrol

2) Create Your Manifest File and Add the Contents as Shown Below

vi /etc/puppetlabs/code/environments/production/manifests/site.pp

Now, before we proceed, we need to discuss the manifest script and the command to be executed. First, we'll have to define the type of ClusterControl and its variables we need to provide. ClusterControl requires every setup to have token and SSH keys be specified and provided accordingly. Hence, this can be done by running the following command below:

## Generate the key

bash /etc/puppetlabs/code/environments/production/modules/clustercontrol/files/s9s_helper.sh --generate-key

## Then, generate the token

bash /etc/puppetlabs/code/environments/production/modules/clustercontrol/files/s9s_helper.sh --generate-token

Now, let's discuss what we'll have to input within the manifest file one by one.

node 'clustercontrol.puppet.local' { # Applies only to mentioned node. If nothing mentioned, applies to all.

        class { 'clustercontrol':

            is_controller => true,

ip_address => '<ip-address-of-your-cluster-control-hosts>',

mysql_cmon_password => '<your-desired-cmon-password>',

                  api_token => '<api-token-generated-earlier>'

        }

Now, we'll have to define the <ip-address-of-your-cluster-control-hosts> of your ClusterControl node where it's actually the clustercontrol.puppet.local in this example. Specify also the cmon password and then place the API token as generated by the command mentioned earlier.

Afterwards, we'll use ClusterControl RPC to send a POST request to create an AWS entry:

exec { 'add-aws-credentials':

            path  => ['/usr/bin', '/usr/sbin', '/bin'],  

    command => "echo '{\"operation\" : \"add_credentials\", \"provider\" : aws, \"name\" : \"<your-aws-credentials-name>\", \"comment\" : \"<optional-comment-about-credential-entry>\", \"credentials\":{\"access_key_id\":\"<aws-access-key-id>\",\"access_key_secret\" : \"<aws-key-secret>\",\"access_key_region\" : \"<aws-region>\"}}'  | curl -sX POST -H\"Content-Type: application/json\" -d @- http://localhost:9500/0/cloud"

}

The placeholder variables I set are self-explanatory. You need to provide the desired credential name for your AWS, provide a comment if you wanted to, provided the AWS access key id, your AWS key secret and AWS region where you'll be deploying the Galera nodes.

Lastly, we'll have to run the command using s9s CLI tools.

exec { 's9s':

  path        => ['/usr/bin', '/usr/sbin', '/bin'],

  onlyif      => "test -f $(/usr/bin/s9s cluster --list --cluster-format='%I' --cluster-name '<cluster-name>' 2> /dev/null) > 0 ", 

  command     => "/usr/bin/s9s cluster --create --cloud=aws --vendor percona --provider-version 5.7  --containers=<node1>,<node2>,<node3> --nodes=<node1>,<node2>,<node3> --cluster-name=<cluster-name> --cluster-type=<cluster-type> --image <aws-image> --template <aws-instance-type> --subnet-id <aws-subnet-id> --region <aws-region> --image-os-user=<image-os-user> --os-user=<os-user> --os-key-file <path-to-rsa-key-file> --vpc-id <aws-vpc-id> --firewalls <aws-firewall-id> --db-admin <db-user> --db-admin-passwd <db-password> --wait --log",

  timeout     => 3600,

  logoutput   => true

}

Let’s look at the key-points of this command. First, the "onlyif" is defined by a conditional check to determine if such cluster name exists, then do not run since it's already added in the cluster. We'll proceed on running the command which utilizes the S9S CLI Tools. You'll need to specify the AWS IDs in the placeholder variables being set. Since the placeholder names are self-explanatory, its values will be taken from your AWS Console or by using the AWS CLI tools.

Now, let's check the succeeding steps remaining.

3) Prepare the Script for Your Manifest File

# Copy the example contents below (edit according to your desired values) and paste it to the manifest file, which is the site.pp.

node 'clustercontrol.puppet.local' { # Applies only to mentioned node. If nothing mentioned, applies to all.

        class { 'clustercontrol':

is_controller => true,

ip_address => '192.168.40.20',

mysql_cmon_password => 'R00tP@55',

mysql_server_addresses => '192.168.40.30,192.168.40.40',

api_token => '0997472ab7de9bbf89c1183f960ba141b3deb37c'

        }



exec { 'add-aws-credentials':

path  => ['/usr/bin', '/usr/sbin', '/bin'],  

command => "echo '{\"operation\" : \"add_credentials\", \"provider\" : aws, \"name\" : \"paul-aws-sg\", \"comment\" : \"my SG AWS Connection\", \"credentials\":{\"access_key_id\":\"XXXXXXXXXXX\",\"access_key_secret\" : \"XXXXXXXXXXXXXXX\",\"access_key_region\" : \"ap-southeast-1\"}}'  | curl -sX POST -H\"Content-Type: application/json\" -d @- http://localhost:9500/0/cloud"

}





exec { 's9s':

path        => ['/usr/bin', '/usr/sbin', '/bin'],

onlyif      => "test -f $(/usr/bin/s9s cluster --list --cluster-format='%I' --cluster-name 'cli-aws-repl' 2> /dev/null) > 0 ", 

command     => "/usr/bin/s9s cluster --create --cloud=aws --vendor percona --provider-version 5.7  --containers=db1,db2,db3 --nodes=db1,db2,db3 --cluster-name=cli-aws-repl --cluster-type=galera --image ubuntu18.04 --template t2.small --subnet-id subnet-xxxxxxxxx  --region ap-southeast-1 --image-os-user=s9s --os-user=s9s --os-key-file /home/vagrant/.ssh/id_rsa --vpc-id vpc-xxxxxxx --firewalls sg-xxxxxxxxx --db-admin root --db-admin-passwd R00tP@55 --wait --log",

timeout     => 3600,

logoutput   => true

}

}

Let's Do the Test and Run Within the Agent Node

/opt/puppetlabs/bin/puppet agent --test

The End Product

Now, let's have a look once the agent is being ran. Once you have this running, visiting the URLhttp://<cluster-control-host>/clustercontrol, you'll be asked by ClusterControl to register first.

Now, you wonder where's the result after we had run the RPC request with resource name 'add-aws-credentials' in our manifest file, it'll be found in the Integrations section within the ClusterControl. Let's see how it looks like after the Puppet perform the runbook.

You can modify this in accordance to your like through the UI but you can also modify this by using our RPC API.

Now, let's check the cluster,

From the UI view, it shows that it has been able to create the cluster, display the cluster in the dashboard, and also shows the job activities that were performed in the background.

Lastly, our AWS nodes are already present now in our AWS Console. Let's check that out,

All nodes are running healthy and are expected to its designated names and region.

Conclusion

In this blog, we are able to deploy a Galera/Percona Xtradb Cluster using automation with Puppet. We did not create the code from scratch, nor did we use any external tools that would have complicated the task. Instead, we used the CusterControl Module and the S9S CLI tool to build and deploy a highly available Galera Cluster.

Tags:

percona xtradb cluster

Using logical backup programs like mysqldump is a common practice performed by MySQL admins for backup and restore (the process of moving a database from one server to another) and is also the most efficient way to perform a database mass modification using a single text file.

When doing this for MySQL Galera Cluster, however, the same rules apply except for the fact that it takes a lot of time to restore a dump file into a running Galera Cluster. In this blog, we will look at the best way to restore a Galera Cluster using mysqldump.

Galera Cluster Restoration Performance

One of the most common misconceptions about Galera Cluster is that restoring a database into a three-node cluster is faster than doing it to a standalone node. This is definitely incorrect when talking about a stateful service, like datastore and filesystem. To keep in sync, every member has to keep up with whatever changes happened with the other members. This is where locking, certifying, applying, rollbacking, committing are forced to be involved into the picture to ensure no data loss along the process, because for a database service, data loss is a big no-no.

Let's make some comparisons to see and understand the impact. Suppose we have a 2 GB of dump file for database 'sbtest'. We usually would load the data into the cluster via two endpoints:

load balancer host
one of the database hosts

As for control measurement, we are also going to restore on a standalone node. Variable pxc_strict_mode is set to PERMISSIVE on all Galera nodes.

The backup was created on one of the Galera nodes with the following command:

$ mysqldump --single-transaction sbtest > sbtest.sql

We are going to use 'pv' to observe the progress and measure the restoration performance. Thus, the restore command is:

$ pv sbtest.sql | mysql -uroot -p sbtest

The restorations were repeated 3 times for each host type as shown in the following table:

Endpoint Type

Database Server

Restoration Time

(seconds)

Restoration Speed

(MiB/s)

Standalone

MySQL 5.7.25

3m 29s

3m 36s

3m 31s

8.73

8.44

8.64

Load

Balancer

HAProxy -> PXC 5.7.25 (multiple DB hosts - all active, leastconn)

5m 45s

6m 03s

5m 43s

5.29

5.02

5.43

ProxySQL -> PXC 5.7.25

(single DB host - single writer hostgroup)

6m 07s

7m 00s

6m 54s

4.97

4.34

4.41

Galera

Cluster

PXC 5.7.25

(single DB host)

5m 22s

6m 00s

5m 28s

5.66

5.07

5.56

Note that the way pv measures the restoration speed is based on the mysqldump text file that is being passed through it through pipe. It's not highly accurate but good enough to give us some measurements to compare. All hosts are having the same specs and running as a virtual machine on the same underlying physical hardware.

The following column chart summarizes the average time it takes to restore the mysqldump:

Standalone host is the clear winner with 212 seconds, while ProxySQL is the worst for this workload; almost two-times slower if compared to standalone.

The following column chart summarizes the average speed pv measures when restoring the mysqldump:

As expected, restoration on the standalone note is way faster with 8.6 MiB/s on average, 1.5x better than restoration directly on the Galera node.

To summarize our observation, restoring directly on a Galera Cluster node is way slower than a standalone host. Restoring through a load balancer is even worse.

Turning Off Galera Replication

Running mysqldump on a Galera Cluster will cause every single DML statement (INSERTs in this case) being broadcasted, certified and applied by Galera nodes through its group communication and replication library. Thus, the fastest way to restore a mysqldump is to perform the restoration on a single node, with Galera Replication turned off, kind of making it running like a standalone mode. The steps are:

Pick one Galera node as the restore node. Stop the rest of the nodes.
Turn off Galera Replication on the restore node.
Perform the restoration.
Stop and bootstrap the restore node.
Force the remaining node to re-join and re-sync via SST.

For example, let's say we choose db1 to be the restore node. Stop the other nodes (db2 and db3) one node at a time so the nodes would leave the cluster gracefully:

$ systemctl stop mysql #db2

$ systemctl stop mysql #db3

Note: For ClusterControl users, simply go to Nodes -> pick the DB node -> Node Actions -> Stop Node. Do not forget to turn off ClusterControl automatic recovery for cluster and nodes before performing this exercise.

Now, login to db1 and turn the Galera node into a standalone node by setting wsrep_provider variable to 'none':

$ mysql -uroot -p

mysql> SET GLOBAL wsrep_provider = 'none';

mysql> SHOW STATUS LIKE 'wsrep_connected';

+-----------------+-------+

| Variable_name   | Value |

+-----------------+-------+

| wsrep_connected | OFF   |

+-----------------+-------+

Then perform the restoration on db1:

$ pv sbtest.sql | mysql -uroot -p sbtest

1.78GiB 0:02:46 [  11MiB/s] [==========================================>] 100%

The restoration time has improved 2x to 166 seconds (down from ~337 seconds) with 11MiB/s (up from ~5.43MiB/s). Since this node is now has the most updated data, we have to bootstrap the cluster based on this node and let the other nodes rejoin the cluster and force to re-syncing everything back.

On db1, stop the MySQL service and start it again in bootstrap mode:

$ systemctl status mysql #check whether mysql or mysql@bootstrap is running

$ systemctl status mysql@bootstrap #check whether mysql or mysql@bootstrap is running

$ systemctl stop mysql # if mysql was running

$ systemctl stop mysql@bootstrap # if mysql@bootstrap was running

$ systemctl start mysql@bootstrap

While on every remaining node, wipe out the datadir (or you can just simply delete grastate.dat file) and start the MySQL service:

$ rm /var/lib/mysql/grastate.dat  # remove this file to force SST

$ systemctl start mysql

Do perform the start up process one node at a time. Once the working node is synced, proceed with the next node and so on.

Note: For ClusterControl users, you could skip the above step because ClusterControl can be configured to force SST during the bootstrap process. Just click on the Cluster Actions -> Bootstrap Cluster and pick the db1 as the bootstrap node and toggle on the option for "Clear MySQL Datadir on Joining nodes", as shown below:

We could also juice up the restoration process by allowing bigger packet size for the mysql client:

$ pv sbtest.sql | mysql -uroot -p --max_allowed_packet=2G sbtest

At this point, our cluster should be running with the restored data. Take note that in this test case, the total restoration time for the cluster is actually longer than if we performed the restoration directly on the Galera node thanks to our small dataset. If you have a huge mysqldump file to restore, believe us, this is one of the best ways you should do.

That's it for now. Happy restoring!

Tags:

mariadb cluster

performance management

High Availability is a must these days as most organizations can’t allow itself to lose its data. High Availability, however, always comes with a price tag (which can vary a lot.) Any setups which require nearly-immediate action would typically require an expensive environment which would mirror precisely the production setup. But, there are other options that can be less expensive. These may not allow for an immediate switch to a disaster recovery cluster, but they will still allow for business continuity (and won’t drain the budget.)

An example of this type of setup is a “cold-standby” DR environment. It allows you to reduce your expenses while still being able to spin up a new environment in an external location should the disaster strikes. In this blog post we will demonstrate how to create such a setup.

The Initial Setup

Let’s assume we have a fairly standard Master / Slave MySQL Replication setup in our own datacenter. It is highly available setup with ProxySQL and Keepalived for Virtual IP handling. The main risk is that the datacenter will become unavailable. It is a small DC, maybe it’s only one ISP with no BGP in place. And in this situation, we will assume that if it would take hours to bring back the database that it’s ok as long as it’s possible to bring it back.

To deploy this cluster we used ClusterControl, which you can download for free. For our DR environment we will use EC2 (but it could also be any other cloud provider.)

The Challenge

The main issue we have to deal with is how should we ensure we do have a fresh data to restore our database in the disaster recovery environment? Of course, ideally we would have a replication slave up and running in EC2... but then we have to pay for it. If we are tight on the budget, we could try to get around that with backups. This is not the perfect solution as, in the worst case scenario, we will never be able to recover all the data.

By “the worst case scenario” we mean a situation in which we won’t have access to the original database servers. If we will be able to reach them, data would not have been lost.

The Solution

We are going to use ClusterControl to setup a backup schedule to reduce the chance that the data would be lost. We will also use the ClusterControl feature to upload backups to the cloud. If the datacenter will not be available, we can hope that the cloud provider we have chosen will be reachable.

Setting up the Backup Schedule in ClusterControl

First, we will have to configure ClusterControl with our cloud credentials.

We can do this by using “Integrations” from the left side menu.

You can pick Amazon Web Services, Google Cloud or Microsoft Azure as the cloud you want ClusterControl to upload backups to. We will go ahead with AWS where ClusterControl will use S3 to store backups.

We then need to pass key ID and key secret, pick the default region and pick a name for this set of credentials.

AWS Cloud Integration Successful - ClusterControl

Once this is done, we can see the credentials we just added listed in ClusterControl.

Now, we shall proceed with setting up backup schedule.

ClusterControl allows you to either create backup immediately or schedule it. We’ll go with the second option. What we want is to create a following schedule:

Full backup created once per day
Incremental backups created every 10 minutes.

The idea here is like follows. Worst case scenario we will lose only 10 minutes of the traffic. If the datacenter will become unavailable from outside but it would work internally, we could try to avoid any data loss by waiting 10 minutes, copying the latest incremental backup on some laptop and then we can manually send it towards our DR database using even phone tethering and a cellular connection to go around ISP failure. If we won’t be able to get the data out of the old datacenter for some time, this is intended to minimize the amount of transactions we will have to manually merge into DR database.

Create Backup Schedule in ClusterControl

We start with full backup which will happen daily at 2:00 am. We will use the master to take the backup from, we will store it on controller under /root/backups/ directory. We will also enable “Upload Backup to the cloud” option.

Next, we want to make some changes in the default configuration. We decided to go with automatically selected failover host (in case our master would be unavailable, ClusterControl will use any other node which is available). We also wanted to enable encryption as we will be sending our backups over the network.

Cloud Settings for Backup Scheduling in ClusterControl

Then we have to pick the credentials, select existing S3 bucket or create a new one if needed.

We are basically repeating the process for the incremental backup, this time we used the “Advanced” dialog to run the backups every 10 minutes.

The rest of the settings is similar, we also can reuse the S3 bucket.

The backup schedule looks as above. We don’t have to start full backup manually, ClusterControl will run incremental backup as scheduled and if it detects there’s no full backup available, it will run a full backup instead of the incremental.

With such setup we can be safe to say that we can recover the data on any external system with 10 minute granularity.

Manual Backup Restore

If it happens that you will need to restore the backup on the disaster recovery instance, there are a couple of steps you have to take. We strongly recommend to test this process from time to time, ensuring it works correctly and you are proficient in executing it.

First, we have to install AWS command line tool on our target server:

root@vagrant:~# apt install python3-pip

root@vagrant:~# pip3 install awscli --upgrade --user

Then we have to configure it with proper credentials:

root@vagrant:~# ~/.local/bin/aws configure

AWS Access Key ID [None]: yourkeyID

AWS Secret Access Key [None]: yourkeySecret

Default region name [None]: us-west-1

Default output format [None]: json

We can now test if we have the access to the data in our S3 bucket:

root@vagrant:~# ~/.local/bin/aws s3 ls s3://drbackup/

                           PRE BACKUP-1/

                           PRE BACKUP-2/

                           PRE BACKUP-3/

                           PRE BACKUP-4/

                           PRE BACKUP-5/

                           PRE BACKUP-6/

                           PRE BACKUP-7/

Now, we have to download the data. We will create directory for the backups - remember, we have to download whole backup set - starting from a full backup to the last incremental we want to apply.

root@vagrant:~# mkdir backups

root@vagrant:~# cd backups/

Now there are two options. We can either download backups one by one:

root@vagrant:~# ~/.local/bin/aws s3 cp s3://drbackup/BACKUP-1/ BACKUP-1 --recursive

download: s3://drbackup/BACKUP-1/cmon_backup.metadata to BACKUP-1/cmon_backup.metadata

Completed 30.4 MiB/36.2 MiB (4.9 MiB/s) with 1 file(s) remaining

download: s3://drbackup/BACKUP-1/backup-full-2019-08-20_113009.xbstream.gz.aes256 to BACKUP-1/backup-full-2019-08-20_113009.xbstream.gz.aes256



root@vagrant:~# ~/.local/bin/aws s3 cp s3://drbackup/BACKUP-2/ BACKUP-2 --recursive

download: s3://drbackup/BACKUP-2/cmon_backup.metadata to BACKUP-2/cmon_backup.metadata

download: s3://drbackup/BACKUP-2/backup-incr-2019-08-20_114009.xbstream.gz.aes256 to BACKUP-2/backup-incr-2019-08-20_114009.xbstream.gz.aes256

We can also, especially if you have tight rotation schedule, sync all contents of the bucket with what we have locally on the server:

root@vagrant:~/backups# ~/.local/bin/aws s3 sync s3://drbackup/ .

download: s3://drbackup/BACKUP-2/cmon_backup.metadata to BACKUP-2/cmon_backup.metadata

download: s3://drbackup/BACKUP-4/cmon_backup.metadata to BACKUP-4/cmon_backup.metadata

download: s3://drbackup/BACKUP-3/cmon_backup.metadata to BACKUP-3/cmon_backup.metadata

download: s3://drbackup/BACKUP-6/cmon_backup.metadata to BACKUP-6/cmon_backup.metadata

download: s3://drbackup/BACKUP-5/cmon_backup.metadata to BACKUP-5/cmon_backup.metadata

download: s3://drbackup/BACKUP-7/cmon_backup.metadata to BACKUP-7/cmon_backup.metadata

download: s3://drbackup/BACKUP-3/backup-incr-2019-08-20_115005.xbstream.gz.aes256 to BACKUP-3/backup-incr-2019-08-20_115005.xbstream.gz.aes256

download: s3://drbackup/BACKUP-1/cmon_backup.metadata to BACKUP-1/cmon_backup.metadata

download: s3://drbackup/BACKUP-2/backup-incr-2019-08-20_114009.xbstream.gz.aes256 to BACKUP-2/backup-incr-2019-08-20_114009.xbstream.gz.aes256

download: s3://drbackup/BACKUP-7/backup-incr-2019-08-20_123008.xbstream.gz.aes256 to BACKUP-7/backup-incr-2019-08-20_123008.xbstream.gz.aes256

download: s3://drbackup/BACKUP-6/backup-incr-2019-08-20_122008.xbstream.gz.aes256 to BACKUP-6/backup-incr-2019-08-20_122008.xbstream.gz.aes256

download: s3://drbackup/BACKUP-5/backup-incr-2019-08-20_121007.xbstream.gz.aes256 to BACKUP-5/backup-incr-2019-08-20_121007.xbstream.gz.aes256

download: s3://drbackup/BACKUP-4/backup-incr-2019-08-20_120007.xbstream.gz.aes256 to BACKUP-4/backup-incr-2019-08-20_120007.xbstream.gz.aes256

download: s3://drbackup/BACKUP-1/backup-full-2019-08-20_113009.xbstream.gz.aes256 to BACKUP-1/backup-full-2019-08-20_113009.xbstream.gz.aes256

As you remember, the backups are encrypted. We have to have encryption key which is stored in ClusterControl. Make sure you have its copy stored somewhere safe, outside of the main datacenter. If you cannot reach it, you won’t be able to decrypt backups. The key can be found in ClusterControl configuration:

root@vagrant:~# grep backup_encryption_key /etc/cmon.d/cmon_1.cnf

backup_encryption_key='aoxhIelVZr1dKv5zMbVPLxlLucuYpcVmSynaeIEeBnM='

It is encoded using base64 thus we have to decode it first and store it in the file before we can start decrypting the backup:

echo "aoxhIelVZr1dKv5zMbVPLxlLucuYpcVmSynaeIEeBnM=" | openssl enc -base64 -d > pass

Now we can reuse this file to decrypt backups. For now, let’s say we will do one full and two incremental backups.

mkdir 1

mkdir 2

mkdir 3

cat BACKUP-1/backup-full-2019-08-20_113009.xbstream.gz.aes256 | openssl enc -d -aes-256-cbc -pass file:/root/backups/pass | zcat | xbstream -x -C /root/backups/1/

cat BACKUP-2/backup-incr-2019-08-20_114009.xbstream.gz.aes256 | openssl enc -d -aes-256-cbc -pass file:/root/backups/pass | zcat | xbstream -x -C /root/backups/2/

cat BACKUP-3/backup-incr-2019-08-20_115005.xbstream.gz.aes256 | openssl enc -d -aes-256-cbc -pass file:/root/backups/pass | zcat | xbstream -x -C /root/backups/3/

We have the data decrypted, now we have to proceed with setting up our MySQL server. Ideally, this should be exactly the same version as on the production systems. We will use Percona Server for MySQL:

cd ~
wget https://repo.percona.com/apt/percona-release_latest.generic_all.deb

sudo dpkg -i percona-release_latest.generic_all.deb

apt-get update

apt-get install percona-server-5.7

Nothing complex, just regular installation. Once it’s up and ready we have to stop it and remove the contents of its data directory.

service mysql stop

rm -rf /var/lib/mysql/*

To restore the backup we will need Xtrabackup - a tool CC uses to create it (at least for Perona and Oracle MySQL, MariaDB uses MariaBackup). It is important that this tool is installed in the same version as on the production servers:

apt install percona-xtrabackup-24

That’s all we have to prepare. Now we can start restoring the backup. With incremental backups it is important to keep in mind that you have to prepare and apply them on top of the base backup. Base backup also has to be prepared. It is crucial to run the prepare with ‘--apply-log-only’ option to prevent xtrabackup from running the rollback phase. Otherwise you won’t be able to apply next incremental backup.

xtrabackup --prepare --apply-log-only --target-dir=/root/backups/1/

xtrabackup --prepare --apply-log-only --target-dir=/root/backups/1/ --incremental-dir=/root/backups/2/

xtrabackup --prepare --target-dir=/root/backups/1/ --incremental-dir=/root/backups/3/

In the last command we allowed xtrabackup to run the rollback of not completed transactions - we won’t be applying any more incremental backups afterwards. Now it is time to populate the data directory with the backup, start the MySQL and see if everything works as expected:

root@vagrant:~/backups# mv /root/backups/1/* /var/lib/mysql/

root@vagrant:~/backups# chown -R mysql.mysql /var/lib/mysql

root@vagrant:~/backups# service mysql start

root@vagrant:~/backups# mysql -ppass

mysql: [Warning] Using a password on the command line interface can be insecure.

Welcome to the MySQL monitor.  Commands end with ; or \g.

Your MySQL connection id is 6

Server version: 5.7.26-29 Percona Server (GPL), Release '29', Revision '11ad961'



Copyright (c) 2009-2019 Percona LLC and/or its affiliates

Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.



Oracle is a registered trademark of Oracle Corporation and/or its

affiliates. Other names may be trademarks of their respective

owners.



Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.



mysql> show schemas;

+--------------------+

| Database           |

+--------------------+

| information_schema |

| mysql              |

| performance_schema |

| proxydemo          |

| sbtest             |

| sys                |

+--------------------+

6 rows in set (0.00 sec)



mysql> select count(*) from sbtest.sbtest1;

+----------+

| count(*) |

+----------+

|    10506 |

+----------+

1 row in set (0.01 sec)

As you can see, all is good. MySQL started correctly and we were able to access it (and the data is there!) We successfully managed to bring our database back up-and-running in a separate location. The total time required depends strictly on the size of the data - we had to download data from S3, decrypt and decompress it and finally prepare the backup. Still, this is a very cheap option (you have to pay for S3 data only) which gives you an option for business continuity should a disaster strikes.

Tags:

Running a MySQL Galera Cluster (either the Percona, MariaDB, or Codership build) is, unfortunately, not a supported (nor part of) the databases supported by Amazon RDS. Most of the databases supported by RDS use asynchronous replication, while Galera Cluster is a synchronous multi-master replication solution. Galera also requires InnoDB as its storage engine to function properly, and while you can use other storage engines such as MyISAM it is not advised that you use this storage engine because of the lack of transaction handling.

Because of the lack of support natively in RDS, this blog will focus on the offerings available when choosing and hosting your Galera-based cluster using an AWS environment.

There are certainly many reasons why you would choose or not choose the AWS cloud platform, but for this particular topic we’re going to go over the advantages and benefits of what you can leverage rather than why you would choose the AWS Platform.

The Virtual Servers (Elastic Compute Instances)

As mentioned earlier, MySQL Galera is not part of RDS and InnoDB is a transactional storage engine for which you need the right resources for your application requirement. It must have the capacity to serve the demand of your client request traffic. At the time of this article, your sole choice for running Galera Cluster is by using EC2, Amazon's compute instance cloud offering.

Because you have the advantage of running your system on a number of nodes on EC2 instances, running a Galera Cluster on EC2 verses on-prem doesn’t differ much. You can access the server remotely via SSH, install your desired software packages, and choose the kind of Galera Cluster build you like to utilize.

Moreover, with EC2 this offering is more elastic and flexible, allowing you to deliver and offer a simpler, granular setup. You can take advantage of the web services to automate or build a number of nodes if you need to scaleout your environment, or for example, automate the building of your staging or development environment. It also gives you an edge to quickly build your desired environment, choose and setup your desired OS, and pickup the right computing resources that fits your requirements (such as CPU, memory, and disk storage.) EC2 eliminates the time to wait for hardware, since you can do this on the fly. You can also leverage their AWS CLI tool to automate your Galera cluster setup.

Pricing for Amazon EC2 Instances

EC2 offers a number of selections which are very flexible for consumers who would like to host their Galera Cluster environment on AWS compute nodes. The AWS Free Tier includes 750 hours of Linux and Windows t2.micro instances, each month, for one year. You can stay within the Free Tier by using only EC2 Micro instances, but this might not be the best thing for production use.

There are multiple types of EC2 instances for which you can deploy when provisioning your Galera nodes. Ideally, these r4/r5/x1 family (memory optimized) and c4/c5 family (compute optimized) are an ideal choice, and these prices differ depending on how large your server resource needs are and type of OS.

These are the types of paid instances you can choose...

On Demand

Pay by compute capacity (per-hour or per-second), depends on the type of instances you run. For example, prices might differ when provisioning an Ubuntu instances vs RHEL instance aside from the type of instance. It has no long-term commitments or upfront payments needed. It also has the flexibility to increase or decrease your compute capacity. These instances are recommended for low cost and flexible environment needs like applications with short-term, spiky, or unpredictable workloads that cannot be interrupted, or applications being developed or tested on Amazon EC2 for the first time. Check it out here for more info.

Dedicated Hosts

If you are looking for compliance and regulatory requirements such as the need to acquire a dedicated server that runs on a dedicated hardware for use, this type of offer suits your needs. Dedicated Hosts can help you address compliance requirements and reduce costs by allowing you to use your existing server-bound software license, including Windows Server, SQL Server, SUSE Linux Enterprise Server, Red Hat Enterprise Linux, or other software licenses that are bound to VMs, sockets, or physical cores, subject to your license terms. It can be purchased On-Demand (hourly) or as a Reservation for up to 70% off the On-Demand price. Check it out here for more info.

Spot Instances

These instances allow you to request spare Amazon EC2 computing capacity for up to 90% off the On-Demand price. This is recommended for applications that have flexible start and end times, applications that are only feasible at very low compute prices, or users with urgent computing needs for large amounts of additional capacity. Check it out here for more info.

Reserved Instances

This type of payment offer provides you the option to grab up to a 75% discount and, depending on which instance you would like to reserve, you can acquire a capacity reservation giving you additional confidence in your ability to launch instances when you need them. This is recommended if your applications have steady state or predictable usage, applications that may require reserved capacity, or customers that can commit to using EC2 over a 1 or 3 year term to reduce their total computing costs. Check it out here for more info.

Pricing Note

One last thing with EC2, they also offer a per-second billing which also takes cost of unused minutes and seconds in an hour off of the bill. This is advantageous if you are scaling-out for a minimal amount of time, just to handle traffic request from a Galera node or in case you want to try and test on a specific node for just a limited time use.

Database Encryption on AWS

If you're concerned about the confidentiality of your data, or abiding the laws required for your security compliance and regulations, AWS offers data-at-rest encryption. If you're using MariaDB Cluster version 10.2+, they have built-in plugin support to interface with the Amazon Web Services (AWS) Key Management Service (KMS) API. This allows you to take advantage of AWS-KMS key management service to facilitate separation of responsibilities and remote logging & auditing of key access requests. Rather than storing the encryption key in a local file, this plugin keeps the master key in AWS KMS.

When you first start MariaDB, the AWS KMS plugin will connect to the AWS Key Management Service and ask it to generate a new key. MariaDB will store that key on-disk in an encrypted form. The key stored on-disk cannot be used to decrypt the data; rather, on each startup, MariaDB connects to AWS KMS and has the service decrypt the locally-stored key(s). The decrypted key is stored in-memory as long as the MariaDB server process is running, and that in-memory decrypted key is used to encrypt the local data.

Alternatively, when deploying your EC2 instances, you can encrypt your data storage volume with EBS (Elastic Block Storage) or encrypt the instance itself. Encryption for EBS type volumes are all supported, though it might have an impact but the latency is very minimal or even not visible to the end users. For EC2 instance-type encryption, most of the large instances are supported. So if you're using compute or memory optimized nodes, you can leverage its encryption.

Below are the list of supported instances types...

General purpose: A1, M3, M4, M5, M5a, M5ad, M5d, T2, T3, and T3a
Compute optimized: C3, C4, C5, C5d, and C5n
Memory optimized: cr1.8xlarge, R3, R4, R5, R5a, R5ad, R5d, u-6tb1.metal, u-9tb1.metal, u-12tb1.metal, X1, X1e, and z1d
Storage optimized: D2, h1.2xlarge, h1.4xlarge, I2, and I3
Accelerated computing: F1, G2, G3, P2, and P3

You can setup your AWS account to always enable encryption upon deployment of your EC2-type instances. This means that AWS will encrypt new EBS volumes on launch and encrypts new copies of unencrypted snapshots.

Multi-AZ/Multi-Region/Multi-Cloud Deployments

Unfortunately, as of this writing, there's no such direct support in the AWS Console (nor any of their AWS API) that supports Multi-AZ/-Region/-Cloud deployments for Galera node clusters.

High Availability, Scalability, and Redundancy

To achieve a multi-AZ deployment, it's recommendable that you provision your galera nodes in different availability zones. This prevents the cluster from going down or a cluster malfunction due to lack of quorum.

You can also setup an AWS Auto Scaling and create an auto scaling group to monitor and do status checks so your cluster will always have redundancy, scalable, and highly availability. Auto Scaling should solve your problem in the case that your node goes down for some unknown reason.

For multi-region or multi-cloud deployment, Galera has its own parameter called gmcast.segment for which you can set this upon server start. This parameter is designed to optimize the communication between the Galera nodes and minimize the amount of traffic sent between network segments including writeset relaying and IST and SST donor selection.

This type of setup allows you to deploy multiple nodes in different regions for your Galera Cluster. Aside from that, you can also deploy your Galera nodes on a different vendor, for example, if it's hosted in Google Cloud and you want redundancy on Microsoft Azure.

I would recommend you to check out our blog Multiple Data Center Setups Using Galera Cluster for MySQL or MariaDB and Zero Downtime Network Migration With MySQL Galera Cluster Using Relay Node to gather more information on how to implement these types of deployments.

Database Performance on AWS

Depending on your application demand, if your queries memory consuming the memory optimized instances are your ideal choice. If your application has higher transactions that require high-performance for web servers or batch processing, then choose compute optimized instances. If you want to learn more about optimizing your Galera Cluster, you can check out this blog How to Improve Performance of Galera Cluster for MySQL or MariaDB.

Database Backups on AWS

Creating backups can be difficult since there's no direct support within AWS that is specific for MySQL Galera technology. However, AWS provides you a disaster and recovery solution using EBS Snapshots. You can take snapshots of the EBS volumes attached to your instance, then either take a backup by schedule using CloudWatch or by using the Amazon Data Lifecycle Manager (Amazon DLM) to automate the snapshots.

Take note that the snapshots taken are incremental backups, which means that only the blocks on the device that have changed after your most recent snapshot are saved. You can store these snapshots to AWS S3 to save storage costs. Alternatively, you can use external tools like Percona Xtrabackup, and Mydumper (for logical backups) and store these to AWS EFS -> AWS S3 -> AWS Glacier.

You can also setup Lifecycle Management in AWS if you need your backup data to be stored in a more cost efficient manner. If you have large files and are going to utilize the AWS EFS, you can leverage their AWS Backup solution as this is also a simple yet cost-effective solution.

On the other hand, you can also use external services (as well such as ClusterControl) which provides you both monitoring and backup solutions. Check this out if you want to know more.

Database Monitoring on AWS

AWS offers health checks and some status checks to provide you visibility into your Galera nodes. This is done through CloudWatch and CloudTrail.

CloudTrail lets you enable and inspect the logs and perform audits based on what actions and traces have been made.

CloudWatch lets you collect and track metrics, collect and monitor log files, and set custom alarms. You can set it up according to your custom needs and gain system-wide visibility into resource utilization, application performance, and operational health. CloudWatch comes with a free tier as long as you still fall within its limits (See the screenshot below.)

CloudWatch also comes with a price depending on the volume of metrics being distributed. Checkout its current pricing by checking here.

Take note: there's a downside to using CloudWatch. It is not designed to cater to the database health, especially for monitoring MySQL Galera cluster nodes. Alternatively, you can use external tools that offer high-resolution graphs or charts that are useful in reporting and are easier to analyze when diagnosing a problematic node.

For this you can use PMM by Percona, DataDog, Idera, VividCortex, or our very own ClusterControl (as monitoring is FREE with ClusterControl Community.) I would recommend that you use a monitoring tool that suits your needs based on your individual application requirements. It's very important that your monitoring tool be able to notify you aggressively or provide you integration for instant messaging systems such as Slack, PagerDuty or even send you SMS when escalating severe health status.

Database Security on AWS

Securing your EC2 instances is one of the most vital parts of deploying your database into the public cloud. You can setup a private subnet and setup the required security groups only favored to allow the port or source IP depending on your setup. You can set your database nodes with a non-remote access and just set up a jump host or an Internet Gateway, if nodes requires to access the internet to access or update software packages. You can read our previous blog Deploying Secure Multicloud MySQL Replication on AWS and GCP with VPN on how we set this up.

In addition to this, you can secure your data in-transit by using TLS/SSL connection or encrypt your data when it's at rest. If you're using ClusterControl, deploying a secure data in-transit is simple and easy. You can check out our blog SSL Key Management and Encryption of MySQL Data in Transit if you want to try out. For data at-rest, storing your data via S3 can be encrypted using AWS Server-Side Encryption or use AWS-KMS which I have discussed earlier. Check this external blog on how to setup and leverage a MariaDB Cluster using AWS-KMS so you can store your data securely at-rest.

Galera Cluster Troubleshooting on AWS

AWS CloudWatch can help especially when investigating and checking out the system metrics. You can check the network, CPU, memory, disk, and it's instance or compute usage and balance. This might not, however, meet your requirements when digging into a specific case.

CloudTrail can perform solid traces of actions that has been governed based on your specific AWS account. This will help you determine if the occurrences aren't coming from MySQL Galera, but might be some bug or issues within the AWS environment (such as Hyper-V is having issues within the host machine where your instance, as the guest, is being hosted.)

If you're using ClusterControl, going to Logs -> System Logs, you'll be able to browse the captured error logs taken from the MySQL Galera node itself. Apart from this, ClusterControl provides real-time monitoring that would amplify your alarm and notification system in case an emergency or if your MySQL Galera node(s) is kaput.

Conclusion

AWS does not have pure support for a MySQL Galera Cluster setup, unlike AWS RDS which has MySQL compatibility. Because of this most of the recommendations or opinions running a Galera Cluster for production use within the AWS environment are based on experienced and well-tested environments that have been running for a very long time.

MariaDB Cluster comes with a great productivity, as they constantly provide concise support for the AWS technology stack solution. In the upcoming release of MariaDB 10.5 version, they will offer a support for S3 Storage Engine, which may be worth the wait.

External tools can help you manage and control your MySQL Galera Cluster running on the AWS Cloud, so it's not a huge concern if you have some dilemmas and FUD on why you should run or shift to the AWS Cloud Platform.

AWS might not be the one-size-fits-all solution in some cases, but it provides a wide-array of solutions that you can customize and tailor it to fit your needs.

In the next part of our blog, we'll look at another public cloud platform, particularly Google Cloud and see how we can leverage if we choose to run our Galera Cluster into their platform.

Tags:

If your IT infrastructure is running on AWS, you have probably heard about Amazon Relational Database Service (RDS), an easy way to set up, operate, and scale a relational database in the cloud. It provides cost-effective and resizable capacity while automating time-consuming administration tasks such as hardware provisioning, database setup, patching, and backups. There are a number of database engine offerings for RDS like MySQL, MariaDB, PostgreSQL, Microsoft SQL Server and Oracle Server.

ClusterControl 1.7.3 acts similarly to RDS as it supports database cluster deployment, management, monitoring, and scaling on the AWS platform. It also supports a number of other cloud platforms like Google Cloud Platform and Microsoft Azure. ClusterControl understands the database topology and is capable of performing automatic recovery, topology management, and many more advanced features to take control of your database.

In this blog post, we are going to compare automatic failover times for Amazon Aurora, Amazon RDS for MySQL, and a MySQL Replication setup deployed and managed by ClusterControl. The type of failover that we are going to do is slave promotion in case that the master goes down. This is where the most up-to-date slave takes over the master role in the cluster to resume the database service.

Our Failover Test

To measure the failover time, we are going to run a simple MySQL connect-update test, with a loop to count the SQL statement status that connect to a single database endpoint. The script looks like this:

#!/bin/bash

_host='{MYSQL ENDPOINT}'

_user='sbtest'

_pass='password'

_port=3306



j=1

while true

do

        echo -n "count $j : "

        num=$(od -A n -t d -N 1 /dev/urandom |tr -d ' ')



        timeout 1 bash -c "mysql -u${_user} -p${_pass} -h${_host} -P${_port} --connect-timeout=1 --disable-reconnect -A -Bse \

        \"UPDATE sbtest.sbtest1 SET k = $num WHERE id = 1\" > /dev/null 2> /dev/null"

        if [ $? -eq 0 ]; then

                echo "OK $(date)"

        else

                echo "Fail ---- $(date)"

        fi



        j=$(( $j + 1 ))

        sleep 1

done

The above Bash script simply connects to a MySQL host and performs an update on a single row with a timeout of 1 second on both Bash and mysql client commands. The timeouts related parameters are required so we can measure the downtime in seconds correctly since mysql client defaults to always reconnect until it reaches the MySQL wait_timeout. We populated a test dataset with the following command beforehand:

$ sysbench \

/usr/share/sysbench/oltp_common.lua \

--db-driver=mysql \

--mysql-host={MYSQL HOST} \

--mysql-user=sbtest \

--mysql-db=sbtest \

--mysql-password=password \

--tables=50 \

--table-size=100000 \

prepare

The script reports whether the above query succeeded (OK) or failed (Fail). Sample outputs are shown further down.

Failover with Amazon RDS for MySQL

In our test, we use the lowest RDS offering with the following specs:

MySQL version: 5.7.22
vCPU: 4
RAM: 16 GB
Storage type: Provisioned IOPS (SSD)
IOPS: 1000
Storage: 100Gib
Multi-AZ Replication: Yes

After Amazon RDS provisions your DB instance, you can use any standard MySQL client application or utility to connect to the instance. In the connection string, you specify the DNS address from the DB instance endpoint as the host parameter, and specify the port number from the DB instance endpoint as the port parameter.

According to Amazon RDS documentation page, in the event of a planned or unplanned outage of your DB instance, Amazon RDS automatically switches to a standby replica in another Availability Zone if you have enabled Multi-AZ. The time it takes for the failover to complete depends on the database activity and other conditions at the time the primary DB instance became unavailable. Failover times are typically 60-120 seconds.

To initiate a multi-AZ failover in RDS, we performed a reboot operation with "Reboot with Failover" checked, as shown in the following screenshot:

The following is what being observed by our application:

...

count 30 : OK Wed Aug 28 03:41:06 UTC 2019

count 31 : OK Wed Aug 28 03:41:07 UTC 2019

count 32 : Fail ---- Wed Aug 28 03:41:09 UTC 2019

count 33 : Fail ---- Wed Aug 28 03:41:11 UTC 2019

count 34 : Fail ---- Wed Aug 28 03:41:13 UTC 2019

count 35 : Fail ---- Wed Aug 28 03:41:15 UTC 2019

count 36 : Fail ---- Wed Aug 28 03:41:17 UTC 2019

count 37 : Fail ---- Wed Aug 28 03:41:19 UTC 2019

count 38 : Fail ---- Wed Aug 28 03:41:21 UTC 2019

count 39 : Fail ---- Wed Aug 28 03:41:23 UTC 2019

count 40 : Fail ---- Wed Aug 28 03:41:25 UTC 2019

count 41 : Fail ---- Wed Aug 28 03:41:27 UTC 2019

count 42 : Fail ---- Wed Aug 28 03:41:29 UTC 2019

count 43 : Fail ---- Wed Aug 28 03:41:31 UTC 2019

count 44 : Fail ---- Wed Aug 28 03:41:33 UTC 2019

count 45 : Fail ---- Wed Aug 28 03:41:35 UTC 2019

count 46 : OK Wed Aug 28 03:41:36 UTC 2019

count 47 : OK Wed Aug 28 03:41:37 UTC 2019

...

The MySQL downtime as seen by the application side was started from 03:41:09 until 03:41:36 which is around 27 seconds in total. From the RDS events, we can see the multi-AZ failover only happened 15 seconds after actual downtime:

Wed, 28 Aug 2019 03:41:24 GMT Multi-AZ instance failover started.

Wed, 28 Aug 2019 03:41:33 GMT DB instance restarted

Wed, 28 Aug 2019 03:41:59 GMT Multi-AZ instance failover completed.

Once the new database instance restarted around 03:41:33, the MySQL service was then accessible around 3 seconds later.

Failover with Amazon Aurora for MySQL

Amazon Aurora can be considered as a superior version of RDS, with a lot of notable features like faster replication with shared storage, no data loss during failover, and up to 64TB of a storage limit. Amazon Aurora for MySQL is based on the open source MySQL Edition, but is not open source by itself; it is a proprietary, closed-source database. It works similarly with MySQL replication (one and only one master, with multiple slaves) and failover is automatically handled by Amazon Aurora.

According to Amazon Aurora FAQS, if you have an Amazon Aurora Replica, in the same or a different Availability Zone, when failing over, Aurora flips the canonical name record (CNAME) for your DB Instance to point at the healthy replica, which is in turn is promoted to become the new primary. Start-to-finish, failover typically completes within 30 seconds.

If you do not have an Amazon Aurora Replica (i.e. single instance), Aurora will first attempt to create a new DB Instance in the same Availability Zone as the original instance. If unable to do so, Aurora will attempt to create a new DB Instance in a different Availability Zone. From start to finish, failover typically completes in under 15 minutes.

Your application should retry database connections in the event of connection loss.

After Amazon Aurora provisions your DB instance, you will get two endpoints one for the writer and one for the reader. The reader endpoint provides load-balancing support for read-only connections to the DB cluster. The following endpoints are taken from our test setup:

writer - aurora-sysbench.cluster-cw9j4kdnvun9.ap-southeast-1.rds.amazonaws.com
reader - aurora-sysbench.cluster-ro-cw9j4kdnvun9.ap-southeast-1.rds.amazonaws.com

In our test, we used the following Aurora specs:

Instance type: db.r5.large
MySQL version: 5.7.12
vCPU: 2
RAM: 16 GB
Multi-AZ Replication: Yes

To trigger a failover, simply pick the writer instance -> Actions -> Failover, as shown in the following screenshot:

The following output is reported by our application while connecting to the Aurora writer endpoint:

...

count 37 : OK Wed Aug 28 12:35:47 UTC 2019

count 38 : OK Wed Aug 28 12:35:48 UTC 2019

count 39 : Fail ---- Wed Aug 28 12:35:49 UTC 2019

count 40 : Fail ---- Wed Aug 28 12:35:50 UTC 2019

count 41 : Fail ---- Wed Aug 28 12:35:51 UTC 2019

count 42 : Fail ---- Wed Aug 28 12:35:52 UTC 2019

count 43 : Fail ---- Wed Aug 28 12:35:53 UTC 2019

count 44 : Fail ---- Wed Aug 28 12:35:54 UTC 2019

count 45 : Fail ---- Wed Aug 28 12:35:55 UTC 2019

count 46 : OK Wed Aug 28 12:35:56 UTC 2019

count 47 : OK Wed Aug 28 12:35:57 UTC 2019

...

The database downtime was started at 12:35:49 until 12:35:56 with total amount of 7 seconds. That's pretty impressive.

Looking at the database event from Aurora management console, only these two events happened:

Wed, 28 Aug 2019 12:35:50 GMT A new writer was promoted. Restarting database as a reader.

Wed, 28 Aug 2019 12:35:55 GMT DB instance restarted

It doesn't take much time for Aurora to promote a slave to become a master, and demote the master to become a slave. Note that all Aurora replicas share the same underlying volume with the primary instance and this means that replication can be performed in milliseconds as updates made by the primary instance are instantly available to all Aurora replicas. Therefore, it has minimal replication lag (Amazon claimed to be 100 milliseconds and less). This will greatly reduce the health check time and improve the recovery time significantly.

Failover with ClusterControl

In this example, we imitate a similar setup with Amazon RDS using m5.xlarge instances, with a ProxySQL in between to automate the failover from application using a single endpoint access just like RDS. The following diagram illustrates our architecture:

Since we are having direct access to the database instances, we would trigger an automatic failover by simply killing the MySQL process on the active master:

$ kill -9 $(pidof mysqld)

The above command triggered an automatic recovery inside ClusterControl:

[11:08:49]: Job Completed.

[11:08:44]: 10.15.3.141:3306: Flushing logs to update 'SHOW SLAVE HOSTS'

[11:08:39]: 10.15.3.141:3306: Flushing logs to update 'SHOW SLAVE HOSTS'

[11:08:39]: Failover Complete. New master is  10.15.3.141:3306.

[11:08:39]: Attaching slaves to new master.

[11:08:39]: 10.15.3.141:3306: Command 'RESET SLAVE /*!50500 ALL */' succeeded.

[11:08:39]: 10.15.3.141:3306: Executing 'RESET SLAVE /*!50500 ALL */'.

[11:08:39]: 10.15.3.141:3306: Successfully stopped slave.

[11:08:39]: 10.15.3.141:3306: Stopping slave.

[11:08:39]: 10.15.3.141:3306: Successfully stopped slave.

[11:08:39]: 10.15.3.141:3306: Stopping slave.

[11:08:38]: 10.15.3.141:3306: Setting read_only=OFF and super_read_only=OFF.

[11:08:38]: 10.15.3.141:3306: Successfully stopped slave.

[11:08:38]: 10.15.3.141:3306: Stopping slave.

[11:08:38]: Stopping slaves.

[11:08:38]: 10.15.3.141:3306: Completed preparations of candidate.

[11:08:38]: 10.15.3.141:3306: Applied 0 transactions. Remaining: .

[11:08:38]: 10.15.3.141:3306: waiting up to 4294967295 seconds before timing out.

[11:08:38]: 10.15.3.141:3306: Checking if the candidate has relay log to apply.

[11:08:38]: 10.15.3.141:3306: preparing candidate.

[11:08:38]: No errant transactions found.

[11:08:38]: 10.15.3.141:3306: Skipping, same as slave  10.15.3.141:3306

[11:08:38]: Checking for errant transactions.

[11:08:37]: 10.15.3.141:3306: Setting read_only=ON and super_read_only=ON.

[11:08:37]: 10.15.3.69:3306: Can't connect to MySQL server on '10.15.3.69' (115)

[11:08:37]: 10.15.3.69:3306: Setting read_only=ON and super_read_only=ON.

[11:08:37]: 10.15.3.69:3306: Failed to CREATE USER rpl_user. Error: 10.15.3.69:3306: Query  failed: Can't connect to MySQL server on '10.15.3.69' (115).

[11:08:36]: 10.15.3.69:3306: Creating user 'rpl_user'@'10.15.3.141.

[11:08:36]: 10.15.3.141:3306: Executing GRANT REPLICATION SLAVE 'rpl_user'@'10.15.3.69'.

[11:08:36]: 10.15.3.141:3306: Creating user 'rpl_user'@'10.15.3.69.

[11:08:36]: 10.15.3.141:3306: Elected as the new Master.

[11:08:36]: 10.15.3.141:3306: Slave lag is 0 seconds.

[11:08:36]: 10.15.3.141:3306 to slave list

[11:08:36]: 10.15.3.141:3306: Checking if slave can be used as a candidate.

[11:08:33]: 10.15.3.69:3306: Trying to shutdown the failed master if it is up.

[11:08:32]: 10.15.3.69:3306: Setting read_only=ON and super_read_only=ON.

[11:08:31]: 10.15.3.141:3306: Setting read_only=ON and super_read_only=ON.

[11:08:30]: 10.15.3.69:3306: Setting read_only=ON and super_read_only=ON.

[11:08:30]: 10.15.3.141:3306: ioerrno=2003 io running 0

[11:08:30]: Checking 10.15.3.141:3306

[11:08:30]: 10.15.3.69:3306: REPL_UNDEFINED

[11:08:30]: 10.15.3.69:3306

[11:08:30]: Failover to a new Master.

Job spec: Failover to a new Master.

While from our test application point-of-view, the downtime happened at the following time while connecting to ProxySQL host port 6033:

...

count 1 : OK Wed Aug 28 11:08:24 UTC 2019

count 2 : OK Wed Aug 28 11:08:25 UTC 2019

count 3 : OK Wed Aug 28 11:08:26 UTC 2019

count 4 : Fail ---- Wed Aug 28 11:08:28 UTC 2019

count 5 : Fail ---- Wed Aug 28 11:08:30 UTC 2019

count 6 : Fail ---- Wed Aug 28 11:08:32 UTC 2019

count 7 : Fail ---- Wed Aug 28 11:08:34 UTC 2019

count 8 : Fail ---- Wed Aug 28 11:08:36 UTC 2019

count 9 : Fail ---- Wed Aug 28 11:08:38 UTC 2019

count 10 : OK Wed Aug 28 11:08:39 UTC 2019

count 11 : OK Wed Aug 28 11:08:40 UTC 2019

...

By looking at both the recovery job events and the output from our application, the MySQL database node was down 4 seconds before the cluster recovery job starts, from 11:08:28 until 11:08:39, with total MySQL downtime of 11 seconds. One of the most impressive things about ClusterControl is, you can track the recovery progress on what action being taken and performed by ClusterControl during the failover. It provides a level of transparency that you won't be able to get with any database offerings by cloud providers.

For MySQL/MariaDB/PostgreSQL replication, ClusterControl allows you to have a more fine-grained against your databases with the support of the following advanced configuration and parameters:

Master-master replication topology management
Chain replication topology management
Topology viewer
Whitelist/Blacklist slaves to be promoted as master
Errant transaction checker
Pre/post, success/fail failover/switchover events hook with external script
Automatic rebuild slave on error
Scale out slave from existing backup

Failover Time Summary

In terms of failover time, Amazon RDS Aurora for MySQL is the clear winner with 7 seconds, followed by ClusterControl11 seconds and Amazon RDS for MySQL with 27 seconds.

Note that this is just a simple test, with one client and one transaction per second to measure the fastest recovery time. Large transactions or a lengthy recovery process can increase failover time e.g, long running transactions may take long time rolling back when shutting down MySQL.

Tags:

ProxySQL is a proven solution that helps database administrators dealing with the requirements for high availability of their databases. Because it is SQL-aware, it can also be used for shaping the traffic heading towards databases - you can route queries to the particular nodes, you can rewrite queries should that be needed, you can also throttle the traffic, implement SQL firewall, create a mirror of your traffic and send it to a separate hostgroup.

ProxySQL 2.0.5 natively supports Galera Cluster, MySQL Replication and MySQL Group Replication. Unfortunately it does not, by default, support AWS Aurora; but there is still a workaround you can use.

You may be asking yourself, why should I bother with ProxySQL when AWS provides me with an endpoint which will do the read-write split for me? That’s indeed the case but it is just the r/w split. ProxySQL, on the other hand, gives you an opportunity for not only separating reads from writes but also to take control of your database traffic. ProxySQL often can save your databases from being overloaded by just rewriting a single query.

ProxySQL 2.0.5 and AWS Aurora

Should you decide to give ProxySQL a try, there are a couple of steps you have to take. First, you will need an EC2 instance to install the ProxySQL on. Once you have the instance up and running, you can install the latest ProxySQL. We would recommend to use repository for that. You can set it up by following the steps in the documentation page: https://github.com/sysown/proxysql/wiki. For Ubuntu 16.04 LTS, which we used, you have to run:

apt-get install -y lsb-release

wget -O - 'https://repo.proxysql.com/ProxySQL/repo_pub_key' | apt-key add -

echo deb https://repo.proxysql.com/ProxySQL/proxysql-2.0.x/$(lsb_release -sc)/ ./ \

| tee /etc/apt/sources.list.d/proxysql.list

Then it’s time to install ProxySQL:

apt-get update

apt-get install proxysql

Then we have to verify that we do have the connectivity from our ProxySQL instance to AWS Aurora nodes. We will use direct endpoints for the connectivity.

We can easily test the connectivity using telnet to the correct endpoint on port 3306:

root@ip-10-0-0-191:~# telnet dbtest-instance-1.cqb1vho43rod.eu-central-1.rds.amazonaws.com 3306

Trying 10.0.0.53...

Connected to dbtest-instance-1.cqb1vho43rod.eu-central-1.rds.amazonaws.com.

Escape character is '^]'.

J

5.7.12_2>ZWP-&[Ov8NzJ:H#Mmysql_native_password^CConnection closed by foreign host.

First one looks good. We’ll proceed with the second Aurora node:

root@ip-10-0-0-191:~# telnet dbtest-instance-1-eu-central-1a.cqb1vho43rod.eu-central-1.rds.amazonaws.com 3306

Trying 10.0.1.90...

Connected to dbtest-instance-1-eu-central-1a.cqb1vho43rod.eu-central-1.rds.amazonaws.com.

Escape character is '^]'.

J

tr3'3rynMmysql_native_password^CConnection closed by foreign host.

Works great too. If you cannot connect to Aurora nodes you need to ensure that all the security bits are aligned properly: check the VPC configuration, see if ProxySQL node can access VPC of Aurora, check if security groups allow the traffic to pass through. AWS network security layer can be tricky to configure if you don’t have the experience but finally you should be able to make it work.

Having the connectivity sorted out we will need to create a user on Aurora. We will use that user for monitoring Aurora nodes in ProxySQL. First, we may have to install MySQL client on ProxySQL node:

root@ip-10-0-0-191:~# apt install mysql-client-core-5.7

Then we will use the endpoint of the cluster to connect to the writer and create user on it:

root@ip-10-0-0-191:~# mysql -h dbtest.cluster-cqb1vho43rod.eu-central-1.rds.amazonaws.com -u root -ppassword

mysql> CREATE USER 'monuser'@'10.0.0.191' IDENTIFIED BY 'mon1t0r';

Query OK, 0 rows affected (0.02 sec)

mysql> GRANT REPLICATION CLIENT ON *.* TO 'monuser'@'10.0.0.191';

Query OK, 0 rows affected (0.00 sec)

Having this done we can log into ProxySQL admin interface (by default on port 6032) to define the monitor user and its password.

root@ip-10-0-0-191:~# mysql -P6032 -u admin -padmin -h127.0.0.1

mysql> SET mysql-monitor_username='monuser';

Query OK, 1 row affected (0.00 sec)



mysql> SET mysql-monitor_password='mon1t0r';

Query OK, 1 row affected (0.00 sec)

mysql> LOAD MYSQL VARIABLES TO RUNTIME;

Query OK, 0 rows affected (0.00 sec)

mysql> SAVE MYSQL VARIABLES TO DISK;

Query OK, 116 rows affected (0.00 sec)

Now it’s time to define Aurora nodes in ProxySQL:

mysql> INSERT INTO mysql_servers (hostgroup_id, hostname) VALUES (10, 'dbtest-instance-1.cqb1vho43rod.eu-central-1.rds.amazonaws.com'), (20, 'dbtest-instance-1-eu-central-1a.cqb1vho43rod.eu-central-1.rds.amazonaws.com');

Query OK, 2 rows affected (0.01 sec)

As you can see, we use their direct endpoints as the hostname. Once this is done, we will use mysql_replication_hostgroup table to define reader and writer hostgroups. We will also have to pass the correct check type - by default ProxySQL looks for ‘read_only’ variable while Aurora uses ‘innodb_read_only’ to differentiate between the writer and readers.

mysql> SHOW CREATE TABLE mysql_replication_hostgroups\G

*************************** 1. row ***************************

       table: mysql_replication_hostgroups

Create Table: CREATE TABLE mysql_replication_hostgroups (

    writer_hostgroup INT CHECK (writer_hostgroup>=0) NOT NULL PRIMARY KEY,

    reader_hostgroup INT NOT NULL CHECK (reader_hostgroup<>writer_hostgroup AND reader_hostgroup>=0),

    check_type VARCHAR CHECK (LOWER(check_type) IN ('read_only','innodb_read_only','super_read_only')) NOT NULL DEFAULT 'read_only',

    comment VARCHAR NOT NULL DEFAULT '', UNIQUE (reader_hostgroup))

1 row in set (0.00 sec)



mysql> INSERT INTO mysql_replication_hostgroups VALUES (10, 20, 'innodb_read_only', 'Aurora');

Query OK, 1 row affected (0.00 sec)

mysql> LOAD MYSQL SERVERS TO RUNTIME;

Query OK, 0 rows affected (0.00 sec)

This is it, we can now see how ProxySQL configured the nodes in runtime configuration:

mysql> SELECT hostgroup_id, hostname, port  FROM runtime_mysql_servers;

+--------------+-----------------------------------------------------------------------------+------+

| hostgroup_id | hostname                                                                    | port |

+--------------+-----------------------------------------------------------------------------+------+

| 10           | | 3306 |

| 20           | dbtest-instance-1-eu-central-1a.cqb1vho43rod.eu-central-1.rds.amazonaws.com | 3306 |

| 20           | dbtest-instance-1.cqb1vho43rod.eu-central-1.rds.amazonaws.com               | 3306 |

+--------------+-----------------------------------------------------------------------------+------+

3 rows in set (0.00 sec)

As you can see, dbtest-instance-1.cqb1vho43rod.eu-central-1.rds.amazonaws.com is the writer. Let’s try the failover now:

mysql> SELECT hostgroup_id, hostname, port  FROM runtime_mysql_servers;

+--------------+-----------------------------------------------------------------------------+------+

| hostgroup_id | hostname                                                                    | port |

+--------------+-----------------------------------------------------------------------------+------+

| 10           | dbtest-instance-1-eu-central-1a.cqb1vho43rod.eu-central-1.rds.amazonaws.com | 3306 |

| 20           | dbtest-instance-1-eu-central-1a.cqb1vho43rod.eu-central-1.rds.amazonaws.com | 3306 |

| 20           | dbtest-instance-1.cqb1vho43rod.eu-central-1.rds.amazonaws.com               | 3306 |

+--------------+-----------------------------------------------------------------------------+------+

3 rows in set (0.00 sec)

As you can see, writer (hostgroup 10) has changed to the second node.

Conclusion

This is basically it - as you can see setting up AWS Aurora nodes in ProxySQL is pretty much simple process.

Tags:

Microsoft Azure is known to many as an alternative public cloud platform to Amazon AWS. It's not easy to directly compare these two giant companies. Microsoft's cloud business -- dubbed commercial cloud -- includes everything from Azure to Office 365 enterprise subscriptions to Dynamics 365 to LinkedIn services. After LinkedIn was acquired by Microsoft it began moving its infrastructure to Azure. While moving LinkedIn to Azure could take some time, it demonstrates Microsoft Azure’s capabilities and ability to handle millions of transactions. Microsoft's strong enterprise heritage, software stack, and data center tools offer both familiarity and a hybrid approach to cloud deployments.

Microsoft Azure is built as an Infrastructure as a Service (IaaS) as well as a Platform as a Service (PaaS). The Azure Virtual machine offers per-second billing and it's currently a multi-tenant compute. It has, however, recently previewed its new offering which allows virtual machines to run on single-tenant physical servers. The offering is called Azure Dedicated Hosts.

Azure also offers specialized large instances (such as for SAP HANA). There are multitenant blocks, file storage, and many other additional IaaS and PaaS capabilities. These include object storage (Azure Blob Storage), a CDN, a Docker-based container service (Azure Container Service), a batch computing service (Azure Batch), and event-driven “serverless computing” (Azure Functions). The Azure Marketplace offers third-party software and services. Colocation needs are met via partner exchanges (Azure ExpressRoute) offered from partners like Equinix and CoreSite.

With all of these offerings Microsoft Azure has stepped up its game to play a vital role in the public cloud market. The PaaS infrastructure offered to its consumers has garnered a lot of trust and many are moving their own infrastructure or private cloud to Microsoft Azure's public cloud infrastructure. This is especially advantageous for consumers who need integration with other Windows Services, such as Visual Studio.

So what’s different between Azure and the other clouds we have looked at in this series? Microsoft has focused heavily on AI, analytics, and the Internet of Things. AzureStack is another “cloud-meets-data center” effort that has been a real differentiator in the market.

Microsoft Azure Migration Pros & Cons

There are several things you should consider when moving your legacy applications or infrastructure to Microsoft Azure.

Strengths

Enterprises that are strategically committed to Microsoft technology generally choose Azure as their primary IaaS+PaaS provider. The integrated end-to-end experience for enterprises building .NET applications using Visual Studio (and related services) is unsurpassed. Microsoft is also leveraging its tremendous sales reach and ability to co-sell Azure with other Microsoft products and services in order to drive adoption.
Azure provides a well-integrated approach to edge computing and Internet of Things (IoT), with offerings that reach from its hyperscale data center out through edge solutions such as AzureStack and Data Box Edge.
Microsoft Azure’s capabilities have become increasingly innovative and open. 50% of the workloads are Linux-based alongside numerous open-source application stacks. Microsoft has a unique vision for the future that involves bringing in technology partners through native, first-party offerings such as those from VMware, NetApp, Red Hat, Cray and Databricks.

Cautions

Microsoft Azure’s reliability issues continue to be a challenge for customers, largely as a result of Azure’s growing pains. Since September 2018, Azure has had multiple service-impacting incidents, including significant outages involving Azure Active Directory. These outages leave customers with no ability to mitigate the downtime.
Gartner clients often experience challenges with executing on-time implementations within budget. This comes from Microsoft often providing unreasonably high expectations for customers. Much of this stems from the Microsoft’s field sales teams being “encouraged” to appropriately position and sell Azure within its customer base.
Enterprises frequently lament the quality of Microsoft technical support (along with the increasing cost of support) and field solution architects. This negatively impacts customer satisfaction, and slows Azure adoption and therefore customer spending.

Microsoft may not be your first choice as it has been seen as a “not-so-open-source-friendly” tech giant, but in fairness it has embraced a lot of activity and support within the Open Source world. Microsoft Azure offers fully-managed services to most of the top open source RDBMS database like PostgreSQL, MySQL, and MariaDB.

Galera Cluster (Percona, Codership, or MariaDB) variants, unfortunately, aren't supported by Azure. The only way you can deploy your Galera Cluster to Azure is by means of a Virtual Machine. You may also want to check their blog on using MariaDB Enterprise Cluster (which is based on Galera) on Azure.

Azure's Virtual Machine

Virtual Machine is the equivalent offering for compute instances in GCP and AWS. An Azure Virtual Machine is an on-demand, high-performance computing server in the cloud and can be deployed in Azure using various methods. These might include the user interface within the Azure portal, using pre-configured images in the Azure marketplace, scripting through Azure PowerShell, deploying from a template that is defined by using a JSON file, or by deploying directly through Visual Studio.

Azure uses a deployment model called the Azure Resource Manager (ARM), which defines all resources that form part of your overall application solution, allowing you to deploy, update, or delete your solution in a single operation.

Resources may include the storage account, network configurations, and IP addresses. You may have heard the term “ARM templates”, which essentially means the JSON template which defines the different aspects of your solution which you are trying to deploy.

Azure Virtual Machines come in different types and sizes, with names beginning with A-series to N-series. Each VM type is built with specific workloads or performance needs in mind, including general purpose, compute optimized, storage optimized or memory optimized. You can also deploy less common types like GPU or high performance compute VMs.

Similar to other public cloud offerings, you can do the following in your virtual machine instances...

Encrypt your disk on virtual machine. Although this does not come easily when compared to GCP and AWS. Encrypting your virtual machine requires a more manual approach. It requires you to complete the Azure Disk Encryption prerequisites. Since Galera does not support Windows, we're only talking here about Linux-based images. Basically, it requires you to have dm-crypt and vfat modules present in the system. Once you get that piece right, then you can encrypt the VM using the Azure CLI. You can check out how to Enable Azure Disk Encryption for Linux IaaS VMs to see how to do it. Encrypting your disk is very important, especially if your company or organization requires that your Galera Cluster data must follow the standards mandated by laws and regulations such as PCI DSS or GDPR.
Creating a snapshot. You can create a snapshot either using the Azure CLI or through the portal. Check their manual on how to do it.
Use auto scaling or Virtual Machine Scale Sets if you require horizontal scaling. Check out the overview of autoscaling in Azure or the overview of virtual machine scale sets.
Multi Zone Deployment. Deploy your virtual machine instances into different availability zones to avoid single-point of failure.

You can also create (or get information from) your virtual machines in different ways. You can use the Azure portal, Azure PowerShell, REST APIs, Client SDKs, or with the Azure CLI. Virtual machines in the Azure virtual network can also easily be connected to your organization’s network and treated as an extended datacenter.

Microsoft Azure Pricing

Just like other public cloud providers, Microsoft Azure also offers a free tier with some free services. It also offers pay-as-you-go options and reserved instances to choose from. Pay-as-you-go starts at $0.008/hour - $0.126/hour.

For reserved instances, the longer you commit and contract with Azure, the more you save on the cost. Microsoft Azure claims to help subscribers save up to 72% of their billing costs compared to its pay-as-you-go model when subscribers sign up for a one to three year term for a Windows or Linux Virtual Machine. Microsoft also offers added flexibility in the sense that if your business needs change, you can cancel your Azure RI subscription at any time and return the remaining unused RI to Microsoft as an early termination fee.

Let's checkout it's pricing in comparison between GCP, AWS EC2, and an Azure Virtual Machine. This is based on us-east1 region and we will compare the price ranges for the compute instances required to run your Galera Cluster.

Machine/
Instance
Type

Google
Compute Engine

AWS EC2

Microsoft
Azure

Shared

f1-micro

G1-small

Prices starts at $0.006 - $0.019 hourly

t2.nano – t3a.2xlarge

Price starts at $0.0058 - $0.3328 hourly

B-Series

Price starts at $0.0052 - $0.832 hourly

Standard

n1-standard-1 – n1-standard-96

Prices starts at $0.034 - $3.193 hourly

m4.large – m4.16xlarge

m5.large – m5d.metal

Prices starts at $0.1 - $5.424 hourly

Av2 Standard, D2-64 v3 latest generation, D2s-64s v3 latest generation, D1-5 v2, DS1-S5 v2, DC-series

Price starts at $0.043 - $3.072 hourly

High Memory/ Memory Optimized

n1-highmem-2 – n1-highmem-96

n1-megamem-96

n1-ultramem-40 – n1-ultramem-160

Prices starts at $0.083 - $17.651 hourly

r4.large – r4.16xlarge

x1.16xlarge – x1.32xlarge

x1e.xlarge – x1e.32xlarge

Prices starts at $0.133 - $26.688 hourly

D2a – D64a v3, D2as – D64as v3, E2-64 v3 latest generation, E2a – E64a v3, E2as – E64as v3, E2s-64s v3 latest generation, D11-15 v2, DS11-S15 v2, M-series, Mv2-series, Instances, Extreme Memory Optimized

Price starts at $0.043 - $44.62 hourly

High CPU/Storage Optimized

n1-highcpu-2 – n1-highcpu-32

Prices starts at $0.05 - $2.383 hourly

h1.2xlarge – h1.16xlarge

i3.large – i3.metal

I3en.large - i3en.metal

d2.xlarge – d2.8xlarge

Prices starts at $0.156 - $10.848 hourly

Fsv2-series, F-series, Fs-Series

Price starts at $0.0497 - $3.045 hourly

Data Encryption on Microsoft Azure

Microsoft Azure does not offer encryption support directly for Galera Cluster (or vice-versa). There are, however, ways you can encrypt data either at-rest or in-transit.

Encryption in-transit is a mechanism for protecting data when it's transmitted across networks. With Azure Storage, you can secure data by using:

Transport-level encryption, such as HTTPS, when you transfer data into or out of Azure Storage.
Wire encryption, such as SMB 3.0 encryption, for Azure file shares.
Client-side encryption, to encrypt the data before it's transferred into Storage and to decrypt the data after it is transferred out of Storage.

Microsoft uses encryption to protect customer data when it’s in-transit between customers realm and Microsoft cloud services. More specifically, Transport Layer Security (TLS) is the protocol that Microsoft’s data centers will use to negotiate with client systems that are connected to Microsoft cloud services.

Perfect Forward Secrecy (PFS) is also employed so that each connection between customers’ client systems and Microsoft’s cloud services use unique keys. Connections to Microsoft cloud services also take advantage of RSA based 2,048-bit encryption key lengths.

Encryption At-Rest

For many organizations, data encryption at-rest is a mandatory step towards achieving data privacy, compliance, and data sovereignty. Three Azure features provide encryption of data at-rest:

Storage Service Encryption is always enabled and automatically encrypts storage service data when writing it to Azure Storage. If your application logic requires your MySQL Galera Cluster database to store valuable data, then storing to Azure Storage can be an option.
Client-side encryption also provides the feature of encryption at-rest.
Azure Disk Encryption enables you to encrypt the OS disks and data disks that an IaaS virtual machine uses. Azure Disk Encryption also supports enabling encryption on Linux VMs that are configured with disk striping (RAID) by using mdadm, and by enabling encryption on Linux VMs by using LVM for data disks

Galera Cluster Multi-AZ/Multi-Region/Multi-Cloud Deployments with GCP

Similar to AWS and GCP, Microsoft Azure does not offer direct support for deploying a Galera Cluster onto a Multi-AZ/-Region/-Cloud. You can, however, deploy your nodes manually as well as creating scripts using PowerShell or Azure CLI to do this for you. Alternatively, when you provision your Virtual Machine instance you can place your nodes in different availability zones. Microsoft Azure also offers another type of redundancy, aside from having its availability zone, which is called Virtual Machine Scale Sets. You can check the differences between virtual machine and scale sets.

Galera Cluster High Availability, Scalability, and Redundancy on Azure

One of the primary reasons for using a Galera node cluster is for high-availability, redundancy, and for its ability to scale. If you are serving traffic globally, it's best that you cater your traffic by region. You should ensure your architectural design includes geo-distribution of your database nodes. In order to achieve this, multi-AZ, multi-region, or multi-cloud/multi-datacenter deployments are recommended. This prevents the cluster from going down as well as a malfunction due to lack of quorum.

As mentioned earlier, Microsoft Azure has an auto scaling solution which can be leveraged using scale sets. This allows you to autoscale a node when a certain threshold has been met (based on what you are monitoring). This depends on which health status items you are monitoring before it then vertically scales. You can check out their tutorial on this topic here.

For multi-region or multi-cloud deployments, Galera has its own parameter called gmcast.segment for which can be set upon server start. This parameter is designed to optimize the communication between the Galera nodes and minimize the amount of traffic sent between network segments. This includes writeset relaying and IST and SST donor selection. This type of setup allows you to deploy multiple nodes in different regions. Aside from that, you can also deploy your Galera nodes on a different cloud vendors routing from GCP, AWS, Microsoft Azure, or within an on-premise setup.

We recommend you to check out our blog Multiple Data Center Setups Using Galera Cluster for MySQL or MariaDB and Zero Downtime Network Migration With MySQL Galera Cluster Using Relay Node to gather more information on how to implement these types of deployments.

Galera Cluster Database Performance on Microsoft Azure

The underlying host machines used by virtual machines in Azure are, in fact, very powerful. The newest VM's in Azure have already been equipped with network optimization modules. You can check this in your kernel info by running (e.g. in Ubuntu).

uname -r|grep azure

Note: Make certain that your command has the "azure" string on it.

For Centos/RHEL, installing any Linux Integration Services (LIS) since version 4.2 contains network optimization. To learn more about this, visit the page on optimizing network throughput.

If your application is very sensitive to network latency, you might be interested in looking at the proximity placement group. It's currently in preview (and not yet recommended for production use) but this helps optimize your network throughput.

For the type of virtual machine you would consume, then this would depend on the requirement of your application traffic and resource demands. For queries that are high on memory consumption, you can start with Dv3. However, for memory-optimized, then start with the Ev3 series. For High CPU requirements, such as high-transactional database or gaming applications, then start with Fsv2 series.

Choosing the right storage and required IOPS for your database volume is a must. Generally, a SSD-based persistent disk is your ideal choice. Begin with Standard SSD which is cost-effective and offers consistent performance. This decision, however, might depend on if you need more IOPS in the long run. If this is the case, then you should go for Premium SSD storage.

We also recommend you to check and read our blog How to Improve Performance of Galera Cluster for MySQL or MariaDB to learn more about optimizing your Galera Cluster.

Database Backup for Galera Nodes on Azure

There's no existing naitve backup support for your MySQL Galera data in Azure, but you can take a snapshot. Microsoft Azure offers Azure VM Backup which takes a snapshot which can be scheduled and encrypted.

Alternatively, if you want to backup the data files from your Galera Cluster, you can also use external services like ClusterControl, use Percona Xtrabackup for your binary backup, or use mysqldump or mydumper for your logical backups. These tools provide backup copies for your mission-critical data and you can read this if you want to learn more.

Galera Cluster Monitoring on Azure

Microsoft Azure has its monitoring service named Azure Monitor. Azure Monitor maximizes the availability and performance of your applications by delivering a comprehensive solution for collecting, analyzing, and acting on telemetry from your cloud and on-premise environments. It helps you understand how your applications are performing and proactively identifies issues affecting them (and the resources they depend on). You can setup or create health alerts, get notified on advisories and alerts detected in the services you deployed.

If you want monitoring specific to your database, then you will need to utilize external monitoring tools which have advanced, highly-granular database metrics. There are several choices you can choose from such as PMM by Percona, DataDog, Idera, VividCortex, or our very own ClusterControl (Monitoring is FREE with ClusterControl Community.)

Galera Cluster Database Security on Azure

As discussed in our previous blogs for AWS and GCP, you can take the same approach for securing your database in the public cloud. Once you create a virtual machine, you can specify what ports only can be opened, or create and setup your Network Security Group in Azure. You can setup the ports need to be open (particularly ports 3306, 4444, 4567, 4568), or create a Virtual Network in Azure and specify the private subnets if they remain as a private node. To add this, if you setup your VM's in Azure without a public IP, it can still an outbound connection merely because it uses SNAT and PAT. If you're familiar with AWS and GCP, you'll like this explanation to make it easier to comprehend.

Another feature available is Role-Based Access Control in Microsoft Azure. This gives you control on which people that access to the specific resources they need.

In addition to this, you can secure your data-in-transit by using a TLS/SSL connection or by encrypting your data when it's at-rest. If you're using ClusterControl, deploying a secure data in-transit is simple and easy. You can check out our blog SSL Key Management and Encryption of MySQL Data in Transit if you want to try out. For data at-rest, you can follow the discussion I have stated earlier in the Encryption section of this blog.

Galera Cluster Troubleshooting

Microsoft Azure offers a wide array of log types to aid troubleshooting and auditing. The logs Activity logs, Azure diagnostics logs, Azure AD reporting, Virtual machines and cloud services, Network Security Group (NSG) flow logs, and Application insight are very useful when troubleshooting. It might not always be necessary to go into all of these when you need troubleshooting, however, it would add more insights and clues when checking the logs.

If you're using ClusterControl, going to Logs -> System Logs, and you'll be able to browse the captured error logs taken from the MySQL Galera node itself. Apart from this, ClusterControl provides real-time monitoring that would amplify your alarm and notification system in case an emergency or if your MySQL Galera node(s) is kaput.

Conclusion

As we finish this three part blog series, we have showed you the offerings and the advantages of each of the tech-giants serving the public cloud industry. There are advantages and disadvantages when selecting one over the other, but what matters most is your reason for moving to a public cloud, its benefits for your organization, and how it serves the requirements of your application.

The choice of provider for your Galera Cluster may involve financial considerations like “what's most cost-efficient” and better suits your budgetary needs. It could also be due to privacy laws and regulation compliance, or even because of the technology stack you are wanting to use. What's important is how your application and database will perform once it's in the cloud handling large amounts of traffic. It has to be highly-available, must be resilient, has the right levels of scalability and redundancy, and takes backups to ensure data recovery.

Tags:

galera

Drupal is a Content Management System (CMS) designed to create everything from tiny to large corporate websites. Over 1,000,000 websites run on Drupal and it is used to make many of the websites and applications you use every day (including this one). Drupal has a great set of standard features such as easy content authoring, reliable performance, and excellent security. What sets Drupal apart is its flexibility as modularity is one of its core principles.

Drupal is also a great choice for creating integrated digital frameworks. You can extend it with the thousands of add-ons available. These modules expand Drupal's functionality. Themes let you customize your content's presentation and distributions (Drupal bundles) are bundles which you can use as starter-kits. You can use all these functionalities to mix and match to enhance Drupal's core abilities or to integrate Drupal with external services. It is content management software that is powerful and scalable.

Drupal uses databases to store its web content. When your Drupal-based website or application is experiencing a large amount of traffic it can have an impact on your database server. When you are in this situation you'll require load balancing, high availability, and a redundant architecture to keep your database online.

When I started researching this blog, I realized there are many answers to this issue online, but the solutions recommended were very dated. This could be a result of the increase in market share by WordPress resulting in a smaller open source community. What I did find were some examples on implementing high availability by using Master/Master (High Availability) or Master/Master/Slave (High Availability/High Performance).

Drupal offers support for a wide array of databases, but it was initially designed using MySQL variants. Though using MySQL is fully supported, there are better approaches you can implement. Implementing these other approaches, however, if not done properly, can cause your website to experience large amounts of downtime, cause your application to suffer performance issues, and may result in write issues to your slaves. Performing maintenance would also be difficult as you need failover to apply the server upgrades or patches (hardware or software) without downtime. This is especially true if you have a large amount of data, causing a potential major impact to your business.

These are situations you don't want to happen which is why in this blog we’ll discuss how you can implement database failover for your MySQL or PostgreSQL databases.

Why Does Your Drupal Website Need Database Failover?

From Wikipedia“failover is switching to a redundant or standby computer server, system, hardware component or network upon the failure or abnormal termination of the previously active application, server, system, hardware component, or network. Failover and switchover are essentially the same operation, except that failover is automatic and usually operates without warning, while switchover requires human intervention.”

In database operations, switchover is also a term used for manual failover, meaning that it requires a person to operate the failover. Failover comes in handy for any admin as it isolates unwanted problems such as accidental deletes/dropping of tables, long hours of downtime causing business impact, database corruption, or system-level corruption.

Database Failover consists of more than a single database node, either physically or virtually. Ideally, since failover requires you to do switching over to a different node, you might as well switch to a different database server, if a host is running multiple database instances on a single host. That still can be either switchover or failover, but typically it's more of redundancy and high-availability in case a catastrophe occurs on that current host.

MySQL Failover for Drupal

Performing a failover for your Drupal-based application requires that the data handled by the database does not differentiate, nor separate. There are several solutions available, and we have already discussed some of them in previous Severalnines blogs. You may likely want to read our Introduction to Failover for MySQL Replication - the 101 Blog.

The Master-Slave Switchover

The most common approaches for MySQL Failover is using the master-slave switch over or the manual failover. There are two approaches you can do here:

You can implement your database with a typical asynchronous master-slave replication.
or can implement with asynchronous master-slave replication using GTID-based replication.

Switching to another master could be quicker and easier. This can be done with the following MySQL syntax:

mysql> SET GLOBAL read_only = 1; /* enable read-only */

mysql> CHANGE MASTER TO MASTER_HOST = '<hostname-or-ip>', MASTER_USER = '<user>', MASTER_PASSWORD = '<password>', MASTER_LOG_FILE = '<master-log-file>', MASTER_LOG_POS=<master_log_position>; /* master information to connect */

mysql> START SLAVE; /* start replication */

mysql> SHOW SLAVE STATUS\G /* check replication status */

or with GTID, you can simply do,

...

mysql> CHANGE MASTER TO MASTER_HOST = '<hostname-or-ip>', MASTER_USER = '<user>', MASTER_PASSWORD = '<password>', MASTER_AUTO_POSITION = 1; /* master information to connect */

...

Wit

Using the non-GTID approach requires you to determine first the master's log file and master's log pos. You can determine this by looking at the master's status in the master node before switching over.

mysql> MASTER STATUS;

You may also consider hardening your server adding sync_binlog = 1 and innodb_flush_log_at_trx_commit = 1 as, in the event your master crashes, you'll have a higher chance that transactions from master will be insync with your slave(s). In such a case that promoted master has a higher chance of being a consistent datasource node.

This, however, may not be the best approach for your Drupal database as it could impose long downtimes if not performed correctly, such as being taken down abruptly. If your master database node experiences a bug resulting in a database to crash, you’ll need your application to point to another database waiting on standby as your new master or by having your slave promoted to be the master. You will need to specify exactly which node should take over and then determine the lag and consistency of that node. Achieving this is not as easy as just doing SET GLOBAL read_only=1; CHANGE MASTER TO… (etc), there are certain situations which require deeper analysis, looking at the viable transactions required to be present in that standby server or promoted master, to get it done.

Drupal Failover Using MHA

One of the most common tools for automatic and manual failover is MHA. It has been around for a long while now and is still used by many organizations. You can checkout these previous blogs we have on the subject, Top Common Issues with MHA and How to Fix Them or MySQL High Availability Tools - Comparing MHA, MRM and ClusterControl.

Drupal Failover Using Orchestrator

Orchestrator has been widely adopted now and is being used by large organizations such as Github and Booking.com. It not only allows you to manage a failover, but also topology management, host discovery, refactoring, and recovery. There's a nice external blog here which I found it very useful to learn about its failover mechanism with Orchestrator. It's a two part blog series; part one and part two.

Drupal Failover Using MaxScale

MaxScale is not just a load balancer designed for MariaDB server, it also extends high availability, scalability, and security for MariaDB while, at the same time, simplifying application development by decoupling it from underlying database infrastructure. If you are using MariaDB, then MaxScale could be a relevant technology for you. Check out our previous blogs on how you can use the MaxScale failover mechanism.

Drupal Failover Using ClusterControl

Severalnines' ClusterControl offers a wide array of database management and monitoring solutions. Part of the solutions we offer is automatic failover, manual failover, and cluster/node recovery. This is very helpful as if it acts as your virtual database administrator, notifying you in real-time in case your cluster is in “panic mode,” all while the recovery is being managed by the system. You can check out this blog How to Automate Database Failover with ClusterControl to learn more about ClusterControl failover.

Drupal PostgreSQL Failover

Since Drupal supports PostgreSQL, we will also checkout the tools to implement a failover or switchover process for PostgreSQL. PostgreSQL uses built-in Streaming Replication, however you can also set it to use a Logical Replication (added as a core element of PostgreSQL in version 10).

Drupal Failover Using pg_ctlcluster

If your environment is Ubuntu, using pg_ctlcluster is a simple and easy way to achieve failover. For example, you can just run the following command:

$ pg_ctlcluster 9.6 pg_7653 promote

or with RHEL/Centos, you can use the pg_ctl command just like,

$ sudo -iu postgres /usr/lib/postgresql/9.6/bin/pg_ctl promote -D  /data/pgsql/slave/data

server promoting

You can also trigger failover of a log-shipping standby server by creating a trigger file with the filename and path specified by the trigger_file in the recovery.conf.

You have to be careful with standby promotion or slave promotion here as you might have to ensure that only one master is accepting the read-write request. This means that, while doing the switchover, you might have to ensure the previous master has been relaxed or stopped.

Taking care of switchover or manual failover from primary to standby server can be fast, but it requires some time to re-prepare the failover cluster. Regularly switching from primary to standby is a useful practice as it allows for regular downtime on each system for maintenance. This also serves as a test of the failover mechanism, to ensure that it will really work when you need it. Written administration procedures are always advised.

Drupal PostgreSQL Automatic Failover

Instead of a manual approach, you might require automatic failover. This is especially needed when a server goes down due to hardware failure or virtual machine corruption. You may also require an application to automatically perform the failover to lessen the downtime of your Drupal application. We'll now go over some of these tools which can be utilized for automatic failover.

Drupal Failover Using Patroni

Patroni is a template for you to create your own customized, high-availability solution using Python and - for maximum accessibility - a distributed configuration store like ZooKeeper, etcd, Consul or Kubernetes. Database engineers, DBAs, DevOps engineers, and SREs who are looking to quickly deploy HA PostgreSQL in the datacenter-or anywhere else-will hopefully find it useful

Drupal Failover Using Pgpool

Pgpool-II is a proxy software that sits between the PostgreSQL servers and a PostgreSQL database client. Aside from having an automatic failover, it has multiple features that includes connection pooling, load balancing, replication, and limiting the exceeding connections. You can read more about this tool is our three part blog; part one, part two, part three.

Drupal Failover Using pglookout

pglookout is a PostgreSQL replication monitoring and failover daemon. pglookout monitors the database nodes, their replication status, and acts according to that status. For example, calling a predefined failover command to promote a new master in the case the previous one goes missing.

pglookout supports two different node types, ones that are installed on the db nodes themselves and observer nodes that can be installed anywhere. The purpose of having pglookout on the PostgreSQL DB nodes is to monitor the replication status of the cluster and act accordingly, the observers have a more limited remit: they just observe the cluster status to give another viewpoint to the cluster state.

Drupal Failover Using repmgr

repmgr is an open-source tool suite for managing replication and failover in a cluster of PostgreSQL servers. It enhances PostgreSQL's built-in hot-standby capabilities with tools to set up standby servers, monitor replication, and perform administrative tasks such as failover or manual switchover operations.

repmgr has provided advanced support for PostgreSQL's built-in replication mechanisms since they were introduced in 9.0. The current repmgr series, repmgr 4, supports the latest developments in replication functionality introduced from PostgreSQL 9.3 such as cascading replication, timeline switching and base backups via the replication protocol.

Drupal Failover Using ClusterControl

ClusterControl supports automatic failover for PostgreSQL. If you have an incident, your slave can be promoted to master status automatically. With ClusterControl you can also deploy standalone, replicated, or clustered PostgreSQL database. You can also easily add or remove a node with a single action.

Additional Solutions For Drupal Failover

While the tools I have mentioned earlier definitely handles the solution for your problems with failover, adding some tools that makes the failover pretty easier, safer, and has a total isolation between your database layer can be satisfactory.

Drupal Failover Using ProxySQL

With ProxySQL, you can just point your Drupal websites or applications to the ProxySQL server host and it will designate which node will receive writes and which nodes will receive the reads. The magic happens transparently within the TCP layer and no changes are needed for your application/website configuration. In addition to that, ProxySQL acts also as your load balancer for your write and read requests for your database traffic. This is only applicable if you are using MySQL database variants.

Drupal Failover Using HAProxy with Keepalived

Using HAProxy and Keepalived adds more high availability and redundancy within your Drupal's database. If you want to failover, it can be done without your application knowing what's happening within your database layer. Just point your application to the vrrp IP that you setup in your Keepalived and everything will be handled with total isolation from your application. Having an automatic failover will be handled transparently and unknowingly by your application so no changes has to occur once, for example, a disaster has occurred and a recovery or failover was applied. The good thing about this setup is that it is applicable for both MySQL and PostgreSQL databases. I suggest you check out our blog PostgreSQL Load Balancing Using HAProxy & Keepalived to learn more about how to do this.

All of the options above are supported by ClusterControl. You can deploy or import the database and then deploy ProxySQL, MaxScale, or HAProxy & Keepalived. Everything will be managed, monitored, and will be set up automatically without any further configuration needed by your end. It all happens in the background and automatically creates a ready-for-production.

Conclusion

Having an always-on Drupal website or application, especially if you are expecting a large amount of traffic, can be complicated to create. If you have the right tools, the right setup, and the right technology stack, however, it is possible to achieve high availability and redundancy.

And if you don’t? Well then ClusterControl will set it up and maintain it for you. Alternatively, you can create a setup using the technologies mentioned in this blog, most of which are open source, free tools that would cater to your needs.

Tags:

There are numerous cloud providers these days. They can be small or large, local or with data centers spread across the whole world. Many of these cloud providers offer some kind of a managed relational database solution. The databases supported tend to be MySQL or PostgreSQL or some other flavor of relational database.

When designing any kind of database infrastructure it is important to understand your business needs and decide what kind of availability you would need to achieve.

In this blog post, we will look into high availability options for MySQL-based solutions from one of the largest cloud providers - Google Cloud Platform.

Deploying a Highly Available Environment Using GCP SQL Instance

For this blog we want is a very simple environment - one database, with maybe one or two replicas. We want to be able to failover easily and restore operations as soon as possible if the master fails. We will use MySQL 5.7 as the version of choice and start with the instance deployment wizard:

We then have to create the root password, set the instance name, and determine where it should be located:

Next, we will look into the configuration options:

We can make changes in terms of the instance size (we will go with db-n1-standard-4), storage, and maintenance schedule. What is most important for us in this setup are the high availability options:

Here we can choose to have a failover replica created. This replica will be promoted to a master should the original master fail.

After we deploy the setup, let’s add a replication slave:

Once the process of adding the replica is done, we are ready for some tests. We are going to run test workload using Sysbench on our master, failover replica, and read replica to see how this will work out. We will run three instances of Sysbench, using the endpoints for all three types of nodes.

Then we will trigger the manual failover via the UI:

Testing MySQL Failover on Google Cloud Platform?

I have got to this point without any detailed knowledge of how the SQL nodes in GCP work. I did have some expectations, however, based on previous MySQL experience and what I’ve seen in the other cloud providers. For starters, the failover to the failover node should be very quick. What we would like is to keep the replication slaves available, without the need for a rebuild. We would also like to see how fast we can execute the failover a second time (as it is not uncommon that the issue propagates from one database to another).

What we determined during our tests...

While failing over, the master became available again in 75 - 80 seconds.
Failover replica was not available for 5-6 minutes.
Read replica was available during the failover process, but it became unavailable for 55 - 60 seconds after the failover replica became available

What we’re not sure about...

What is happening when the failover replica is not available? Based on the time, it looks like the failover replica is being rebuilt. This makes sense, but then the recovery time would be strongly related to the size of the instance (especially I/O performance) and the size of the data file.

What is happening with read replica after the failover replica would have been rebuilt? Originally, the read replica was connected to the master. When the master failed, we would expect the read replica to provide an outdated view of the dataset. Once the new master shows up, it should reconnect via replication to the instance (which used to be failover replica and which has been promoted to master). There is no need for a minute of downtime when CHANGE MASTER is being executed.

More importantly, during the failover process there is no way to execute another failover (which sort of makes sense):

It is also not possible to promote read replica (which not necessarily makes sense - we would expect to be able to promote read replicas at any time).

It is important to note, relying on the read replicas to provide high availability (without creating a failover replica) is not a viable solution. You can promote a read replica to become a master, however a new cluster would be created; detached from the rest of the nodes.

There is no way to slave your other replicas off the new cluster. The only way to do this would be to create new replicas, but this is a time-consuming process. It is also virtually non-usable, making the failover replica to be the only real option for high availability for SQL nodes in Google Cloud Platform.

Conclusion

While it is possible to create a highly-available environment for SQL nodes in GCP, the master will not be available for roughly a minute and a half. The whole process (including rebuilding the failover replica and some actions on the read replicas) took several minutes. During that time we weren’t able to trigger an additional failover, nor we we able to promote a read replica.

Do we have any GCP users out there? How are you achieving high availability?

Tags:

We have recently written several blogs covering how different cloud providers handle database failover. We compared failover performance in Amazon Aurora, Amazon RDS and ClusterControl, tested the failover behavior in Amazon RDS, and also on Google Cloud Platform. While those services provide great options when it comes to failover, they may not be right for every application.

In this blog post we will spend a bit of time analysing the pros and cons of using the DBaaS solutions compared with designing an environment manually or by using a database management platform, like ClusterControl.

Implementing High Availability Databases with Managed Solutions

The primary reason to use existing solutions is ease of use. You can deploy a highly available solution with automated failover in just a couple of clicks. There’s no need for combining different tools together, managing the databases by hand, deploying tools, writing scripts, designing the monitoring, or any other database management operations. Everything is already in place. This can seriously reduce the learning curve and requires less experience to set up a highly-available environment for the databases; allowing basically everyone to deploy such setups.

In most of the cases with these solutions, the failover process is executed within a reasonable time. It may be blazing fast as with Amazon Aurora or somewhat slower as with Google Cloud Platform SQL nodes. For the majority of the cases, these types of results are acceptable.

The bottom line. If you can accept 30 - 60 seconds of downtime, you should be ok using any of the DBaaS platforms.

The Downside of Using a Managed Solution for HA

While DBaaS solutions are simple to use, they also come with some serious drawbacks. For starters, there is always a vendor lock-in component to consider. Once you deploy a cluster in Amazon Web Services it is quite tricky to migrate out of that provider. There are no easy methods to download the full dataset through a physical backup. With most providers, only manually executed logical backups are available. Sure, there are always options to achieve this, but it is typically a complex, time-consuming process, which still may require some downtime after all.

Using a provider like Amazon RDS also comes with limitations. Some actions cannot be easily performed which would be very simple to accomplish on environments deployed in a fully user-controlled manner (e.g. AWS EC2). Some of these limitations have already been covered in other blogs, but to summarize is that no DBaaS service gives you the same level of flexibility as regular MySQL GTID-based replication. You can promote any slave, you can re-slave every node off any other...virtually every action is possible. With tools like RDS you face design-induced limitations you cannot bypass.

The problem is also with an ability to understand performance details. When you design your own highly available setup, you become knowledgeable about potential performance issues that may show up. On the other hand, RDS and similar environments are pretty much “black boxes.” Yes, we have learned that Amazon RDS uses DRBD to create a shadow copy of the master, we know that Aurora uses shared, replicated storage to implement very fast failovers. That’s just a general knowledge. We cannot tell what are the performance implications of those solutions other than what we might casually notice. What are common issues associated with them? How stable are those solutions? Only the developers behind the solution know for sure.

What is the Alternative to DBaaS Solutions?

You may wonder, is there an alternative to DBaaS? After all, it is so convenient to run the managed service where you can access most of the typical actions via UI. You can create and restore backups, failover is handled automatically for you. The environment is easy-to-use which can be compelling for companies who do not have dedicated and experienced staff for dealing with databases.

ClusterControl provides a great alternative to cloud-based DBaaS services. It provides you with a graphical user interface, which can be used to deploy, manage, and monitor open source databases.

In couple of clicks you can easily deploy a highly-available database cluster, with automated failover (faster than most of the DBaaS offerings), backup management, advanced monitoring, and other features like integration with external tools (e.g. Slack or PagerDuty) or upgrade management. All this while completely avoiding vendor lock-in.

ClusterControl doesn’t care where your databases are located as long as it can connect to them using SSH. You can have setups in cloud, on-prem, or in a mixed environment of multiple cloud providers. As long as connectivity is there, ClusterControl will be able to manage the environment. Utilizing the solutions you want (and not the ones that you are not familiar nor aware of) allows you to take full control over the environment at any point in time.

Whatever setup you deployed with ClusterControl, you can easily manage it in a more traditional, manual or scripted way. ClusterControl even provides you with command line interface, which will let you incorporate tasks executed by ClusterControl into your shell scripts. You have all the control you want - nothing is a black box, every piece of the environment would be built using open source solutions combined together and deployed by ClusterControl.

Let’s take a look at how easily you can deploy a MySQL Replication cluster using ClusterControl. Let’s assume you have the environment prepared with ClusterControl installed on one instance and all other nodes accessible via SSH from ClusterControl host.

We will start with picking the “Deploy” wizard.

At the first step we have to define how ClusterControl should connect to the nodes on which databases are to be deployed. Both root access or sudo (with or without the password) are supported.

Then, we want to pick a vendor, version and pass the password for the administrative user in our MySQL database.

Finally, we want to define the topology for our new cluster. As you can see, this is already quite complex setup, unlike something you can deploy using AWS RDS or GCP SQL node.

All we have to do now is to wait for the process to complete. ClusterControl will do its best to understand the environment it is deploying to and install required set of packages, including the database itself.

Once the cluster is up-and-running, you can proceed with deploying the proxy layer (which will provide your application with a single point of entry into the database layer). This is more or less what happens behind the scenes with DBaaS, where you also have endpoints to connect to the database cluster. It is quite common to use a single endpoint for writes and multiple endpoints for reaching particular replicas.

Here we will use ProxySQL, which will do the dirty work for us - it will understand the topology, sends writes only to the master and load balance read-only queries across all replicas that we have.

To deploy ProxySQL we will go to Manage -> Load Balancers.

Add Database Load Balancer ClusterControl

We have to fill all required fields: hosts to deploy on, credentials for the administrative and monitoring user, we may import existing user from MySQL into ProxySQL or create a new one. All the details about ProxySQL can be easily found in multiple blogs in our blog section.

We want at least two ProxySQL nodes to be deployed to ensure high-availability. Then, once they are deployed, we will deploy Keepalived on top of ProxySQL. This will ensure that Virtual IP will be configured and pointing to one of the ProxySQL instances, as long as there will be at least one healthy node.

Here is the only potential problem if you go with cloud environments where routing works in a way that you cannot easily bring up a network interface. In such case you will have to modify the configuration of Keepalived, introduce ‘notify_master’ script and use a script, which will make the necessary IP changes - in case of EC2 it would have to detach Elastic IP from one host and attach it to the other host.

There are plenty of instructions on how to do that using widely-tested open source software in setups deployed by ClusterControl. You can easily find additional information, tips, and how-to’s which are relevant to your particular environment.

Database Cluster Topology with Load Balancer

Conclusion

We hope you found this blog post insightful. If you would like to test ClusterControl, it comes with a 30 day enterprise trial where you have available all the features. You can download it for free and test if it fits in your environment.

Tags:

As soon as you start running a database server and your usage grows, you are exposed to many types of technical problems, performance degradation, and database malfunctions. Each of these could lead to much bigger problems, such as catastrophic failure or data loss. It’s like a chain reaction, where one thing can lead to another, causing more and more issues. Proactive countermeasures must be performed in order for you to have a stable environment as long as possible.

In this blog post, we are going to look at a bunch of cool features offered by ClusterControl that can greatly help us troubleshoot and fix our MySQL database issues when they happen.

Database Alarms and Notifications

For all undesired events, ClusterControl will log everything under Alarms, accessible on the Activity (Top Menu) of ClusterControl page. This is commonly the first step to start troubleshooting when something goes wrong. From this page, we can get an idea on what is actually going on with our database cluster:

The above screenshot shows an example of a server unreachable event, with severity CRITICAL, detected by two components, Network and Node. If you have configured the email notifications setting, you should get a copy of these alarms in your mailbox.

When clicking on the “Full Alarm Details,” you can get the important details of the alarm like hostname, timestamp, cluster name and so on. It also provides the next recommended step to take. You can also send out this alarm as an email to other recipients configured under the Email Notification Settings.

You may also opt to silence an alarm by clicking the “Ignore Alarm” button and it will not appear in the list again. Ignoring an alarm might be useful if you have a low severity alarm and know how to handle or work around it. For example if ClusterControl detects a duplicate index in your database, where in some cases would be needed by your legacy applications.

By looking at this page, we can obtain an immediate understanding of what is going on with our database cluster and what the next step is to do to solve the problem. As in this case, one of the database nodes went down and became unreachable via SSH from the ClusterControl host. Even a beginner SysAdmin would now know what to do next if this alarm appears.

Centralized Database Log Files

This is where we can drill down what was wrong with our database server. Under ClusterControl -> Logs -> System Logs, you can see all log files related to the database cluster. As for MySQL-based database cluster, ClusterControl pulls the ProxySQL log, MySQL error log and backup logs:

Click on "Refresh Log" to retrieve the latest log from all hosts that are accessible at that particular time. If a node is unreachable, ClusterControl will still view the outdated log in since this information is stored inside the CMON database. By default ClusterControl keeps retrieving the system logs every 10 minutes, configurable under Settings -> Log Interval.

ClusterControl will trigger the job to pull the latest log from each server, as shown in the following "Collect Logs" job:

A centralized view of log file allows us to have faster understanding on what went wrong. For a database cluster which commonly involves multiple nodes and tiers, this feature will greatly improve the log reading where a SysAdmin can compare these logs side-by-side and pinpoint critical events, reducing the total troubleshooting time.

Web SSH Console

ClusterControl provides a web-based SSH console so you can access the DB server directly via the ClusterControl UI (as the SSH user is configured to connect to the database hosts). From here, we can gather much more information which allows us to fix the problem even faster. Everyone knows when a database issue hits the production system, every second of downtime counts.

To access the SSH console via web, simply pick the nodes under Nodes -> Node Actions -> SSH Console, or simply click on the gear icon for a shortcut:

Due to security concern that might be imposed with this feature, especially for multi-user or multi-tenant environment, one can disable it by going to /var/www/html/clustercontrol/bootstrap.php on ClusterControl server and set the following constant to false:

define('SSH_ENABLED', false);

Refresh the ClusterControl UI page to load the new changes.

Database Performance Issues

Apart from monitoring and trending features, ClusterControl proactively sends you various alarms and advisors related to database performance, for example:

Excessive usage - Resource that passes certain thresholds like CPU, memory, swap usage and disk space.
Cluster degradation - Cluster and network partitioning.
System time drift - Time difference among all nodes in the cluster (including ClusterControl node).
Various other MySQL related advisors:
- Replication - replication lag, binlog expiration, location and growth
- Galera - SST method, scan GRA logfile, cluster address checker
- Schema check - Non-transactional table existance on Galera Cluster.
- Connections - Threads connected ratio
- InnoDB - Dirty pages ratio, InnoDB log file growth
- Slow queries - By default ClusterControl will raise an alarm if it finds a query running for more than 30 seconds. This is of course configurable under Settings -> Runtime Configuration -> Long Query.
- Deadlocks - InnoDB transactions deadlock and Galera deadlock.
- Indexes - Duplicate keys, table without primary keys.

Check out the Advisors page under Performance -> Advisors to get the details of things that can be improved as suggested by ClusterControl. For every advisor, it provides justifications and advice as shown in the following example for "Checking Disk Space Usage" advisor:

When a performance issue occurs you will get "Warning" (yellow) or "Critical" (red) status on these advisors. Further tuning is commonly required to overcome the problem. Advisors raise alarms, which means, users will get a copy of these alarms inside the mailbox if Email Notifications are configured accordingly. For every alarm raised by ClusterControl or its advisors, users will also get an email if the alarm has been cleared. These are pre-configured within ClusterControl and require no initial configuration. Further customization is always possible under Manage -> Developer Studio. You can check out this blog post on how to write your own advisor.

ClusterControl also provides a dedicated page in regards to database performance under ClusterControl -> Performance. It provides all sorts of database insights following the best-practices like centralized view of DB Status, Variables, InnoDB status, Schema Analyzer, Transaction Logs. These are pretty self-explanatory and straightforward to understand.

For query performance, you can inspect Top Queries and Query Outliers, where ClusterControl highlights queries which performed significantly differ from their average query. We have covered this topic in detail in this blog post, MySQL Query Performance Tuning.

Database Error Reports

ClusterControl comes with an error report generator tool, to collect debugging information about your database cluster to help understand the current situation and status. To generate an error report, simply go to ClusterControl -> Logs -> Error Reports -> Create Error Report:

The generated error report can be downloaded from this page once ready. This generated report will be in TAR ball format (tar.gz) and you may attach it to a support request. Since the support ticket has the limit of 10MB of file size, if the tarball size is bigger than that, you could upload it into a cloud drive and only share with us the download link with proper permission. You may remove it later once we already got the file. You can also generate the error report via command line as explained in the Error Report documentation page.

In the event of an outage, we highly recommend that you generate multiple error reports during and right after the outage. Those reports will be very useful to try to understand what went wrong, the consequences of the outage, and to verify that the cluster is in-fact back to operational status after a disastrous event.

Conclusion

ClusterControl proactive monitoring, together with a set of troubleshooting features, provide an efficient platform for users to troubleshoot any kind of MySQL database issues. Long gone is the legacy way of troubleshooting where one has to open multiple SSH sessions to access multiple hosts and execute multiple commands repeatedly in order to pinpoint the root cause.

If the above mentioned features are not helping you in solving the problem or troubleshooting the database issue, you always contact the Severalnines Support Team to back you up. Our 24/7/365 dedicated technical experts are available to attend your request at anytime. Our average first reply time is usually less than 30 minutes.

Tags:

clustercontrol

database performance

performance management

performance monitoring

alerts

MariaDB

database troubleshooting

troubleshooting

MySQL Galera Cluster 4.0 is the new kid on the database block with very interesting new features. Currently it is available only as a part of MariaDB 10.4 but in the future it will work as well with MySQL 5.6, 5.7 and 8.0. In this blog post we would like to go over some of the new features that came along with Galera Cluster 4.0.

Galera Cluster Streaming Replication

The most important new feature in this release is streaming replication. So far the certification process for the Galera Cluster worked in a way that whole transactions had to be certified after they completed.

This process was not ideal in several scenarios...

Hotspots in tables, rows which are very frequently updated on multiple nodes - hundreds of fast transactions running on multiple nodes, modifying the same set of rows result in frequent deadlocks and rollback of transactions
Long running transactions - if a transaction takes significant time to complete, this seriously increases chances that some other transaction, in the meantime, on another node, may modify some of the rows that were also updated by the long transaction. This resulted in a deadlock during certification and one of the transactions having to be rolled back.
Large transactions - if a transaction modifies a significant number of rows, it is likely that another transaction, at the same time, on a different node, will modify one of the rows already modified by the large transaction. This results in a deadlock during certification and one of the transactions has to be rolled back. In addition to this, large transactions will take additional time to be processed, sent to all nodes in the cluster and certified. This is not an ideal situation as it adds delay to commits and slows down the whole cluster.

Luckily, streaming replication can solve these problems. The main difference is that the certification happens in chunks where there is no need to wait for the whole transaction to complete. As a result, even if a transaction is large or long, majority (or all, depending on the settings we will discuss in a moment) of rows are locked on all of the nodes, preventing other queries from modifying them.

MySQL Galera Cluster Streaming Replication Options

There are two configuration options for streaming replication:

wsrep_trx_fragment_size

This tells how big a fragment should be (by default it is set to 0, which means that the streaming replication is disabled)

wsrep_trx_fragment_unit

This tells what the fragment really is. By default it is bytes, but it can also be a ‘statements’ or ‘rows’.

Those variables can (and should) be set on a session level, making it possible for user to decide which particular query should be replicated using streaming replication. Setting unit to ‘statements’ and size to 1 allow, for example, to use streaming replication just for a single query which, for example, updates a hotspot.

You can configure Galera 4.0 to certify every row that you have modified and grab the locks on all of the nodes while doing so. This makes streaming replication great at solving problems with frequent deadlocks which, prior to Galera 4.0, were possible to solve only by redirecting all writes to a single node.

WSREP Tables

Galera 4.0 introduces several tables, which will help to monitor the state of the cluster:

wsrep_cluster
wsrep_cluster_members
wsrep_streaming_log

All of them are located in the ‘mysql’ schema. wsrep_cluster will provide insight into the state of the cluster. wsrep_cluster_members will give you information about the nodes that are part of the cluster. wsrep_streaming_log helps to track the state of the streaming replication.

Galera Cluster Upcoming Features

Codership, the company behind the Galera, isn’t done yet. We were able to get a preview of the roadmap from CEO, Seppo Jaakola which was given at Percona Live earlier this year. Apparently, we are going to see features like XA transaction support and gcache encryption. This is really good news.

Support for XA transactions will be possible thanks to the streaming replication. In short, XA transactions are the distributed transactions which can run across multiple nodes. They utilize two-phase commit, which requires to first acquire all required locks to run the transaction on all of the nodes and then, once it is done, commit the changes. In previous versions Galera did not have means to lock resources on remote nodes, with streaming replication this has changed.

Gcache is a file which stores writesets. Its contents are sent to joiner nodes which asks for a data transfer. If all data is stored in the gcache, joiner will receive just the missing transactions in the process called Incremental State Transfer (IST). If gcache does not contain all required data, State Snapshot Transfer (SST) will be required and the whole dataset will have to be transferred to the joining node.

Gcache contains information about recent changes, therefore it’s great to see its contents encrypted for better security. With better security standards being introduced through more and more regulations, it is crucial that the software will become better at achieving compliance.

Conclusion

We are definitely looking forward to see how Galera Cluster 4.0 will work out on databases than MariaDB. Being able to deploy MySQL 5.7 or 8.0 with Galera Cluster will be really great. After all, Galera is one of the most widely tested synchronous replication solutions that are available on the market.

Tags:

If you are managing a production database, chances are high that you’ve had to clone your database to a different server other than the production server. The basic method of creating a clone is to restore a database from a recent backup onto another database server. Another method is by replicating from a source database while it is still running, in which case it is important that the original database be unaffected by any cloning procedure.

Why Would You Need to Clone a Database?

A cloned database cluster is useful in a number of scenarios:

Troubleshoot your cloned production cluster in the safety of your test environment while performing destructive operations on the database.
Patch/upgrade test of a cloned database to validate the upgrade process before applying it to the production cluster.
Validate backup & recovery of a production cluster using a cloned cluster.
Validate or test new applications on a cloned production cluster before deploying it on the live production cluster.
Quickly clone the database for audit or information compliance requirements for example by quarter or year end where the content of the database must not be changed.
A reporting database can be created at intervals in order to avoid data changes during the report generations.
Migrate a database to new servers, new deployment environment or a new data center.

When running your database infrastructure on the cloud, the cost of owning a host (shared or dedicated virtual machine) is significantly lower compared to the traditional way of renting space in a datacenter or owning a physical server. Furthermore, most of the cloud deployment can be automated easily via provider APIs, client software and scripting. Therefore, cloning a cluster can be a common way to duplicate your deployment environment for example, from dev to staging to production or vice versa.

We haven't seen this feature being offered by anyone in the market thus it is our privilege to showcase how it works with ClusterControl.

Cloning a MySQL Galera Cluster

One of the cool features in ClusterControl is it allows you to quickly clone, an existing MySQL Galera Cluster so you have an exact copy of the dataset on the other cluster. ClusterControl performs the cloning operation online, without any locking or bringing downtime to the existing cluster. It's like a cluster scale out operation except both clusters are independent to each other after the syncing completes. The cloned cluster does not necessarily need to be as the same cluster size as the existing one. We could start with one-node cluster, and scale it out with more database nodes at a later stage.

In this example, we are having a cluster called "Staging" that we would want to clone as another cluster called "Production". The premise is the staging cluster already stored the necessary data that is going to be in production soon. The production cluster consists of another 3 nodes, with production specs.

The following diagram summarizes final architecture of what we want to achieve:

How to Clone Your Database - ClusterControl

The first thing to do is to set up a passwordless SSH from ClusterControl server to the production servers. On ClusterControl server run the following:

$ whoami

root

$ ssh-copy-id root@prod1.local

$ ssh-copy-id root@prod2.local

$ ssh-copy-id root@prod3.local

Enter the root password of the target server if prompted.

From ClusterControl database cluster list, click on the Cluster Action button and choose Clone Cluster. The following wizard will appear:

Specify the IP addresses or hostnames of the new cluster and make sure you get all the green tick icon next to the specified host. The green icon means ClusterControl is able to connect to the host via passwordless SSH. Click on the "Clone Cluster" button to start the deployment.

The deployment steps are:

Create a new cluster consists of one node.
Sync the new one-node cluster via SST. The donor is one of the source servers.
The remaining new nodes will be joining the cluster after the donor of the cloned cluster is synced with the cluster.

Once done, a new MySQL Galera Cluster will be listed under ClusterControl cluster dashboard once the deployment job completes.

Note that the cluster cloning only clones the database servers and not the whole stack of the cluster. This means, other supporting components related to the cluster like load balancers, virtual IP address, Galera arbitrator or asynchronous slave are not going to be cloned by ClusterControl. Nevertheless, if you would like to clone as an exact copy of your existing database infrastructure, you can achieve that with ClusterControl by deploying those components separately after the database cloning operation completes.

Creating a Database Cluster from a Backup

Another similar feature offered by ClusterControl is "Create Cluster from Backup". This feature is introduced in ClusterControl 1.7.1, specifically for Galera Cluster and PostgreSQL clusters where one can create a new cluster from the existing backup. Contratory to cluster cloning, this operation does not bring additional load to the source cluster with a tradeoff of the cloned cluster will not be at the current state as the source cluster.

In order to create cluster from a backup, you must have a working backup created. For Galera Cluster, all backup methods are supported while for PostgreSQL, only pgbackrest is not supported for new cluster deployment. From ClusterControl, a backup can be created or scheduled easily under ClusterControl -> Backups -> Create Backup. From the list of the created backup, click on Restore backup, choose the backup from the list and choose to "Create Cluster from Backup" from the restoration option:

In this example, we are going to deploy a new PostgreSQL Streaming Replication cluster for staging environment, based on the existing backup we have in the production cluster. The following diagram illustrates the final architecture:

Database Backup Restoration with ClusterControl

The first thing to do is to set up a passwordless SSH from ClusterControl server to the production servers. On ClusterControl server run the following:

$ whoami

root

$ ssh-copy-id root@prod1.local

$ ssh-copy-id root@prod2.local

$ ssh-copy-id root@prod3.local

When you chooseCreate Cluster From Backup, ClusterControl will open a deployment wizard dialog to assist you on setting up the new cluster:

A new PostgreSQL Streaming Replication instance will be created from the selected backup, which will be used as the base dataset for the new cluster. The selected backup must be accessible from the nodes in the new cluster, or stored in the ClusterControl host.

Clicking on "Continue" will open the standard database cluster deployment wizard:

Create Database Cluster from Backup - ClusterControl

Note that the root/admin user password for this cluster must the same as the PostgreSQL admin/root password as included in the backup. Follow the configuration wizard accordingly and ClusterControl then perform the deployment on the following order:

Install necessary softwares and dependencies on all PostgreSQL nodes.
Start the first node.
Stream and restore backup on the first node.
Configure and add the rest of the nodes.

Once done, a new PostgreSQL Replication Cluster will be listed under ClusterControl cluster dashboard once the deployment job completes.

Conclusion

ClusterControl allows you to clone or copy a database cluster to multiple environments with just a number of clicks. You can download it for free today. Happy cloning!

Tags:

It is quite common to see databases distributed across multiple geographical locations. One scenario for doing this type of setup is for disaster recovery, where your standby data center is located in a separate location than your main datacenter. It might as well be required so that the databases are located closer to the users.

The main challenge to achieving this setup is by designing the database in a way that reduces the chance of issues related to the network partitioning. One of the solutions might be to use Galera Cluster instead of regular asynchronous (or semi-synchronous) replication. In this blog we will discuss the pros and cons of this approach. This is the first part in a series of two blogs. In the second part we will design the geo-distributed Galera Cluster and see how ClusterControl can help us deploy such environment.

Why Galera Cluster Instead of Asynchronous Replication for Geo-Distributed Clusters?

Let’s consider the main differences between the Galera and regular replication. Regular replication provides you with just one node to write to, this means that every write from remote datacenter would have to be sent over the Wide Area Network (WAN) to reach the master. It also means that all proxies located in the remote datacenter will have to be able to monitor the whole topology, spanning across all data centers involved as they have to be able to tell which node is currently the master.

This leads to the number of problems. First, multiple connections have to be established across the WAN, this adds latency and slows down any checks that proxy may be running. In addition, this adds unnecessary overhead on the proxies and databases. Most of the time you are interested only in routing traffic to the local database nodes. The only exception is the master and only because of this proxies are forced to watch the whole infrastructure rather than just the part located in the local datacenter. Of course, you can try to overcome this by using proxies to route only SELECTs, while using some other method (dedicated hostname for master managed by DNS) to point the application to master, but this adds unnecessary levels of complexity and moving parts, which could seriously impact your ability to handle multiple node and network failures without losing data consistency.

Galera Cluster can support multiple writers. Latency is also a factor, as all nodes in the Galera cluster have to coordinate and communicate to certify writesets, it can even be the reason you may decide not to use Galera when latency is too high. It is also an issue in replication clusters - in replication clusters latency affects only writes from the remote data centers while the connections from the datacenter where master is located would benefit from a low latency commits.

In MySQL Replication you also have to take the worst case scenario in mind and ensure that the application is ok with delayed writes. Master can always change and you cannot be sure that all the time you will be writing to a local node.

Another difference between replication and Galera Cluster is the handling of the replication lag. Geo-distributed clusters can be seriously affected by lag: latency, limited throughput of the WAN connection, all of this will impact the ability of a replicated cluster to keep up with the replication. Please keep in mind that replication generates one to all traffic.

All slaves have to receive whole replication traffic - the amount of data you have to send to remote slaves over WAN increases with every remote slave that you add. This may easily result in the WAN link saturation, especially if you do plenty of modifications and WAN link doesn’t have good throughput. As you can see on the diagram above, with three data centers and three nodes in each of them master has to sent 6x the replication traffic over WAN connection.

With Galera cluster things are slightly different. For starters, Galera uses flow control to keep the nodes in sync. If one of the nodes start to lag behind, it has an ability to ask the rest of the cluster to slow down and let it catch up. Sure, this reduces the performance of the whole cluster, but it is still better than when you cannot really use slaves for SELECTs as they tend to lag from time to time - in such cases the results you will get might be outdated and incorrect.

Another feature of Galera Cluster, which can significantly improve its performance when used over WAN, are segments. By default Galera uses all to all communication and every writeset is sent by the node to all other nodes in the cluster. This behavior can be changed using segments. Segments allow users to split Galera cluster in several parts. Each segment may contain multiple nodes and it elects one of them as a relay node. Such node receives writesets from other segments and redistribute them across Galera nodes local to the segment. As a result, as you can see on the diagram above, it is possible to reduce the replication traffic going over WAN three times - just two “replicas” of the replication stream are being sent over WAN: one per datacenter compared to one per slave in MySQL Replication.

Galera Cluster Network Partitioning Handling

Where Galera Cluster shines is the handling of the network partitioning. Galera Cluster constantly monitors the state of the nodes in the cluster. Every node attempts to connect with its peers and exchange the state of the cluster. If subset of nodes is not reachable, Galera attempts to relay the communication so if there is a way to reach those nodes, they will be reached.

An example can be seen on the diagram above: DC 1 lost the connectivity with DC2 but DC2 and DC3 can connect. In this case one of the nodes in DC3 will be used to relay data from DC1 to DC2 ensuring that the intra-cluster communication can be maintained.

Galera Cluster is able to take actions based on the state of the cluster. It implements quorum - majority of the nodes have to be available in order for the cluster to be able to operate. If node gets disconnected from the cluster and cannot reach any other node, it will cease to operate.

As can be seen on the diagram above, there’s a partial loss of the network communication in DC1 and affected node is removed from the cluster, ensuring that the application will not access outdated data.

This is also true on a larger scale. The DC1 got all of its communication cut off. As a result, whole datacenter has been removed from the cluster and neither of its nodes will serve the traffic. The rest of the cluster maintained majority (6 out of 9 nodes are available) and it reconfigured itself to keep the connection between DC 2 and DC3. In the diagram above we assumed the write hits the node in DC2 but please keep in mind that Galera is capable of running with multiple writers.

MySQL Replication does not have any kind of cluster awareness, making it problematic to handle network issues. It cannot shut down itself upon losing connection with other nodes. There is no easy way of preventing old master to show up after the network split.

The only possibilities are limited to the proxy layer or even higher. You have to design a system, which would try to understand the state of the cluster and take necessary actions. One possible way is to use cluster-aware tools like Orchestrator and then run scripts that would check the state of the Orchestrator RAFT cluster and, based on this state, take required actions on the database layer. This is far from ideal because any action taken on a layer higher than the database, adds additional latency: it makes possible so the issue shows up and data consistency is compromised before correct action can be taken. Galera, on the other hand, takes actions on the database level, ensuring the fastest reaction possible.

Tags:

galera

mariadb cluster

In the previous blog in the series we discussed the pros and cons of using Galera Cluster to create geo-distributed cluster. In this post we will design a Galera-based geo-distributed cluster and we will show how you can deploy all the required pieces using ClusterControl.

Designing a Geo-Distributed Galera Cluster

We will start with explaining the environment we want to build. We will use three remote data centers, connected via Wide Area Network (WAN). Each datacenter will receive writes from local application servers. Reads will also be only local. This is intended to avoid unnecessary traffic crossing the WAN.

For this setup the connectivity is in place and secured, but we won’t describe exactly how this can be achieved. There are numerous methods to secure the connectivity starting from proprietary hardware and software solutions through OpenVPN and ending up on SSH tunnels.

We will use ProxySQL as a loadbalancer. ProxySQL will be deployed locally in each datacenter. It will also route traffic only to the local nodes. Remote nodes can always be added manually and we will explain cases where this might be a good solution. Application can be configured to connect to one of the local ProxySQL nodes using round-robin algorithm. We can as well use Keepalived and Virtual IP to route the traffic towards the single ProxySQL node, as long as a single ProxySQL node would be able to handle all of the traffic.

Another possible solution is to collocate ProxySQL with application nodes and configure the application to connect to the proxy on the localhost. This approach works quite well under the assumption that it is unlikely that ProxySQL will not be available yet the application would work ok on the same node. Typically what we see is either node failure or network failure, which would affect both ProxySQL and application at the same time.

Geo-Distributed MySQL Galera Cluster with ProxySQL

The diagram above shows the version of the environment, where ProxySQL is collocated on the same node as the application. ProxySQL is configured to distribute the workload across all Galera nodes in the local datacenter. One of those nodes would be picked as a node to send the writes to while SELECTs would be distributed across all nodes. Having one dedicated writer node in a datacenter helps to reduce the number of possible certification conflicts, leading to, typically, better performance. To reduce this even further we would have to start sending the traffic over the WAN connection, which is not ideal as the bandwidth utilization would significantly increase. Right now, with segments in place, only two copies of the writeset are being sent across datacenters - one per DC.

The main concern with Galera Cluster geo-distributed deployments is latency. This is something you always have to test prior launching the environment. Am I ok with the commit time? At every commit certification has to happen so writesets have to be sent and certified on all nodes in the cluster, including remote ones. It may be that the high latency will deem the setup unsuitable for your application. In that case you may find multiple Galera clusters connected via asynchronous replication more suitable. This would be a topic for another blog post though.

Deploying a Geo-Distributed Galera Cluster Using ClusterControl

To clarify things, we will show here how a deployment may look like. We won’t use actual multi-DC setup, everything will be deployed in a local lab. We assume that the latency is acceptable and the whole setup is viable. What is great about ClusterControl is that it is infrastructure-agnostic. It doesn’t care if the nodes are close to each other, located in the same datacenter or if the nodes are distributed across multiple cloud providers. As long as there is SSH connectivity from ClusterControl instance to all of the nodes, the deployment process looks exactly the same. That’s why we can show it to you step by step using just local lab.

Installing ClusterControl

First, you have to install ClusterControl. You can download it for free. After registering, you should access the page with guide to download and install ClusterControl. It is as simple as running a shell script. Once you have ClusterControl installed, you will be presented with a form to create an administrative user:

Once you fill it, you will be presented with a Welcome screen and access to deployment wizards:

We’ll go with deploy. This will open a deployment wizard:

We will pick MySQL Galera. We have to pass SSH connectivity details - either root user or sudo user are supported. On the next step we are to define servers in the cluster.

We are going to deploy three nodes in one of the data centers. Then we will be able to extend the cluster, configuring new nodes in different segments. For now all we have to do is to click on “Deploy” and watch ClusterControl deploying the Galera cluster.

Our first three nodes are up and running, we can now proceed to adding additional nodes in other datacenters.

You can do that from the action menu, as shown on the screenshot above.

Here we can add additional nodes, one at a time. What is important, you should change the Galera segment to non-zero (0 is used for the initial three nodes).

After a while we end up with all nine nodes, distributed across three segments.

ClusterControl Geo-Distributed Database Nodes

Now, we have to deploy proxy layer. We will use ProxySQL for that. You can deploy it in ClusterControl via Manage -> Load Balancer:

This opens a deployment field:

First, we have to decide where to deploy ProxySQL. We will use existing Galera nodes but you can type anything in the field so it is perfectly possible to deploy ProxySQL on top of the application nodes. In addition, you have to pass access credentials for the administrative and monitoring user.

Then we have to either pick one of existing users in MySQL or create one right now. We also want to ensure that the ProxySQL is configured to use Galera nodes located only in the same datacenter.

When you have one ProxySQL ready in the datacenter, you can use it as a source of the configuration:

This has to be repeated for every application server that you have in all datacenters. Then the application has to be configured to connect to the local ProxySQL instance, ideally over the Unix socket. This comes with the best performance and the lowest latency.

After the last ProxySQL is deployed, our environment is ready. Application nodes connect to local ProxySQL. Each ProxySQL is configured to work with Galera nodes in the same datacenter:

Conclusion

We hope this two-part series helped you to understand the strengths and weaknesses of geo-distributed Galera Clusters and how ClusterControl makes it very easy to deploy and manage such cluster.

Tags:

galera

mariadb cluster

clustercontrol

Backups are a very important part of your database operations, as your business must be secured when catastrophe strikes. When that time comes (and it will), your Recovery Point Objective (RPO) and Recovery Time Objective (RTO) should be predefined, as this is how fast you can recover from the incident which occurred.

Most organizations vary their approach to backups, trying to have a combination of server image backups (snapshots), logical and physical backups. These backups are then stored in multiple locations, so as to avoid any local or regional disasters. It also means that the data can be restored in the shortest amount of time, avoiding major downtime which can impact your company's business.

Hosting your database with a cloud provider, such as Microsoft Azure (which we will discuss in this blog), is not an exception, you still need to prepare and define your disaster recovery policy.

Like other public cloud offerings, Microsoft Azure (Azure) offers an approach for backups that is practical, cost-effective, and designed to provide you with recovery options. Microsoft Azure backup solutions allow you to configure and operate and are easily handled using their Azure Backup or through the Restore Services Vault (if you are operating your database using virtual machines).

If you want a managed database in the cloud, Azure offers Azure Database for MySQL. This should be used only if you do not want to operate and manage the MySQL database yourself. This service offers a rich solution for backup which allows you to create a backup of your database instance, either from a local region or through a geo-redundant location. This can be useful for data recovery. You may even be able to restore a node from a specific period of time, which is useful in achieving point-in-time recovery. This can be done with just one click.

In this blog, we will cover all of these backup and restore scenarios using a MySQL database on the Microsoft Azure cloud.

Performing Backups on a Virtual Machine on Azure

Unfortunately, Microsoft Azure does not offer a MySQL-specific backup type solution (e.g. MySQL Enterprise Backup, Percona XtraBackup, or MariaDB's Mariabackup).

Upon creation of your Virtual Machine (using the portal), you can setup a process to backup your VM using the Restore Services vault. This will guard you from any incident, disaster, or catastrophe and the data stored is encrypted by default. Adding encryption is optional and, though recommended by Azure, it comes with a price. You can take a look at their Azure Backup Pricing page for more details.

To create and setup a backup, go to the left panel and click All Resources → Compute → Virtual Machine. Now set the parameters required in the text fields. Once you are on that page, go to the Management tab and scroll down below. You'll be able to see how you can setup or create the backup. See the screenshot below:

Then setup your backup policy based on your backup requirements. Just hit the Create New link in the Backup policy text field to create a new policy. See below:

You can configure your backup policy with retention by week, monthly, and yearly.

Once you have your backup configured, you can check that you have a backup enabled on that particular virtual machine you have just created. See the screenshot below:

Restore and Recover Your Virtual Machine on Azure

Designing your recovery in Azure depends on what kind of policy and requirements your application requires. It also depends on whether RTO and RPO must be low or invisible to the user in case an incident or during maintenance. You may setup your virtual machine with an availability set or on a different availability zone to achieve a higher recovery rate.

You may also setup a disaster recovery for your VM to replicate your virtual machines to another Azure region for business continuity and disaster recovery needs. However, this might not be a good idea for your organization as it comes with a high cost. If in place, Azure offers you an option to restore or create a virtual machine from the backup created.

For example, during the creation of your virtual machine, you can go to Disks tab, then go to Data Disks. You can create or attach an existing disk where you can attach the snapshot you have available. See the screenshot below for which you'll be able to choose from snapshot or storage blob:

You may also restore on a specific point in time just like in the screenshot below:

Restoring in Azure can be done in different ways, but it uses the same resources you have already created.

For example, if you have created a snapshot or a disk image stored in the Azure Storage blob, if you create a new VM, you can use that resource as long as it's compatible and available to use. Additionally, you may even be able to do some file recovery, aside from restoring a VM just like in the screenshot below:

During File Recovery, you may be able to choose from a specific recovery point, as well as download a script to browse and recover files. This is very helpful when you need only a specific file but not the whole system or disk volume.

Restoring from backup on an existing VM takes about three minutes. However, restoring from backup to spawn a new VM takes twelve minutes. This, however, could depend on the size of your VM and the network bandwidth available in Azure. The good thing is that, when restoring, it will provide you with details of what has been completed and how much time is remaining. For example, see the screenshot below:

Backups for Azure Database For MySQL

Azure Database for MySQL is a fully-managed database service by Microsoft Azure. This service offers a very flexible and convenient way to setup your backup and restore capabilities.

Upon creation of your MySQL server instance, you can then setup backup retention and create your backup redundancy options; either locally redundant (local region) or geo-redundant (on a different region). Azure will provide you the estimated cost you would be charged for a month. See a sample screenshot below:

Keep in mind that geo-redundant backup options are only available on General Purpose and Memory Optimized types of compute nodes. It's not available on a Basic compute node, but you can have your redundancy in the local region (i.e. within the availability zones available).

Once you have a master setup, it's easy to create a replica by going to Azure Database for MySQL servers → Select your MyQL instance → Replication → and click Add Replica. Your replica can be used as the source or restore target when needed.

Keep in mind that in Azure, when you stop the replication between the master and a replica, this will be forever and irreversible as it makes the replica a standalone server. A replica created using Microsoft Azure is ideally a managed instance and you can stop and start the replication threads just like what you do on a normal master-slave replication. You can do a restart and that's all. If you created the replica manually, by either restoring from the master or a backup, (e.g. via a point-in-time recovery), then you'll be able to stop/start the replication threads or setup a slave lag if needed.

Restoring Your Azure Database For MySQL From A Backup

Restoring is very easy and quick using the Azure portal. You can just hit the restore button with your MySQL instance node and just follow the UI as shown in the screenshot below:

Then you can select a period of time and create/spawn a new instance based on this backup captured:

Once you have the node available, this node will not be a replica of the master yet. You need to manually set this up with easy steps using their stored procedures available:

CALL mysql.az_replication_change_master('<master_host>', '<master_user>', '<master_password>', 3306, '<master_log_file>', <master_log_pos>, '<master_ssl_ca>');

where,

master_host: hostname of the master server

master_user: username for the master server

master_password: password for the master server

master_log_file: binary log file name from running show master status

master_log_pos: binary log position from running show master status

master_ssl_ca: CA certificate’s context. If not using SSL, pass in empty string.

Then starting the MySQL threads is as follows,

CALL mysql.az_replication_start;

or you can stop the replication threads as follows,

CALL mysql.az_replication_stop;

or you can remove the master as,

CALL mysql.az_replication_remove_master;

or skip SQL thread errors as,

CALL mysql.az_replication_skip_counter;

As mentioned earlier, when a replica is created using Microsoft Azure under the Add Replica feature under a MySQL instance, these specific stored procedures aren't available. However, the mysql.az_replication_restart procedure will be available since you are not allowed to stop nor start the replication threads of a managed replica by Azure. So the example we have above was restored from a master which takes the full copy of the master but acts as a single node and needs a manual setup to be a replica of an existing master.

Additionally, when you have a manual replica that you have setup, you will not be able to see this under Azure Database for MySQL servers → Select your MyQL instance → Replication since you created or setup the replication manually.

Alternative Cloud and Restore Backup Solutions

There are certain scenarios where you want to have full-access when taking a full backup of your MySQL database in the cloud. To do this you can create your own script or use open-source technologies. With these you can control how the data in your MySQL database should be backed up and precisely how it should be stored.

You can also leverage Azure Command Line Interface (CLI) to create your custom automation. For example, you can create a snapshot using the following command with Azure CLI:

az snapshot create  -g myResourceGroup -source "$osDiskId" --name osDisk-backup

or create your MySQL server replica with the following command:

az mysql server replica create --name mydemoreplicaserver --source-server mydemoserver --resource-group myresourcegroup

Alternatively, you can also leverage an enterprise tool that features ways to take your backup with restore options. Using open-source technologies or 3rd party tools requires knowledge and skills to leverage and create your own implementation. Here's the list you can leverage:

ClusterControl - While we may be a little biased, ClusterControl offers the ability to manage physical and logical backups of your MySQL database using battle-tested, open-source technologies (PXB, Mariabackup, and mydumper). It supports MySQL, Percona, MariaDB, Galera databases. You can easily create our backup policy and store your database backups on any cloud (AWS, GCP, or Azure) Please note that the free version of ClusterControl does not include the backup features.
LVM Snapshots - You can use LVM to take a snapshot of your logical volume. This is only applicable for your VM since it requires access to block-level storage. Using this tool requires caveat since it can bring your database node unresponsive while the backup is running.
Percona XtraBackup (PXB) - An open source technology from Percona. With PXB, you can create a physical backup copy of your MySQL database. You can also do a hot-backup with PXB for InnoDB storage engine but it's recommended to run this on a slave or non-busy MySQL db server. This is only applicable for your VM instance since it requires binary or file access to the database server itself.
Mariabackup - Same with PXB, it's an open-source technology forked from PXB but is maintained by MariaDB. Specifically, if your database is using MariaDB, you should use Mariabackup in order to avoid incompatibility issues with tablespaces.
mydumper/myloader - These backup tools creates a logical backup copies of your MySQL database. You can use this with your Azure database for MySQL though I haven't tried how successful is this for your backup and restore procedure.
mysqldump - it's a logical backup tool which is very useful when you need to backup and dump (or restore) a specific table or database to another instance. This is commonly used by DBA's but you need to pay attention of your disks space as logical backup copies are huge compared to physical backups.
MySQL Enterprise Backup - It delivers hot, online, non-blocking backups on multiple platforms including Linux, Windows, Mac & Solaris. It's not a free backup tool but offers a lot of features.
rsync - It's a fast and extraordinarily versatile file copying tool. It can copy locally, to/from another host over any remote shell, or to/from a remote rsync daemon. It offers a large number of options that control every aspect of its behavior and permit very flexible specification of the set of files to be copied. Mostly in Linux systems, rsync is installed as part of the OS package.

Tags:

Streaming Replication is a new feature which was introduced with the 4.0 release of Galera Cluster. Galera uses replication synchronously across the entire cluster, but before this release write-sets greater than 2GB were not supported. Streaming Replication allows you to now replicate large write-sets, which is perfect for bulk inserts or loading data to your database.

In a previous blog we wrote about Handling Large Transactions with Streaming Replication and MariaDB 10.4, but as of writing this blog Codership had not yet released their version of the new Galera Cluster. Percona has, however, released their experimental binary version of Percona XtraDB Cluster 8.0 which highlights the following features...

Streaming Replication supporting large transactions
The synchronization functions allow action coordination (wsrep_last_seen_gtid, wsrep_last_written_gtid, wsrep_sync_wait_upto_gtid)
More granular and improved error logging. wsrep_debug is now a multi-valued variable to assist in controlling the logging, and logging messages have been significantly improved.
Some DML and DDL errors on a replicating node can either be ignored or suppressed. Use the wsrep_ignore_apply_errors variable to configure.
Multiple system tables help find out more about the state of the cluster state.
The wsrep infrastructure of Galera 4 is more robust than that of Galera 3. It features a faster execution of code with better state handling, improved predictability, and error handling.

What's New With Galera Cluster 4.0?

The New Streaming Replication Feature

With Streaming Replication, transactions are replicated gradually in small fragments during transaction processing (i.e. before actual commit, we replicate a number of small size fragments). Replicated fragments are then applied in slave threads, preserving the transaction’s state in all cluster nodes. Fragments hold locks in all nodes and cannot be conflicted later.

Galera SystemTables

Database Administrators and clients with access to the MySQL database may read these tables, but they cannot modify them as the database itself will make any modifications needed. If your server doesn’t have these tables, it may be that your server is using an older version of Galera Cluster.

#> show tables from mysql like 'wsrep%';

+--------------------------+

| Tables_in_mysql (wsrep%) |

+--------------------------+

| wsrep_cluster            |

| wsrep_cluster_members    |

| wsrep_streaming_log      |

+--------------------------+

3 rows in set (0.12 sec)

New Synchronization Functions

This version introduces a series of SQL functions for use in wsrep synchronization operations. You can use them to obtain the Global Transaction ID which is based on either the last write or last seen transaction. You can also set the node to wait for a specific GTID to replicate and apply, before initiating the next transaction.

Intelligent Donor Selection

Some understated features that have been present since Galera 3.x include intelligent donor selection and cluster crash recovery. These were originally planned for Galera 4, but made it into earlier releases largely due to customer requirements. When it comes to donor node selection in Galera 3, the State Snapshot Transfer (SST) donor was selected at random. However with Galera 4, you get a much more intelligent choice when it comes to choosing a donor, as it will favour a donor that can provide an Incremental State Transfer (IST), or pick a donor in the same segment. As a Database Administrator, you can force this via setting wsrep_sst_donor.

Why Use MySQL Galera Cluster Streaming Replication?

Long-Running Transactions

Galera's problems and limitations always revolved around how it handled long-running transactions and oftentimes caused the entire cluster to slow down due to large write-sets being replicated. It's flow control often goes high, causing the writes to slow down or even terminating the process in order to revert the cluster back to its normal state. This is a pretty common issue with previous versions of Galera Cluster.

Codership advises to use Streaming Replication for your long-running transactions to mitigate these situations. Once the node replicates and certifies a fragment, it is no longer possible for other transactions to abort it.

Large Transactions

This is very helpful when loading data to your report or analytics. Creating bulk inserts, deletes, updates, or using LOAD DATA statement to load large quantity of data can fall down in this category. Although it depends on how your manage your data for retrieval or storage. You must take into account that Streaming Replication has its limitations such that certification keys are generated from record locks.

Without Streaming Replication, updating a large number of records would result in a conflict and the whole transaction would have to be rolled back. Slaves that are also replicating large transactions are subject to the flow control as it hits the threshold and starts slowing down the entire cluster to process any writes as they tend to relax receiving incoming transactions from the synchronous replication. Galera will relax the replication until the write-set is manageable as it allows to continue replication again. Check this external blog by Percona to help you understand more about flow control within Galera.

With Streaming Replication, the node begins to replicate the data with each transaction fragment, rather than waiting for the commit. This means that there's no way for any conflicting transactions running within the other nodes to abort since this simply affirms that the cluster has certified the write-set for this particular fragment. It’s free to apply and commit other concurrent transactions without blocking and process large transaction with a minimal impact on the cluster.

Hot Records/Hot Spots

Hot records or rows are those rows in your table that gets constantly get updated. These data could be the most visited and highly gets the traffic of your entire database (e.g. news feeds, a counter such as number of visits or logs). With Streaming Replication, you can force critical updates to the entire cluster.

As noted by the Galera Team at Codership

“Running a transaction in this way effectively locks the hot record on all nodes, preventing other transactions from modifying the row. It also increases the chances that the transaction will commit successfully and that the client in turn will receive the desired outcome.”

This comes with limitations as it might not be persistent and consistent that you'll have successful commits. Without using Streaming Replication, you'll end up high chances or rollbacks and that could add overhead to the end user when experiencing this issue in the application's perspective.

Things to Consider When Using Streaming Replication

Certification keys are generated from record locks, therefore they don’t cover gap locks or next key locks. If the transaction takes a gap lock, it is possible that a transaction, which is executed on another node, will apply a write set which encounters the gap log and will abort the streaming transaction.
When enabling Streaming Replication, write-set logs are written to wsrep_streaming_log table found in the mysql system database to preserve persistence in case crash occurs, so this table serves upon recovery. In case of excessive logging and elevated replication overhead, streaming replication will cause degraded transaction throughput rate. This could be a performance bottleneck when high peak load is reached. As such, it’s recommended that you only enable Streaming Replication at a session-level and then only for transactions that would not run correctly without it.
Best use case is to use streaming replication for cutting large transactions
Set fragment size to ~10K rows
Fragment variables are session variables and can be dynamically set
Intelligent application can set streaming replication on/off on need basis

Conclusion

Thanks for reading, in part two we will discuss how to enable Galera Cluster Streaming Replication and what the results could look like for your setup.

Tags:

codership

In the first part of this blog we provided an overview of the new Streaming Replication feature in MySQL Galera Cluster. In this blog we will show you how to enable it and take a look at the results.

Enabling Streaming Replication

It is highly recommended that you enable Streaming Replication at a session-level for the specific transactions that interact with your application/client.

As stated in the previous blog, Galera logs its write-sets to the wsrep_streaming_log table in MySQL database. This has the potential to create a performance bottleneck, especially when a rollback is needed. This doesn't mean that you can’t use Streaming Replication, it just means you need to design your application client efficiently when using Streaming Replication so you’ll get better performance. Still, it's best to have Streaming Replication for dealing with and cutting down large transactions.

Enabling Streaming Replication requires you to define the replication unit and number of units to use in forming the transaction fragments. Two parameters control these variables: wsrep_trx_fragment_unit and wsrep_trx_fragment_size.

Below is an example of how to set these two parameters:

SET SESSION wsrep_trx_fragment_unit='statements';

SET SESSION wsrep_trx_fragment_size=3;

In this example, the fragment is set to three statements. For every three statements from a transaction, the node will generate, replicate, and certify a fragment.

You can choose between a few replication units when forming fragments:

bytes - This defines the fragment size in bytes.
rows- This defines the fragment size as the number of rows the fragment updates.
statements- This defines the fragment size as the number of statements in a fragment.

Choose the replication unit and fragment size that best suits the specific operation you want to run.

Streaming Replication In Action

As discussed in our other blog on handling large transactions in Mariadb 10.4, we performed and tested how Streaming Replication performed when enabled based on this criteria...

Baseline, set global wsrep_trx_fragment_size=0;
set global wsrep_trx_fragment_unit='rows'; set global wsrep_trx_fragment_size=1;
set global wsrep_trx_fragment_unit='statements'; set global wsrep_trx_fragment_size=1;
set global wsrep_trx_fragment_unit='statements'; set global wsrep_trx_fragment_size=5;

And results are

Transactions: 82.91 per sec., queries: 1658.27 per sec. (100%)

Transactions: 54.72 per sec., queries: 1094.43 per sec. (66%)

Transactions: 54.76 per sec., queries: 1095.18 per sec. (66%)

Transactions: 70.93 per sec., queries: 1418.55 per sec. (86%)

For this example we're using Percona XtraDB Cluster 8.0.15 straight from their testing branch using the Percona-XtraDB-Cluster_8.0.15.5-27dev.4.2_Linux.x86_64.ssl102.tar.gz build.

We then tried a 3-node Galera cluster with hosts info below:

testnode11 = 192.168.10.110

testnode12 = 192.168.10.120

testnode13 = 192.168.10.130

We pre-populated a table from my sysbench database and tried to delete a very large rows.

root@testnode11[sbtest]#> select count(*) from sbtest1;

+----------+

| count(*) |

+----------+

| 12608218 |

+----------+

1 row in set (25.55 sec)

At first, running without Streaming Replication,

root@testnode12[sbtest]#> select @@wsrep_trx_fragment_unit, @@wsrep_trx_fragment_size,  @@innodb_lock_wait_timeout;

+---------------------------+---------------------------+----------------------------+

| @@wsrep_trx_fragment_unit | @@wsrep_trx_fragment_size | @@innodb_lock_wait_timeout |

+---------------------------+---------------------------+----------------------------+

| bytes                     | 0 |                         50000 |

+---------------------------+---------------------------+----------------------------+

1 row in set (0.00 sec)

Then run,

root@testnode11[sbtest]#> delete from sbtest1 where id >= 2000000;

However, we ended up getting a rollback...

---TRANSACTION 648910, ACTIVE 573 sec rollback

mysql tables in use 1, locked 1

ROLLING BACK 164858 lock struct(s), heap size 18637008, 12199395 row lock(s), undo log entries 11961589

MySQL thread id 183, OS thread handle 140041167468288, query id 79286 localhost 127.0.0.1 root wsrep: replicating and certifying write set(-1)

delete from sbtest1 where id >= 2000000

Using ClusterControl Dashboards to gather an overview of any indication of flow control, since the transaction runs solely on the master (active-writer) node until commit time, there's no any indication of activity for flow control:

In case you’re wondering, the current version of ClusterControl does not yet have direct support for PXC 8.0 with Galera Cluster 4 (as it is still experimental). You can, however, try to import it... but it needs minor tweaks to make your Dashboards work correctly.

Back to the query process. It failed as it rolled back!

root@testnode11[sbtest]#> delete from sbtest1 where id >= 2000000;

ERROR 1180 (HY000): Got error 5 - 'Transaction size exceed set threshold' during COMMIT

regardless of the wsrep_max_ws_rows or wsrep_max_ws_size,

root@testnode11[sbtest]#> select @@global.wsrep_max_ws_rows, @@global.wsrep_max_ws_size/(1024*1024*1024);

+----------------------------+---------------------------------------------+

| @@global.wsrep_max_ws_rows | @@global.wsrep_max_ws_size/(1024*1024*1024) |

+----------------------------+---------------------------------------------+

|                          0 |               2.0000 |

+----------------------------+---------------------------------------------+

1 row in set (0.00 sec)

It did, eventually, reach the threshold.

During this time the system table mysql.wsrep_streaming_log is empty, which indicates that Streaming Replication is not happening or enabled,

root@testnode12[sbtest]#> select count(*) from mysql.wsrep_streaming_log;

+----------+

| count(*) |

+----------+

|        0 |

+----------+

1 row in set (0.01 sec)



root@testnode13[sbtest]#> select count(*) from mysql.wsrep_streaming_log;

+----------+

| count(*) |

+----------+

|        0 |

+----------+

1 row in set (0.00 sec)

and that is verified on the other 2 nodes (testnode12 and testnode13).

Now, let's try enabling it with Streaming Replication,

root@testnode11[sbtest]#> select @@wsrep_trx_fragment_unit, @@wsrep_trx_fragment_size, @@innodb_lock_wait_timeout;

+---------------------------+---------------------------+----------------------------+

| @@wsrep_trx_fragment_unit | @@wsrep_trx_fragment_size | @@innodb_lock_wait_timeout |

+---------------------------+---------------------------+----------------------------+

| bytes                     | 0 |                      50000 |

+---------------------------+---------------------------+----------------------------+

1 row in set (0.00 sec)



root@testnode11[sbtest]#> set wsrep_trx_fragment_unit='rows'; set wsrep_trx_fragment_size=100; 

Query OK, 0 rows affected (0.00 sec)



Query OK, 0 rows affected (0.00 sec)



root@testnode11[sbtest]#> select @@wsrep_trx_fragment_unit, @@wsrep_trx_fragment_size, @@innodb_lock_wait_timeout;

+---------------------------+---------------------------+----------------------------+

| @@wsrep_trx_fragment_unit | @@wsrep_trx_fragment_size | @@innodb_lock_wait_timeout |

+---------------------------+---------------------------+----------------------------+

| rows                      | 100 |                      50000 |

+---------------------------+---------------------------+----------------------------+

1 row in set (0.00 sec)

What to Expect When Galera Cluster Streaming Replication is Enabled?

When query has been performed in testnode11,

root@testnode11[sbtest]#> delete from sbtest1 where id >= 2000000;

What happens is that it fragments the transaction piece by piece depending on the set value of variable wsrep_trx_fragment_size. Let's check this in the other nodes:

Host testnode12

root@testnode12[sbtest]#> pager sed -n '/TRANSACTIONS/,/FILE I\/O/p'; show engine innodb status\G nopager; show global status like 'wsrep%flow%'; select count(*) from mysql.wsrep_streaming_log;

PAGER set to 'sed -n '/TRANSACTIONS/,/FILE I\/O/p''

TRANSACTIONS

------------

Trx id counter 567148

Purge done for trx's n:o < 566636 undo n:o < 0 state: running but idle

History list length 44

LIST OF TRANSACTIONS FOR EACH SESSION:

..

...

---TRANSACTION 421740651985200, not started

0 lock struct(s), heap size 1136, 0 row lock(s)

---TRANSACTION 553661, ACTIVE 190 sec

18393 lock struct(s), heap size 2089168, 1342600 row lock(s), undo log entries 1342600

MySQL thread id 898, OS thread handle 140266050008832, query id 216824 wsrep: applied write set (-1)

--------

FILE I/O

1 row in set (0.08 sec)



PAGER set to stdout

+----------------------------------+--------------+

| Variable_name                    | Value |

+----------------------------------+--------------+

| wsrep_flow_control_paused_ns     | 211197844753 |

| wsrep_flow_control_paused        | 0.133786 |

| wsrep_flow_control_sent          | 633 |

| wsrep_flow_control_recv          | 878 |

| wsrep_flow_control_interval      | [ 173, 173 ] |

| wsrep_flow_control_interval_low  | 173 |

| wsrep_flow_control_interval_high | 173          |

| wsrep_flow_control_status        | OFF |

+----------------------------------+--------------+

8 rows in set (0.00 sec)



+----------+

| count(*) |

+----------+

|    13429 |

+----------+

1 row in set (0.04 sec)

Host testnode13

root@testnode13[sbtest]#> pager sed -n '/TRANSACTIONS/,/FILE I\/O/p'; show engine innodb status\G nopager; show global status like 'wsrep%flow%'; select count(*) from mysql.wsrep_streaming_log;

PAGER set to 'sed -n '/TRANSACTIONS/,/FILE I\/O/p''

TRANSACTIONS

------------

Trx id counter 568523

Purge done for trx's n:o < 567824 undo n:o < 0 state: running but idle

History list length 23

LIST OF TRANSACTIONS FOR EACH SESSION:

..

...

---TRANSACTION 552701, ACTIVE 216 sec

21587 lock struct(s), heap size 2449616, 1575700 row lock(s), undo log entries 1575700

MySQL thread id 936, OS thread handle 140188019226368, query id 600980 wsrep: applied write set (-1)

--------

FILE I/O

1 row in set (0.28 sec)



PAGER set to stdout

+----------------------------------+--------------+

| Variable_name                    | Value |

+----------------------------------+--------------+

| wsrep_flow_control_paused_ns     | 210755642443 |

| wsrep_flow_control_paused        | 0.0231273 |

| wsrep_flow_control_sent          | 1653 |

| wsrep_flow_control_recv          | 3857 |

| wsrep_flow_control_interval      | [ 173, 173 ] |

| wsrep_flow_control_interval_low  | 173 |

| wsrep_flow_control_interval_high | 173          |

| wsrep_flow_control_status        | OFF |

+----------------------------------+--------------+

8 rows in set (0.01 sec)



+----------+

| count(*) |

+----------+

|    15758 |

+----------+

1 row in set (0.03 sec)

Noticeably, the flow control just kicked in!

And WSREP queues send/received has been kicking as well:

Host testnode12 (192.168.10.120)

Host testnode13 (192.168.10.130)

Now, let's elaborate more of the result from the mysql.wsrep_streaming_log table,

root@testnode11[sbtest]#> pager sed -n '/TRANSACTIONS/,/FILE I\/O/p'|tail -8; show engine innodb status\G nopager;

PAGER set to 'sed -n '/TRANSACTIONS/,/FILE I\/O/p'|tail -8'

MySQL thread id 134822, OS thread handle 140041167468288, query id 0 System lock

---TRANSACTION 649008, ACTIVE 481 sec

mysql tables in use 1, locked 1

53104 lock struct(s), heap size 6004944, 3929602 row lock(s), undo log entries 3876500

MySQL thread id 183, OS thread handle 140041167468288, query id 105367 localhost 127.0.0.1 root updating

delete from sbtest1 where id >= 2000000

--------

FILE I/O

1 row in set (0.01 sec)

then taking the result of,

root@testnode12[sbtest]#> select count(*) from mysql.wsrep_streaming_log;

+----------+

| count(*) |

+----------+

|    38899 |

+----------+

1 row in set (0.40 sec)

It tells how much fragment has been replicated using Streaming Replication. Now, let's do some basic math:

root@testnode12[sbtest]#> select 3876500/38899.0;

+-----------------+

| 3876500/38899.0 |

+-----------------+

|         99.6555 |

+-----------------+

1 row in set (0.03 sec)

I'm taking the undo log entries from theSHOW ENGINE INNODB STATUS\G result and then divide the total count of the mysql.wsrep_streaming_log records. As I've set it earlier, I defined wsrep_trx_fragment_size= 100. The result will show you how much the total replicated logs are currently being processed by Galera.

It’s important to take note at what Streaming Replication is trying to achieve... "the node breaks the transaction into fragments, then certifies and replicates them on the slaves while the transaction is still in progress. Once certified, the fragment can no longer be aborted by conflicting transactions."

The fragments are considered transactions, which have been passed to the remaining nodes within the cluster, certifying the fragmented transaction, then applying the write-sets. This means that once your large transaction has been certified or prioritized, all incoming connections that could possibly have a deadlock will need to wait until the transactions finishes.

Now, the verdict of deleting a huge table?

root@testnode11[sbtest]#> delete from sbtest1 where id >= 2000000;

Query OK, 12034538 rows affected (30 min 36.96 sec)

It finishes successfully without any failure!

How does it look like in the other nodes? In testnode12,

root@testnode12[sbtest]#> pager sed -n '/TRANSACTIONS/,/FILE I\/O/p'|tail -8; show engine innodb status\G nopager; show global status like 'wsrep%flow%'; select count(*) from mysql.wsrep_streaming_log;

PAGER set to 'sed -n '/TRANSACTIONS/,/FILE I\/O/p'|tail -8'

0 lock struct(s), heap size 1136, 0 row lock(s)

---TRANSACTION 421740651985200, not started

0 lock struct(s), heap size 1136, 0 row lock(s)

---TRANSACTION 553661, ACTIVE (PREPARED) 2050 sec

165631 lock struct(s), heap size 18735312, 12154883 row lock(s), undo log entries 12154883

MySQL thread id 898, OS thread handle 140266050008832, query id 341835 wsrep: preparing to commit write set(215510)

--------

FILE I/O

1 row in set (0.46 sec)



PAGER set to stdout

+----------------------------------+--------------+

| Variable_name                    | Value |

+----------------------------------+--------------+

| wsrep_flow_control_paused_ns     | 290832524304 |

| wsrep_flow_control_paused        | 0 |

| wsrep_flow_control_sent          | 0 |

| wsrep_flow_control_recv          | 0 |

| wsrep_flow_control_interval      | [ 173, 173 ] |

| wsrep_flow_control_interval_low  | 173 |

| wsrep_flow_control_interval_high | 173          |

| wsrep_flow_control_status        | OFF |

+----------------------------------+--------------+

8 rows in set (0.53 sec)



+----------+

| count(*) |

+----------+

|   120345 |

+----------+

1 row in set (0.88 sec)

It stops at a total of 120345 fragments, and if we do the math again on the last captured undo log entries (undo logs are the same from the master as well),

root@testnode12[sbtest]#> select 12154883/120345.0;                                                                                                                                                   +-------------------+

| 12154883/120345.0 |

+-------------------+

|          101.0003 |

+-------------------+

1 row in set (0.00 sec)

So we had a total of 120345 transactions being fragmented to delete 12034538 rows.

Once you're done using or enabling Stream Replication, do not forget to disable it as it will always log huge transactions and adds a lot of performance overhead to your cluster. To disable it, just run

root@testnode11[sbtest]#> set wsrep_trx_fragment_size=0;

Query OK, 0 rows affected (0.04 sec)

Conclusion

With Streaming Replication enabled, it's important that you are able to identify how large your fragment size can be and what unit you have to choose (bytes, rows, statements).

It is also very important that you need to run it at session-level and of course identify when you only need to use Streaming Replication.

While performing these tests, deleting a large number of rows to a huge table with Streaming Replication enabled has noticeably caused a high peak of disk utilization and CPU utilization. The RAM was more stable, but this could due to the statement we performed is not highly a memory contention.

It’s safe to say that Streaming Replication can cause performance bottlenecks when dealing with large records, so using it should be done with proper decision and care.

Lastly, if you are using Streaming Replication, do not forget to always disable this once done on that current session to avoid unwanted problems.

Tags: