Wednesday, April 12, 2006

MySQL Cluster Replication

MySQL QA

In the 5.1 release MySQL Cluster will support replication to another MySQL installation. The replication is handled using Row Bases Replication instead of Statement Based Replication.

I have been testing it for about a year now and it has gotten really hard for me to break at this point, but I am still trying.

So what is so cool about Cluster Replication? Good question!

One answer:

Cluster Replication really gives a company five 9's (99.999%) in up time. You can have a Master Cluster in the main site supporting the business and have another Slave Cluster in a total different site that can be used as a backup for times there are issues with main site. In addition, the Slave Cluster can also be used for reporting or data mining to take that traffic off of the main Cluster.

How hard is it to setup? Another great question.

Answer: Easy

So to set it up, you would create two set of Clusters configured the way that you wanted them. You configure the Master MySQLD to create a bin log and to connect to the Master Cluster. You configure the Slave MySQLD to read from that Master MySQLD and to connect to the Slave Cluster. That is basically it. Of course you have to grant replication permissions, but I am sure you knew that. Here is are some examples:
******************************************************

Master MySQLD my.cnf:

[mysqld]
server-id=1
log-bin = /space/var/master1
log = /space/var/master1.log
log-error = /space/var/master1.err
socket = /tmp/mysql.sock
port = 3306
pid-file = /space/var/hostname.pid
datadir = /space/var/
language = /usr/mysql/english/
ndbcluster # Use NDB engine
ndb-connectstring=master01:1234 # location of MGM node ndb-connectstring=host:port

Slave Cluster MySQLD my.cnf:
[mysqld]
server-id=8
relay-log=slave-relay-bin
replicate-ignore-db=mysql
master-user=rep
master-connect-retry=1
master-host=master01
master-password=test
master-port=3306
skip-slave-start
datadir=/space/var/
log-warnings
log-error=/space/var/slave.err
ndbcluster
ndb-connectstring=slave01:1234 # location of MGM node
******************************************************

Hey, in your example the slave has an ID of 8, why? You noticed.

Answer:
To really get the most out of the Cluster you need a lot feeding it. You can have more then one MySQLD process feeding the Master Cluster and producing bin logs for that matter.

Well if you have more then one MySQLD feeding the Master Cluster then how does replication work?

Answer: The injector thread.

The Master Cluster is responsible for updating the Bin Log, not the MySQLD process. This is done through the Cluster Bin Log Injector Thread.

Therefore you can have many MySQLD processes feeding the Master Cluster, but have the Slave only reading from one of them and not miss a beat (or data action).

One other questions is why do you have skip-slave-start in you Slave my.cnf? You saw that too, hmmm.

Answer:
Sometimes your Master Cluster will have been up a long time and you are wanting to bring a slave on-line, but you really don't want the slave in catch-up mode for days. So you can start the the MySQLD on the slave telling it to not start replication yet, restore a backup from the Master, run a couple of command, then start the slave and you will catch-up quickly and be in sync.

To read more about this, checkout:

http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-replication.html

If you have ideas for testing, or have different scenarios you would like to see tested, please send them to me. Always looking to improve.

Cheers!

9 comments:

Mike Kruckenberg said...

I'm tasked with testing the cluster in our environment and determining if/when it might meet our needs.

What's your environment like? Are you using high-end production machines?

What are you using to put load on the cluster and benchmark performance of things like memory vs. disk? How much load are you finding the cluster can support

For testing replication do you have a duplicate cluster that you're running the same kinds of load against?

jbm said...

Hi Michal.

"I'm tasked with testing the cluster in our environment and determining if/when it might meet our needs."

Pretty big task :-)

"What's your environment like? Are you using high-end production machines?"

I have a mix of systems that I am using. My most common is "2.6.12-1.1447_FC4smp #1 SMP Fri Aug 26 20:57:13 EDT 2005 i686 athlon i386 GNU/Linux"

Most testing is done on older systems with about 4GB of RAM, but running (as you see above) pretty recent FC release.

"What are you using to put load on the cluster and benchmark performance of things like memory vs. disk? How much load are you finding the cluster can support"

I use anything I can get my hands on for load. This include using performance tests to generate load. For Performance testing I have been using Atomics and TPC-B. I have just migrated DBT2 (TPC-C) scripts and starting to uses these for load and will be using them later for performance testing.

My main issue with load testing is, for example, that my MySQLD processes max out long before my NDB processes do.

An example of load testing I did a couple of weeks ago:

I had 7 host main cluster and a 2 host slave.

Out of the 7 hosts, 4 were Data Node hosts , 1 was ndb_mgmd & MySQLD, and the other 2 host contained 2 mysqld processes each. With all 5 MySQLD processes maxed, the NDB was only between 25 - 40% CPU.

The slave of 2 hosts had 2 data nodes, 1 ndb_mgmd and 1 MySQLD. IT kept up without issue.

I will run stress/load again soon and will need to figure out how to put more MySQLD processes against the cluster to get the load up higher.

"For testing replication do you have a duplicate cluster that you're running the same kinds of load against?"

No, see above. Of course I an not updating the cluster directly, only through replication.

Hope this helps/

jbm said...

"No, see above. Of course I an not updating the cluster directly, only through replication."

Want to be clear here. I am not updating the slave cluster directly, I am updating it through replication. I am running select type of statements and show statments.

Example "SHOW SLAVE STATUS\G"

Best wished

Webincome said...

Great Blog, I found your postings really interesting,
but did you know that if you dont back up your data base you could lose it all
belive me I kow, thats why I created a fully automatic script that will back up any mysql database, and I kept the cost low...
check it out today at http://www.databasebackup.co.uk

jbm said...

Webincome said...

Thanks for the comments, but the MySQL Cluster has real time backups built in that account for on going transaction while the back up takes place. Not sure a third party software could replace that.

Anonymous said...
This comment has been removed by a blog administrator.
rantsh said...

Hello, I can see you're a Senior Lead Quality Assurance Developer so I guess that makes you the perfect person for me to ask this...

Lately, I've been working on a MySQL Cluster implementation for the company I work for, I did set it up but using v5.0.45 which is (by my understanding) the main/most stable branch, now the heads of the company want the ability to replicate the cluster to a remote location for backup, this obviously can be done asynchronously... But I just bumped with the issue that MySQL Cluster 5.0.x does not work with replication.

So I guess my 2 questions would be:

1.- Is MySQL Cluster version 5.1.x suitable for production enviroments?
2.- How different is the implementation of the cluster between the 2 versions, I know 5.1 works with disk based tables (which is GOOD) but does is the installation and configuration as "straight-forward" as the installation of version 5.0.45?

I understand you may not have the time to reply to this soon, but it'd be nice to know since this questions are not really clear in the docs.

Either way thanks for any reply you can provide, and I hope to hear from you...

Roderick Smith

PS: If you prefer you can email me at rantsh [at] hotmail [dot] com

Anonymous said...

Hi there, I found your blog via Google while searching for first aid for a heart attack and your post looks very interesting for me.

Anonymous said...

Good dispatch and this post helped me alot in my college assignement. Thank you for your information.