Speed up large MySQL backup, dump, restore processes

by jagbir on April 19, 2011

In general, MySQL is quite fast at restoring data, but I observed that while restoring 20gb of backup its taking more than the usual time. This can happen when you don’t have enough memory or if key_buffer_size is not set high enough, then it can take very long time to re-index the data. In CentOS 5.2 server with 6gb of RAM, I noticed key_buffer_size is set to just 800M which is very low. You should set it at least 20-25% of total RAM. After increasing the value to 2gb, MySQL is able to reload data quite fast.

Though this does trick for me but there are other helpful suggestions also which you can try/check to speed up your backup/restore process, few are as follows:

* Increase value of key_buffer_size variable in my.cnf to 20-25% of available RAM as mentioned above. Be aware that, if you make the value too large like setting more than 50% of total RAM, your machine may start to page and become extremely slow. This is because MySQL relies on the operating system to perform file system caching for data reads, so you must leave some room for the file system cache. Anyhow, remember that on 32 bit system, the maximum value key_buffer_size can take is 4gb only and setting any greater might crash your MySQL server.

* For InnoDB engine, another very important setting you should consider is to set innodb_flush_log_at_trx_commit = 2. Reset it to 1 again after restore if you need ACID.

* Again for InnoDB and foreing keys, you could also disable foreign key checks and re-enable it at the end (using SET FOREIGN_KEY_CHECKS=0 and SET FOREIGN_KEY_CHECKS=1).

* Few maatkit tools like mk-parallel-dump / mk-parallel-restore will add capability to do dump/restore with multithreaded support utilizing available no. of CPUs but beware maatkit itself not recommending using these tools for backup.

* Try using Extended inserts in dumps but again as client has a fixed size buffer for each line it is reading and extended inserts may exceeds that limit, there’s possibility that you may not be able to restore your very large dumps.

* Backup/Dump data with –tab format to make use of mysqlimport, which is faster than mysql < dumpfile.

* You can try importing data with multiple threads, means start individual process to import each table.

* Change database engine if possible. Importing into a heavily transactional engine like innodb is painfully slow. Inserting into a non-transactional engine like MyISAM is much much faster.

* You can also try turning off foreign key checks and turn on auto-commit.

* Also check usage of --disable-keys with InnoDB.

* If you think network may be bottleneck, try --compress when using mysqldump.

* Last but not the least here, you can better take directory dump and restore it when required if your MySQL version is similar. You just need to stop your MySQL server, copy whole data or mysql directory, compress it and store at your backup location. Restore in same way when needed. Its one of the fastest method available and I would recommend using it if you are having separate slave hosts which is taking regular backups etc. You can also check lvmbackup/snapshot etc for more efficiency.

Do you know any other tip/tricks which can speed up taking/restore MySQL dump? Please share by putting a comment below.

More related and helpful articles:
* Optimize MySQL on a large Database Server
* Recover or reset root password of MySQL and PostgreSQL Servers
* How to setup MySQL Cluster 7.0 in Redhat based Linux
* Optimize and fix MySQL Server running slow without any load
* Quick and simple php based script to check replication status of multiple MySQL Servers

  • http://www.nocturnal.org Matthew LEnz

    “* For InnoDB engine, another very important setting you should consider is to set innodb_flush_log_at_trx_commit = 2. Reset it to 1 again after restore if you need ACID.”

    Just tried this. Shaved 2 minutes off (33min) a 35min import.

    The Foreign key checks should be disabled automatically if you are using a modern version of mysql and a dump from mysqldump. It has the hints at the beginning of the script.

  • Jeff B

    In my experience MySQL is horribly slow doing restores. Is is also horribly slow doing schema alters. Slower than any other database. For example, I have two identical databases over 200GB in size, one in PostgreSQL and one in MySQL. I have these two because I migrated from MySQL to PostgreSQL two years ago, and the database now is quite a bit larger but I kept these two copies for testing.

    MySQL will spend 20 hours on a restore of this database, and separately from a restore will spend 3 hours on an alter table operation on the biggest table. PostgreSQL will restore the whole thing in 25 minutes on the same hardware, and do an alter on the same table in 8 minutes. I have tuned MySQL extensively and it just doesn’t scale at the same rate as your data does. I mean, my server has 96GB of RAM and I have tried MySQL at insanely high levels and it really doesn’t give back as much as you give it.

    I would also add that the slow link in my experience is with InnoDB. I’m sure MySQL would fair better under MyISAM but then you lose some important features.

    It is incredibly easy to migrate from MySQL to PostgreSQL. Just migrate your schema first (leaving foreign keys out until after restore) and then the mysqldump files are almost perfectly compatible with PostgreSQL when using:
    mysqldump –compatible=postgresql –no-create-db –no-create-info –add-locks=false –quote-names=false –complete-insert=true

    I just had to convert \’ to two single quotes and fix MySQL’s infamous null dates (0000-00-00 00:00:00) to be true nulls and was able to pipe the backup right into psql.
    sed -e “s/\\\’/\’\’/g” -e “s/\’0000-00-00 00:00:00\’/null/g” backup.sql > output.sql

  • Jeff B

    Bottom line, I would highly recommend migrating your 20GB MySQL database before it gets much bigger.

  • http://linuxadminzone.com jagbir

    @Matthew LEnz thanks for comment, yah it should automatically do.

    @Jeff B: Thanks a lot for useful information. I know MySQL will take much time for 200G data but its really eye opener if PostgreSQL is doing it in just 25 minutes on same hardware.

    btw, did you tried XtraDB from Percona? I must say it has lots of enhancements.

  • Pingback: sant singh maskeen

  • sanchez

    Hi Jagbir, I love your Blog, thanks for all the information that you provide us. I have a question in regard of Mysql. I use rackspace cloud server and the maximum cloud has 16 GB of ram and 630 GB space.

    My question is what if my mysql database file grows larger than 630 GB of space? What is the solution for that?

    • http://linuxadminzone.com jagbir

      Hi Samuel, thanks for appreciation. You need to keep an eye on growth of DB size and proactively take steps to avoid any issues regarding that. I don’t have extensive experience with rackspace cloud but there few things you can think:
      - Cloud providers usually have facility to attach another/extra drive to your host, increasing capacity, check about it and store you DB in separate drive. I must say there should be some offering in kind of elastic drive like AWS is offering.
      - If possible, split your DB and store it in separate hosts, if that’s no doable, check possibility of archiving really old records to reduce size of DB.
      - Make sure to not store any extra backup/large files in your DB host, means keep it for DB only.
      - if you are really concerned and don’t think there’s solution of growing DB size, time to explore possibility of migration of DB to other provider. Let me know how you proceeded. :)

      • ali

        i use a database of 1.5 Gb and for me is enough of using gui tool dbforge mysql but the thing is the tool is for windows only.

  • Sam

    Awesome, thanks! This shaved off about 80% off my MySQL restore times – I was able to restore a 100MB+ in 2 minutes instead of 10 minutes as it would usually take. I used only the key_buffer_size setting – but I’m using a server with 16GB RAM and key_buffer_size set to 4 GB.

  • Kimber Esplin

    My teams in two companies have now built solutions that used AWS and the Amazon Cloud Solution including systems that serve users with millions of page-views and mobile solutions serving the US, Mexico, China, Taiwan, Korea, ….

    • http://linuxadminzone.com jagbir

      Good to know that Kimber. What are these solutions?

Previous post:

Next post: