Backup and Disaster Recovery

From WeBWorK_wiki
Revision as of 15:48, 23 July 2015 by Dglin (talk | contribs) (Create the page. Still lacking detail on rsync to another server.)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Overview

The following provides some best-practice suggestions on setting up regular, automated backups of student data from WeBWorK. The idea is that in the case of a server failure, all student progress up to the last backup can be restored on a new machine.

What to Back Up

In order to restore student progress system-wide, there are two main components that need to be saved:

  • The WeBWorK database (stored in MySQL (or MariaDB)
  • The WeBWorK courses directory (typically /opt/webwork/courses)

Backing up the Database

The easiest way to back up a database is to use the mysqldump command. This command creates a text file with all of the commands required to rebuild the database.

As with any mysql command, it must be run by an authenticated database user with sufficient privileges to read the entire database. Luckily we can use the 'webworkWrite' user which already exists to allow the WeBWorK web server to interact with the database. Since we plan to set mysqldump to run automatically, we will store the MySQL username and password in a configuration file which can be written by the system. You will first need to look up the password for this user. It can be found in the site.conf configuration file under the variable $database_password. In your home directory, edit the file .my.cnf and add the following (the file may not yet exist):

[client]
user=webworkWrite
password=[your-webworkWrite-password]

You can test this by running the command mysql webwork from the command line. It should now let you access the webwork database without having to type a password.

Once this is configured, you can use the following script to create a backup of the webwork database. Note that this assumes that you have created a directory called /webwork_backup and that it is writeable by your user. The script also zips the backup in order to save space. You can save this script in the bin subdirectory of your home directory. Perhaps call it ~/bin/wwmysqldump.

#!/bin/bash
HOME=/path/to/your/home/directory
/usr/bin/mysqldump --opt webwork |/bin/gzip -c > /webwork_backup/webwork.sql.gz

Don't forget to make the file executable:

> chmod u+x ~/bin/wwmysqldump

You can now test this by running wwmysqldump. It should create the file /database_backup/webwork.sql.gz. Note that if you run the script again, it will overwrite the backup with a newer version. Later on we will use logrotate to keep multiple backups.

Backing up the Courses Directory

You can use the tar command to create an archive of the courses directory. You can do this using the following command, which we will save in a script called ~/bin/wwtar.

#!/bin/bash
/bin/tar zcf /webwork_backup/webwork-courses.tar.gz /opt/webwork/courses

Again, make the file executable:

> chmod u+x ~/bin/wwtar

Now if you run wwtar, it should create the file /database_backup/webwork-courses.tar.gz. Note that this may take a long time depending on the number of courses, and how many students they contain. As with wwmysqldump, running the command more than once will overwrite the previous archive.

Setting the Backups to Run Nightly

We will use the cron system to schedule the two backups above to run nightly. To do so, we must add entries into the cron table. Do this by running the command crontab -e as the user who will be completing the backups (note that the default editor for crontab is vi. If you wish to use a different editor, then run the command export EDITOR=nano, where you can replace nano with your favourite text editor). You should then insert the following lines:

30 2 * * * /home/[your_username]/bin/wwmysqldump
30 3 * * * /home/[your_username}/bin/wwtar

The first two numbers are the minute and hour when the script should run. The three stars indicate that the script should run every day of the month, every month, and every day of the week. This means that wwmysqldump will run daily at 2:30am, and wwtar will run daily at 3:30am.

Keeping Several Backups Using logrotate

logrotate is a tool which is designed to manage log files. It takes care of making backups, and can delete files that are older than a specified range. We can use this in combination with the above scripts to keep several backups. logrotate configuration is stored in the /etc/logrotate.d directory. You should create one logrotate configuration for each of the two backups created above.

Create a file called /etc/logrotate.d/webworkdatabase with the following contents:

/webwork_backup/webwork.sql.gz {
daily
rotate 7
missingok
sharedscripts
}

Every day (daily), this script will back up the file webwork.sql.gz by moving it to a file with an index on the end (e.g. webwork.sql.gz.1 or webwork.sql.gz-[date] depending on the logrotate settings). It will keep the most recent 7 of these files, and delete any older files.

Similarly, create /etc/logrotate.d/webworkcourses:

/webwork_backup/webwork-courses.tar.gz {
daily
rotate 7
missingok
sharedscripts
}

For both of these files, you can change how many previous backups to keep by changing the number after rotate. For the courses directory, you may need to back this up less frequently. Theoretically student progress can be restored strictly from the database, but any questions that were authored or modified locally will be contained in the courses directory.

Copying Your Backups Off-Site

All of the above is useless if the hard drive on your WeBWorK server dies, so the next step is to make sure these backup files are stored on another computer. There are two ways to accomplish this:

  1. mount an external drive to the /webwork_backup directory. This could be either an attached USB drive, or a remote file share.
  2. Set up a remote sync either to or from another server. This can be done using the rsync command over ssh, and can be automated by creating a key-pair to log in from one server to the other without requiring a password.