Backup and Disaster Recovery

From WeBWorK_wiki
Jump to navigation Jump to search

Overview

The following provides some best-practice suggestions on setting up regular, automated backups of student data from WeBWorK. The idea is that in the case of a server failure, all student progress up to the last backup can be restored on a new machine.

What to Back Up

In order to restore student progress system-wide, there are two main components that need to be saved:

  • The WeBWorK database (stored in MySQL (or MariaDB))
  • The WeBWorK courses directory (typically /opt/webwork/courses)

Backing up the Database

The easiest way to back up a database is to use the mysqldump command. This command creates a text file with all of the commands required to rebuild the database.

As with any mysql command, it must be run by an authenticated database user with sufficient privileges to read the entire database. Luckily we can use the 'webworkWrite' user which already exists to allow the WeBWorK web server to interact with the database. Since we plan to set mysqldump to run automatically, we will store the MySQL username and password in a configuration file which can be written by the system. You will first need to look up the password for this user. It can be found in the site.conf configuration file under the variable $database_password. In your home directory, edit the file .my.cnf and add the following (the file may not yet exist):

[client]
user=webworkWrite
password=[your-webworkWrite-password]

You can test this by running the command mysql webwork from the command line. It should now let you access the webwork database without having to type a password.

Once this is configured, you can use the following script to create a backup of the webwork database. Note that this assumes that you have created a directory called /webwork_backup and that it is writeable by your user. The script also zips the backup in order to save space. You can save this script in the bin subdirectory of your home directory. Perhaps call it ~/bin/wwmysqldump.

#!/bin/bash
HOME=/path/to/your/home/directory
/usr/bin/mysqldump --opt webwork |/bin/gzip -c > /webwork_backup/webwork.sql.gz

Don't forget to make the file executable:

> chmod u+x ~/bin/wwmysqldump

You can now test this by running wwmysqldump. It should create the file /webwork_backup/webwork.sql.gz. Note that if you run the script again, it will overwrite the backup with a newer version. Later on we will use logrotate to keep multiple backups.

Backing up the Courses Directory

You can use the tar command to create an archive of the courses directory. You can do this using the following command, which we will save in a script called ~/bin/wwtar.

#!/bin/bash
/bin/tar zcf /webwork_backup/webwork-courses.tar.gz /opt/webwork/courses

Again, make the file executable:

> chmod u+x ~/bin/wwtar

Now if you run wwtar, it should create the file /webwork_backup/webwork-courses.tar.gz. Note that this may take a long time depending on the number of courses, and how many students they contain. As with wwmysqldump, running the command more than once will overwrite the previous archive.

Setting the Backups to Run Nightly

We will use the cron system to schedule the two backups above to run nightly. To do so, we must add entries into the cron table. Do this by running the command crontab -e as the user who will be completing the backups (note that the default editor for crontab is vi. If you wish to use a different editor, then run the command export EDITOR=nano, where you can replace nano with your favourite text editor). You should then insert the following lines:

30 2 * * * /home/[your_username]/bin/wwmysqldump
30 3 * * * /home/[your_username}/bin/wwtar

The first two numbers are the minute and hour when the script should run. The three stars indicate that the script should run every day of the month, every month, and every day of the week. This means that wwmysqldump will run daily at 2:30am, and wwtar will run daily at 3:30am.

Keeping Several Backups Using logrotate

logrotate is a tool which is designed to manage log files. It takes care of making backups, and can delete files that are older than a specified range. We can use this in combination with the above scripts to keep several backups. logrotate configuration is stored in the /etc/logrotate.d directory. You should create one logrotate configuration for each of the two backups created above.

Create a file called /etc/logrotate.d/webworkdatabase with the following contents:

/webwork_backup/webwork.sql.gz {
daily
rotate 7
missingok
sharedscripts
}

Every day (daily), this script will back up the file webwork.sql.gz by moving it to a file with an index on the end (e.g. webwork.sql.gz.1 or webwork.sql.gz-[date] depending on the logrotate settings). It will keep the most recent 7 of these files, and delete any older files.

Similarly, create /etc/logrotate.d/webworkcourses:

/webwork_backup/webwork-courses.tar.gz {
daily
rotate 7
missingok
sharedscripts
}

For both of these files, you can change how many previous backups to keep by changing the number after rotate. For the courses directory, you may need to back this up less frequently. Theoretically student progress can be restored strictly from the database, but any questions that were authored or modified locally will be contained in the courses directory.

Copying Your Backups Off-Site

All of the above is useless if the hard drive on your WeBWorK server dies, so the next step is to make sure these backup files are stored on another computer. There are two ways to accomplish this:

  1. mount an external drive to the /webwork_backup directory. This could be either an attached USB drive, or a remote file share.
  2. Set up a remote sync either to or from another server. This can be done using the rsync command over ssh, and can be automated by creating a key-pair to log in from one server to the other without requiring a password.

We will provide instructions for the second method here. You must first decide whether you will be running your sync on the WeBWorK server or on your other server. Whichever server runs the sync will need to be able to log in to the other via ssh without having to type a password. Here are instructions for setting this up. Make sure you perform this process as the same user who will be performing the backups.

On the server that will be running the sync, you will need to create a script, say ~/bin/webworkrsync to run the synchronization.

If you are running the sync from the WeBWorK server, the script would look like:

/usr/bin/rsync -al --delete -e '/usr/bin/ssh -l [remote_username]' /webwork_backup remote.server.name:/path/to/remote/backup

where you will need to fill in your username on the remote server, as well as the location on the remote server where the synchronized copy should be stored.

Similarly, if you are running the sync on the remote server, the script would look like:

/usr/bin/rsync -al --delete -e '/usr/bin/ssh -l [webwork_username]' webwork.server.name:/webwork_backup /path/to/remote/backup

Once you have created this script, you will need to schedule it to run nightly using cron. Run the command crontab -e and add the following line:

30 4 * * * /home/[your_username}/bin/webworkrsync

Make sure you leave enough time for the other two backups to complete before starting the rsync.

If you have set this up properly, then you will have a copy of your /webwork_backup directory on the remote server which is synchronized with the WeBWorK server nightly.

Disaster Recovery

If your WeBWorK server blows up, you should be able to restore to the last backup using the following process:

  • Build a new WeBWorK server, and install WeBWorK as you would for a fresh installation. It is probably wise to install the same version of WeBWorK as was running on the failed server. Do not start the apache web server yet. Warning: the following steps assume that this is a brand new server containing no valuable course or student data. Following these instructions will destroy any courses on the new server, and overwrite them from the backup.
  • Copy the most recent database and courses backups from your remote backup server (or external hard drive) to the new server.
  • Remove the courses directory:
rm -rf /opt/webwork/courses
  • Unpack the courses backup:
tar zxvf webwork-courses.tar.gz

(you probably will run this command as root and with --same-owner option to preserve ownerships)

  • Restore the MySQL database from the backup:
gunzip webwork.sql.gz
mysql -p webwork < webwork.sql

(you will require the MySQL root password to complete the above command)

  • Start the apache web server. You should see all of your courses with student data restored.