Difference between revisions of "Clean Out Temporary Files"

From WeBWorK_wiki
Jump to navigation Jump to search
 
(One intermediate revision by the same user not shown)
Line 32: Line 32:
 
==Using Cron Jobs to remove temporary files==
 
==Using Cron Jobs to remove temporary files==
   
It is a good idea to clean out temporary files on a regular automatic schedule. Also pdf copies of downloaded problem sets are saved in a temporary directory (<code>wwtmp/.../hardcopy</code>) so that they can be downloaded from the web. But after the download, the pdf file remains and is visible from the web if one knows the URL. For this reason we recommend deleting all such files that are over one hour old. Similarly we recommend deleting all png, gif, and html links under <code>wwtmp</code> that are over 30 days old. The following cron jobs will accomplish this. The first is run every 30 minutes and the last three twice a month. These cron jobs should be run as root. We use <code>crontab</code> to edit the <code>crontab</code> file:
+
Instead of cleaning out temporary files on an ad hoc basis as above, it is probably a better idea to clean out temporary files on a regular automatic schedule. Also pdf copies of downloaded problem sets are saved in a temporary directory (<code>wwtmp/.../hardcopy</code>) so that they can be downloaded from the web. But after the download, the pdf file remains and is visible from the web if one knows the URL. For this reason we recommend deleting all such files that are over one hour old. Similarly we recommend deleting all png, gif, and html links under <code>wwtmp</code> that are over 30 days old. And finally every week we recommend deleting all equation images that are over 14 days old. The following cron jobs will accomplish this. The first is run every 30 minutes, the next three twice a month and the last one weekly on Sunday morning. These cron jobs should be run as root. We use <code>crontab</code> to edit the <code>crontab</code> file:
 
$ su
 
$ su
 
<root password>
 
<root password>
Line 39: Line 39:
 
Now add the lines
 
Now add the lines
   
  +
WEBWORK_ROOT=/opt/webwork/webwork2
 
*/30 * * * * find /var/www/wwtmp/*/hardcopy/* -mmin +60 -name "*" -delete
 
*/30 * * * * find /var/www/wwtmp/*/hardcopy/* -mmin +60 -name "*" -delete
 
5 5 1,15 * * find /var/www/wwtmp/*/gif/ -mtime +30 -name "*" -delete
 
5 5 1,15 * * find /var/www/wwtmp/*/gif/ -mtime +30 -name "*" -delete
 
5 5 2,16 * * find /var/www/wwtmp/*/png/ -mtime +30 -name "*" -delete
 
5 5 2,16 * * find /var/www/wwtmp/*/png/ -mtime +30 -name "*" -delete
 
5 5 3,17 * * find /var/www/wwtmp/*/html/ -mtime +30 -name "*" -delete
 
5 5 3,17 * * find /var/www/wwtmp/*/html/ -mtime +30 -name "*" -delete
 
  +
4 5 * * 0 /opt/webwork/webwork2/bin/remove_stale_images --delete --days 14
   
 
and save the file and quit
 
and save the file and quit
 
# exit
 
# exit
 
$
 
$
 
 
[[Category:Administrators]]
 
[[Category:Administrators]]

Latest revision as of 17:13, 14 March 2012

The WeBWorK system creates and saves a large number of temporary files, e.g. png images of equations, pdf downloads, etc. If e.g. a png image of an equation already exists, it is faster for WeBWorK to locate and use that image than to create a new one. However, eventually the number of temporary files will become very large (especially for large installations) and over time only a very small number of them will be used. Thus it is a good idea, especially for large installations, to clean out temporary files say every semester. WeBWorK will recreate any temporary file it needs that does not already exist, so there is no harm in removing temporary files.

Removing temporary files when a wwtmp directory is set up

Removing temporary files is most important for large installations and I assume all large installations will set up WeBWorK so that all temporary files are places in a separate directory or partition (see Store WeBWorK's temporary files in a separate directory or partition). In this case, simply so the following:

cd to the wwtmp directory (e.g. cd /var/www/wwtmp)

and as root remove everything in that directory (Before doing this triple check that you are in the wwtmp directory!!!)

sudo rm -rf *

We still have to clean out the depths table in the MySQL database that saves information on the depths of the png images of equations so that they get displayed nicely. Fortunately this is very easy to do. Simply run the command

remove_stale_images --delete --days 0

That command will report

Removed 0 images.

since you have already removes all the temporary images (or on a very active server, it may remove a few that have just been created). The important thing is that it silently cleans out the depths table. In fact, the way remove_stale_images works is that it totally clears out the <depths> table and then adds the necessary information for any images that remain.

NOTE: There is bug in the file PGcore.pm prior to revision 7138 that prevents the temporary directory from being recreated with the correct permissions if the course name contains a dash (i.e. a -). To check the revision number of PGcore.pm, go to the directory containing PGcore.pm (usually /opt/webwork/pg/lib) and run the command svn status -u -v PGcore.pm. If you have an older copy, you can update it with the command svn update PGcore.pm.

Using the remove_stale_images script

If you have a small installation and do not have a wwtmp directory, you can still easily remove all or some of the temporary equations images by running the remove_stale_images command;

remove_stale_images --delete 

will moves all images older than 7 days.

remove_stale_images --delete --days 0

will remove all images.

remove_stale_images --help

will list all options.

How often should temporary files be removed

It would not be a good idea to do this on a daily basis since some of these files, especially the image files, are reused. Recreating them takes a lot more resources. The default for remove_stale_images --delete is to remove images over 7 days old since it assumed students will only be working on problems for 7 days. Running remove_stale_images --delete --days 14 every month would clean out the old image files over 14 days. There is a trade off between removing everything and the extra resources it takes to recreate files that are needed again. Totally cleaning out wwtmp every 3 or 6 months is probably sufficient.

Using Cron Jobs to remove temporary files

Instead of cleaning out temporary files on an ad hoc basis as above, it is probably a better idea to clean out temporary files on a regular automatic schedule. Also pdf copies of downloaded problem sets are saved in a temporary directory (wwtmp/.../hardcopy) so that they can be downloaded from the web. But after the download, the pdf file remains and is visible from the web if one knows the URL. For this reason we recommend deleting all such files that are over one hour old. Similarly we recommend deleting all png, gif, and html links under wwtmp that are over 30 days old. And finally every week we recommend deleting all equation images that are over 14 days old. The following cron jobs will accomplish this. The first is run every 30 minutes, the next three twice a month and the last one weekly on Sunday morning. These cron jobs should be run as root. We use crontab to edit the crontab file:

$ su
<root password>
# crontab -e

Now add the lines

WEBWORK_ROOT=/opt/webwork/webwork2
*/30 * * * * find /var/www/wwtmp/*/hardcopy/*  -mmin +60  -name "*" -delete
5 5 1,15 * *  find /var/www/wwtmp/*/gif/  -mtime +30  -name "*" -delete
5 5 2,16 * *  find /var/www/wwtmp/*/png/  -mtime +30  -name "*" -delete
5 5 3,17 * *  find /var/www/wwtmp/*/html/  -mtime +30  -name "*" -delete
4 5 * * 0 /opt/webwork/webwork2/bin/remove_stale_images --delete --days 14

and save the file and quit

# exit
$