WeBWorK Main Forum

Serious webwork crash

Serious webwork crash

by Rob Owen -
Number of replies: 3
Our local installation of webwork [v 2.x I believe] has always had some problems -- notably, it produces a series of warnings claiming that "" cannot be compared (via ==, <, etc.) in ContentGenerator.pm and a few other modules -- but it's been stable. Just last night, though, a macro I was debugging went a little haywire and now our webwork frontend (and maybe our server?) seems to be fried.

Specifically, the macro was supposed to generate random numbers according to a particular scheme and I mistakenly entered the wrong variable in the conditional, resulting in an infinite loop. This choked the system temporarily but such things had happened before without ill effect. Not this time. At first we got error messages saying "Cannot determine local time zone" (this was around midnight if that helps) whenever a WW problem set would load at all, which was rare. After I caught and fixed the error everything seemed fine but this afternoon webwork completely imploded. Even after restarting the mysql server -- it had apparently crashed some time during the festivities -- all we're getting now is about the following error message when we try to access a course:

Error messages

error instantiating DB driver WeBWorK::DB::Driver::SQL for table problem_user: DBI connect('webwork','webworkRead',...) failed: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (111) at /usr/local/webwork2/lib/WeBWorK/DB/Driver/SQL.pm line 62
at /usr/local/webwork2/lib/WeBWorK.pm line 238

Call stack

The information below can help locate the source of the problem.

  • in Carp::croak called at line 221 of /usr/local/webwork2/lib/WeBWorK/DB.pm
  • in WeBWorK::DB::new called at line 238 of /usr/local/webwork2/lib/WeBWorK.pm

and now we're also getting:

error instantiating DB driver WeBWorK::DB::Driver::SQL for table permission: DBD driver has not implemented the AutoCommit attribute at /usr/local/lib/perl/5.8.4/DBI.pm line 670.
at /usr/local/webwork2/lib/WeBWorK.pm line 238


My questions are thus, in descending order of importance:

1) Can anyone figure out what this error means and suggest a fix? I'd like to avoid a reinstallation if possible, and we absolutely cannot afford to lose the data already online. Beyond that, I think we're open to pretty much anything (although I'm not sure how easy it will be to reboot the server).

2) How does one go about killing a webwork process (in particular, a problem set) that's locked in an infinite loop without killing the entirety of webwork? Sometimes webwork figured it out and stopped by itself, but that only happened about one in ten times.

3) What is causing all those ContentGenerator.pm warning messages about comparisons to empty strings? [I think there are also some from one of the problem set modules.] Are they dangerous? And how do we make them go away?

4) Can one "use strict;" or its equivalent in a .pg file? [I know you can in a .pl file, I haven't been able to port this over after I'd thought of it.] Can one use the equivalent of perl -w? I've had enough problems with mismatched variable names that I'd like the option of automating that particular check if possible.

Any help you can give on the matter would be greatly appreciated.
In reply to Rob Owen

Re: Serious webwork crash

by Michael Gage -
(1)Your first error message suggests that your mysql database is not running. You should try to get that running first and protect your data. (If the mysql server starts up easily then you can use the mysqldump command -- you can also back up the directory where your
mysql installation is storing its actual data (often /var/db/mysql). You will need to use the manpages and search the net for more detail on using mysqldump and repairing your mysql database if it is damaged. )

(2) You will need to stop and restart the apache server to kill a runaway child process. Possibly apachectl stop followed by apachectl start will work, but sometimes the child process won't listen. Use top to find the process ids for the apache server and then use kill to kill the process. Again read the man pages for more details and get someone with some experience using the apache server to help out.

You need direct command line access to the server to accomplish the steps above. No data will be lost in killing the process. If you stop webwork and restart it in less than a minute most users will not even be aware that it was down -- they will attribute the delay to heavy traffic.

Every WeBWorK problem is supposed to time-out after about 60 seconds. So if it is simply a matter of an infinite loop it will
kill itself eventually.(The setting is in webwork2/lib/WeBWorK/Constants.pm) If it is eating up memory it's harder to say what will happen.

(3) If you can give line numbers about the warning messages we may be able to diagnose the problem -- I suspect it's not dangerous but it would still be nice to make them go away or at least give more friendly error messages. Do you have undefined dates in some of the sets?

Hope this helps. I'll check into the use strictquestion.
In reply to Michael Gage

Re: Serious webwork crash

by Rob Owen -
For (1) and (2): the mysql database had indeed crashed; we restarted it shortly afterwards but it apparently took a while for webwork to figure this out. Not entirely sure what happened there. I've trapped against infinite loops in those routines, so hopefully this will be a thing of the past.

As for the warning messages, I too suspect they're not dangerous. Here are some samples:

Argument "" isn't numeric in numeric gt (>) at /usr/local/webwork2/lib/WeBWorK/ContentGenerator/Instructor/ProblemSetDetail.pm line 1424. [One error, found in listing many sets.]

Argument "" isn't numeric in numeric comparison (<=>) at /usr/local/webwork2/lib/WeBWorK/ContentGenerator/Instructor/ProblemSetList.pm line 1327. [Approximately 100 times]

Then, in a typical set, we get a slew of error messages like this:

Argument "" isn't numeric in numeric ge (>=) at /usr/local/webwork2/lib/WeBWorK/ContentGenerator/Problem.pm line 174.
Argument "" isn't numeric in numeric le (<=) at /usr/local/webwork2/lib/WeBWorK/ContentGenerator/Problem.pm line 173.
Argument "" isn't numeric in numeric gt (>) at /usr/local/webwork2/lib/WeBWorK/ContentGenerator/Problem.pm line 175.

And so forth.

I don't know if this is being caused by undefined dates in some of the sets; I'm not really sure what dates those would be so I can't check. Does this help anyone in identifying the problem?
In reply to Rob Owen

Re: Serious webwork crash

by Michael Gage -
The line numbers have changed in the newer versions of the files, so can't precisely locate the error in ProblemSetDetail.pm.

The errors in Problem.pm and ProblemSetList.pm are most likely related and occur when a comparison is made to see if the current time is before after or between the open date, due date and/or answer date.

If you check the problem sets for which these errors occur (by looking at them via Hmwk Sets Editor -> Edit problems ---- title at the top of the page will say "Set Detail for set ...." ) you will probably find that the date information at the top of the page is either missing or has some kind of bad format. (format should look like

01/10/1997 at 06:00am EST

)

Could you report back which version of webwork you are using (rel-2-3 , rel-2-2 ?).

I think we've caught most instances in which the dates are accidentally left undefined -- but one can never be sure.
:-)