WeBWorK Problems

SQL error saving answer

SQL error saving answer

by Sean Fitzpatrick -
Number of replies: 27
On mobile, so I'll have to follow up later with the pg file if needed.
I have a question that's a variation on an APEX problem from Valdosta, asking for the derivative of a function of the form a*sin(x)/(b*cos(x)+c).

Students are getting an SQL error for incorrect string value in the last answer column. See attached screenshot.

What's odd is that their answer shows up as correct in past answers, but their problem still says they've done zero attempts, with no credit.

I haven't been able to reproduce the error yet.

I'm wondering if this is related to turning MathQuill on? The problem worked fine last year under 2.14.
I'm running 2.15 with one modification: I had to redefine the delimiter for \verb for PreTeXt compatibility. (This is an issue that Alex Jordan has reported on. I reverted back to the old delimiter.)
Attachment SQLerror.png
In reply to Sean Fitzpatrick

Re: SQL error saving answer

by Sean Fitzpatrick -
Confirming: with MathQuill turned on, I can reproduce the error. With it off, I don't get an error.
In reply to Sean Fitzpatrick

Re: SQL error saving answer

by Sean Fitzpatrick -
Following up with details.
With MathQuill turned on, the answer is submitted as

(((4cos(x)(-4cos(x)-6)-4sin(x)(4sin(x))))/((4cos(x)+6)^(2)))

This is copied and pasted from the past answers page, right after getting an error page for submitting it. It shows up as correct there, but clearly is not making it into the database.

Aside from extra parentheses this is the same as the manually inputted answer.
But the error message leads me to suspect that MathQuill is injecting some other special character that isn't visible here.

PG code and error page attached.


In reply to Sean Fitzpatrick

Re: SQL error saving answer

by Glenn Rice -
The problem has to do with the text string that is being stored in the last_answer column for the sticky answers. I am not exactly sure what is happening yet, but it seems to have to do with the length of the text string that webwork is trying to store in the database, and not some special character being injected. It seems that if the encoded answer string exceeds 230 characters it is too long. Note that the encoded answer string contains both the text answer you displayed above, and the latex form of the answer used for MathQuill stored in a hidden input.

If you disable MathQuill answer boxes and type in 1 about 150 times and submit that the same error occurs.

What are the limits on the length of a text string in mysql?
In reply to Glenn Rice

Re: SQL error saving answer

by Michael Gage -
the answer string is:

answer_string => { type=>"VARCHAR(5012)"},


if you change it in the code new tables will be built with the new char length. You will still have to change the field in the database manually for existing tables or use "upgrade course tables" .
In reply to Michael Gage

Re: SQL error saving answer

by Glenn Rice -
However, the problem is with the last_answer column in the courseid_problem_user table defined in UserProblem.pm as:

last_answer => { type=>"TEXT" },

The past answer is successfully stored, but the last_answer is two big. Isn't type TEXT supposed to be able to store much longer strings?
In reply to Glenn Rice

Re: SQL error saving answer

by Sean Fitzpatrick -
Thanks. This makes some sense, and explains why students are getting the same error in a few other problems as well. At first I thought the problem was with having exponents in the denominator (as one gets from the quotient rule).
But it sounds like it's just that it's all the extra parentheses MathQuill injects to deal with this push the length too long?

The same error popped up in a problem asking for the derivative of a product of three functions.

I've temporarily disabled MathQuill to cut down on the number of support emails coming in from students.

Is this something I can fix locally? (Keeping in mind that I'm no expert on the inner workings of WeBWorK, but I could change a database setting.) Or is there a code change that needs to be applied remotely and then pulled to our server?
In reply to Sean Fitzpatrick

Re: SQL error saving answer

by Michael Gage -
This gives the number of bytes for the various storage types:

The other thing to remember is that the big change from 2.14 to 2.15 is to use 4 byte characters (full utf8 unicode -- or uft8mb4 in mysql speak) so that all languages can be represented.

At a guess you can change TEXT to LONGTEXT (in a test course) and then rebuild the data table for the test course -- and see what happens.

We anticipated that index sizes would have to be changed with the new encodings (and we adjusted them) but not what would happen to answers stored in the past_answer table. We'll have to think what the best solution is.

for more information:

In reply to Michael Gage

Re: SQL error saving answer

by Glenn Rice -
It seems to me that the TEXT data type should be able to contain the combined answer in this case (and every case discussed) with no problem. According to the stackoverflow link you referenced and every other resource I found the TEXT data type can hold 2^16-1=65535 characters. Even with multibyte encoding this should be more than enough. I tried changing the type to LONGTEXT as you suggested, but the problem still occurs.
In reply to Glenn Rice

Re: SQL error saving answer

by Michael Gage -
Thanks very much for your research Glenn.

I haven't had time to test this myself yet. It's still possible that multibyte characters might cause troubles eventually but from your investigation it appears that something else is causing this particular error.
In reply to Michael Gage

Re: SQL error saving answer

by Glenn Rice -
I have done a little more testing. I disabled MathQuill answer boxes and opened a problem that had three answer inputs. I then entered a 2^7-1 characters in each input. The resulting encoded answer string that is stored in the last_answer field has a string length of 430 characters, but there is no problem storing this in the database. I then added one more character to one of the inputs (it doesn't matter which of the three), and the error occurs that Sean posted the screen shot of above. So it is not the combined length that matters here. The TEXT data field can handle much longer strings. However, for some reason if a single input exceeds 2^7-1 characters then there is an issue. This is of course what is happening with Sean's example. The latex stored for the MathQuill answer box is too long.
In reply to Glenn Rice

Re: SQL error saving answer

by Sean Fitzpatrick -
Thanks for looking into this -- and the detailed debugging.
Is there anything I can do to work around right now, other than disabling MathQuill?
In reply to Sean Fitzpatrick

Re: SQL error saving answer

by Alex Jordan -
> Is there anything I can do to work around right now, other than disabling MathQuill?

You could tell the students to simplify their answers more :)
In reply to Sean Fitzpatrick

Re: SQL error saving answer

by Glenn Rice -
So it seems that changing the last_answer type in the database to BLOB fixes the problem.

Unfortunately there is no easy work around if you want to use MathQuill. A complicated work around is as follows.

Change line 48 of webwork2/lib/WeBWorK/DB/Record/UserProblem.pm from

last_answer => { type=>"TEXT" },

to

last_answer => { type=>"BLOB" },

Then in mysql for each course do:

ALTER TABLE courseid_problem_user CHANGE last_answer last_answer BLOB;

Of course replace courseid with the actual course id. Note that last_answer typed twice is not a typo.

Then restart apache2 (a reload may be enough).

In reply to Glenn Rice

Re: SQL error saving answer

by Glenn Rice -
Okay, so I have found something that indicates this does have something to do with the multibyte character encoding. If I go to a server that is running WeBWork 2.14 and enter an answer that is longer than 2^7-1 characters into a text input, it works with no problem.

It is worth noting that 2^7 is the cut off from standard utf8 to full unicode utf8. I think this has to do with the length bytes at the beginning of the string that are shown in the error screen shot Sean posted.
In reply to Glenn Rice

Re: SQL error saving answer

by Michael Gage -
I can duplicate this with a fresh docker installation of webwork 2.15. I've checked as best I can that the database is correctly configured to use uft8mb4 encoding everywhere.

It seems to object to the string length specification -- but I don't understand why yet -- I thought that in TEXT type the string is stored starting with a byte count of the length of the string.

BLOB stores things as binary rather than as characters so that might be why BLOB works. My understanding is that utf8 strings should be stored as TEXT rather than binary -- I confess I don't understand the reasons that this is preferable.
In reply to Michael Gage

Re: SQL error saving answer

by Glenn Rice -
In type TEXT a string is stored with a byte count at the beginning. However, that byte count as I understand it is internal to the MySQL storage and is not what we are seeing. I think the byte count that is causing the problem is a result of the Storable::nfreeze that is called on the data in the encodeAnswers call. There is a leading byte count for each answer that is in the returned string. MySQL is trying to interpret those as full utf8 characters when type TEXT is used, and they are not supposed to be. They are intended to by used by the thaw method to properly separate the frozen data. Basically, the data is not technically a utf8 string anymore because Storable::nfreeze is turning it into binary data, and then MySQL can't handle it properly as TEXT anymore. The Storable::nfreeze and Storable::thaw should preserve the full utf8 properties of the original strings. Perl 5.8 or higher supports full utf8 encoding and decoding with the nfreeze and thaw approach.
In reply to Glenn Rice

Re: SQL error saving answer

by Michael Gage -
That sounds very plausible. I'll bet we missed the freeze/thaw operation
when making things utf8mb4 friendly. Will take a look. Good catch.
In reply to Michael Gage

Re: SQL error saving answer

by Sean Fitzpatrick -
Well, I'm reassured that I've uncovered a bigger issue than I realized at first (OK, mostly that it's not due to some silly thing I did).
Thanks for your expert attention on this -- it's definitely beyond my pay grade :-)

I've turned off MathQuill temporarily. This weekend, if I have time, I'll put together some documentation for students detailing how they can turn it off in User Settings if they run into an error like this. (Turns out an instruction like "change it in user settings" doesn't suffice for what is supposed to be our digital generation...)

Then I'll turn it back on, and (as Alex suggests) advise them to also consider simplifying their answers. (Though I've always encouraged the idea that the great thing about WeBWorK is that you can worry about getting the calculus right, and let WeBWorK do the simplifying!)

With about 20 courses on our server right now I'm reluctant to start fiddling with the database mid-semester, in case I mess something up. We'll be launching a new server in January, so I'll deal with this when we set that up.
(I'm learning that students like MathQuill, now that it's turned off. Suddenly they're responsible for putting in parentheses correctly!)
In reply to Michael Gage

Re: SQL error saving answer

by Glenn Rice -
So it seems to me that we will need to change the type of the last_answer column to BLOB. This brings up another issue. In the change to utf8mb4 in the database several other columns in the tables had their types changed (mainly VARCHAR length). I noticed that if you "Upgrade Courses" from the webwork admin page, it does not change the types of these columns in the database. Don't we need to adapt the upgrade methods to do that? It will need to be done if we change from TEXT to BLOB.
In reply to Glenn Rice

Re: SQL error saving answer

by Nathan Wallach -
I think the core issue is in fact the binary byte count added by Storable::nfreeze.

I think the byte count that is causing the problem is a result of the Storable::nfreeze that is called on the data in the encodeAnswers call. There is a leading byte count for each answer that is in the returned string. MySQL is trying to interpret those as full utf8 characters when type TEXT is used, and they are not supposed to be. They are intended to by used by the thaw method to properly separate the frozen data. Basically, the data is not technically a utf8 string anymore because Storable::nfreeze is turning it into binary data, and then MySQL can't handle it properly as TEXT anymore.

Instead of fixing this by changing the field to a BLOB, I would like to propose that we look into using JSON to serialize the data for this database field, but try to do so in a backwards compatible way by having decodeAnswers() in lib/WeBWorK/Utils.pm decide whether to use the "old" approach or the JSON approach based on some regexp check of the contents of the field. All new data put into this field would be serialized/deserialized using standard Perl JSON tools, and only old style nfreeze data would be deserialized using the old code.

What do people think of this idea?
In reply to Nathan Wallach

Re: SQL error saving answer

by Michael Gage -
Preliminary report. I agree that changing encode and decode answers is the right approach. I just did a test in a fresh install of webwork 2.15 in docker and made these changes:
sub decodeAnswers($) {
 my $serialized = shift;
 return unless defined $serialized and $serialized;
 my $array_ref = eval{ Storable::thaw(decode_utf8_base64($serialized)) };

 if ($@ or !defined $array_ref) {
 # My hope is that this next warning is no longer needed since there are few legacy base64 days and the fix seems transparent.
 # warn "problem fetching answers -- possibly left over from base64 days. Not to worry -- press preview or submit and this will go away permanently for this question. $@";
 return ();
 } else {
 return @{$array_ref};
 }
}

and

sub encodeAnswers(\%\@) {
 my %hash = %{shift()};
 my @order = @{shift()};
 my @ordered_hash = ();
 foreach my $key (@order) {
 push @ordered_hash, $key, $hash{$key};
 }
 return encode_utf8_base64( Storable::nfreeze( \@ordered_hash)) ;

}

If I entered the string

0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789

before the change I would get a database error. After the change

webwork simply reports correctly that my answer is incorrect. I haven't thought how to make this backwards compatible yet. It might be easier to implement than switching to JSON but there may be other reasons to switch to JSON. In any case it seems to confirm that the basic source of this bug is the freeze/thaw operation.


In reply to Michael Gage

Re: SQL error saving answer

by Glenn Rice -
As I pointed out in a comment in the related issue, I don't think it is valid to store the result of Storable::nfreeze in a TEXT field in the database as it really is binary data. This way that you demonstrate may work around the issue by re-encoding the binary data in a format that works, but I don't think that is really a good way to go.

If the desire is to keep a TEXT data type in the database, then JSON is the way to go as that is text, not binary. Furthermore, it seems that it would be harder to find a valid distinction between the encoded_utf8_base64 result and the old type of data, than to find a distinction between JSON data and the old type of data for backwards compatibility.

By the way, I think we should move this discussion to the issue on GitHub. I think that is the more appropriate forum for this discussion, and this 30 minute delay on the actual posting of responses is kind of annoying.
In reply to Sean Fitzpatrick

Re: SQL error saving answer

by Glenn Rice -
Actually, the extra parentheses added by MathQuill are really not the issue here. Those are only added to the text string, and not to the latex string. The issue is simply the combined length of the text string and the latex string. The latex string for what you were typing in the example you gave is:

\frac{4\cos\left(x\right)\left(-4\cos\left(x\right)-6\right)-4\sin\left(x\right)\left(4\sin\left(x\right)\right)}{\left(4\cos\left(x\right)+6\right)^2}

As you can see, there are no additional parentheses to than the ones that you typed. For the example that you gave MathQuill is only adding two sets of unnecessary parentheses (4 characters) to the text string, and removing those is not enough to shorten the answer to a length that the database will take.
In reply to Sean Fitzpatrick

Re: SQL error saving answer

by Alex Jordan -
> I had to redefine the delimiter for \verb for PreTeXt compatibility. (This is an issue that Alex Jordan has reported on. I reverted back to the old delimiter.)

On this point, WW 2.15 will be sending PreTeXt (and LaTeX hardcopy) verbatim things where the delimiter is the carriage return character, 0xD, aka \r. And there is a pending PTX pull request that will make PTX able to handle the old 0x85, the temporary 0x1F, and the new 0xD.

So once that PTX pull request is merge, if you undo the reversion to the old delimiter 0x85, and then update your WW server to the full 2.15, I think things will be fine. Maybe best to wait until 2.15 becomes master though.
In reply to Sean Fitzpatrick

Re: SQL error saving answer

by Nathan Wallach -
I think we resolved the issue with https://github.com/openwebwork/webwork2/pull/1033 which was merged into the WeBWorK-2.15 branch several hours ago.
In reply to Nathan Wallach

Re: SQL error saving answer

by Sean Fitzpatrick -
Just to confirm here: I've pulled the changes (needed to reboot for them to take effect -- an Apache restart didn't seem to be enough) and everything is running smoothly. Thanks for the quick attention on this!