WeBWorK Main Forum

DBD mysql error and the database character set.

DBD mysql error and the database character set.

by Richard Humphrey -
Number of replies: 15

Hello-

 We're using webwork, using the Ubuntu 20.04 virtual machine image provided by webwork.

Users are getting an error on some problems, 

BD::mysql::st execute failed: Incorrect string value: '\xE2\x88\x92cos...' for column 'answer_string' at row 1 at

The system and the users' browsers are using UTF-8, but the character set of Mysql is Latin1. Converting the database to UTF-8 is straightforward from the database side. Would converting the database to UTF-8 solve this problem, and more importantly, will it break anything?

Thanks,

=Richard Mortimer Humphrey

Sysadmin,Math,Cornell

In reply to Richard Humphrey

Re: DBD mysql error and the database character set.

by Bianca Sosnovski -
Hi everyone,
I see that there is no answer posted for the above but I'm interested in a solutions.
We just had our Webwork server reinstalled from scratch because many error messages would appear and go to the user and the system was super slow at times.
After the reinstall, we used the backups for the courses and database. Everything was running ok until now. Some users are getting similar error messages:

DBD::mysql::st execute failed: Incorrect string value: '\\xCF\\x80 \\x0965...' for column 'answer_string' at row 1 at /opt/webwork/webwork2/lib/WeBWorK/DB/Schema/NewSQL/Std.pm line 837

A quick search shows that the issue  has something to do with  UTF-8 encode in the database.

Did anyone encounter and resolve the similar issue with Webwork's database? Thanks for any help
In reply to Bianca Sosnovski

Re: DBD mysql error and the database character set.

by Nathan Wallach -
"'\\xCF\\x80\\x0965" may be related to pi. It does not seem to be valid UTF-8. 

I would guess that this is something put in the table as Latin1, and now causing trouble. Hex "CF" is an accented character in iso-8859-1.

 http://unicode.scarfboy.com/?s=CF800965

Input does not describe a codepoint...
  ...describing intput string as-is (full input string)
Constituent codepoints:
   03C0   GREEK SMALL LETTER PI
   0009   (name not known)
   0065   LATIN SMALL LETTER E

http://unicode.scarfboy.com/?s=CF8009

Input does not describe a codepoint...
  ...describing intput string as-is (full input string)
Constituent codepoints:
   03C0   GREEK SMALL LETTER PI
   0009   (name not known)

I am still a bit stumped by why Chrome has the problem and Safari does not. Maybe the "past answer" which usually reappears on problem load is somehow related to why a new user does not have the problem, but an old one does. It could be that the "stick past answer" (which would come from a different DB table) is coming as invalid UTF-8 which Chrome is handling differently than the other browsers.
In reply to Richard Humphrey

Re: DBD mysql error and the database character set.

by Alex Jordan -

\xE2\x88\x92 is a special elongated negative sign character, as you can see here:

https://www.utf8-chartable.de/unicode-utf8-table.pl?start=8704&number=128&names=2&utf8=string-literal

This suggests a user is copy-pasting their answer into the answer blank, and they should just be typing a keyboard hyphen.

Now, if a user does that, what should happen is that the character is marked as unrecognized, highlighted in yellow, and there is no scary database error message. So maybe something is not right with database text encoding settings with this server. The same thing occasionally appears on my own server, and I haven't had time to investigate this.

In reply to Alex Jordan

Re: DBD mysql error and the database character set.

by Bianca Sosnovski -
Alex,

Thank you so much for the clarification.

Is there a way of making this type of error to show like a message to the user to not copy characters to the answer field? 

Because the way it is faculty and students think that there is a problem with Webwork. We just reinstalled the server and we had a lot error messages with similar layout output instead of showing a Webwork recognizable page saying that the error is user related and not the system's. This way faculty and students would be sure there the system is running ok.
In reply to Bianca Sosnovski

Re: DBD mysql error and the database character set.

by Alex Jordan -

Hi Bianca,

I can reproduce this issue on my own school's server, version 2.15. I also manage WW servers for a few other locations, where the installation is a very clean version 2.15, and on those servers I cannot reproduce this issue. I suspect it has something to do with settings for the database management. It may be as simple as I kept an old localOverrides.conf in place without reviewing the changes that came with version 2.15, but I haven't had time to research this. For a time, I assumed it was a 2.15 issue in general (part of a slow conversion to universal support of UTF-8) and only recently realized that it is not an issue on some other 2.15 servers.

Some people, notably Tani Wallach, Glenn Rice, and Mike Gage, have done the recent work on WW relating to character encoding. I would like to see if any of them have an idea about what could be causing this.

But more directly to your question, if things are working as they are supposed to work, then users should be seeing a more reasonable error message. Not an error message at all actually, just a good regular feedback message.

Alex

In reply to Alex Jordan

Re: DBD mysql error and the database character set.

by Nathan Wallach -
When I submit an answer with that Unicode character to an input box not handled by MathQuill, I see the message: Unexpected character '−' and don't get any message about a DB error.

Note, however, that I could not paste that character into a MathQuill controlled input box at all.

Since my systems are using UTF-8 (mysql's utf8mb4) as the database character set, and don't have any problems, I would say that the problem is just an issue of the DB character set configuration.

WW 2.15 is certainly intended to have UTF-8 support in the database, and using Latin1 database tables will almost certainly cause problems with many multi-byte characters which could be typed into an answer.

I took a look at the installation instructions for WW 2.15 now, and there is no mention of the need to change the DB settings to use UTF-8 (really utf8mb4 is strongly recommended unless it is not available, as otherwise 4-byte UTF-8 characters will trigger the same sort of error messages when "utf8" is the database charset). That is a unfortunate omission.

That omission apparently also lead to the virtual machine images not having a suitable DB charset setting, as it seems that the OS default is still Latin1 (based on what was written in this thread).

Those of involved in testing WW 2.15 probably all used either Docker or had already long since made the necessary changes in their local DB settings, so a reminder about this apparently never made it to those people who prepared the installation instructions.

Mea cupla: as I do know that we made sure to override the default character set settings for the database in the Docker setup, and since all my systems are running via Docker - I too had forgotten how critical this setting is. See: https://github.com/openwebwork/webwork2/blob/master/docker-config/db/mariadb.cnf and https://github.com/openwebwork/webwork2/blob/master/docker-compose.yml . 

Note also: I have found a need to mount the config file also to /etc/mysql/mariadb.cnf and not just to /etc/mysql/conf.d/mariadb.cnf (for MariaDB 10.4 containers) while the other location as was apparently sufficient in some older MariaDB docker images (as I believe the old location worked when it was put into the GitHub repo.)

Reminder: In order to have WW use utf8mb4 (and avoid issues with the partial 3- byte mySQL "utf8" character set) it is also important to set $ENABLE_UTF8MB4 =1; in site.conf. That setting exists as there was a desire to allow "old" servers with old versions of mysql to use the somewhat broken "utf8" instead of the full "utf8mb4" character set which was added to mySQL somewhat later rather than fail due to a request for utf8mb4 which it fails to understand.

If someone works on converting WW database tables to utf8mb4 - writing up a post on how to do that efficiently would help the community. (New courses should automatically use utf8mb4 in their tables once things are properly configured.)
In reply to Nathan Wallach

Re: DBD mysql error and the database character set.

by Arnold Pizer -
Just a couple of points of clarification.

The installation instructions for WW 2.15 call for installing Mysql 8 and by default Mysql 8 uses utf8mb4.

Also the WeBWorK virtual image does in fact use utf8mb4.
In reply to Nathan Wallach

Re: DBD mysql error and the database character set.

by Alex Jordan -

I would chalk up my situation to not reviewing changes to all of the distribution config files when I upgraded from 2.14 to 2.15. It looks like I pulled and I probably did review the changes to localOverrides.conf.dist, but I let site.conf.dist go unreviewed. And the lines I need are the ones that you mention in site.conf. Now that I put those in place, I was able to create a new course that does not suffer form the issue. Existing courses still suffer from the issue, of course. I think I am content to let it ride and the courses that start in January will be OK.


In reply to Alex Jordan

Re: DBD mysql error and the database character set.

by Bianca Sosnovski -
Hi Alex, Arnold and Nathan,

Thank you for your suggestions and comments.

In our brand new reinstall of  the WW server  none of the old config files were used. I checked again the site.conf file and  we have the following configurations:

$ENABLE_UTF8MB4 =1;

$database_character_set=($ENABLE_UTF8MB4) ? 'utf8mb4' : 'utf8'

If I submit the  character '−' using Google Chrome, I get the error message "DBD::mysql::st execute failed: Incorrect string value: '\\xCF\\x80 \\x0965..."

But if I use the web browser Safari the expected reasonable message appears:
screenshot
These responses happen regardless of having  MathQuill or MathView enabled. I don't think it is relevant but I use a Mac Book Pro.

I also noticed that when MathQuill is enabled in the configuration of the course, it doesn't work properly with  Google Chrome. I can't see what is being entered in the answer fields when typing if MathQuill is enabled.

So I guess this is a problem is related to Google Chrome.

In reply to Bianca Sosnovski

Re: DBD mysql error and the database character set.

by Alex Jordan -

Interesting!

I just tried my new course where "−4" works fine in Firefox, and it also works in Chrome and Safari. (By "works", I mean there is the "Unexpected character" message.

Have to check to rule it out, but are you sure you were in the same course when you saw two different behaviors on two different browsers? Just in case something was initiated differently for one course's database tables than for the other.

In reply to Alex Jordan

Re: DBD mysql error and the database character set.

by Bianca Sosnovski -
Yes, I'm absolutely sure that the two different behaviors happen when using different web browsers in the same course and same assignment and problem.

I went over the copies I kept of the conf files from the previous install of our server. That instance of the server was up to date with version 2.15 (it was upgraded several times since version 2.12) but the conf files (site.conf, localOverrides.conf and defaults.config) don't have any configuration set for utf8 or utf8m4.
After the new instance was reinstalled, I imported the database from the previous one. Is there a possibility that this is what is causing it? 

I created a new account (not one the accounts imported from the old server instance). The issue doesn't happen.

I also checked it with a Windows laptop and the same issue happens with the account imported.
In reply to Bianca Sosnovski

Re: DBD mysql error and the database character set.

by Nathan Wallach -

My first impression was that this might be a Chrome issue. I have encountered some problems which occurred with Chrome on Linux and not with Chrome on Windows, so it is certainly possible that this is specific to Chrome on Mac.

Maybe there are records in the table which were created in a version of WW before 2.15 which were put in the database in some "bad" way, and now cannot be retrieved.

If someone there feels comfortable directly modifying the database, I would recommend trying to "removing" the records for the specific user and specific problem and testing to see if that helps. (I'm recommending changing the user_id field, as then in the future we could look in more depth at what is actually in the table which is making trouble).

Something like the lines below - fixing the table name, student user_id, set name and problem_id (a number)

select user_id,set_id,problem_id,count(*) from CN_past_answer group by user_id,set_id,problem_id;

select * from CN_past_answer where user_id='the account ID' and set_id='set name' and problem_id='N';

update CN_past_answer set user_id='REMOVED_FROM the account ID' where user_id='the account ID' and set_id='set name' and problem_id='N';

Warning: I typed that SQL code after running similar commands, but could have messed up the syntax when editing to make generic.

If removing the records fixes the problem, we would have a better understanding of the problem


In reply to Bianca Sosnovski

Re: DBD mysql error and the database character set.

by Nathan Wallach -
About the MathQuill issue - you wrote: "I also noticed that when MathQuill is enabled in the configuration of the course, it doesn't work properly with Google Chrome. I can't see what is being entered in the answer fields when typing if MathQuill is enabled."

Is that specific to the question where this is happening, or is MathQuill failing in Chrome for all questions. If MQ works in other problems, does it work on this problem for the new account you created?
In reply to Nathan Wallach

Re: DBD mysql error and the database character set.

by Bianca Sosnovski -

Nathan, 

Interesting enough I am not able to replicate anymore the issue of not being able to see what I enter in answer field when MathQuill is enabled. I updated my Google Chrome this morning and the issue is gone. And that is currently not happening with my Windows laptop.

But the error message 
"DBD::mysql::st execute failed: Incorrect string value: '\\xCF\\x80 \\x0965...' for column 'answer_string' at row 1 at /opt/webwork/webwork2/lib/WeBWorK/DB/Schema/NewSQL/Std.pm line 837"
still  appears with Google Chrome for some problems but not all problems when copying and pasting the unexpected symbol '-'.

Thanks for the suggestion of trying to fix the issue by changing records in the database. I will try to do what you suggested with the MySQL commands you provided when I have a chance and will report here.

  

In reply to Richard Humphrey

Re: DBD mysql error and the database character set.

by Nathan Wallach -
Richard -

Arnie wrote below that the virtual machine images should be using utf8mb4 by default, as they are using mySQL 8.

You write that the Is it possible that "character set of Mysql is Latin1".

As noted below, there is now a suspicion that this issue may be related to importing a course and it's DB tables from a version of WW before 2.15 into 2.15 and this triggering issues UTF-8 data which was improperly stored in Latin1 tables by the older version of WW making trouble with WW 2.15.
Could this have been a result of either (a) local configuration overriding the OS default for the mySQL character set, or (b) importing course tables which were created on a older version of WW (which would almost certainly retain the character set from the old server/course.)