Converting the webwork database from the latin1 to the utf8mb4 character set

From WeBWorK_wiki
Revision as of 17:24, 21 March 2021 by Apizer (talk | contribs) (→‎Step 10)
Jump to navigation Jump to search

These instructions explain how to convert the webwork database from the latin1 to the utf8mb4 character set.

Terminal Window Notation

In a terminal window some commands will have to be run as root whereas others should be run as a regular user. We will use # to indicate that the command is to be run as root e.g.

# mysql --database=webwork -B -N -e "SHOW TABLES LIKE '%\_setting' " | awk '{print "ALTER TABLE", $1, "MODIFY name varchar(240) NOT NULL; "}' | mysql --database=webwork

and $ to indicate that the command is to be run as a normal user e.g.

$ mysql -u root -p 

Of course you can use sudo to run most commands as root from a standard command prompt. In general we will use the same notation used in Installation Manual for 2.12 on Ubuntu 16.04.

Preliminaries

Check What the Current Character Set Is

Before we begin let's make use sure the webwork database is using the latin1 character set.

Log into mysql. Depending on your OS and mysql version, you will either use the command

$ mysql -u root -p 
Enter Password: <mysql root password>

or

$ sudo mysql
[sudo] password for wwadmin: <wwadmin password>

You should see something very similar to

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 4
...

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

Now issue the following mysql commands:

mysql> Use webwork;
...
Database changed


mysql> SELECT @@character_set_database;

If the webwork database is using the latin1 character set you will see:

+--------------------------+
| @@character_set_database |
+--------------------------+
| latin1                   |
+--------------------------+
1 row in set (0.04 sec)

Independent of the above result, run the following command which will show the collation for every table. The collation specifies the character set.

mysql> SHOW TABLE STATUS FROM webwork;

Looking at the results you will see information on every table, e.g.

| Name                                 | Engine | Version | Row_format | Rows  | Avg_row_length | Data_length | Max_data_length   | Index_length | Data_free | Auto_increment | Create_time         | Update_time         | Check_time          | Collation         | Checksum | Create_options | Comment |
+--------------------------------------+--------+---------+------------+-------+----------------+-------------+-------------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+
| OPL_DBchapter                        | MyISAM |      10 | Dynamic    |   176 |             36 |        6348 |   281474976710655 |        16384 |         0 |            180 | 2021-03-07 16:25:08 | 2021-03-07 16:25:08 | 2021-03-07 16:25:08 | latin1_swedish_ci |     NULL |

From this we see that the OPL_BDchapter table uses the collation latin1_swedish_ci which means it is using the latin1 character set. Below we will present two methods for converting the database from the latin1 to the utf8mb4 character sets, either working table by table or working on the whole database at once. If you chose to use the table by table method, the above information will tell you what tables you need to convert.

Now exit MySQL

mysql> exit
Bye
$

Assuming the webwork database is using the latin1 character set, continue reading these instructions.

Check what the default character set is for MySQL on your new or upgraded server

Log into mysql on your new or upgraded server. Depending on your OS and mysql version, you will either use the command

$ mysql -u root -p 
Enter Password: <mysql root password>

or

$ sudo mysql
[sudo] password for wwadmin: <wwadmin password>

You should see something very similar to

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 4
...

Now issue the following mysql commands:

mysql> SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';

and you should see

+--------------------------+--------------------+
| Variable_name            | Value              |
+--------------------------+--------------------+
| character_set_client     | utf8mb4            |
| character_set_connection | utf8mb4            |
| character_set_database   | utf8mb4            |
| character_set_filesystem | binary             |
| character_set_results    | utf8mb4            |
| character_set_server     | utf8mb4            |
| character_set_system     | utf8               |
| collation_connection     | utf8mb4_0900_ai_ci |
| collation_database       | utf8mb4_0900_ai_ci |
| collation_server         | utf8mb4_0900_ai_ci |
+--------------------------+--------------------+
10 rows in set (0.00 sec)

mysql> 

Now exit MySQL

 mysql> exit
Bye
$


If your version of MySQL is not using utf8mb4 as listed above, I would strongly suggest that you upgrade MySQL to version 8. You can find out the version of MySQL on your server with the command

$ mysql -V

Version 8 uses the character set utf8mb4 by default. If for whatever reason you can not upgrade to version 8, then you should edit the my.cnf file which is probably in the /etc/mysql/ directory. Actually my.cnf might be redirected to another file (e.g. mysql.cnf) so edit the appropriate file. You will probably have to be root to edit the file (e.g. "sudo gedit mysql.cnf). At the end of the file add the following

[client]
default-character-set=utf8mb4     

[mysql]
default-character-set=utf8mb4

[mysqld]
init-connect='SET NAMES utf8mb4'
character_set_server=utf8mb4

Then save the file and exit. Restart mysql

$ sudo /etc/init.d/mysql restart

and then log into mysql again and repeat the command

mysql> SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';

to check that mysql is now using utf8mb4.

Now any new MySQL tables created in the webwork database will use the utf8mb4 character set. So if you create a new WeBWorK course, it's associated tables will use the utf8mb4 character set. Similarly if you update the OPL (Open Problem Library) by running the OPL-update command, this create all new OPL tables and so they will all the utf8mb4 character set. However any existing courses will have associated tables using latin1. Likewise if you use the admin course methods "Archive Course" and "Unarchive Course" to move a course from a server using the latin1 character set to a server using the utf8mb4 character set, the unarchived course will still have associated tables using latin1.

Backup the webwork database

IMPORTANT: Do not skip this step.

First we create a directory to hold the backup file and cd to the new directory

$ mkdir mysql_backups
$ cd mysql_backups

From now on in these instructions I will assume you are using the older method of logging into mysql as root using a password rather than the newer method of sudoing into mysql. If this is not the case, use sudo wherever the mysql password is used in these instructions.

Now use the mysqldump command to create the backup file.

 $ mysqldump -u root -p webwork > webwork_backup.sql
Enter Password: <mysql root password>

After the process finishes, the file webwork_backup.sql will be located in the mysql_backups directory.

Restore the webwork database

Hopefully this will not be necessary but here is how the restore the backedup webwork database.

First we have to log into MySQL and then drop and recreate the webwork database.

$ mysql -u root -p 
Enter password: <mysql root password>
mysql> drop database webwork;
mysql> CREATE DATABASE webwork;
mysql> GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, ALTER, DROP, LOCK TABLES ON webwork.* TO 'webworkWrite'@'localhost';

Now exit MySQL

mysql> exit
Bye
$

cd to the mysql_backups directory the contains the backup sql file.

$ cd
$ cd mysql_backups

and use the following command to restore the database

mysql -u root -p webwork < webwork_backup.sql
Enter password: <mysql root password>

Now connect to WeBWorK and everything should be restored.

Maximum Key Length Issue

WeBWorK uses the MyISAM Engine for almost all of its MySQL tables and these tables have a maximum key length of 1000 bytes. In a number of tables in earlier versions of WeBWorK (prior to version 2.15), the maximum key length is set to 255 characters. With the latin1 character set with 1 byte per character, there is no issue. However, with the utf8mb4 character set with 4 bytes per character, the maximum length in bytes is 4 x 255 which is greater than 1000 and is not allowed. Thus all the offending tables must be altered before converting from latin1 to utf8mb4.

The tables that need to be altered are location_addresses, coursename_setting for every course, and most of the OPL tables. Instead of altering the OPL tables directly, we will run OPL-update which will recreate the tables in the proper format. However, we may have to alter the location_addresses table (depending on your situation) and will have to alter the coursename_setting tables (for every course) individually.

Methodology

The reason you are converting from latin1 to utf8mb4 character sets is that you are upgrading from an earlier version of WeBWorK to WeBWorK version 2.15 or above and you want to move your courses (some or all) from the old version of WeBWorK to the new. There are basically three methods of doing this.

  1. Build a new server and use the admin course methods "Archive Course" and "Unarchive Course" to move courses from the old server to the new server. To build your new server, you can build it from scratch following the directions at Installation_Manual_for_2.15_on_Ubuntu_20.04_Server, use a pre built Virtual Machine Image (see Installing_from_WW2.15_Ubuntu20.04_Server Virtual Machine Image) or use a pre built AWS image (see WeBWorK_2.15_Ubuntu_Server_20.04_LTS_Amazon_Machine_Image). This may be the preferred method since your old server will remain fully functioning until you switch over to the new server and you will end up with a new server with up to date versions of the OS, WeBWorK, MySQL and all other components.
  2. Build a new server as above and move the whole webwork database and courses directory over to the new server. This may cause some issues with the admin course and other things that are easy to address.
  3. Update WeBWorK on your current server using git basically following the directions in Release notes for WeBWorK 2.14. The issues here include all the issues in 2 above a a few more.

From the standpoint of converting the webwork database from from latin1 to utf8mb4 cases 2 and 3 are basically identical so we will give instructions covering cases 1 and 2.

Method 1: Build a new server and use the admin course methods "Archive Course" and "Unarchive Course" to move courses from the old server to the new server

Step 1

Build your new server from scratch following the directions at Installation_Manual_for_2.15_on_Ubuntu_20.04_Server, or use a pre built Virtual Machine Image (see Installing_from_WW2.15_Ubuntu20.04_Server Virtual Machine Image) or use a pre built AWS image (see WeBWorK_2.15_Ubuntu_Server_20.04_LTS_Amazon_Machine_Image). If you are not using Ubuntu, you can still follow Installation_Manual_for_2.15_on_Ubuntu_20.04_Server, just make the obvious changes. Check that WeBWorK is working properly on your new server before transferring old courses to the new server.

Step 2

On your old server log into the admin course (Course Administration), select "Archive Course", select the courses you want to move to the new server and click on "Archive Courses"

Step 3

For each course, "Archive Courses" will create a coursename.tar.gz file in the /opt/webwork/courses directory.

Step 4

Transfer all the coursename.tar.gz files from your old server to your new server (e.g. by sftp) and put them in the /opt/webwork/courses directory on the new server.

Step 5

On your new server log into the admin course (Course Administration), select "Unarchive Course", select a course you want to unarchive and click on "Unarchive Courses". This is a process you have to perform course by course. Follow any instructions. For example you may have to edit the course.conf file replacing the copyright symbol by '©' to prevent warning messages.

Step 6

Next in the admin course (Course Administration), select "Upgrade Courses", and upgrade all courses that require upgrading (don't skip this step).

Step 7

You might want to log into some or all of your transferred courses just to make sure that everything is OK so far.

Step 8

Before we start manually altering tables, now would be an excellent time to backup the webwork database if you have not done so already.

Step 9

Now we will handle the Maximum Key Length Issue mentioned above which occurs in the coursename_setting table and the coursname_past_answer table for every course we have transfered.

Log into mysql. Depending on your OS and mysql version, you will either use the command

$ mysql -u root -p 
Enter Password: <mysql root password>

or

$ sudo mysql
[sudo] password for wwadmin: <wwadmin password>

Now issue the following mysql commands:

mysql> Use webwork;
...
Database changed

Use the command SHOW TABLES to list all tables and the command DESC tablename; to describe any table. For example

mysql> DESC test1_setting;
+-------+--------------+------+-----+---------+-------+
| Field | Type         | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+-------+
| name  | varchar(255) | NO   | PRI | NULL    |       |
| value | text         | YES  |     | NULL    |       |
+-------+--------------+------+-----+---------+-------+
2 rows in set (0.01 sec)

mysql> DESC myTestCourse_setting;
+-------+--------------+------+-----+---------+-------+
| Field | Type         | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+-------+
| name  | varchar(240) | NO   | PRI | NULL    |       |
| value | text         | YES  |     | NULL    |       |
+-------+--------------+------+-----+---------+-------+
2 rows in set (0.00 sec)

test1 above is a course transferred from an old version of WeBWorK and myTestCourse is a course created on our new server. Notice that the key length is 240 for myTestCourse but 255 for test1.

Similarly, we see

mysql> desc test1_past_answer;
+----------------+---------------+------+-----+---------+----------------+
| Field          | Type          | Null | Key | Default | Extra          |
+----------------+---------------+------+-----+---------+----------------+
| answer_id      | int           | NO   | PRI | NULL    | auto_increment |
| course_id      | varchar(100)  | NO   | PRI | NULL    |                |
| user_id        | varchar(100)  | NO   | PRI | NULL    |                |
| set_id         | varchar(100)  | NO   | PRI | NULL    |                |
| problem_id     | int           | NO   | PRI | NULL    |                |
| source_file    | text          | YES  |     | NULL    |                |
| timestamp      | int           | YES  |     | NULL    |                |
| scores         | tinytext      | YES  |     | NULL    |                |
| answer_string  | varchar(5012) | YES  |     | NULL    |                |
| comment_string | varchar(5012) | YES  |     | NULL    |                |
+----------------+---------------+------+-----+---------+----------------+
10 rows in set (0.01 sec)

and

mysql> desc myTestCourse_past_answer;
+----------------+---------------+------+-----+---------+----------------+
| Field          | Type          | Null | Key | Default | Extra          |
+----------------+---------------+------+-----+---------+----------------+
| answer_id      | int           | NO   | PRI | NULL    | auto_increment |
| course_id      | varchar(80)   | NO   | PRI | NULL    |                |
| user_id        | varchar(80)   | NO   | PRI | NULL    |                |
| set_id         | varchar(80)   | NO   | PRI | NULL    |                |
| problem_id     | int           | NO   | PRI | NULL    |                |
| source_file    | text          | YES  |     | NULL    |                |
| timestamp      | int           | YES  |     | NULL    |                |
| scores         | tinytext      | YES  |     | NULL    |                |
| answer_string  | varchar(5012) | YES  |     | NULL    |                |
| comment_string | varchar(5012) | YES  |     | NULL    |                |
+----------------+---------------+------+-----+---------+----------------+
10 rows in set (0.01 sec)

Notice that the key length is 80 for myTestCourse but 100 for test1.

logout of mysql (or use a new terminal session)

Now exit MySQL
mysql> exit
Bye
$

Now we will edit all coursename_setting tables and all coursename_past_answer tables setting the correct key length.

How we do this depends on how you log into mysql. If use use

sudo mysql

to log into mysql, then you need to become root

sudo su
[sudo] password for wwadmin: <wwadmin password>

and then run the commands below as root

# mysql --database=webwork -B -N -e "SHOW TABLES LIKE '%\_setting' " | awk '{print "ALTER TABLE", $1, "MODIFY name varchar(240) NOT NULL; "}' | mysql --database=webwork

# mysql --database=webwork -B -N -e "SHOW TABLES LIKE '%\_past\_answer' " | awk '{print "ALTER TABLE", $1, "MODIFY course_id varchar(80) NOT NULL, MODIFY user_id varchar(80) NOT NULL, MODIFY set_id varchar(80) NOT NULL; "}' | mysql --database=webwork

On the other hand if you log into mysql with the command

$ mysql -u root -p 
Enter Password: <mysql root password>

then we have a make a temporary change to the my.cnf file which is probably in the /etc/mysql/ directory. Actually my.cnf might be redirected to another file (e.g. mysql.cnf) so edit the appropriate file. You will probably have to be root to edit the file (e.g. "sudo gedit mysql.cnf). First make a backup. For example do the following

$ cd/etc/mysql
$ sudo cp mysql.cnf mysql.cnf.bak
$ sudo gedit mysql.cnf
At the end of the client section add the following (make sure the password is enclosed in quotation marks but don't use angle brackets):
[client]
user=username root
password="<mysql root password>"

Then save the file and quit. Now run the commands

$ mysql --database=webwork -B -N -e "SHOW TABLES LIKE '%\_setting' " | awk '{print "ALTER TABLE", $1, "MODIFY name varchar(240) NOT NULL; "}' | mysql --database=webwork

$ mysql --database=webwork -B -N -e "SHOW TABLES LIKE '%\_past\_answer' " | awk '{print "ALTER TABLE", $1, "MODIFY course_id varchar(80) NOT NULL, MODIFY user_id varchar(80) NOT NULL, MODIFY set_id varchar(80) NOT NULL; "}' | mysql --database=webwork


NOTES:

1. you are upgrading from WeBWorK version 2.12 or later, the above changes should suffice. If you are upgrading from an earlier version of WeBWorK, more tables may need to be modified. For example if you are upgrading from WeBWorK version 2.7, you will run into an error in Step 10 below and following the instructions there you will see that in WebWorK version 2.7 the _past_answer table has problem_id as Type varchar(100) but WebWorK version 2.15 the Type is int so in analogy with the above commands we need to run

# mysql --database=webwork -B -N -e "SHOW TABLES LIKE '%\_past\_answer' " | awk '{print "ALTER TABLE", $1, "MODIFY problem_id int NOT NULL; "}' | mysql --database=webwork

2. These commands should run without errors but if there is an error, log into mysql and do the following:

mysql> SHOW TABLE STATUS FROM webwork;

Look down the list of tables checking which character set is being used. The first table in the list that is still using the latin1 character set is the one that caused the error. Look at it's description or data to try to figure out what the problem may be.

Step 10

Now we are finally at the stage where we can convert all tables to the utf8mb4 character set. Note that if a table already uses utf8mb4, the command below will leave things unchanged. We assume the setup is as in Step 9 above where you are either acting as root or are using the modified my.cnf file. Run the command (either as root or as a regular user depending on your setup)

# mysql --database=webwork -B -N -e "SHOW TABLES" | awk '{print "SET foreign_key_checks = 0; ALTER TABLE", $1, "CONVERT TO CHARACTER SET utf8mb4 ; SET foreign_key_checks = 1; "}' | mysql --database=webwork

This should run without errors but if there is an error, log into mysql and do the following:

mysql> SHOW TABLE STATUS FROM webwork;

Look down the list of tables checking which character set is being used. The first table in the list that is still using the latin1 character set is the one that caused the error. Look at it's description or data to try to figure out what the problem may be. After fixing the problem, run the above command again.

Now if you are acting as root return to a regular user

# exit
$

or if you made the temporary change to my.cnf or a file to which it is rediredted, then let's return to the original file, e.g.

$ cd /etc/mysql
$ sudo mv mysql.cnf.bak mysql.cnf

Step 11

Check and possible repair the database.

Run the command

$ sudo mysqlcheck webwork

and hopefully all tables will be OK. If not you can first backup the database and then try

$ sudo mysqlcheck -r webwork table_name

where obviously you should replace table_name by the name of the table you are trying to repair. If this fails you on the internet for further help.

Method 2: Build a new server and move the whole webwork database and courses directory over to the new server