Difference between revisions of "Converting the webwork database from the latin1 to the utf8mb4 character set"

From WeBWorK_wiki
Jump to navigation Jump to search
Line 201: Line 201:
   
 
From the standpoint of converting the webwork database from from latin1 to utf8mb4 cases 2 and 3 are basically identical so we will give instructions covering cases 1 and 2.
 
From the standpoint of converting the webwork database from from latin1 to utf8mb4 cases 2 and 3 are basically identical so we will give instructions covering cases 1 and 2.
  +
  +
==Method 1: Build a new server and use the admin course methods "Archive Course" and "Unarchive Course" to move courses from the old server to the new server==
  +
# Build your new server from scratch following the directions at [[Installation_Manual_for_2.15_on_Ubuntu_20.04_Server]], or use a pre built Virtual Machine Image (see [[Installing_from_WW2.15_Ubuntu20.04_Server Virtual Machine Image]]) or use a pre built AWS image (see [[WeBWorK_2.15_Ubuntu_Server_20.04_LTS_Amazon_Machine_Image]]). If you are not using Ubuntu, you can still follow [[Installation_Manual_for_2.15_on_Ubuntu_20.04_Server]], just make the obvious changes. Check that WeBWorK is working properly on your new server before transferring old courses to the new server.
  +
# On your old server log into the admin course (Course Administration), select "Archive Course", select the courses you want to move to the new server and click on "Archive Courses"
  +
# For each course, "Archive Courses" will create a coursename.tar.gz file in the <code>/opt/webwork/courses</code> directory.
  +
# Transfer all the coursename.tar.gz files from your old server to your new server (e.g. by sftp) and put them in the <code>/opt/webwork/courses</code> directory on the new server.
  +
# # On your new server log into the admin course (Course Administration), select "Unarchive Course", select the courses you want to move to the new server and click on "Unarchive Courses"
  +
  +
  +
==Method 2: Build a new server and move the whole webwork database and courses directory over to the new server==

Revision as of 17:53, 15 March 2021

These instructions explain how to convert the webwork database from the latin1 to the utf8mb4 character set.

Terminal Window Notation

In a terminal window some commands will have to be run as root whereas others should be run as a regular user. We will use # to indicate that the command is to be run as root e.g.

# perl -MCPAN -e shell

and $ to indicate that the command is to be run as a normal user e.g.

$ mysql -u root -p 

Of course you can use sudo to run commands as root from a standard command prompt. In general we will use the same notation used in Installation Manual for 2.12 on Ubuntu 16.04.

Preliminaries

Check What the Current Character Set Is

Before we begin let's make use sure the webwork database is using the latin1 character set.

Log into mysql. Depending on your OS and mysql version, you will either use the command

$ mysql -u root -p 
Enter Password: <mysql root password>

or

$ sudo mysql
[sudo] password for wwadmin: <wwadmin password>

You should see something very similar to

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 4
...

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

Now issue the following mysql commands:

mysql> Use webwork;
...
Database changed


mysql> SELECT @@character_set_database;

If the webwork database is using the latin1 character set you will see:

+--------------------------+
| @@character_set_database |
+--------------------------+
| latin1                   |
+--------------------------+
1 row in set (0.04 sec)

Independent of the above result, run the following command which will show the collation for every table. The collation specifies the character set.

mysql> SHOW TABLE STATUS FROM webwork;

Looking at the results you will see information on every table, e.g.

| Name                                 | Engine | Version | Row_format | Rows  | Avg_row_length | Data_length | Max_data_length   | Index_length | Data_free | Auto_increment | Create_time         | Update_time         | Check_time          | Collation         | Checksum | Create_options | Comment |
+--------------------------------------+--------+---------+------------+-------+----------------+-------------+-------------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+
| OPL_DBchapter                        | MyISAM |      10 | Dynamic    |   176 |             36 |        6348 |   281474976710655 |        16384 |         0 |            180 | 2021-03-07 16:25:08 | 2021-03-07 16:25:08 | 2021-03-07 16:25:08 | latin1_swedish_ci |     NULL |

From this we see that the OPL_BDchapter table uses the collation latin1_swedish_ci which means it is using the latin1 character set. Below we will present two methods for converting the database from the latin1 to the utf8mb4 character sets, either working table by table or working on the whole database at once. If you chose to use the table by table method, the above information will tell you what tables you need to convert.

Now exit MySQL

mysql> exit
Bye
$

Assuming the webwork database is using the latin1 character set, continue reading these instructions.

Check what the default character set is for MySQL on your new or upgraded server

Log into mysql on your new or upgraded server. Depending on your OS and mysql version, you will either use the command

$ mysql -u root -p 
Enter Password: <mysql root password>

or

$ sudo mysql
[sudo] password for wwadmin: <wwadmin password>

You should see something very similar to

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 4
...

Now issue the following mysql commands:

mysql> SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';

and you should see

+--------------------------+--------------------+
| Variable_name            | Value              |
+--------------------------+--------------------+
| character_set_client     | utf8mb4            |
| character_set_connection | utf8mb4            |
| character_set_database   | utf8mb4            |
| character_set_filesystem | binary             |
| character_set_results    | utf8mb4            |
| character_set_server     | utf8mb4            |
| character_set_system     | utf8               |
| collation_connection     | utf8mb4_0900_ai_ci |
| collation_database       | utf8mb4_0900_ai_ci |
| collation_server         | utf8mb4_0900_ai_ci |
+--------------------------+--------------------+
10 rows in set (0.00 sec)

mysql> 

Now exit MySQL

 mysql> exit
Bye
$


If your version of MySQL is not using utf8mb4 as listed above, I would strongly suggest that you upgrade MySQL to version 8. You can find out the version of MySQL on your server with the command

$ mysql -V

Version 8 uses the character set utf8mb4 by default. If for whatever reason you can not upgrade to version 8, then you should edit the my.cnf file which is probably in the /etc/mysql/ directory. Actually my.cnf might be redirected to another file (e.g. mysql.cnf) so edit the appropriate file. You will probably have to be root to edit the file (e.g. "sudo gedit mysql.cnf). At the end of the file add the following

[client]
default-character-set=utf8mb4     

[mysql]
default-character-set=utf8mb4

[mysqld]
init-connect='SET NAMES utf8mb4'
character_set_server=utf8mb4

Then save the file and exit. Restart mysql

$ sudo /etc/init.d/mysql restart

and then log into mysql again and repeat the command

mysql> SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';

to check that mysql is now using utf8mb4.

Now any new MySQL tables created in the webwork database will use the utf8mb4 character set. So if you create a new WeBWorK course, it's associated tables will use the utf8mb4 character set. Similarly if you update the OPL (Open Problem Library) by running the OPL-update command, this create all new OPL tables and so they will all the utf8mb4 character set. However any existing courses will have associated tables using latin1. Likewise if you use the admin course methods "Archive Course" and "Unarchive Course" to move a course from a server using the latin1 character set to a server using the utf8mb4 character set, the unarchived course will still have associated tables using latin1.

Backup the webwork database

IMPORTANT: Do not skip this step.

First we create a directory to hold the backup file and cd to the new directory

$ mkdir mysql_backups
$ cd mysql_backups

From now on in these instructions I will assume you are using the older method of logging into mysql as root using a password rather than the newer method of sudoing into mysql. If this is not the case, use sudo wherever the mysql password is used in these instructions.

Now use the mysqldump command to create the backup file.

 $ mysqldump -u root -p webwork > webwork_backup.sql
Enter Password: <mysql root password>

After the process finishes, the file webwork_backup.sql will be located in the mysql_backups directory.

Restore the webwork database

Hopefully this will not be necessary but here is how the restore the backedup webwork database.

First we have to log into MySQL and then drop and recreate the webwork database.

$ mysql -u root -p 
Enter password: <mysql root password>
mysql> drop database webwork;
mysql> CREATE DATABASE webwork;
mysql> GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, ALTER, DROP, LOCK TABLES ON webwork.* TO 'webworkWrite'@'localhost';

Now exit MySQL

mysql> exit
Bye
$

cd to the mysql_backups directory the contains the backup sql file.

$ cd
$ cd mysql_backups

and use the following command to restore the database

mysql -u root -p webwork < webwork_backup.sql
Enter password: <mysql root password>

Now connect to WeBWorK and everything should be restored.

Maximum Key Length Issue

WeBWorK uses the MyISAM Engine for almost all of its MySQL tables and these tables have a maximum key length of 1000 bytes. In a number of tables in earlier versions of WeBWorK (prior to version 2.15), the maximum key length is set to 255 characters. With the latin1 character set with 1 byte per character, there is no issue. However, with the utf8mb4 character set with 4 bytes per character, the maximum length in bytes is 4 x 255 which is greater than 1000 and is not allowed. Thus all the offending tables must be altered before converting from latin1 to utf8mb4.

The tables that need to be altered are location_addresses, coursename_setting for every course, and most of the OPL tables. Instead of altering the OPL tables directly, we will run OPL-update which will recreate the tables in the proper format. However, we may have to alter the location_addresses table (depending on your situation) and will have to alter the coursename_setting tables (for every course) individually.

Methodology

The reason you are converting from latin1 to utf8mb4 character sets is that you are upgrading from an earlier version of WeBWorK to WeBWorK version 2.15 or above and you want to move your courses (some or all) from the old version of WeBWorK to the new. There are basically three methods of doing this.

  1. Build a new server and use the admin course methods "Archive Course" and "Unarchive Course" to move courses from the old server to the new server. To build your new server, you can build it from scratch following the directions at Installation_Manual_for_2.15_on_Ubuntu_20.04_Server, use a pre built Virtual Machine Image (see Installing_from_WW2.15_Ubuntu20.04_Server Virtual Machine Image) or use a pre built AWS image (see WeBWorK_2.15_Ubuntu_Server_20.04_LTS_Amazon_Machine_Image). This may be the preferred method since your old server will remain fully functioning until you switch over to the new server and you will end up with a new server with up to date versions of the OS, WeBWorK, MySQL and all other components.
  2. Build a new server as above and move the whole webwork database and courses directory over to the new server. This may cause some issues with the admin course and other things that are easy to address.
  3. Update WeBWorK on your current server using git basically following the directions in Release notes for WeBWorK 2.14. The issues here include all the issues in 2 above a a few more.

From the standpoint of converting the webwork database from from latin1 to utf8mb4 cases 2 and 3 are basically identical so we will give instructions covering cases 1 and 2.

Method 1: Build a new server and use the admin course methods "Archive Course" and "Unarchive Course" to move courses from the old server to the new server

  1. Build your new server from scratch following the directions at Installation_Manual_for_2.15_on_Ubuntu_20.04_Server, or use a pre built Virtual Machine Image (see Installing_from_WW2.15_Ubuntu20.04_Server Virtual Machine Image) or use a pre built AWS image (see WeBWorK_2.15_Ubuntu_Server_20.04_LTS_Amazon_Machine_Image). If you are not using Ubuntu, you can still follow Installation_Manual_for_2.15_on_Ubuntu_20.04_Server, just make the obvious changes. Check that WeBWorK is working properly on your new server before transferring old courses to the new server.
  2. On your old server log into the admin course (Course Administration), select "Archive Course", select the courses you want to move to the new server and click on "Archive Courses"
  3. For each course, "Archive Courses" will create a coursename.tar.gz file in the /opt/webwork/courses directory.
  4. Transfer all the coursename.tar.gz files from your old server to your new server (e.g. by sftp) and put them in the /opt/webwork/courses directory on the new server.
  5. # On your new server log into the admin course (Course Administration), select "Unarchive Course", select the courses you want to move to the new server and click on "Unarchive Courses"


Method 2: Build a new server and move the whole webwork database and courses directory over to the new server