WeBWorK Main Forum

Primary keys

Primary keys

by Jeremy Lounds -
Number of replies: 10
I thought it would be best to start a new thread specifically about the primary keys issue, as discussed toward the end of this tread http://webwork.maa.org/moodle/mod/forum/discuss.php?d=4350#p13410

I have been looking at the files in webwork2/lib/WeBWorK/DB/Record and comparing to the tables in MySQL after using the web interface "Add Course" form.

How is the primary key specified in the DB/Record file? Obviously, I see the "key=>1" on certain fields/columns, but take the following example.

Setting.pm

name => { type=>"VARCHAR(255) NOT NULL", key=>1 }

User.pm

user_id => { type=>"TINYBLOB NOT NULL", key=>1 }

In MySQL, in table "course_name_setting", the "name" column is a primary key. In table "course_name_user", the "user_id" column is a unique index, but not a primary key.

Additionally, in UserSet.pm, it is more confusing to me

user_id => { type=>"TINYBLOB NOT NULL", key=>1 },
set_id => { type=>"TINYBLOB NOT NULL", key=>1 },
psvn => { type=>"INT UNIQUE NOT NULL AUTO_INCREMENT" }

The column "psvn" isn't set as a key in the definition above, but yet it is the primary key in the table.

I am happy to help, but I should probably understand what is going on currently before I try to do that. smile

Jeremy

In reply to Jeremy Lounds

Re: Primary keys

by Jeremy Lounds -
Related question -- how do I "upgrade" the 3 tables that are not part of the courses? (depths, locations and location_addresses)

E.g., I added a column by editing lib/WeBWorK/DB/Record/Locations.pm

uniq_id => { type=>"INT UNIQUE NOT NULL AUTO_INCREMENT" },

But I cannot find a script or page in the admin area to upgrade these tables.

So, I tried dropping all tables, and removing all folders (except modelCourse) from /opt/webwork/courses and ran the "addcourse admin" command from the installation guide. The tables were created, but the new column I added was not added.

What am I missing? Thank you!
In reply to Jeremy Lounds

Re: Primary keys

by Michael Gage -
I'll help where I can here.

First some history. The database infrastructure has not been overhauled in some time. It was created largely by Sam Hathaway, an undergraduate and later an alumnus of University of Rochester, working "under my direction" which means I have a vague idea of what he was doing and what he did-- but essentially all of the database design (and redesign) is his. At the time we were supporting at least two database systems GDBM, which we used until mysql was invented, and after that mysql. There remains more than a little legacy code which is no longer needed, to support GDBM for example, but being aware of the history helps explain some of the weirdest idiosyncrasies. I know that at one point Sam decided that he had over abstracted the design and the code was refactored to run more quickly -- that also means that some of the code .pm modules may not be used but were kept around in case they or the ideas in them were needed someday. (This was also before git -- so we didn't have easy access to old versions of the code.)

Two more places of interest: There is some documentation of the construction of the SQL code base in https://github.com/openwebwork/webwork2/tree/master/doc/devel (Sam kept good notes) -- but it should be taken with a grain of salt since there may have been subsequent changes by others that were not so well documented.

Secondly the schema is defined in webwork2/conf/database.conf.dist .
(You can create a database.conf in which case that will be read instead of database.conf.dist but so few people made local modifications that database.conf.dist is read automatically if there is no local database.conf file.)

You have probably figured all of this out already. I'm glad you are working on this project. Overhauling the SQL code has long been on the backburner -- both to clean out the cruft that is no longer used and to make it possible to use postgresql as a database -- another goal that got stalled. I hope that while you are working on your project you can keep your eyes open for ideas that would allow us to make it easier for the next person to make upgrades to the database code.

Now to your current question. Danny Glin and Gavin LaRose did more recent work (2007) on adding the location table and the proctor capability to WeBWorK. The code for running it is
in ContentGenerator/CourseAdmin.pm around line 2548. This handles the front end. WeBWorK/Utils/CourseManagement.pm contains the operational code.

I believe the location tables are created by the function initNonNativeTables in CourseManagement.pm You are not creating new tables so I don't think you need to adjust database.conf.dist. This function is called at line 71 of CourseAdmin. Your new fields should be added around line 1120 of CourseManagement. I think they are added the first time you run the admin page after a change in the schema.

Hope this helps. Good luck and respond back with results or if you have more questions.

--------------
Having dumped all of this information on you and rereading it -- it seems to me that perhaps the problem is the SQL entry. (Check the sql error logs -- I suspect that WW fails silently on this kind of thing -- another bug.)

For example I see "unique key" in examples creating databases but I don't
see "unique" by itself. Try taking out "unique" and see if the tables are built.
You will need to remove the field from the table manually or WW will not try to add the field.

Further information -- the phrase "unique key..." is added for other tables by code in DB/Schema/NewSQL/Std.pm The key fields to use are determined by a method in DB/Records.pm and triggered by the item key=>1 in the record specification.

So

uniq_id => { type=>INT NOT NULL AUTO_INCREMENT",key=>1}

might do exactly what you want.


-- Mike



In reply to Jeremy Lounds

Re: Primary keys

by Jeremy Lounds -
Well, I ran into an issue. After adding the following line to lib/WeBWorK/DB/Record/User.pm

uniq_id => { type=>"INT UNIQUE NOT NULL AUTO_INCREMENT" },

and upgrading the admin course, the Classlist Editor page gives the following error (and no classlist)

Error messages

Can't use an undefined value as a HASH reference at /opt/webwork/webwork2/lib/WeBWorK/ContentGenerator/Instructor/UserList2.pm line 1805.

Call stack

The information below can help locate the source of the problem.

  • in WeBWorK::ContentGenerator::Instructor::UserList2::printTableHTML called at line 554 of /opt/webwork/webwork2/lib/WeBWorK/ContentGenerator/Instructor/UserList2.pm
  • in WeBWorK::ContentGenerator::Instructor::UserList2::body called at line 155 of /opt/webwork/webwork2/lib/WeBWorK/Template.pm
  • in WeBWorK::Template::template called at line 562 of /opt/webwork/webwork2/lib/WeBWorK/ContentGenerator.pm
  • in WeBWorK::ContentGenerator::content called at line 232 of /opt/webwork/webwork2/lib/WeBWorK/ContentGenerator.pm
  • in WeBWorK::ContentGenerator::go called at line 384 of /opt/webwork/webwork2/lib/WeBWorK.pm
...I just noticed a reply to my other question, but have not read it yet, maybe that will answer this too :-)
In reply to Jeremy Lounds

Re: Primary keys

by Nathan Wallach -
lib/WeBWorK/ContentGenerator/Instructor/UserList2.pm
defines a constant (hash reference of hash references) FIELD_PROPERTIES which it uses to understand the data in the user table.

It seems you need to add another internal hash to tell this part of the code how to handle the new uniq_id field you just added.

Something like:

uniq_id => {
access => 'hidden',
},

should probably do what is needed.

Good luck
In reply to Nathan Wallach

Re: Primary keys

by Jeremy Lounds -
Thank you so much, that worked! (after I restarted Apache)
In reply to Jeremy Lounds

Re: Primary keys

by Michael Gage -
restarting apache is essential after changes in any .pm file. because .pm files are cached by mod_perl. Sometimes a restart is required after changes in .pl files as well, although most of those are read with every request).

In reply to Jeremy Lounds

Re: Primary keys

by Nathan Wallach -
I just took a look, out of curiosity, as I had looked at the database code recently when I wanted to get utf8mb4 working in the database.

1. I'm not sure how the primary keys are set. From what I can tell, no current regular WW code is declaring any primary key. I suspect that MariaDB/mySQL may be automatically upgrading the (first) "UNIQUE KEY" to be a primary key at some point, but cannot find any reference to how/why that would happen.

2. In practice, I do also see some primary keys in 3 of the course tables (per course):
  • courseName_setting
  • courseName_past_answer
  • courseName_set_user
but most tables do not have a primary key while several do have at least one unique key.

However, running "show create table tableName;" only shows a "UNIQUE KEY" in the "create table" command for these tables. The recorded "create table" syntax corresponds to what the _create_table_stmt() method in lib/WeBWorK/DB/Schema/NewSQL/Std.pm has. It only has code which makes the "first" key settings (all the key fields together) a "UNIQUE KEY".

3. There is older code in both lib/WeBWorK/Utils/DBUpgrade.pm and doc/devel/new-DB-sql which does include SQL code defining a primary key.

4. About "psvn" - from what I read (link below) - the UNIQUE constraint on a column automatically adds an index on the column. This at least explains why psvn is an index in courseName_set_user table. The other fields which are indexed on this table use repeated values, so cannot be in a primary key, while psvn is non-null and set with the UNIQUE constraint. Thus, if mySQL is automatically upgrading some index on this table to be a primary key, it can only be psvn.
5. About the 3 non-course tables, I recently discovered where the code is which triggers there creation. That happens in the initNonNativeTables() subroutine in lib/WeBWorK/Utils/CourseManagement.pm and when I changed some field lengths and had the tables recreated (probably by logging in to the admin course) the modified field definitions were used. (If you are testing on Docker, make sure you are mounting your "local" webwork2 and not just using what is inside the initial Docker image.)
In reply to Nathan Wallach

Re: Primary keys

by Jeremy Lounds -
This is all very helpful info.

I tried to move a course from a production server using the [ archive / unarchive / upgrade ] process, and it failed because some of the tables I didn't expect a primary key on had a primary key! (e.g., table "course_name_user", column "user_id")

This would support the idea that somewhere, somehow, the first unique key/index is being changed to an actual primary key in MySQL.

(for comparison, when I add a new course on the production server, which is stock 2.13, and the "course_name_user.user_id" is not a primary key)

So confusing! I am very hesitant to move ahead with our plans for a MySQL cluster until I can wrap my head around this.

Thanks all!

Jeremy

In reply to Jeremy Lounds

Re: Primary keys

by Michael Gage -
Here is the official reference from mysql about primary key/index, unique key/index, etc. (apparently key and index are interchangeable). Also primary key is sometimes just written as key.

I would like to have a field that only identifies a row -- and has no other meaning. I think PRIMARY key is the right idea for this but maybe UNIQUE key is enough. "If a table has a PRIMARY KEY or UNIQUE NOT NULL index that consists of a single column that has an integer type, you can use _rowid to refer to the indexed column in SELECT statements, as described in Unique Indexes."

As far as I can tell the primary key is just a unique, non-null key with a distinctive name. I want the properties, integer valued, single column, unique and non-null.

Some code in CourseManagement would have to modified so that the entry primary_key =>1 in a field definition would create the right sql code. (or just naming the field using primary_id => {INT NULL UNIQUE (ascending?) }
or something similar)

I would also like an automatic timestamp field.

There have been many times when the existence of these two items in a row in each data table would have simplified coding and perhaps sped up performance.


In reply to Jeremy Lounds

Re: Primary keys

by Michael Gage -
Can you explain the failure when you move a course from the production server -- why/how does it fail? Does adding a designation of PRIMARY cause a problem?
The archive/unarchive process uses mysqldump so that could be a source of the modification. I don't see anywhere else the term PRIMARY could be added outside of the OPL tables.

The OPL software uses PRIMARY key. As far as I can tell none of the current table creation software uses PRIMARY anywhere. There are legacy codes, e.g. in DBupgrade that use PRIMARY key. I don't know why the current versions for building course tables do not use a PRIMARY key.

The way PRIMARY key is used in the OPL software is not the way I would prefer in the sense that it designates a column with unique but meaningful entries (e.g. an author's email or the path to the problem) where as I would like in addition a primary field containing only a number that just indicates the row (an auto-inc type row).