DatabaseRequirements

From WeBWorK_wiki
Revision as of 14:19, 22 January 2012 by Aubreyja (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
This article has been retained as a historical document. It is not up-to-date and the formatting may be lacking. Use the information herein with caution.

Requirements for post-GDBM database architecture (with an increasing number of implementation notes)

Basic concepts

Expandability: We should be able to add columns to existing tables and add new tables without coding. (Coding may be required when there's some special behavior associated with the new element.) Also, existing client code should be able to remain ignorant of the new elements, and the database system should "do the right thing".

Lately, we want to be able to add new columns on-the-fly, without any explicit "ugrading" of existing tables. This is harder than it looks.

  • One way to do this would be to have some file somewhere that would list the fields in each table, and at startup time (for certain values of startup), the database system would compare this list to the list of fields in the actual table. If we connect to the database at child init time and leave that handle around for multiple requests, we can do this pretty efficiently. However, the cost would be related to the number of courses, and could get pretty big. I don't know how big, but maybe really big. :-)
  • If we wait for the request to do the check, we only have to do it for the course (and for any non-course-specific tables we happen to have), but I don't think it would be feasible, since we'd have to do the check before every request.

Either tactic above could be improved by recording the "version" of the tables in a particular course in a global course table, and requiring that the version is updated whenever the field list changes. Then we'd be down to finding low values at child init time, or checking a single integer at request time. So with that optimization, I chose updating all out-of-date courses at child init time.

I think we can prevent synchronization issues by initiating a transaction on the course table and only committing it when modifications to the rest of the tables are complete and the table_version field has been updated. I don't think we need to worry about locking the tables we're altering, because all not changing data in them.

Specific design decisions

Permission system: Instead of having a numeric "permission level" for each user, we should have a list of logical privileges for each user (like the stuff that's in the %permissions hash in global.conf).Each user could have multiple privileges assigned to them. Privileges could be grouped into "privilege groups" that could be assigned all in one shot.

Attempt recording: We'd like to move the "attempt log" into the database. Each attempt should be linked to a particular version of a problem, and should record a timestamp, what the user entered, and the score that the user recieved on that attempt.

Problem versions: We need the ability to provide students with multiple "versions" of a single problem. Users can get additional practice, or verify that they understand the concept underlying the problem, by attempting multiple versions.

Group selection: We want to be able to, on a per-version basis, select a PG source file from a group as well as generating a problem seed. Several ways to do this have been discussed, including:

  • We could handle this entirely from within the .pg language by adding an include function (which may already exist in some form). The .pg "group problem" file itself could randomly select which of a number of source files to render and return. -- Main.MichaelGage
    • This would work, but we professors to be able to do group problem selection without looking at PG code at all. -- Main.SamHathaway - 21 Oct 2004
  • The solution we came up with is to allow a "problem" object to actually represent a group of problems, by using a specially-formatted string for the problem source. Each time a new version is created, a particular problem source is selected. So now we've got a situation in which a "problem" object represents a group of possible problems, not a single problem. This is primarily a philosophical problem. -- Main.SamHathaway
    • Instead of using a specially-formatted source file string, we can break it out into multiple fields. -- Main.SamHathaway - 21 Oct 2004

Assignments vs Versions: We need to separate the idea of an "assignment" (which links a problem and a user) from the idea of a "version" (which represents a unique problem seed and results in a unique version of the problem). Once this is separated, we can do nifty things like generating multiple versions of an assigned problem on-the-fly or giving the user the ability to attempt multiple versions (for practice). This also makes it easier to "kick" a poorly-designed problem if a particular problem seed is causing problems. Each attempt would be linked to a particular version of the problem, so that if a version caused problems, those attempts could be invalidated (or deleted altogether). Each version could have a weight associated with it (or at least a "score this attempt" flag).

Multiple problem languages: We should be able to support problems written in multiple languages, perhaps with a language field that would determine which renderer would be used. This requires that any language-specific data (particularly sticky answers) either be stored opaquely or be stored in "extension" tables (i.e. problem would store data shared by problems of all languages, and "problem_pg" would store PG-specific problem data). Of course, this capability requires additional infrastructure elsewhere. PG.pm would be replaced by something like Renderer.pm, which would have subclasses for each language. Or something.

  • This doesn't have to be factored into the initial design -- it can be added later. -- Main.SamHathaway - 21 Oct 2004

Lazy version creation: I'd like to make assignment a light weight thing, involving a small number of database records (like "one"). Ideally, I'd like to see user-specific records created on demand, instead of as part of a "building" or "assigning" activity. Part of this is deciding problem seeds only when needed (see above), but another part might be having properties of a problem set "trickle down" to the problems in the set. If the set is assigned to a user, all the problems in the set will be as well, unless otherwise noted. A user-specific record could be created if needed (to override some global value, for example), but it wouldn't represent an "assignment".

  • The "on demand" set creation is essentially what we're doing with versioning as I'm trying to create it for Gateway tests. When the student enters the set and is allowed to create a version, Instructor::assignSetVersionToUser is called to create a new set version for her/him. In this case the student has to have a non-versioned set created for her/him before the version can be created, but the idea seems to me to be the same. -- Main.GavinLaRose
    • This seems reasonable. There should be some abstraction, of course, so a client module can ask for the current version and one will be created if appropriate. -- Main.SamHathaway - 21 Oct 2004

Global override of user overrides: This is probably too complicated UI-wise, but it would be neat to be able to have global values override user-specific ones. For example, when a professor decides to give everyone credit for a particular problem, she can either drop the weight of the problem to 0, which is confusing to students, or give each student a score of 1, which clobbers their existing scores. If there were global overrides applied after user-specific ones, the individual students' scores could be preserved (but ignored) in favor of the global one. (I guess this is less of an issue if we have a complete attempt history.)

  • At the very least we need global resets -- for example to reset all due dates to some new value. The difference between global overrides and a global reset, is that the latter involves inspecting each user record and either erasing it so that the global parameter takes effect or resetting it to the global value (the latter is what is done in WW1.9, but is probably not the prefered action.) In the particular example above the data for the score and the data for number of correct and incorrect attempts are separate, so even resetting the score would not result in the loss of data -- there would still be a record of the number of correct and incorrect attempts. -- Main.MichaelGage
    • I think we can like with implementing global resets in the UI. -- Main.SamHathaway - 21 Oct 2004