DatabaseSchemaV3

From WeBWorK_wiki
Revision as of 10:54, 20 February 2008 by Sam (talk | contribs) (New page: {{Historical}} = New Database Schema (version 3 or so) = <nowiki> CREATE DATABASE wwdbv3; </nowiki> ''<b>Main features:</b>'' * Global ''<b>user</b>'' table and course-specific ''<b>par...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
This article has been retained as a historical document. It is not up-to-date and the formatting may be lacking. Use the information herein with caution.

New Database Schema (version 3 or so)

CREATE DATABASE wwdbv3;

Main features:

  • Global user table and course-specific participant tables eliminate duplication of user data.
  • Separation of user-specific data and global-override data.
  • version table allows set and problem versioning.
  • attempt table keeps a complete record of problem attempts, per-version.
  • section, recitation, and status tables enumerate the possible values for these fields.
  • role table represents a participant's role in the course (i.e. "instructor", "teaching assistant", "student").
  • Privileges are associated with roles via the role_privilege table (replacing the "permission level" mechanism).

Notes:

  • id fields are auto-increment serial numbers used for uniquely identifying records, defining relations between tables, etc. name fields are human-readable strings identifying a record, and must be unique w.r.t. their parent object. I'd like to be able to use name fields in URLs. With %nn-encoding, I don't think we need to restrict the characters used past just specifying that they be printable and deciding on a character set.
  • I'm going to talk a little bit about the algorithms for assignment, override, and version creation in the appropriate sections of this file. I know that's not strictly part of the schema, but it's useful to talk about.
  • In field names, a date field represents a time and date (type DATETIME), and a time field represents an abstract time in the record creation process (like "assignment time" or "version creation time").
  • NULL values are allowed only in cases where having an undefined value makes sense, i.e. if an override field is NULL, the abstract value will be used.
  • To easily extract just the verbatim lines from this file:
    perl -ne '/^<\/verbatim>/ and $p=0; $p and print $_; /^<verbatim>/ and $p=1;'

---

System stuff

setting

A simple key-value-pair store for system settings. There is some stuff that probably belongs here instead of in global.conf. CREATE TABLE setting ( name VARCHAR(255) PRIMARY KEY, val TEXT );

  • This table must have at least the fields name and val in all versions of the database schema. It must also contain a record in which name is 'db_version' and val is a positive integer representing the version of the database schema.

INSERT INTO setting (name,val) VALUES ('db_version','1');

equation_cache

A table used by the WeBWorK::PG::EquationCache module to map <nop>TeX strings to images rendered with dvipng. CREATE TABLE equation_cache ( id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT, tex TEXT NOT NULL, width INT UNSIGNED NOT NULL, height INT UNSIGNED NOT NULL, depth INT NOT NULL, INDEX ( tex(255) ) );

  • Records are looked-up by tex string. Image names are based on id.
  • We can't make tex the PRIMARY KEY or make it UNIQUIE, because there may be two strings where the first 255 characters are identical.

---

Course management

course

Lists all courses on the system. CREATE TABLE course ( id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT, name VARCHAR(255) NOT NULL, visible BOOL NOT NULL DEFAULT 1, locked BOOL NOT NULL DEFAULT 0, archived BOOL NOT NULL DEFAULT 0, UNIQUE ( name(255) ), INDEX ( visible ) );

  • visible controls whether the course will be displayed in lists.
  • locked controls whether the course is accessible.
  • archived controls whether the records associated with this course are available in the main tables or are archived in a parallel table structure for archived courses. This allows inactive courses to be removed from consideration, speeding up access and modification of active courses.

---

User management

user

Keep track of informaton about particular individuals, like name, email address, password. This information is not course-specific, and this table avoids the problem of synchronizing passwords, email addresses, etc. among courses. It also allows people to login before selecting a course to view a list of courses in which they are enrolled. CREATE TABLE user ( id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT, first_name VARCHAR(255) NOT NULL, last_name VARCHAR(255) NOT NULL, email_address TEXT NOT NULL, student_id VARCHAR(255) NOT NULL, login_id VARCHAR(255) NOT NULL, password VARCHAR(255) NOT NULL, display_mode VARCHAR(255), show_old_answers BOOL, INDEX ( last_name(32), first_name(32) ), UNIQUE ( login_id(255) ) );

  • email_address can't be VARCHAR since the domain part alone can be 255 characters.
  • student_id is an institution-dependent identification string. This can be used when importing scoring data into the institution's grading system, or for looking up passwords in the institution's authentication system.
  • display_mode and show_old_answers denote defaults for this user that override course defaults. If NULL the course defaults are used.

status

Enumerate the available statuses. Status control whether users are allowed to access a course, are included in statistics, etc. CREATE TABLE status ( id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT, course INT UNSIGNED, name VARCHAR(255) NOT NULL, allow_course_access BOOL NOT NULL DEFAULT 1, include_in_assignment BOOL NOT NULL DEFAULT 1, include_in_stats BOOL NOT NULL DEFAULT 1, include_in_scoring BOOL NOT NULL DEFAULT 1, INDEX ( course ), UNIQUE ( course, name(255) ) INDEX ( include_in_assignment, course ), INDEX ( include_in_stats, course ), INDEX ( include_in_scoring, course ) );

  • If course is NULL, the status is valid for all courses. Otherwise, the status is valid only for the indicated course.

role

Enumerate the available roles. Roles are used to link sets of users to sets of privileges. CREATE TABLE role ( id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT, course INT UNSIGNED, name VARCHAR(255) NOT NULL, privs TEXT, INDEX ( course ), UNIQUE ( course, name(255) ) );

  • If course is NULL, the role is valid for all courses. Otherwise, the role is valid only for the indicated course.
  • privileges is a comma-separated list of privileges interpreted by WeBWorK.
  • Roles can be used to get a list of all participants in a course or section or reciation and then display them by role, giving you "instructors", "TAs", "Students", etc.


section

Enumerate the sections available in a course. CREATE TABLE section ( id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT, course INT UNSIGNED NOT NULL, name VARCHAR(255) NOT NULL, INDEX ( course ), UNIQUE ( course, name(255) ) );

recitation

Enumerate the recitations available in a course. CREATE TABLE recitation ( id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT, course INT UNSIGNED NOT NULL, name VARCHAR(255) NOT NULL, INDEX ( course ), UNIQUE ( course, name(255) ) );

participant

Indicates that a user is a participant in a course. Contains course-specific user data. CREATE TABLE participant ( id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT, course INT UNSIGNED NOT NULL, user INT UNSIGNED NOT NULL, status INT UNSIGNED NOT NULL, role INT UNSIGNED NOT NULL, section INT UNSIGNED, recitation INT UNSIGNED, last_access DATETIME NOT NULL, comment TEXT, INDEX ( course ), UNIQUE ( user, course ), INDEX ( status ), INDEX ( role ), INDEX ( section, course ), INDEX ( recitation, course ), INDEX ( last_access, course ) );

  • If section or rectation is NULL, the participant is not a member of a section or recitation.
  • last_access is a timestamp updated whenever this user interacts with this course. This can be used to get a list of "recently active" users on a per-course basis.

---

Abstract sets and problems

Abstract (formerly "global") sets and problems represent a homework set as defined by the instructor and are independent of any user.

abstract_set

CREATE TABLE abstract_set ( id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT, course INT UNSIGNED NOT NULL, name VARCHAR(255) NOT NULL, set_header TEXT, hardcopy_header TEXT, open_date DATETIME NOT NULL, due_date DATETIME NOT NULL, answer_date DATETIME NOT NULL, published BOOL NOT NULL DEFAULT 1, problem_order TEXT NOT NULL, reorder_type ENUM('none','random') NOT NULL DEFAULT 'none', reorder_subset_size INT, reorder_time ENUM('assignment','version_creation') NOT NULL DEFAULT 'version_creation', atomicity ENUM('problem','set') NOT NULL DEFAULT 'problem', max_attempts_per_version INT UNSIGNED, version_creation_interval INT UNSIGNED, versions_per_interval INT UNSIGNED, version_due_date_offset INT UNSIGNED, version_answer_date_offset INT UNSIGNED, UNIQUE ( course, name(255) ), INDEX ( open_date ), INDEX ( due_date ), INDEX ( answer_date ), INDEX ( published ), INDEX ( atomicity ) );

  • A NULL value in set_header or problem_header indicates that the course/system default should be used. An empty string indicates that no header should be used.
  • set_header and problem_header can't be VARCHAR fields since they can potentially contain pathnames longer than 255 characters.
  • The problem_order field will contain a comma-separated list of abstract problem IDs, representing the natural ordering for the problems in the set.
  • Reordering is controlled by three fields:
  • reorder_type controlls whether problem_order is shuffled or not.
  • reorder_subset_size indicates the number of items to select from the possibly shuffled problem order.
  • reorder_time can be 'assignment' or 'version_creation', and controlls when reordering happens -- when a problem is assigned (each user will get a different ordering) or when each version is created (each version will be different).
  • atomicity determines if the entire set is to be treated as a unit or if each problem is dealt with separately for the following things:
    • Whether to display and accept attempts on problems individually or as a single unit.
    • Whether to use the max_attempts_per_version, version_creation_interval, versions_per_interval, version_due_date_offset, and version_answer_date_offset fields in the abstract_set record or in each abstract_problem record.
    • Whether to allow versions to be created for individual problems, or only for all problems at once.
  • A NULL value in reorder_subset_size indicates that all problems should be included.
  • A NULL value in max_attempts_per_version indicates an unlimited number of attempts per version.
  • A NULL value in version_creation_interval indicates that the version creation interval is infinite. That is, only versions_per_interval versions may be created throughout the lifetime of the problem set.
  • A NULL value in versions_per_interval indicates that an unlimited number of versions may be created per interval. (The value of version_creation_interval does not matter when this field is NULL.)
  • NULL values in version_due_date_offset or version_answer_date_offset mean that the due date or answer date for each version should simply be the abstract set's due or answer date.

abstract_problem

CREATE TABLE abstract_problem ( id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT, abstract_set INT UNSIGNED NOT NULL, name VARCHAR(255), source_type ENUM('file','group') NOT NULL DEFAULT 'file', source_file TEXT, source_group_set_id INT UNSIGNED, source_group_select_time ENUM('assignment','version_creation'), weight FLOAT UNSIGNED NOT NULL DEFAULT 1, max_attempts_per_version INT UNSIGNED, version_creation_interval INT UNSIGNED, versions_per_interval INT UNSIGNED, version_due_date_offset INT UNSIGNED, version_answer_date_offset INT UNSIGNED, INDEX ( abstract_set ), INDEX ( source_group_set_id ), UNIQUE ( abstract_set, name(255) ) );

  • source_type controls where the problem source comes from.
  • If source_type is 'file', source_file contains the path to the file and cannot be NULL.
  • If source_type is 'group':
    • source_group_set_id contains the ID of another abstract set. (This field must be checked for loops.)
    • source_group_select_time indicates whether group selection is done at assignment time ('assignment') or version creation time ('version_creation').
    • Neither field can be NULL.
  • source_file can't be a VARCHAR field since it can potentially contain a pathname longer than 255 characters.
  • If the abstract problem's abstract set's atomicity is 'problem', the fields max_attempts_per_version, version_creation_interval, versions_per_interval, version_due_date_offset, and version_answer_date_offset are used in lieu of the corresponding fields in the abstract set. A NULL value in any of these fields has the same meaning as above.

---

Set and problem assignments

Record the association between an abstract set or problem and a participant. Indicates that the set is assigned to that participant and contains set data specific to a user.

set_assignment

CREATE TABLE set_assignment ( id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT, abstract_set INT UNSIGNED NOT NULL, participant INT UNSIGNED NOT NULL, problem_order TEXT, UNIQUE ( abstract_set, participant ), INDEX ( participant ) );

  • If the associated abstract set's reorder_time field is set to 'assignment', this record's problem_order field will be set to the result of problem reordering (see above). Otherwise, this field will be NULL and the abstract set's problem order will be used.

problem_assignment

CREATE TABLE problem_assignment ( id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT, set_assignment INT UNSIGNED NOT NULL, abstract_problem INT UNSIGNED NOT NULL, source_file TEXT, UNIQUE ( set_assignment, abstract_problem ), INDEX ( abstract_problem ) );

  • If the associated abstract problem's source_type field is 'group' and its source_group_select_time field is 'assignment', this record's source_file field will be set to the result of group selection (see above). Otherwise, this field will be NULL.

---

Set and problem overrides

Set and problem overides allow the data in an set assignment or problem assignment and its abstract set or problem to be overridden on a per-section, -recitation, or -participant basis. Only one of section, recitation, and participant may be non- NULL in a given record.

set_override

CREATE TABLE set_override ( id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT, abstract_set INT UNSIGNED NOT NULL, section INT UNSIGNED, recitation INT UNSIGNED, participant INT UNSIGNED, set_header TEXT, hardcopy_header TEXT, open_date DATETIME, due_date DATETIME, answer_date DATETIME, published ENUM('false','true'), problem_order TEXT, reorder_type ENUM('none','random'), reorder_subset_size INT, atomicity ENUM('problem','set'), max_attempts_per_version INT, version_creation_interval INT, versions_per_interval INT, version_due_date_offset INT, version_answer_date_offset INT, INDEX ( abstract_set ), INDEX ( section ), INDEX ( recitation ), INDEX ( participant ), UNIQUE ( abstract_set, section, recitation, participant ) );

  • If any override field is NULL, the value in the corresponding abstract set will be used.
  • Overrides are applied after assignment, so a reorder_time field is not needed here, and reorder_type and reorder_subset_size are only used when the abstract set's reorder_time field is set to 'version_creation'.

problem_override

CREATE TABLE problem_override ( id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT, abstract_problem INT UNSIGNED NOT NULL, section INT UNSIGNED, recitation INT UNSIGNED, participant INT UNSIGNED, source_type ENUM('file','group'), source_file TEXT, source_group_set_id INT UNSIGNED, weight FLOAT UNSIGNED, max_attempts_per_version INT UNSIGNED, version_creation_interval INT UNSIGNED, versions_per_interval INT UNSIGNED, version_due_date_offset INT UNSIGNED, version_answer_date_offset INT UNSIGNED, INDEX ( abstract_problem ), INDEX ( section ), INDEX ( recitation ), INDEX ( participant ), UNIQUE ( abstract_problem, section, recitation, participant ) );

  • If any override field is NULL, the value in the corresponding abstract problem will be used.
  • Overrides are applied after assignment, so a source_group_select_time field is not needed, and source_type, source_file, and source_group_set_id are only used when the abstract set's source_group_select_time field is set to 'version_creation'.

---

Set and problem versions

Set and problem versions allow for variations in the problem source file chosen, the seed used, and the order that problems are presented. They also facilitate time limits on the submission of attempts. Each attempt is associated with a problem version, not a problem assignment, so that if a source file or seed is changed, the attempt data remains consistent.

set_version

CREATE TABLE set_version ( id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT, set_assignment INT UNSIGNED NOT NULL, problem_order TEXT, creation_date DATETIME NOT NULL, INDEX ( set_assignment ) );

  • problem_order is set to the result of problem reordering if the abstract set's reorder_time is 'version_creation'. Otherwise, it is NULL.
  • creation_date is used along with version_due_date_offset and version_answer_date_offset to determine if attempts may be submitted and answers shown at the current time.

problem_version

CREATE TABLE problem_version ( id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT, set_version INT UNSIGNED NOT NULL, problem_assignment INT UNSIGNED NOT NULL, creation_date DATETIME NOT NULL, source_file TEXT, seed INT UNSIGNED NOT NULL, INDEX ( set_version, problem_assignment ), INDEX ( problem_assignment ) );

  • If the associated abstract problem's source_type field is 'group' and its source_group_select_time field is 'version_creation', this record's source_file field will be set to the result of group selection. Otherwise, it will be NULL.
  • seed is set to a random number.

---

Attempts

problem_attempt

An attempt record records a single recorded attempt on a problem. CREATE TABLE problem_attempt ( id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT, problem_version INT UNSIGNED NOT NULL, creation_date DATETIME NOT NULL, score FLOAT UNSIGNED NOT NULL DEFAULT 0, data BLOB, INDEX ( problem_version, creation_date, score ), INDEX ( problem_version, score ), INDEX ( score ) );

  • data stores language-specific data about the attempt. (For the PG language, we could store a serialized AnswerHash object.)

-- Main.SamHathaway - 16 Nov 2004