Difference between revisions of "DatabaseSchemaV3"

From WeBWorK_wiki
Jump to navigation Jump to search
 
(One intermediate revision by the same user not shown)
Line 479: Line 479:
   
 
* <code>data</code> stores language-specific data about the attempt. (For the PG language, we could store a serialized AnswerHash object.)
 
* <code>data</code> stores language-specific data about the attempt. (For the PG language, we could store a serialized AnswerHash object.)
  +
[[Category:Developers]]

Latest revision as of 14:18, 22 January 2012

This article has been retained as a historical document. It is not up-to-date and the formatting may be lacking. Use the information herein with caution.

New Database Schema (version 3 or so)

<syntaxhighlight lang="sql"> CREATE DATABASE wwdbv3; </syntaxhighlight>

Main features:

  • Global user table and course-specific participant tables eliminate duplication of user data.
  • Separation of user-specific data and global-override data.
  • version table allows set and problem versioning.
  • attempt table keeps a complete record of problem attempts, per-version.
  • section, recitation, and status tables enumerate the possible values for these fields.
  • role table represents a participant's role in the course (i.e. "instructor", "teaching assistant", "student").
  • Privileges are associated with roles via the role_privilege table (replacing the "permission level" mechanism).

Notes:

  • id fields are auto-increment serial numbers used for uniquely identifying records, defining relations between tables, etc. name fields are human-readable strings identifying a record, and must be unique w.r.t. their parent object. I'd like to be able to use name fields in URLs. With %nn-encoding, I don't think we need to restrict the characters used past just specifying that they be printable and deciding on a character set.
  • I'm going to talk a little bit about the algorithms for assignment, override, and version creation in the appropriate sections of this file. I know that's not strictly part of the schema, but it's useful to talk about.
  • In field names, a date field represents a time and date (type DATETIME), and a time field represents an abstract time in the record creation process (like "assignment time" or "version creation time").
  • NULL values are allowed only in cases where having an undefined value makes sense, i.e. if an override field is NULL, the abstract value will be used.
  • To easily extract just the verbatim lines from this file:
    perl -ne '/^<\/verbatim>/ and $p=0; $p and print $_; /^<verbatim>/ and $p=1;'

---

System stuff

setting

A simple key-value-pair store for system settings. There is some stuff that probably belongs here instead of in global.conf.

<syntaxhighlight lang="sql"> CREATE TABLE setting (

  name VARCHAR(255) PRIMARY KEY,
  val  TEXT

); </syntaxhighlight>

  • This table must have at least the fields name and val in all versions of the database schema. It must also contain a record in which name is 'db_version' and val is a positive integer representing the version of the database schema.

<syntaxhighlight lang="sql"> INSERT INTO setting (name,val) VALUES ('db_version','1'); </syntaxhighlight>

equation_cache

A table used by the WeBWorK::PG::EquationCache module to map <nop>TeX strings to images rendered with dvipng. <syntaxhighlight lang="sql"> CREATE TABLE equation_cache (

  id     INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
  
  tex    TEXT         NOT NULL,
  width  INT UNSIGNED NOT NULL,
  height INT UNSIGNED NOT NULL,
  depth  INT          NOT NULL,
  
  INDEX ( tex(255) )

); </syntaxhighlight>

  • Records are looked-up by tex string. Image names are based on id.
  • We can't make tex the PRIMARY KEY or make it UNIQUIE, because there may be two strings where the first 255 characters are identical.

---

Course management

course

Lists all courses on the system. <syntaxhighlight lang="sql"> CREATE TABLE course (

  id         INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
  name       VARCHAR(255) NOT NULL,
  
  visible    BOOL         NOT NULL DEFAULT 1,
  locked     BOOL         NOT NULL DEFAULT 0,
  archived   BOOL         NOT NULL DEFAULT 0,
  
  UNIQUE ( name(255) ),
  INDEX ( visible )

); </syntaxhighlight>

  • visible controls whether the course will be displayed in lists.
  • locked controls whether the course is accessible.
  • archived controls whether the records associated with this course are available in the main tables or are archived in a parallel table structure for archived courses. This allows inactive courses to be removed from consideration, speeding up access and modification of active courses.

---

User management

user

Keep track of informaton about particular individuals, like name, email address, password. This information is not course-specific, and this table avoids the problem of synchronizing passwords, email addresses, etc. among courses. It also allows people to login before selecting a course to view a list of courses in which they are enrolled. <syntaxhighlight lang="sql"> CREATE TABLE user (

  id                   INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
  
  first_name           VARCHAR(255) NOT NULL,
  last_name            VARCHAR(255) NOT NULL,
  email_address        TEXT         NOT NULL,
  student_id           VARCHAR(255) NOT NULL,
  login_id             VARCHAR(255) NOT NULL,
  password             VARCHAR(255) NOT NULL,
  display_mode         VARCHAR(255),
  show_old_answers     BOOL,
  
  INDEX ( last_name(32), first_name(32) ),
  UNIQUE ( login_id(255) )

); </syntaxhighlight>

  • email_address can't be VARCHAR since the domain part alone can be 255 characters.
  • student_id is an institution-dependent identification string. This can be used when importing scoring data into the institution's grading system, or for looking up passwords in the institution's authentication system.
  • display_mode and show_old_answers denote defaults for this user that override course defaults. If NULL the course defaults are used.

status

Enumerate the available statuses. Status control whether users are allowed to access a course, are included in statistics, etc. <syntaxhighlight lang="sql"> CREATE TABLE status (

  id                    INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
  course                INT UNSIGNED,
  name                  VARCHAR(255) NOT NULL,
  
  allow_course_access   BOOL         NOT NULL DEFAULT 1,
  include_in_assignment BOOL         NOT NULL DEFAULT 1,
  include_in_stats      BOOL         NOT NULL DEFAULT 1,
  include_in_scoring    BOOL         NOT NULL DEFAULT 1,
  
  INDEX ( course ),
  UNIQUE ( course, name(255) )
  INDEX ( include_in_assignment, course ),
  INDEX ( include_in_stats, course ),
  INDEX ( include_in_scoring, course )

); </syntaxhighlight>

  • If course is NULL, the status is valid for all courses. Otherwise, the status is valid only for the indicated course.

role

Enumerate the available roles. Roles are used to link sets of users to sets of privileges. <syntaxhighlight lang="sql"> CREATE TABLE role (

  id         INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
  course     INT UNSIGNED,
  name       VARCHAR(255) NOT NULL,
  
  privs      TEXT,
  
  INDEX ( course ),
  UNIQUE ( course, name(255) )

); </syntaxhighlight>

  • If course is NULL, the role is valid for all courses. Otherwise, the role is valid only for the indicated course.
  • privileges is a comma-separated list of privileges interpreted by WeBWorK.
  • Roles can be used to get a list of all participants in a course or section or reciation and then display them by role, giving you "instructors", "TAs", "Students", etc.


section

Enumerate the sections available in a course. <syntaxhighlight lang="sql"> CREATE TABLE section (

  id        INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
  course    INT UNSIGNED NOT NULL,
  name      VARCHAR(255) NOT NULL,
  
  INDEX ( course ),
  UNIQUE ( course, name(255) )

); </syntaxhighlight>

recitation

Enumerate the recitations available in a course. <syntaxhighlight lang="sql"> CREATE TABLE recitation (

  id        INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
  course    INT UNSIGNED NOT NULL,
  name      VARCHAR(255) NOT NULL,
  
  INDEX ( course ),
  UNIQUE ( course, name(255) )

); </syntaxhighlight>

participant

Indicates that a user is a participant in a course. Contains course-specific user data. <syntaxhighlight lang="sql"> CREATE TABLE participant (

  id            INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
  course        INT UNSIGNED NOT NULL,
  user          INT UNSIGNED NOT NULL,
  status        INT UNSIGNED NOT NULL,
  role          INT UNSIGNED NOT NULL,
  section       INT UNSIGNED,
  recitation    INT UNSIGNED,
  last_access   DATETIME     NOT NULL,
  comment       TEXT,
  
  INDEX ( course ),
  UNIQUE ( user, course ),
  INDEX ( status ),
  INDEX ( role ),
  INDEX ( section, course ),
  INDEX ( recitation, course ),
  INDEX ( last_access, course )

); </syntaxhighlight>

  • If section or rectation is NULL, the participant is not a member of a section or recitation.
  • last_access is a timestamp updated whenever this user interacts with this course. This can be used to get a list of "recently active" users on a per-course basis.

---

Abstract sets and problems

Abstract (formerly "global") sets and problems represent a homework set as defined by the instructor and are independent of any user.

abstract_set

<syntaxhighlight lang="sql"> CREATE TABLE abstract_set (

  id                         INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
  course                     INT UNSIGNED NOT NULL,
  name                       VARCHAR(255) NOT NULL,
  
  set_header                 TEXT,
  hardcopy_header            TEXT,
  open_date                  DATETIME NOT NULL,
  due_date                   DATETIME NOT NULL,
  answer_date                DATETIME NOT NULL,
  published                  BOOL     NOT NULL DEFAULT 1,
  problem_order              TEXT     NOT NULL,
  reorder_type               ENUM('none','random') NOT NULL DEFAULT 'none',
  reorder_subset_size        INT,
  reorder_time               ENUM('assignment','version_creation') NOT NULL DEFAULT 'version_creation',
  atomicity                  ENUM('problem','set') NOT NULL DEFAULT 'problem',
  max_attempts_per_version   INT UNSIGNED,
  version_creation_interval  INT UNSIGNED,
  versions_per_interval      INT UNSIGNED,
  version_due_date_offset    INT UNSIGNED,
  version_answer_date_offset INT UNSIGNED,
  
  UNIQUE ( course, name(255) ),
  INDEX ( open_date ),
  INDEX ( due_date ),
  INDEX ( answer_date ),
  INDEX ( published ),
  INDEX ( atomicity )

); </syntaxhighlight>

  • A NULL value in set_header or problem_header indicates that the course/system default should be used. An empty string indicates that no header should be used.
  • set_header and problem_header can't be VARCHAR fields since they can potentially contain pathnames longer than 255 characters.
  • The problem_order field will contain a comma-separated list of abstract problem IDs, representing the natural ordering for the problems in the set.
  • Reordering is controlled by three fields:
  • reorder_type controlls whether problem_order is shuffled or not.
  • reorder_subset_size indicates the number of items to select from the possibly shuffled problem order.
  • reorder_time can be 'assignment' or 'version_creation', and controlls when reordering happens -- when a problem is assigned (each user will get a different ordering) or when each version is created (each version will be different).
  • atomicity determines if the entire set is to be treated as a unit or if each problem is dealt with separately for the following things:
    • Whether to display and accept attempts on problems individually or as a single unit.
    • Whether to use the max_attempts_per_version, version_creation_interval, versions_per_interval, version_due_date_offset, and version_answer_date_offset fields in the abstract_set record or in each abstract_problem record.
    • Whether to allow versions to be created for individual problems, or only for all problems at once.
  • A NULL value in reorder_subset_size indicates that all problems should be included.
  • A NULL value in max_attempts_per_version indicates an unlimited number of attempts per version.
  • A NULL value in version_creation_interval indicates that the version creation interval is infinite. That is, only versions_per_interval versions may be created throughout the lifetime of the problem set.
  • A NULL value in versions_per_interval indicates that an unlimited number of versions may be created per interval. (The value of version_creation_interval does not matter when this field is NULL.)
  • NULL values in version_due_date_offset or version_answer_date_offset mean that the due date or answer date for each version should simply be the abstract set's due or answer date.

abstract_problem

<syntaxhighlight lang="sql"> CREATE TABLE abstract_problem (

  id                         INT UNSIGNED   PRIMARY KEY AUTO_INCREMENT,
  abstract_set               INT UNSIGNED   NOT NULL,
  name                       VARCHAR(255),
  
  source_type                ENUM('file','group') NOT NULL DEFAULT 'file',
  source_file                TEXT,
  source_group_set_id        INT UNSIGNED,
  source_group_select_time   ENUM('assignment','version_creation'),
  weight                     FLOAT UNSIGNED NOT NULL DEFAULT 1,
  max_attempts_per_version   INT UNSIGNED,
  version_creation_interval  INT UNSIGNED,
  versions_per_interval      INT UNSIGNED,
  version_due_date_offset    INT UNSIGNED,
  version_answer_date_offset INT UNSIGNED,
  
  INDEX ( abstract_set ),
  INDEX ( source_group_set_id ),
  UNIQUE ( abstract_set, name(255) )

); </syntaxhighlight>

  • source_type controls where the problem source comes from.
  • If source_type is 'file', source_file contains the path to the file and cannot be NULL.
  • If source_type is 'group':
    • source_group_set_id contains the ID of another abstract set. (This field must be checked for loops.)
    • source_group_select_time indicates whether group selection is done at assignment time ('assignment') or version creation time ('version_creation').
    • Neither field can be NULL.
  • source_file can't be a VARCHAR field since it can potentially contain a pathname longer than 255 characters.
  • If the abstract problem's abstract set's atomicity is 'problem', the fields max_attempts_per_version, version_creation_interval, versions_per_interval, version_due_date_offset, and version_answer_date_offset are used in lieu of the corresponding fields in the abstract set. A NULL value in any of these fields has the same meaning as above.

---

Set and problem assignments

Record the association between an abstract set or problem and a participant. Indicates that the set is assigned to that participant and contains set data specific to a user.

set_assignment

<syntaxhighlight lang="sql"> CREATE TABLE set_assignment (

  id              INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
  abstract_set    INT UNSIGNED NOT NULL,
  participant     INT UNSIGNED NOT NULL,
  
  problem_order   TEXT,
  
  UNIQUE ( abstract_set, participant ),
  INDEX ( participant )

); </syntaxhighlight>

  • If the associated abstract set's reorder_time field is set to 'assignment', this record's problem_order field will be set to the result of problem reordering (see above). Otherwise, this field will be NULL and the abstract set's problem order will be used.

problem_assignment

<syntaxhighlight lang="sql"> CREATE TABLE problem_assignment (

  id                  INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
  set_assignment      INT UNSIGNED NOT NULL,
  abstract_problem    INT UNSIGNED NOT NULL,
  
  source_file         TEXT,
  
  UNIQUE ( set_assignment, abstract_problem ),
  INDEX ( abstract_problem )

); </syntaxhighlight>

  • If the associated abstract problem's source_type field is 'group' and its source_group_select_time field is 'assignment', this record's source_file field will be set to the result of group selection (see above). Otherwise, this field will be NULL.

---

Set and problem overrides

Set and problem overides allow the data in an set assignment or problem assignment and its abstract set or problem to be overridden on a per-section, -recitation, or -participant basis. Only one of section, recitation, and participant may be non- NULL in a given record.

set_override

<syntaxhighlight lang="sql"> CREATE TABLE set_override (

  id                         INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
  abstract_set               INT UNSIGNED NOT NULL,
  section                    INT UNSIGNED,
  recitation                 INT UNSIGNED,
  participant                INT UNSIGNED,
  
  set_header                 TEXT,
  hardcopy_header            TEXT,
  open_date                  DATETIME,
  due_date                   DATETIME,
  answer_date                DATETIME,
  published                  ENUM('false','true'),
  problem_order              TEXT,
  reorder_type               ENUM('none','random'),
  reorder_subset_size        INT,
  atomicity                  ENUM('problem','set'),
  max_attempts_per_version   INT,
  version_creation_interval  INT,
  versions_per_interval      INT,
  version_due_date_offset    INT,
  version_answer_date_offset INT,
  
  INDEX ( abstract_set ),
  INDEX ( section ),
  INDEX ( recitation ),
  INDEX ( participant ),
  UNIQUE ( abstract_set, section, recitation, participant )

); </syntaxhighlight>

  • If any override field is NULL, the value in the corresponding abstract set will be used.
  • Overrides are applied after assignment, so a reorder_time field is not needed here, and reorder_type and reorder_subset_size are only used when the abstract set's reorder_time field is set to 'version_creation'.

problem_override

<syntaxhighlight lang="sql"> CREATE TABLE problem_override (

  id                         INT UNSIGNED   PRIMARY KEY AUTO_INCREMENT,
  abstract_problem           INT UNSIGNED   NOT NULL,
  section                    INT UNSIGNED,
  recitation                 INT UNSIGNED,
  participant                INT UNSIGNED,
  
  source_type                ENUM('file','group'),
  source_file                TEXT,
  source_group_set_id        INT UNSIGNED,
  weight                     FLOAT UNSIGNED,
  max_attempts_per_version   INT UNSIGNED,
  version_creation_interval  INT UNSIGNED,
  versions_per_interval      INT UNSIGNED,
  version_due_date_offset    INT UNSIGNED,
  version_answer_date_offset INT UNSIGNED,
  
  INDEX ( abstract_problem ),
  INDEX ( section ),
  INDEX ( recitation ),
  INDEX ( participant ),
  UNIQUE ( abstract_problem, section, recitation, participant )

); </syntaxhighlight>

  • If any override field is NULL, the value in the corresponding abstract problem will be used.
  • Overrides are applied after assignment, so a source_group_select_time field is not needed, and source_type, source_file, and source_group_set_id are only used when the abstract set's source_group_select_time field is set to 'version_creation'.

---

Set and problem versions

Set and problem versions allow for variations in the problem source file chosen, the seed used, and the order that problems are presented. They also facilitate time limits on the submission of attempts. Each attempt is associated with a problem version, not a problem assignment, so that if a source file or seed is changed, the attempt data remains consistent.

set_version

<syntaxhighlight lang="sql"> CREATE TABLE set_version (

  id                INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
  set_assignment    INT UNSIGNED NOT NULL,
  
  problem_order     TEXT,
  creation_date     DATETIME NOT NULL,
  
  INDEX ( set_assignment )

); </syntaxhighlight>

  • problem_order is set to the result of problem reordering if the abstract set's reorder_time is 'version_creation'. Otherwise, it is NULL.
  • creation_date is used along with version_due_date_offset and version_answer_date_offset to determine if attempts may be submitted and answers shown at the current time.

problem_version

<syntaxhighlight lang="sql"> CREATE TABLE problem_version (

  id                    INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
  set_version           INT UNSIGNED NOT NULL,
  problem_assignment    INT UNSIGNED NOT NULL,
  
  creation_date         DATETIME     NOT NULL,
  source_file           TEXT,
  seed                  INT UNSIGNED NOT NULL,
  
  INDEX ( set_version, problem_assignment ),
  INDEX ( problem_assignment )

); </syntaxhighlight>

  • If the associated abstract problem's source_type field is 'group' and its source_group_select_time field is 'version_creation', this record's source_file field will be set to the result of group selection. Otherwise, it will be NULL.
  • seed is set to a random number.

Attempts

problem_attempt

An attempt record records a single recorded attempt on a problem. <syntaxhighlight lang="sql"> CREATE TABLE problem_attempt (

  id                 INT UNSIGNED   PRIMARY KEY AUTO_INCREMENT,
  problem_version    INT UNSIGNED   NOT NULL,
  
  creation_date      DATETIME       NOT NULL,
  score              FLOAT UNSIGNED NOT NULL DEFAULT 0,
  data               BLOB,
  
  INDEX ( problem_version, creation_date, score ),
  INDEX ( problem_version, score ),
  INDEX ( score )

); </syntaxhighlight>

  • data stores language-specific data about the attempt. (For the PG language, we could store a serialized AnswerHash object.)