[ww-devel] Big Data Table

Danny Glin dlglin at ucalgary.ca
Mon Aug 18 13:41:43 EDT 2014


Excuse my ignorance, but what is the purpose of the “Big Data Table”?

Danny
On Aug 18, 2014, at 8:40 AM, Geoff Goehle <goehle at gmail.com> wrote:

> Mike wanted to get started on the Big Data Table design and I figured we
> could at least have a discussion about what the possible columns could
> be and what some of the challenges are.  In terms of challenges I see
> the following
> 
> -  Table design:  Do we have one giant table for portability and ease of
> use, or do we have thinner but separate tables for performance and
> assume people know how to do joins? Having thinner tables also addresses
> the next question in some way. 
> -  Data purity:  A lot of data which people care about can be included
> but will not be very "pure".  For example fields like seed, due date,
> answer date can be included from the appropriate tables, but they may
> not actually be the seed, due date, answer date that the student had
> when the answer was recorded.  Most of the time they will be the same
> but there will certainly be times when they are not.  We could just use
> the table structure we have now under the theory that its the most
> accurate reflection of the data we have. 
> -  Computed columns:  Do we want to have computed columns that do not
> reflect any data we actually store and may involve some educated
> guessing.  For example, we have the final number of incorrect attempts
> and correct attempts and the final score, but we don't have these values
> at the time of each attempt.  Do we want to try and compute these values
> for each answer row?  What happens when we fail (e.g. the question has
> auxiliary fields which were not recorded) or are just wrong (e.g. the
> seed changed)?  
> 
> I also put together a list of possible columns.  This list and the
> previous questions can be found at
> https://github.com/openwebwork/webwork2/wiki/Data-Export-Columns if
> people want to weigh in or change things. 
> 
> -  Answer ID: Salted hash or Unique int
> -  Course ID: Salted hash
> -  Student ID: Salted hash
> -  Set ID: Salted hash
> -  Problem ID: Salted hash
> -  Answer Timestamp: Unix time
> -  Answer String
> -  Answer Correct String: String of 1's, 0's corresponding to
> correctness of answers
> -  Problem Path: Also serves as unique identifier of problem
> -  Final Problem Status
> -  Total Incorrect Attempts
> -  Total Correct Attempts
> -  Problem Value:  Possibly impure
> -  Problem Max Attempts: Possibly impure
> -  Seed: Possibly impure
> -  Open Date: Unix time, Possibly impure
> -  Due Date: Unix time, Possibly impure
> -  Answer Date: Unix time, Possibly impure
> -  Set Type
> -  Library Subject: Possibly Missing
> -  Library Chapter: Possibly Missing
> -  Library Section: Possibly Missing
> -  Library Keywords: Possibly Missing
> -  Status of Attempt:  Post-Computed, Possibly Missing, Possibly Impure
> -  Final Set Grade:  Post-Computed, Possibly Missing, Possibly Impure
> -  Number Incorrect Previous Attempts: Post-Computed, Possibly Missing,
> Possibly Impure
> -  Number Correct Previous Attempts: Post-Computed, Possibly Missing,
> Possibly Impure
> 
> 
> 
> 
> _______________________________________________
> webwork-devel mailing list
> webwork-devel at webwork.maa.org
> http://webwork.maa.org/mailman/listinfo/webwork-devel



More information about the webwork-devel mailing list