WeBWorK Main Forum

Need assistance on resolving user data duplication in WeBWorK database

Need assistance on resolving user data duplication in WeBWorK database

by Lukas Fredriksson -
Number of replies: 1

Dear community,

We are currently using Moodle in combination with WeBWorK for a number of courses at our university. In this system, WeBWorK reports completed problem sets to Moodle, which students can track in a Moodle module. This module also tells our students which problem sets they have access to.

Unexpectedly, about 2.5 weeks ago, our Identity Provider (IdP) altered the format of the data they relayed to us when students accessed one of our course websites. In particular, the 'user_id' fields in our WeBWorK database previously adhered to the following format:

user_id: personal-identity-number@antagning.se

This format then became

user_id: personal-identity-number


We managed to solve the equivalent issue for Moodle by simply removing "@antagning.se" from the affected fields in our database. We also thought we had found a solution for WeBWorK by modifying the format of the data passed to the user_id fields in /lib/WeBWorK/Authen/LTIAdvanced.pm. Unfortunately, the latter change has been causing issues down the line with significant data duplication in our WeBWorK database over the past few weeks, which we need to deal with now.

In particular, students who have completed certain types of problem sets ("final exams") are apparently no longer able to progress to the next problem set, as Moodle no longer gets the correct data from WeBWorK.

We have partially cleaned up our WeBWorK database from overlapping user data, but we have two tables that we are unsure how to deal with. Any assistance would be greatly appreciated here:


Coursename_set_user:

Here we have typical "collisions" as shown below. Which one of the two rows below should we keep (before we possibly remove the @antagning.se suffix from 'user_id' to adhere to the new/desired 'user_id' format)?

*************************** 1. row ***************************
user_id: 123456789@antagning.se
set_id: Slutprov_2.3-2.4
psvn: 390972
set_header: NULL
hardcopy_header: NULL
open_date: 1677160589
due_date: 1677164189
answer_date: 1677164189
visible: NULL
enable_reduced_scoring: NULL
assignment_type: NULL
description: NULL
restricted_release: NULL
restricted_status: NULL
attempts_per_version: NULL
time_interval: NULL
versions_per_interval: NULL
version_time_limit: NULL
version_creation_time: NULL
problem_randorder: NULL
version_last_attempt_time: NULL
problems_per_page: NULL
hide_score: NULL
hide_score_by_problem: NULL
hide_work: NULL
time_limit_cap: NULL
restrict_ip: NULL
relax_restrict_ip: NULL
restricted_login_proctor: NULL
hide_hint: NULL
lis_source_did: {"data":{"instanceid":"11","userid":"24120","typeid":"1","launchid":661889568},"hash":"169919cb41ec87f4eefaac3fcc65244993f95e0fafc28f1ce6dbf0b84e553f83"} reduced_scoring_date: NULL
email_instructor: NULL
restrict_prob_progression: NULL

*************************** 2. row ***************************
user_id: 123456789
set_id: Slutprov_2.3-2.4
psvn: 396282
set_header: NULL
hardcopy_header: NULL
open_date: NULL
due_date: NULL
answer_date: NULL
visible: NULL
enable_reduced_scoring: NULL
assignment_type: NULL
description: NULL
restricted_release: NULL
restricted_status: NULL
attempts_per_version: NULL
time_interval: NULL
versions_per_interval: NULL
version_time_limit: NULL
version_creation_time: NULL
problem_randorder: NULL
version_last_attempt_time: NULL
problems_per_page: NULL
hide_score: NULL
hide_score_by_problem: NULL
hide_work: NULL
time_limit_cap: NULL
restrict_ip: NULL
relax_restrict_ip: NULL
restricted_login_proctor: NULL
hide_hint: NULL
lis_source_did: {"data":{"instanceid":"11","userid":"24120","typeid":"1","launchid":1889764647},"hash":"d5cc0a690a6ff0f149eab36c9a93628a124db7310c9b49671020e581dc0fe15a"} reduced_scoring_date: NULL
email_instructor: NULL
restrict_prob_progression: NULL


Coursename_problem_user:

Here we have a large number of rows that track student progress. There are various collisions when we examine the fields that share the same 'user_id' (with/without "@antagning.se"), 'set_id' and 'problem_id'. For example, here is one such collision:

*************************** 1. row ***************************
user_id: 123456789@antagning.se
set_id: Slutprov_3.6-3.8
problem_id: 6
source_file: NULL
value: NULL
max_attempts: NULL
problem_seed: 2344
status: 0
attempted: 0
last_answer: NULL
num_correct: 0
num_incorrect: 0
sub_status: 0
flags: NULL
showMeAnother: NULL
prCount: NULL
att_to_open_children: NULL
counts_parent_grade: NULL
showMeAnotherCount: NULL
prPeriod: NULL

*************************** 2. row ***************************
user_id: 123456789
set_id: Slutprov_3.6-3.8
problem_id: 6
source_file: NULL
value: NULL
max_attempts: NULL
problem_seed: 2150
status: 1
attempted: 1
last_answer: ["AnSwEr0001","18"]
num_correct: 1
num_incorrect: 0
sub_status: 1
flags: NULL
showMeAnother: NULL
prCount: NULL
att_to_open_children: NULL
counts_parent_grade: NULL
showMeAnotherCount: NULL
prPeriod: NULL


We would appreciate it if you could suggest any general rule for handling rows in this table. Your assistance in this matter would be invaluable to us.


Thank you for your time and expertise.


Best regards,


Lukas Fredriksson


In reply to Lukas Fredriksson

Re: Need assistance on resolving user data duplication in WeBWorK database

by Danny Glin -
Directly modifying data in the WeBWorK database is always dangerous, and any actions done based on advice you receive on the forums is done at your own risk. At the very least you should back up your database before trying anything suggested here.

It will be hard to troubleshoot your issue when you haven't described how you modified LTIAdvanced.pm, but I can provide some general info:

The coursename_set_user and coursename_problem_user tables will have a record for each set or problem (respectively) assigned to each user. When you see an entry with user_id: 123456789@antagning.se and set_id: Slutprov_2.3-2.4 it contains the data associated with the fact that the set with ID Slutprov_2.3-2.4 is assigned to the user with ID 123456789@antagning.se. Since you have entries with two different user_ids it implies that WeBWorK believes that this set is assigned to two different users: one with user_id: 123456789@antagning.se and one with user_id: 123456789.

Modifying LTIAdvanced.pm won't change the user_ids of any users who had already been created in WeBWorK. Most likely what happened is that when the user_ids changed in Moodle, the next time that user accessed WeBWorK a new WeBWorK account was created with the new user_id. This means that likely there are two WeBWorK accounts for each user: one with the old user_id and one with the new user_id.