Difference between revisions of "OPL Problem Statistics"
(explain planned change to script used to load the global statistics data for WW-2.15) |
|||
(99 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | {{UnderConstruction}} |
+ | <!-- {{UnderConstruction}} |
+ | --> |
||
+ | In WeBWorK (beginning with version 2.12) the Library Browser optionally displays local and global data about problems, specifically the number of individuals who have attempted the problem, the average number of attempts on the problem and the average status earned on the problem. The purpose is to provide instructors with useful information when selecting problems. Local data represents usage at your institution and global data represents the sum of all local data contributions. This page provides information on OPL Problem Statistics for Instructors and WeBWorK Administrators. |
||
− | In a future version of WeBWorK (probably 2.12) the Library Browser will optionally display local and global data about problems, specifically the number of individuals who have attempted the problem, the average number of attempts on the problem and the average status earned on the problem. Local data represents usage at your institution and global data represents the sum of all local data contributions. This page provides information on OPL Problem Statistics for Instructors and WeBWorK Administrators. |
||
+ | ==The Display== |
||
− | |||
− | ==The display== |
||
The display in the Library Browser looks like |
The display in the Library Browser looks like |
||
[[File:OPL_Statistics.png]] |
[[File:OPL_Statistics.png]] |
||
− | |||
− | where obviously the data displayed in the image above is not real. |
||
==Information for Instructors== |
==Information for Instructors== |
||
Line 18: | Line 15: | ||
===GLOBAL Usage=== |
===GLOBAL Usage=== |
||
− | Global data on problem usage is contributed by many institutions using WeBWorK all over the world. The Usage figure is the total number of individuals who have |
+ | Global data on problem usage is contributed by many institutions using WeBWorK all over the world. The Usage figure is the total number of individuals from contributing institutions who have attempted this problem at least once. A high figure represents a problem which has been assigned to many students and is both popular with instructors and likely bug free. |
+ | |||
===GLOBAL Attempts=== |
===GLOBAL Attempts=== |
||
The Attempts figure is the global average of the number of attempts (both correct and incorrect) individuals take on this problem. A high figure may represent a difficult problem. Note that problems with multiple parts may have higher average attempts since many students will submit an answer to each part before continuing and each such submittal counts as an attempt. |
The Attempts figure is the global average of the number of attempts (both correct and incorrect) individuals take on this problem. A high figure may represent a difficult problem. Note that problems with multiple parts may have higher average attempts since many students will submit an answer to each part before continuing and each such submittal counts as an attempt. |
||
Line 26: | Line 23: | ||
Reviewing a problem and looking at both the average Attempts and average Status should give instructors valuable information about the difficulty of the problem. |
Reviewing a problem and looking at both the average Attempts and average Status should give instructors valuable information about the difficulty of the problem. |
||
===LOCAL Usage=== |
===LOCAL Usage=== |
||
− | Local data on problem usage is generated and maintained by your institution. The Usage figure is the total number of local individuals who have |
+ | Local data on problem usage is generated and maintained by your institution. The Usage figure is the total number of local individuals who have attemtped this problem at least once. A high figure represents a problem which has been assigned to many students and is both popular with instructors at your institution and likely bug free. Local data is generated when your systems admin runs the standalone script update-OPL-statistics or, assuming the display of local data is enabled, the script OPL-update. |
===LOCAL Attempts=== |
===LOCAL Attempts=== |
||
Line 33: | Line 30: | ||
The Status figure is the local average of the Status individuals at your institution have earned on this problem. The Status is the percentage correct (from 0% to 100%) recorded for the problem. A low figure may represent a difficult problem. The Status is often fairly high since many students will work on a problem until they get it correct or nearly so. |
The Status figure is the local average of the Status individuals at your institution have earned on this problem. The Status is the percentage correct (from 0% to 100%) recorded for the problem. A low figure may represent a difficult problem. The Status is often fairly high since many students will work on a problem until they get it correct or nearly so. |
||
− | Reviewing a problem and looking at both the average Attempts and average Status should give instructors valuable information about the difficulty of the problem. |
+ | Reviewing a problem and looking at both the local average Attempts and local average Status should give instructors valuable information about the difficulty of the problem for students at your institution. |
==Information for WeBWorK Administrators== |
==Information for WeBWorK Administrators== |
||
Line 44: | Line 41: | ||
Either or both can be disabled by setting the above values to zero (0) in the <code>localOverrides.conf</code> file. |
Either or both can be disabled by setting the above values to zero (0) in the <code>localOverrides.conf</code> file. |
||
− | Either or both can be enabled or disabled for an individual course by setting the above values to one (1) or zero (0) in the <code>course.conf</code> file |
+ | Either or both can be enabled or disabled for an individual course by setting the above values to one (1) or zero (0) in the course's <code>course.conf</code> file. |
+ | |||
+ | No OPL Problem statistics data will be displayed unless <code>update-OPL-statistics</code> is run. If you enable either of the above options and run <code>OPL-update</code> then <code>update-OPL-statistics</code> will automatically be run. |
||
+ | |||
+ | ===Generating Global OPL Problem statistics=== |
||
+ | Global OPL Problem statistics data is contained in a file <code>OPL_global_statistics.sql</code> which is distributed with the OPL. |
||
+ | |||
+ | Downloading the current version of the OPL with <code>git</code> will automatically retrieve the latest version of this file. |
||
+ | |||
+ | Assuming <code>$problemLibrary{showLibraryGlobalStats} = 1</code> is set for the server, then whenever the script <code>update-OPL-statistics</code> is run, the file <code>OPL_global_statistics.sql</code> will be processed and a MySQL table <code>OPL_global_statistics</code> will be created which contains the global data for display. |
||
+ | |||
+ | Changes planned for WeBWorK-2.15 will require running <code>load-OPL-global-statistics</code> to load the updated global statistics data instead of <code>update-OPL-statistics</code> which was used in older versions and did additional unrelated work. |
||
+ | ===Generating Local OPL Problem statistics=== |
||
− | defaults.config:$problemLibrary{showLibraryLocalStats} = 1; |
||
+ | This data is generated in one of two ways. First, if <code>$problemLibrary{showLibraryLocalStats} = 1</code> is set for the server, then whenever the script <code>OPL-update</code> is run to update the OPL, the OPL statistics will be generated by calling <code>update-OPL-statistics</code>. The second method is to run the script <code>update-OPL-statistics</code> directly. Either method creates a MySQL table <code>OPL_local_statistics</code> which contains the local data for display. All scripts are found in the standard directory <code>/opt/webwork/webwork2/bin/</code>. While the script <code>update-OPL-statistics</code> can be be run at any time and as often as one wants, '''it is recommended that script be run at the end of every term or semester''' after WeBWorK assignments due dates have passed. Note that no data will be collected for assignments whose due date has not passed at the time the script is run nor will any data be collected from archived courses. This is the reason we recommend running the script <code>update-OPL-statistics</code> at the end of semesters. |
||
− | localOverrides.conf:$problemLibrary{showLibraryLocalStats} = 1; |
||
− | localOverrides.conf.dist:$problemLibrary{showLibraryLocalStats} = 1; |
||
− | wwadmin@wwserver:/opt/webwork/webwork2/conf$ grep -R showLibraryGlobalStats * |
||
− | defaults.config:$problemLibrary{showLibraryGlobalStats} = 1; |
||
− | localOverrides.conf:$problemLibrary{showLibraryGlobalStats} = 1; |
||
− | localOverrides.conf.dist:$problemLibrary{showLibraryGlobalStats} = 1; |
||
+ | ====How Accurate Local Data is Maintained==== |
||
+ | Data is collected from all closed (due date has passed) homework sets and this is done in such a way that old data (e.g. from courses deleted since the last time the script was run or from courses that are reused by deleting old students and adding new ones or just old courses that remain on the server) is saved and new data is properly appended. |
||
− | The data that we are requesting you generate and contribute goes in a MySQL table OPL_local_statistics where entries look like: |
||
+ | Specifically the script <code>update-OPL-statistics</code> first checks whether the MySQL table <code>OPL_problem_user</code> exists and creates it if it does not exist. If this table exists, new data will be appended to it. '''Maintaining this table is the key to maintaining accurate local OPL problem statistics data at your institution. If you bring a new server online, even if you choose not to bring over the whole <code>webwork</code> database, you should definitely transfer over the <code>OPL_problem_user</code> table.''' |
||
− | +--------------------------------------------------+--------------------+------------------+----------------+ |
||
+ | The <code>OPL_problem_user</code> table contains rows with the following type of data |
||
− | | source_file | students_attempted | average_attempts | average_status | |
||
+ | +--------------+-----------+--------+------------+------------+--------------------------------------------------+--------+-----------+-------------+---------------+ |
||
− | +--------------------------------------------------+--------------------+------------------+----------------+ |
||
+ | | course_id | user_id | set_id | due_date | problem_id | source_file | status | attempted | num_correct | num_incorrect | |
||
− | | Library/LoyolaChicago/Precalc/Chap4Sec2/Q02.pg | 2 | 3 | .5 | |
||
+ | +--------------+-----------+--------+------------+------------+--------------------------------------------------+--------+-----------+-------------+---------------+ |
||
− | +--------------------------------------------------+--------------------+------------------+----------------+ |
||
+ | | myTestCourse | admin | test | 1447649940 | 1 | Library/Rochester/setDerivatives7Log/mec1.pg | 1 | 1 | 1 | 1 | |
||
+ | +--------------+-----------+--------+------------+------------+--------------------------------------------------+--------+-----------+-------------+---------------+ |
||
− | so, as you can see, there is no student identifying data being requested. |
||
+ | This is a subset of the data contained in the CourseName_problem_user, CourseName_problem and CourseName_set tables. After the <code>OPL_problem_user</code> table is updated with new information, the <code>update-OPL-statistics</code> script reads through the |
||
+ | <code>OPL_problem_user</code> table and collects information on usage and average status and average number of attempts and puts this information in a new table <code>OPL_local_statistics</code> (first deleting an old version if it exists). For example, if the <code>OPL_problem_user</code> table contains only the information above, then the <code>OPL_local_statistics</code> table would |
||
+ | look like |
||
− | In order to get this project (which is a small and independent part of the WeBWorK "Big Data" project) off the ground, we need global data. Hence this personal request for data from your institution. |
||
+ | +--------------------------------------------------+--------------------+------------------+----------------+ |
||
+ | | source_file | students_attempted | average_attempts | average_status | |
||
+ | +--------------------------------------------------+--------------------+------------------+----------------+ |
||
+ | | Library/Rochester/setDerivatives7Log/mec1.pg | 1 | 2 | 1 | |
||
+ | +--------------------------------------------------+--------------------+------------------+----------------+ |
||
− | If you are interested in contributing (and we hope you are), the process is very easy. First download two scripts with the following commands: |
||
+ | This sample table shows only one student attempted the problem and they got it 100% correct after 2 attempts (not very realistic but it illustrates the point). The Library Browser uses the information in the <code>OPL_local_statistics</code> table to display the the local information illustrated in the screen shot above. Global information comes from the <code>OPL_global_statistics</code> table which has exactly the same structure as the <code>OPL_local_statistics</code> table. |
||
− | wget --no-check-cert https://raw.githubusercontent.com/goehle/webwork2/dbscript/bin/update-OPL-statistics |
||
+ | ===Contributing your Local Data to WeBWorK's Global Database=== |
||
− | wget --no-check-cert https://raw.githubusercontent.com/goehle/webwork2/dbscript/bin/upload-OPL-statistics |
||
+ | The global statistics displayed come from voluntary contributions of institutions all over the world using WeBWorK. Obviously the more institutions that contribute data, the more accurate the data will be and the more valuable the data will be to instructors when selecting problems. Contributing your Local Data to WeBWorK's Global Database is very easy and consists of running the script |
||
+ | <code>upload-OPL-statistics</code> which basically sends a compressed version of your <code>OPL_local_statistics</code> table to WeBWorK's repository. |
||
+ | ====The upload-OPL-statistics script==== |
||
+ | The <code>upload-OPL-statistics</code> script resides in the standard directory <code>/opt/webwork/webwork2/bin/</code>. '''First run''' <code>update-OPL-statistics</code> '''to make sure your local statistics are up to date.''' The <code>upload-OPL-statistics</code> script creates three files: server_name-xxxx-data.tar.gz, server_name-xxxx-desc.txt, and server_name-xxxx-opl.sql so you should run the command in a directory for which you have write permission, e.g. a temp directory. In that directory run the command |
||
+ | <pre> |
||
+ | perl upload-OPL-statistics |
||
+ | </pre> |
||
+ | Note if you do not have <code>/opt/webwork/webwork2/bin/</code> in your environmental search path, you will have to use the full path name of the script, <code>perl /opt/webwork/webwork2/bin/upload-OPL-statistics</code> |
||
+ | The script will request some basic information |
||
+ | * your institution |
||
+ | * your department |
||
+ | * your name |
||
+ | * your email address |
||
+ | * if you have uploaded data from this server before |
||
+ | * approximately what years does this data span |
||
+ | * approximately how many classes are included |
||
+ | * additional comments |
||
+ | Probably the most important thing to mention as an additional comment is, if you have previously uploaded data, should this upload replace your previous upload (the usual case) or does this upload consist of all new data (e.g. if your old MySQL OPL_problem_user table was removed or not moved to a new server). See the above section [[OPL_Problem_Statistics#How Accurate Local Data is Maintained|How Accurate Local Data is Maintained]] for more information. |
||
− | You can put them in /opt/webwork/webwork2/bin/ if you want. They should run on any recent version of WeBWorK but have only been tested on 2.7, 2.9 and 2.10. Run |
||
+ | Finally you will be given the choice of |
||
− | perl update-OPL-statistics |
||
+ | * Upload Data |
||
− | which generates the OPL_local_statistics table. In the future you should probably run update-OPL-statistics at the end of every semester. It harvests data from all closed (due date has passed) homework sets and is written so that old data (e.g. from courses deleted since the last time the script was run or from courses that are reused by deleting old students and adding new ones or just old courses that remain on the server) is saved and new data is properly handled. It does not retrieve data from archived courses. |
||
+ | * Reenter above information |
||
+ | * Cancel |
||
+ | That's it. Assuming you choose to go ahead and upload data, your data will be sent to the global WeBWorK repository and will be combined with data from all other contributing institutions. |
||
− | Then maybe you should contact your IRB office and tell them that you are planning on contributing data and show them the OPL_local_statistics table (or the sample above) so they can see that no student identifying data is involved. After receiving their blessing, to upload the data, run the command |
||
+ | ====IRB (Institutional Review Board) considerations==== |
||
− | perl upload-OPL-statistics |
||
+ | The data that is being contributed consists of the data contained in your <code>OPL_local_statistics</code> table, namely gross usage and averages. '''Absolutely no individual data is being transferred, much less any individually identifiable data.''' In particular the individual student data in the <code>OPL_problem_user</code> table is never transferred to the WeBWorK repository. Under the guidelines published by the U.S. Department of Health & Human Services (see http://www.hhs.gov/ohrp/policy/checklists/decisioncharts.html#c1), this research is not research involving human subjects, and 45 CFR part 46 does not apply. Specifically, looking at chart 1, one sees "Is the information '''individually identifiable''' (i.e. the identity of the subject is or may readily be ascertained by the investigator or associated with the information)?" and if the answer is NO, the conclusion is "The research is not research involving human subjects, and 45 CFR part 46 does not apply". In our case, not only is the data not individually identifiable, there is '''no individual data''' at all. However, different IRB offices view things in their own and different ways. You should probably contact your IRB office and tell them that you are planning on contributing data and show them the OPL_local_statistics table (or the sample above) so they can see themselves that no student identifying data is involved. |
||
− | That's it. Your local data will get sent to a server that Geoff Goehle (who has written most of the code for this project) is maintaining. The resulting OPL_global_statistics table will be distributed as part of the OPL. |
||
+ | ==Contributing Data even if you are not yet using WeBWorK Version 2.12 or later== |
||
+ | We hope many institutions will choose to contribute data even if they are not yet using WeBWorK Version 2.12. Currently (May 2015) we have data on approximately 23,000 of the 30,000 problems in the OPL. This is a good start, but we obviously want and need more data. |
||
− | If you are interested in seeing how this actually works in the Library Browser, do the following on your development system. |
||
+ | ===Downloading the scripts=== |
||
+ | The <code>update-OPL-statistics</code> and <code>upload-OPL-statistics</code> scripts are contained in the standard WeBWorK 2.12 (and later) master branch distribution but not in earlier distributions. You can download them directly with the following commands: |
||
− | git checkout origin/develop |
||
+ | wget --no-check-cert https://raw.githubusercontent.com/openwebwork/webwork2/master/bin/update-OPL-statistics |
||
− | git pull http://github.com/goehle/webwork2.git dbscript |
||
+ | wget --no-check-cert https://raw.githubusercontent.com/openwebwork/webwork2/master/bin/upload-OPL-statistics |
||
− | Update localOverrides.conf from localOverrides.conf.dist and run |
||
− | apachectl restart |
||
− | Then run |
||
+ | Note that these scripts are also contained in the master branch of webwork2 (in the webwork2/bin/ directory) so you can get them from that location if you prefer. |
||
− | update-OPL-statistics |
||
− | if you haven't run that yet. Also it's a good idea to first update the OPL if you haven't done that in awhile. Either turn off the global data option (in localOverrides.conf) or create a fake OPL_global_statistics table, e.g a copy of OPL_local_statistics (maybe very soon with your help we will have a real OPL_global_statistics table available). Note that if no data is available for a problem you are looking at in the Library Browser, you will not see labels with 0's displayed. |
||
− | Thanks for any help you can give us with this and feel free to write with any questions, comments or suggestions. |
||
+ | ===Running the scripts=== |
||
+ | You can put the scripts in a temp directory. Also the user running the scripts has to have the WEBWORK_ROOT environmental variable set (see e.g. [[Installation_Manual_for_2.10_on_Ubuntu_14.04#Configuring_the_Shell]]). The scripts should work on any recent version of WeBWorK but have only been tested on 2.7 and later versions. Note that it is a good idea to first update the OPL if you haven't done that in awhile (see e.g. [[Installation_Manual_for_2.10_on_Ubuntu_14.04#Updating_the_OPL]]). After updating the OPL if necessary, '''first''' run the command |
||
+ | perl update-OPL-statistics |
||
+ | which generates the OPL_local_statistics table. |
||
− | Sincerely, |
||
+ | The <code>upload-OPL-statistics</code> script creates three files: server_name-xxxx-data.tar.gz, server_name-xxxx-desc.txt, and server_name-xxxx-opl.sql so you should run the command in a directory for which you have write permission, e.g. a temp directory. Now in that directory run the command |
||
+ | perl upload-OPL-statistics |
||
− | Arnie |
||
+ | That's it. Your local data will get sent to a server that Geoff Goehle (who has written most of the code for this project) is maintaining. |
||
+ | -- Main.ArnoldPizer - 18 May 2016 <br /> |
||
[[Category:Administrators]] |
[[Category:Administrators]] |
Latest revision as of 09:21, 16 August 2019
In WeBWorK (beginning with version 2.12) the Library Browser optionally displays local and global data about problems, specifically the number of individuals who have attempted the problem, the average number of attempts on the problem and the average status earned on the problem. The purpose is to provide instructors with useful information when selecting problems. Local data represents usage at your institution and global data represents the sum of all local data contributions. This page provides information on OPL Problem Statistics for Instructors and WeBWorK Administrators.
Contents
- 1 The Display
- 2 Information for Instructors
- 3 Information for WeBWorK Administrators
- 4 Contributing Data even if you are not yet using WeBWorK Version 2.12 or later
The Display
The display in the Library Browser looks like
Information for Instructors
Statistics are only displayed for OPL problems for which local and/or global data exists (you will not see all 0's). Data is not collected nor displayed for non OPL problems. The display of local and/or global data can be disabled for an individual course or for all courses. If you are not seeing any data please ask your WeBWorK administrator to enable the display of data.
Following is a description of the data displayed.
GLOBAL Usage
Global data on problem usage is contributed by many institutions using WeBWorK all over the world. The Usage figure is the total number of individuals from contributing institutions who have attempted this problem at least once. A high figure represents a problem which has been assigned to many students and is both popular with instructors and likely bug free.
GLOBAL Attempts
The Attempts figure is the global average of the number of attempts (both correct and incorrect) individuals take on this problem. A high figure may represent a difficult problem. Note that problems with multiple parts may have higher average attempts since many students will submit an answer to each part before continuing and each such submittal counts as an attempt.
GLOBAL Status
The Status figure is the global average of the Status individuals have earned on this problem. The Status is the percentage correct (from 0% to 100%) recorded for the problem. A low figure may represent a difficult problem. The Status is often fairly high since many students will work on a problem until they get it correct or nearly so.
Reviewing a problem and looking at both the average Attempts and average Status should give instructors valuable information about the difficulty of the problem.
LOCAL Usage
Local data on problem usage is generated and maintained by your institution. The Usage figure is the total number of local individuals who have attemtped this problem at least once. A high figure represents a problem which has been assigned to many students and is both popular with instructors at your institution and likely bug free. Local data is generated when your systems admin runs the standalone script update-OPL-statistics or, assuming the display of local data is enabled, the script OPL-update.
LOCAL Attempts
The Attempts figure is the local average of the number of attempts (both correct and incorrect) individuals at your institution take on this problem. A high figure may represent a difficult problem. Note that problems with multiple parts may have higher average attempts since many students will submit an answer to each part before continuing and each such submittal counts as an attempt.
LOCAL Status
The Status figure is the local average of the Status individuals at your institution have earned on this problem. The Status is the percentage correct (from 0% to 100%) recorded for the problem. A low figure may represent a difficult problem. The Status is often fairly high since many students will work on a problem until they get it correct or nearly so.
Reviewing a problem and looking at both the local average Attempts and local average Status should give instructors valuable information about the difficulty of the problem for students at your institution.
Information for WeBWorK Administrators
Enabling and Disabling the display of OPL Problem statistics
The display of both global and local OPL Problem statistics in the Library Browser is enabled by default for all courses in the defaults.config
file:
$problemLibrary{showLibraryGlobalStats} = 1; $problemLibrary{showLibraryLocalStats} = 1;
Either or both can be disabled by setting the above values to zero (0) in the localOverrides.conf
file.
Either or both can be enabled or disabled for an individual course by setting the above values to one (1) or zero (0) in the course's course.conf
file.
No OPL Problem statistics data will be displayed unless update-OPL-statistics
is run. If you enable either of the above options and run OPL-update
then update-OPL-statistics
will automatically be run.
Generating Global OPL Problem statistics
Global OPL Problem statistics data is contained in a file OPL_global_statistics.sql
which is distributed with the OPL.
Downloading the current version of the OPL with git
will automatically retrieve the latest version of this file.
Assuming $problemLibrary{showLibraryGlobalStats} = 1
is set for the server, then whenever the script update-OPL-statistics
is run, the file OPL_global_statistics.sql
will be processed and a MySQL table OPL_global_statistics
will be created which contains the global data for display.
Changes planned for WeBWorK-2.15 will require running load-OPL-global-statistics
to load the updated global statistics data instead of update-OPL-statistics
which was used in older versions and did additional unrelated work.
Generating Local OPL Problem statistics
This data is generated in one of two ways. First, if $problemLibrary{showLibraryLocalStats} = 1
is set for the server, then whenever the script OPL-update
is run to update the OPL, the OPL statistics will be generated by calling update-OPL-statistics
. The second method is to run the script update-OPL-statistics
directly. Either method creates a MySQL table OPL_local_statistics
which contains the local data for display. All scripts are found in the standard directory /opt/webwork/webwork2/bin/
. While the script update-OPL-statistics
can be be run at any time and as often as one wants, it is recommended that script be run at the end of every term or semester after WeBWorK assignments due dates have passed. Note that no data will be collected for assignments whose due date has not passed at the time the script is run nor will any data be collected from archived courses. This is the reason we recommend running the script update-OPL-statistics
at the end of semesters.
How Accurate Local Data is Maintained
Data is collected from all closed (due date has passed) homework sets and this is done in such a way that old data (e.g. from courses deleted since the last time the script was run or from courses that are reused by deleting old students and adding new ones or just old courses that remain on the server) is saved and new data is properly appended.
Specifically the script update-OPL-statistics
first checks whether the MySQL table OPL_problem_user
exists and creates it if it does not exist. If this table exists, new data will be appended to it. Maintaining this table is the key to maintaining accurate local OPL problem statistics data at your institution. If you bring a new server online, even if you choose not to bring over the whole webwork
database, you should definitely transfer over the OPL_problem_user
table.
The OPL_problem_user
table contains rows with the following type of data
+--------------+-----------+--------+------------+------------+--------------------------------------------------+--------+-----------+-------------+---------------+ | course_id | user_id | set_id | due_date | problem_id | source_file | status | attempted | num_correct | num_incorrect | +--------------+-----------+--------+------------+------------+--------------------------------------------------+--------+-----------+-------------+---------------+ | myTestCourse | admin | test | 1447649940 | 1 | Library/Rochester/setDerivatives7Log/mec1.pg | 1 | 1 | 1 | 1 | +--------------+-----------+--------+------------+------------+--------------------------------------------------+--------+-----------+-------------+---------------+
This is a subset of the data contained in the CourseName_problem_user, CourseName_problem and CourseName_set tables. After the OPL_problem_user
table is updated with new information, the update-OPL-statistics
script reads through the
OPL_problem_user
table and collects information on usage and average status and average number of attempts and puts this information in a new table OPL_local_statistics
(first deleting an old version if it exists). For example, if the OPL_problem_user
table contains only the information above, then the OPL_local_statistics
table would
look like
+--------------------------------------------------+--------------------+------------------+----------------+ | source_file | students_attempted | average_attempts | average_status | +--------------------------------------------------+--------------------+------------------+----------------+ | Library/Rochester/setDerivatives7Log/mec1.pg | 1 | 2 | 1 | +--------------------------------------------------+--------------------+------------------+----------------+
This sample table shows only one student attempted the problem and they got it 100% correct after 2 attempts (not very realistic but it illustrates the point). The Library Browser uses the information in the OPL_local_statistics
table to display the the local information illustrated in the screen shot above. Global information comes from the OPL_global_statistics
table which has exactly the same structure as the OPL_local_statistics
table.
Contributing your Local Data to WeBWorK's Global Database
The global statistics displayed come from voluntary contributions of institutions all over the world using WeBWorK. Obviously the more institutions that contribute data, the more accurate the data will be and the more valuable the data will be to instructors when selecting problems. Contributing your Local Data to WeBWorK's Global Database is very easy and consists of running the script
upload-OPL-statistics
which basically sends a compressed version of your OPL_local_statistics
table to WeBWorK's repository.
The upload-OPL-statistics script
The upload-OPL-statistics
script resides in the standard directory /opt/webwork/webwork2/bin/
. First run update-OPL-statistics
to make sure your local statistics are up to date. The upload-OPL-statistics
script creates three files: server_name-xxxx-data.tar.gz, server_name-xxxx-desc.txt, and server_name-xxxx-opl.sql so you should run the command in a directory for which you have write permission, e.g. a temp directory. In that directory run the command
perl upload-OPL-statistics
Note if you do not have /opt/webwork/webwork2/bin/
in your environmental search path, you will have to use the full path name of the script, perl /opt/webwork/webwork2/bin/upload-OPL-statistics
The script will request some basic information
- your institution
- your department
- your name
- your email address
- if you have uploaded data from this server before
- approximately what years does this data span
- approximately how many classes are included
- additional comments
Probably the most important thing to mention as an additional comment is, if you have previously uploaded data, should this upload replace your previous upload (the usual case) or does this upload consist of all new data (e.g. if your old MySQL OPL_problem_user table was removed or not moved to a new server). See the above section How Accurate Local Data is Maintained for more information.
Finally you will be given the choice of
- Upload Data
- Reenter above information
- Cancel
That's it. Assuming you choose to go ahead and upload data, your data will be sent to the global WeBWorK repository and will be combined with data from all other contributing institutions.
IRB (Institutional Review Board) considerations
The data that is being contributed consists of the data contained in your OPL_local_statistics
table, namely gross usage and averages. Absolutely no individual data is being transferred, much less any individually identifiable data. In particular the individual student data in the OPL_problem_user
table is never transferred to the WeBWorK repository. Under the guidelines published by the U.S. Department of Health & Human Services (see http://www.hhs.gov/ohrp/policy/checklists/decisioncharts.html#c1), this research is not research involving human subjects, and 45 CFR part 46 does not apply. Specifically, looking at chart 1, one sees "Is the information individually identifiable (i.e. the identity of the subject is or may readily be ascertained by the investigator or associated with the information)?" and if the answer is NO, the conclusion is "The research is not research involving human subjects, and 45 CFR part 46 does not apply". In our case, not only is the data not individually identifiable, there is no individual data at all. However, different IRB offices view things in their own and different ways. You should probably contact your IRB office and tell them that you are planning on contributing data and show them the OPL_local_statistics table (or the sample above) so they can see themselves that no student identifying data is involved.
Contributing Data even if you are not yet using WeBWorK Version 2.12 or later
We hope many institutions will choose to contribute data even if they are not yet using WeBWorK Version 2.12. Currently (May 2015) we have data on approximately 23,000 of the 30,000 problems in the OPL. This is a good start, but we obviously want and need more data.
Downloading the scripts
The update-OPL-statistics
and upload-OPL-statistics
scripts are contained in the standard WeBWorK 2.12 (and later) master branch distribution but not in earlier distributions. You can download them directly with the following commands:
wget --no-check-cert https://raw.githubusercontent.com/openwebwork/webwork2/master/bin/update-OPL-statistics wget --no-check-cert https://raw.githubusercontent.com/openwebwork/webwork2/master/bin/upload-OPL-statistics
Note that these scripts are also contained in the master branch of webwork2 (in the webwork2/bin/ directory) so you can get them from that location if you prefer.
Running the scripts
You can put the scripts in a temp directory. Also the user running the scripts has to have the WEBWORK_ROOT environmental variable set (see e.g. Installation_Manual_for_2.10_on_Ubuntu_14.04#Configuring_the_Shell). The scripts should work on any recent version of WeBWorK but have only been tested on 2.7 and later versions. Note that it is a good idea to first update the OPL if you haven't done that in awhile (see e.g. Installation_Manual_for_2.10_on_Ubuntu_14.04#Updating_the_OPL). After updating the OPL if necessary, first run the command
perl update-OPL-statistics
which generates the OPL_local_statistics table.
The upload-OPL-statistics
script creates three files: server_name-xxxx-data.tar.gz, server_name-xxxx-desc.txt, and server_name-xxxx-opl.sql so you should run the command in a directory for which you have write permission, e.g. a temp directory. Now in that directory run the command
perl upload-OPL-statistics
That's it. Your local data will get sent to a server that Geoff Goehle (who has written most of the code for this project) is maintaining.
-- Main.ArnoldPizer - 18 May 2016