## WeBWorK Main Forum

### Re: Item analysis and assessment in webwork?

by Aaron Wangberg -
Number of replies: 0
Hi Lars,

The issue you describe is complex, but it sounds similar to a project we've been developing to be able to provide individual calculus students with customized assignments based upon their struggles with precalculus material.  There are some issues (marked by a *) with the underlying data in our approach described below, but if you or anyone else is interested in using, discussing, or generalizing our approach please contact me (awangberg@winona.edu).

In order to associate specific assessment outcomes with specific WeBWorK problems, we originally defined a bunch of WeBWorK sets and included problems in each set that specifically addressed each outcome.  We then gave students a pre-test, customized practice, and customized post-test by randomly pulling a problem from each set for each student.

We used a separate database to contain the information about each set of problems, and this information was helpful in producing a couple of different assessment reports.  The graph below shows the post-test vs. pre-test ability of students on problems involving linear and rational functions.  Each dot represents the progress of one student, and the size of each dot indicates the amount of time they spent practicing that material.  We can provide similar plots for more finely defined concepts (e.g. graphing problems, graphing problems with transformations, graphing problems with horizontal transformations, etc.)

We could also provide assessment on each item across all students.  The table below shows 7 different webwork sets related to working with a linear equation.  The table shows the average pre-test score, post-test score, and the average amount of practice time spent by successful (green) and unsuccessful (red) post-test students (n=110) on each assessment item.  The green bars on the far right in the table below shows the success rates of each individual problem in each assessment set.

In terms of assessment, this approach has drawbacks.  One issue is related to reliability.  The green bars above indicate that some WeBWorK problems were easier than others within each set, even though I'd (incorrectly) thought they assess the same concept*.  This design also encouraged students to "practice for the post-test" to some degree.

This fall semester, we tried a (better) approach.  I had previously ranked* the specific knowledge, skills, and concepts needed to solve the problems in each WeBWorK set, and this data helped us generate a list of the 3 "nearest WeBWorK set neighbors" for each assessment item on a 25-question pre-test.  Based upon a student's pre-test score, we used these associations to generate customized practice problems for each student.  We're still analyzing the effectiveness of this approach, but we can use past data to generate a "success" tree for the problem and its neighbors.  For example, the tree below shows how successful students were on one assessment item and its three nearest neighbors using both pre-test (top graph) and post-test (bottom graph) data.  Each neighbor is represented on a different level in the tree, and the right branch at each node indicates the number of students who successfully completed that question.  The data associated with WeBWorK problems could be used to help generate better neighbors for customized practice sets for students.

Despite the issues with the *-ed items above, I think there are some ways to affiliate some of the information above with problems in the National Problem Library.  If anyone is interested in our approach, using our approach, or being involved in discussions about the issue above, please contact me.

Aaron Wangberg
awangberg@winona.edu
Winona State University