## WeBWorK Problems

### Tolerance setting for answers is unreliable

by tim Payer -
Number of replies: 2
Greetings,

I am having a problem with the tolerance for students answers being much wider than set in the problem.

I have set the tolerance for correct answers at 0.1 with the command:

Context()->flags->set(tolerance => 0.1);

And yet the third and fourth answer in this problem set is accepting students answers that have a tolerance of plus or minus 1.4 for the students entry.

Tim

# DESCRIPTION
# Stat 109 Data summarization
# WeBWorK problem written by TimPayer <tsp1@humboldt.edu>
# ENDDESCRIPTION

## DBsubject(Statistics)
## DBchapter(Exploratory data analysis/descriptive statistics)
## DBsection(Summary statistics)
## Institution(Humboldt State University)
## Author(Tim Payer)
## KEYWORDS(summations, mean, sd)

DOCUMENT();
"PGstandard.pl",
"PGbasicmacros.pl",
"MathObjects.pl",
"PGML.pl",
"niceTables.pl",
"parserPopUp.pl",
"PGcourse.pl",
"parserFormulaUpToConstant.pl"
);

Context("Numeric");
Context()->flags->set(tolerance => 0.1);
$showPartialCorrectAnswers = 1; ## The beginning of 3.1 Stat 109 problem: ## Generate 8 random values and sort. These values ## will form the "box" of the boxplot.$parsum = 0;
@countpre = ();
foreach my $i (0..7) {$countpre [$i] = random(14.3, 16.5,0.1);$sum1 = Compute("$countpre[$i] + $parsum");$parsum = $sum1; } sub num_sort { PGsort sub {$_[0] <=> $_[1]}, @_; } @out = num_sort(@countpre); ## Create Quartiles and Outlier Threshold values ## based upon the 8 boxplot data points.$Q1 = Compute(" ($out[0] +$out[1])/2");
$Q3 = Compute(" ($out[6] + $out[7])/2");$step = Compute(" 1.5*($Q3 -$Q1)");
$LOT = Compute("$Q1 -$step");$UOT = Compute("$Q3 +$step");

$LOTminus1 =$LOT -0.6;
$LOTplus1 =$LOT + 0.6;
$LOTplus2 =$LOT + 1.3;
$Q1minus1 =$out[0]-0.1;
$Q3plus1 =$out[7]+0.1;
$UOTminus1 =$UOT -0.2;
$UOTplus1 =$UOT + 0.2;
$LOTminus2 =$LOT -0.8;
$UOTplus2 =$UOT + 0.8;
$UOTminus2 =$UOT -0.8;

@count =();

$cnt1 = random($LOTminus2, $LOTminus1, 0.1);# low outlier$cnt10 = floor($cnt1*10);$count[0] = Compute("$cnt10/10 ");$cnt2 = random($LOTplus1,$Q1minus1, 0.1); # left whisker terminal.
$cnt20 = floor($cnt2*10);
$count[1] = Compute("$cnt20/10 ");
$cnt11 = random($Q3plus1, $UOTminus1, 0.1); # right whisker terminal.$cnt110 = ceil($cnt11*10);$count[10] = Compute("$cnt110/10 ");$cnt12 = random( $UOTplus1,$UOTplus2, 0.1);# high outlier
$cnt120 = ceil($cnt12*10);
$count[11] = Compute("$cnt120/10 ");

$sum = Compute("$sum1 + $count[0] +$count[1] + $count[10] +$count[11] ");

##### Combine the separate box, outlier, whisker data into one array.
@boxa = ();
foreach my $i (0..7) {$boxa [$i] =$out [$i]; }$boxa [8] = $count [0]; # Lower outlier$boxa [9] = $count [1]; # Lower whisker terminal$boxa [10] = $count [10]; # Upper whisker terminal$boxa [11] = $count [11]; # Upper outlier ###################################################### ## For a problem that gives each student a different random sample, ## un-comment(#) the code above. For a fixed problem that permits each student to share ## a common answer, comment out the loop above and un-comment the 12 ## lines of code below. #$count [0] = 14.1;
#$count [1] = 14.4; #$out [0] = 14.7;
#$out [1] = 14.8; #$out [5] = 15.0;
#$out [6] = 15.2; #$out [7] = 15.5;
#$out [8] = 15.7; #$out [9] = 15.9;
#$out [10] = 16.1; #$count [10] = 17.5;
#$count [11] = 17.9; #$parsum = 0;
#foreach my $i (0..7) { #$sum1 = Compute("$out[$i] + $parsum"); #$parsum = $sum1; #} #$sum = Compute("$sum1 +$count[0] + $count[1] +$count[10] + $count[11] "); # #$Q1 = Compute(" ($out[0] +$out[1])/2");
#$Q3 = Compute(" ($out[6] + $out[7])/2"); #$step = Compute(" 1.5*($Q3 -$Q1)");
#$LOT = Compute("$Q1 -$step"); #$UOT = Compute("$Q3 +$step");
##############################################################

## Answers generated for both forms (randomized or fixed) problem sets.

$ans1 = Compute("$sum/12");
$ans2 = Compute("($out[3] + $out[4])/2");$ans3 = $Q1;$ans4 = $Q3;$ans5 = $LOT;$ans6 = $UOT;$ans7 = $count[1];$ans8 = $count[10];$outliers = List($count[0],$count[11]);
$dif1 =$out[4]-$out[3];$ans9 =sprintf("%0.1f",$dif1);$ans10 = $out[0]+$out[1];
$dif2 =$ans4-ans3; BEGIN_PGML *Drawn from lecture notes Week 2, Day 2.* 3.1) Beall et al. 2002, made an extensive survey of hemoglobin levels (g/dL) among male populations living at different elevations. Given that the concentration of oxygen at high elevations ( above 18,000 feet) is less than half that found at sea level, Beall's suspected that men living at higher elevations should have higher hemoglobin levels to compensate for the thinner air. As a base line for comparison, hemoglobin levels from over 1700 men living at sea level in the USA was recorded. A random sample of twelve of these hemoglobin levels are sorted and tallied in the table below. Determine the mean and following eight boxplot values to summarize the twelve data points. Source: Data set drawn from Beall et al. 2002, from the text: "The Analysis of Biological Data", Whitlock and Shluter, 2nd ed. 2015. [@DataTable( [ [ ["".PGML::Format('[\text{Subject}]')."", halign => 'r|', rowcss => 'border-bottom: 2px solid;', cellcss => 'border-top: 2px solid; border-left: 2px solid; border-right: 2px solid; ',], ["".PGML::Format('[\text{Hemoglobin: g/dL}]')."", rowcss => 'border-bottom: 2px solid;', cellcss => 'border-top: 2px solid; border-right: 2px solid; ',], ], [ ["1", midrule => '1', rowcss => 'border-bottom: 1px solid;', cellcss => 'border-bottom: 1px solid; border-left: 1px solid; border-right: 1px solid; ',], ["".PGML::Format('[[count[0]]]')."",
rowcss  => 'border-bottom: 1px solid;',
cellcss => 'border-top: 1px solid;
border-right: 1px solid; ',],
],
[ ["2",
halign  => '|r|',
rowcss  => 'border-bottom: 1px solid;',
cellcss => 'border-left: 1px solid;
border-right: 1px solid; ',],
["".PGML::Format('[[count[1]]]')."", colspan => '1', halign => 'c|', rowcss => 'border-bottom: 1px solid;', cellcss => 'border-right: 1px solid; ',], ], [ ["3", midrule => '1', rowcss => 'border-bottom: 1px solid;', cellcss => 'border-bottom: 1px solid; border-left: 1px solid; border-right: 1px solid; ',], ["".PGML::Format('[[out[0]]]')."",
rowcss  => 'border-bottom: 1px solid;',
cellcss => 'border-top: 1px solid;
border-right: 1px solid; ',],
],
[ ["4",
halign  => '|r|',
rowcss  => 'border-bottom: 1px solid;',
cellcss => 'border-left: 1px solid;
border-right: 1px solid; ',],
["".PGML::Format('[[out[1]]]')."", colspan => '1', halign => 'c|', rowcss => 'border-bottom: 1px solid;', cellcss => 'border-right: 1px solid; ',], ], [ ["5", midrule => '1', rowcss => 'border-bottom: 1px solid;', cellcss => 'border-bottom: 1px solid; border-left: 1px solid; border-right: 1px solid; ',], ["".PGML::Format('[[out[2]]]')."",
rowcss  => 'border-bottom: 1px solid;',
cellcss => 'border-top: 1px solid;
border-right: 1px solid; ',],
],
[ ["6",
halign  => '|r|',
rowcss  => 'border-bottom: 1px solid;',
cellcss => 'border-left: 1px solid;
border-right: 1px solid; ',],
["".PGML::Format('[[out[3]]]')."", colspan => '1', halign => 'c|', rowcss => 'border-bottom: 1px solid;', cellcss => 'border-right: 1px solid; ',], ], [ ["7", midrule => '1', rowcss => 'border-bottom: 1px solid;', cellcss => 'border-bottom: 1px solid; border-left: 1px solid; border-right: 1px solid; ',], ["".PGML::Format('[[out[4]]]')."",
rowcss  => 'border-bottom: 1px solid;',
cellcss => 'border-top: 1px solid;
border-right: 1px solid; ',],
],
[ ["8",
halign  => '|r|',
rowcss  => 'border-bottom: 1px solid;',
cellcss => 'border-left: 1px solid;
border-right: 1px solid; ',],
["".PGML::Format('[[out[5]]]')."", colspan => '1', halign => 'c|', rowcss => 'border-bottom: 1px solid;', cellcss => 'border-right: 1px solid; ',], ], [ ["9", midrule => '1', rowcss => 'border-bottom: 1px solid;', cellcss => 'border-bottom: 1px solid; border-left: 1px solid; border-right: 1px solid; ',], ["".PGML::Format('[[out[6]]]')."",
rowcss  => 'border-bottom: 1px solid;',
cellcss => 'border-top: 1px solid;
border-right: 1px solid; ',],
],
[ ["10",
halign  => '|r|',
rowcss  => 'border-bottom: 1px solid;',
cellcss => 'border-left: 1px solid;
border-right: 1px solid; ',],
["".PGML::Format('[[out[7]]]')."", colspan => '1', halign => 'c|', rowcss => 'border-bottom: 1px solid;', cellcss => 'border-right: 1px solid; ',], ], [ ["11", midrule => '1', rowcss => 'border-bottom: 1px solid;', cellcss => 'border-bottom: 1px solid; border-left: 1px solid; border-right: 1px solid; ',], ["".PGML::Format('[[count[10]]]')."",
rowcss  => 'border-bottom: 1px solid;',
cellcss => 'border-top: 1px solid;
border-right: 1px solid; ',],
],
[ ["12",
halign  => '|r|',
rowcss  => 'border-bottom: 1px solid;',
cellcss => 'border-left: 1px solid;
border-right: 1px solid; ',],
["".PGML::Format('[[count[11]]]')."", colspan => '1', halign => 'c|', rowcss => 'border-bottom: 1px solid;', cellcss => 'border-right: 1px solid; ',], ], ], caption => 'Hemoglobin levels (g/dL) for twelve American men at sea level:', align => '|r|c|l|', columnscss => ['border-left: 0px solid; ', 'border-right: 0px solid; ', ' ',], );@]* Determine the mean hemoglobin level for this sample of men living at sea level in the USA: 3.1a) [\bar x] = [__________]{"ans1"}

Determine the median hemoglobin level for this sample of men living at sea level in the USA:

3.1b)    [\quad \Large{\tilde x}]  = [__________]{"$ans2"} 3.1c) Determine the first Quartile: [Q_1] = [__________]{"$ans3"}

3.1d) Determine the third Quartile: [Q_3]  = [__________]{"$ans4"} 3.1e) Determine the lower outlier threshold: [LOT] = [__________]{"$ans5"}

3.1f)  Determine the upper outlier threshold: [UOT]  = [__________]{"$ans6"} 3.1g) The lower whisker of a boxplot of this data set terminates at which data point?: = [__________]{"$ans7"}

3.1h)  The upper whisker of a boxplot of this data set terminates at which data point?:   = [__________]{"$ans8"} 3.1i) List any outlier(s) for this data set: = [__________]{"$outliers"}

END_PGML

BEGIN_PGML_SOLUTION
*SOLUTION*

3.1a) The mean is calculated by summing all the data points and dividing by the number of data points.

[ \bar x = \frac{1}{12} \sum_{i = 1}^{12}x_i =  \frac{1}{12} ([$count[0]] +[$count[1]] +[$out[0]] + [$out[1]] +[$out[2]] +[$out[3]] +[$out[4]] +[$out[5]] +[$out[6]] +[$out[7]] +[$count[10]] +[$count[11]])]

[ \bar x =  \frac{[$sum]}{12}] [ \bar x = [$ans1]]

3.1b)  The median of a sorted even numbered data set is calculated by averaging the middle two values.

[\begin{aligned}&\\
\Large {\tilde x} &= \left(\frac{n + 1}{2}\right)^{th}   && \text{The general form. }\\
\Large {\tilde x} &= \left(\frac{12 + 1}{2}\right)^{th}   && \text{Substituting the sample size. }\\
\Large {\tilde x} &= \left(\frac{13}{2} \right)^{th} = (6.5)^{th}  && \text{Reducing. }\\
\Large {\tilde x} &= 6^{th} + 0.5(7^{th}-6^{th})   && \text{Account for the half jump.}\\
\Large {\tilde x} &= [$out[3]] + 0.5([$out[4]]-[$out[3]]) && \text{Substitute for the "th" value.}\\ \Large {\tilde x} &= [$out[3]]  + 0.5([$ans9]) && \text{Account for the half jump.}\\ \Large {\tilde x} &= [$ans2]    && \text{Reduce.}
\end{aligned}]

3.1c)  Since [N = 12] is divisible by 4, the first quartile is calculated by the following.

[\begin{aligned}&\\
Q_1 &= \frac{\left[\frac{n}{4}\right]^{th} +\left[\frac{n}{4} +1\right]^{th}}{2}   && \text{The }  Q_1 \text{ criteria when n is divisible by 4. }\\
Q_1 &= \frac{\left[\frac{12}{4}\right]^{th} +\left[\frac{12}{4} +1\right]^{th}}{2}   && \text{Substituting the sample size. }\\
Q_1 &= \frac{3^{rd} +4^{th}}{2}   && \text{Reducing fractions. }\\
Q_1 &= \frac{[$out[0]] +[$out[1]]}{2}   && \text{Substitute the 3rd and 4th value. }\\
Q_1 &= \frac{[$ans10]}{2} && \text{Sum numerator.}\\ Q_1 &= [$ans3]    && \text{Reduce.}
\end{aligned}]

3.1d)  Since [N = 12] is divisible by 4, the third quartile is calculated by the following.

[\begin{aligned}&\\
Q_3 &= \frac{\left[\frac{3n}{4}\right]^{th} +\left[\frac{3n}{4} +1\right]^{th}}{2}   && \text{The }  Q_3 \text{ criteria when n is divisible by 4. }\\
Q_3 &= \frac{\left[\frac{3 \cdot 12}{4}\right]^{th} +\left[\frac{3 \cdot12}{4} +1\right]^{th}}{2}   && \text{Substituting the sample size. }\\
Q_3 &= \frac{9^{th} +10^{th}}{2}   && \text{Reducing fractions. }\\
Q_3 &= \frac{[$out[6]] +[$out[7]]}{2}   && \text{Substitute the 3rd and 4th value. }\\
Q_3 &= \frac{[$ans10]}{2} && \text{Sum numerator.}\\ Q_3 &= [$ans3]    && \text{Reduce.}
\end{aligned}]

3.1e)  The lower outlier threshold (LOT) is calculated by the following.

[\begin{aligned}&\\
LOT &= Q_1 - step = Q_1 - 1.5 \times (Q_3 - Q_1)   && \text{The general form for the  }LOT.\\
LOT &= [$ans3] - 1.5 \times ([$ans4] - [$ans3]) && \text{Substituting for the given values.}\\ LOT &= [$ans3] - 1.5 \times ([$dif2]) && \text{Reduce.}\\ LOT &= [$ans3] - [$step] && \text{Reduce.}\\ LOT &= [$ans5]  && \text{Reduced.}
\end{aligned}]

3.1f)  The upper outlier threshold (UOT) is calculated by the following.

[\begin{aligned}&\\
UOT &= Q_3 + step = Q_1 + 1.5 \times (Q_3 - Q_1)   && \text{The general form for the  }UOT.\\
UOT &= [$ans4] + 1.5 \times ([$ans4] - [$ans3]) && \text{Substituting for the given values.}\\ UOT &= [$ans4] + 1.5 \times ([$dif2]) && \text{Reduce.}\\ UOT &= [$ans4] + [$step] && \text{Reduce.}\\ UOT &= [$ans6]  && \text{Reduced.}
\end{aligned}]

3.1g)  The lower whisker terminates at [$ans7] because it is the last data point that satisfies the restriction: [LOT < [$ans7] < Q_1]
[[$ans5] < [$ans7] < [$ans3] ] 3.1h) The upper whisker terminates at [$ans8] because it is the last data point that satisfies the restriction:

[Q_3 < [$ans8] < UOT] [[$ans4] <  [$ans8] < [$ans6] ]

3.1i)  There are two outliers in this boxplot which are [[$count[0]] ], and [[$count[11]] ], because:

[[$count[0]] < LOT] [[$count[0]] < [$ans5]] [[$count[11]] > UOT]
[[$count[11]] > [$ans6]]

END_PGML_SOLUTION

ENDDOCUMENT();

### Re: Tolerance setting for answers is unreliable

by Danny Glin -
By default the tolerance is relative, meaning that when you set it to 0.1, it will accept answers within 10% of the correct answer, so within about 1.4 is the expected behaviour for those answers.

It looks like you want to use absolute tolerance, so change your code to:
Context()->flags->set(tolerance => 0.1, tolType => "absolute");

See http://webwork.maa.org/wiki/NumericalTolerance for reference.

### Re: Tolerance setting for answers is unreliable

by Alex Jordan -
Danny has explained about the two tolerance types.

I want to point out something else that I have experienced with statistics questions and quartiles. While your class and your textbook may have one specific procedure for calculating them, the resources my students have turned to (internet, software, calculators) have at least four methods that I've catalogued.

Your data set has eight values. Consider a data set:
1 1 5 5 5 5 5 5

It looks like you would expect Q1 to be 3, the average of 1 and 5, quite a reasonable method. But you should know that even mainstream software like Excel, which has three quartile functions:

quartile() [which is deprecated],
quartile.inc(),
quartile.exc()

outputs either 2 or 4 for Q1, depending on which function you use. The logic is about going 25% or 75% of the way between 1 and 5.

So students could reasonably (imho) find Q1 to be 2, 3, or 4. So now you have completely different tolerance issues if the answer is 3 and they enter 2 or 4.

You could deal with this by being very clear with your students about your specifications on finding quartiles. Or, what I did was to use parserOneOf.pl and make it so the answer was OneOf("2,3,4"). Then any answer within the usual tolerance of any one of these would be OK.

====

Not that you asked, but if you are coding statistics questions, other tolerance issues that are going to require special care that I have catalogued fall into four categories:
• The quartiles issue
• Conversion of z-scores (and t-scores, etc.) to probabilities (and vice versa) sometimes relies on tables, where excessive rounding happens; sometimes on more accurate decimal values like say from calculators; and sometimes relies on approximations coming form the normal approximation to the binomial theorem [with or without the normal correction by 0.5]. Because of this, sometimes it's reasonable for four students to have all done something "right", but they all have decimal answers that are slightly out of tolerance from each other. As with quartiles, I just use OneOf to deal with this.
• With regression lines covering a range of x-value data that is far from x=0, a small rounding error in slope can result in a large relative error for x-values near where the data is. So with these, I make sure to change the domain for comparison to surround x-bar, not be the default [-2,2].
• What should be done when a probability answer is something like 0.9999? With default tolerances, 1 is acceptable. Maybe that is OK, but there is a big conceptual difference between probability 1 and probability 0.9999. Even worse, 1.0008 would be acceptable too. [I have not yet come up with a context-based solution for this issue.]