R in WeBWorK

R integration gives problem authors the ability to run arbitrary computations in R and make their results available to the rest of the PG code as if they were constructed using the standard PG functions. (In our running example, we could construct the random vector in R with `sample`, and/or calculate its mean with `mean`.) The reason why one might want to do this is R's rich library of high-quality statistical functions as well as its graphical abilities. While in theory these could be both replicated with PG, it would take a huge effort that can be better spent by simply using the functionality already available in R.

Required setup

Use a compatible version of WeBWorK

Since version 2.13, WeBWorK has had the capability to use R code in authoring problems.

Once you have a working WeBWorK server, there are three distinct steps that need to happen:

Set up the R server.
Install the Perl module for the Perl-R bridge.
Configure WeBWorK with the location of the R server.

Set up the R server

Your R server (which can be on the same server as WeBWorK), is quite easy to set up:

Install R following OS-specific instructions.
Install the Rserve server, which allows remote clients to execute R code on the Rserve server and returns the result in the response. The easiest way is to run (as an administrative user) Rscript -e 'install.packages("Rserve", repo="https://cran.rstudio.com")'. Note that this package includes C source that needs to be compiled, so you have to have the basic developer tools present on the server. On Ubuntu there is also a package r-cran-rserve available which is even easier to install.
Run the Rserve service as appropriate to your system. The command you need to run is R CMD Rserve. This will start the Rserve daemon which will listen on port 6311. The daemon only accepts connections from localhost. If you run WebWork and Rserve on separate servers, read the final section of this page for additional configuration steps for your system. For systems using systemd (RHEL CentOS > 7, Ubuntu > 15), you may use the following instructions to have Rserve start on boot:
- Download this file, and place it in /usr/lib/systemd/system. Now you can start Rserve with the command (as a superuser) systemctl start rserve. You can also set Rserve to start at startup with the command systemctl enable rserve.
- Note: if the folder /usr/lib/systemd/system does not exist, your system may expect you to place the file in /lib/systemd/system instead. You should also check to see where R is installed on your system. The file linked to above assumes that R is installed in /usr/lib64/R but on some systems (for example, any recent version of Ubuntu) it might be in /usr/lib/R, in which case you'll have to edit the file to correct this path (in two places).
- Note: if you run the service with a high number of users, and do a lot of temporary file I/O (see the last example below), you might eventually run into a situation where Rserve is still running, but not responding to the requests from WeBWorK - it seems to be an issue with the total number of file handles, although this isn't entirely clear. Thus, it can be useful to force a restart on the daemon every few days. You can do this by modifying the [Service] section to say

   Restart=always
   RuntimeMaxSec=7d

If you then

   systemctl daemon-reload
   systemctl restart rserve

then the Rserve service will restart itself once a week, which can avoid this sort of behaviour.

Install the Perl-R bridge

The Perl module Statistics::R::IO implements Rserve's communication protocol in Perl and provides translation from R data structures to Perl's. It is available on CPAN and can be installed in the standard manner for Perl modules, e.g., by running (as an admin user) cpan Statistics::R::IO.

Configure Webwork with the location of the R server

The PG macro that communicates with R needs to know the location of the R server which is a URL. You can do this by modifying conf/localOverrides.conf:

$pg{specialPGEnvironmentVars}{Rserve} = {host => "localhost"};

The value of this variable should be a reference to a hash with at least the key host. Note that running Rserve on a non-standard port (i.e., not on 6311) is not supported at this point.

You should now be able to load questions which call R. There are a number of such questions already in the OPL, for example Library/UBC/STAT/STAT300/hw07/stat300_hw07_q02.pg (which can be found under Statistics -> Simple linear regression -> Hypothesis tests). If your R server is working properly, you should see a scatterplot in this question.

Additional configuration when WeBWorK and R are on separate hosts

If you run WeBWorK and R on separate hosts, you can either set up a tunnel to forward port 6311 from WeBWorK to R's host, or do the following:

set up Rserve to listen on all network interfaces, not just localhost by adding the line "remote enable" to the file "/etc/Rserv.conf":
```
 cat <'EOF' >> /etc/Rserv.conf
   remote enable
   EOF
```
Use the correct host name (instead of "localhost") in the "Rserve host" line of "localOverrides.conf". For example:
```
$pg{specialPGEnvironmentVars}{Rserve} = {host => "www.example.com"};
```

Note that in this case you will also want to set up the Rserve host's firewall to only allow connections from the WeBWorK host(s), or otherwise it will happily execute arbitrary code from any Rserve client anywhere on the internet!

Troubleshooting Installation Problems

Please note that you must be running at least version 2.12-r (preferably 2.13 or newer) of WeBWorK in order for the integration to work. It's mentioned at the top of the page, but it's easy to miss.

If your server is running CentOS 7, there is a special configuration set for root which causes the CPAN install of Statistics::R::IO to not be recognized by Apache and WeBWorK. See this forum post for more details on disabling this. Another forum thread which goes through the troubleshooting step-by-step is here.

If you are using R 3.5.0 or newer, there is an incompatibility issue with version 1.7 of Rserve. If you receive errors like "Unrecognized response type" when trying to load problems involving R, then this could be the issue. The solution is to install a newer version of Rserve. If a newer version is not available for your distribution, then a newer version can still be installed by running the following command in the R console (as root):

install.packages('Rserve',,"http://rforge.net/",type="source")

Authoring problems with R code

The way that R integration works is that WeBWorK uses a Perl module that can talk to a server running the Rserve software, which allows remote clients to execute R code on the Rserve's server and returns the result in the response. The Perl module converts this response from R's native values (e.g., a generic vector, aka "list") to those understandable to Perl (e.g., an array), making them available to the rest of the PG code.

Loading the macros

To use R code in a problem, include "RserveClient.pl" in the "loadMacros" call at the start of the question. For example:

loadMacros(
  "PGstandard.pl",     # Standard macros for PG language
  "PGML.pl",
  "RserveClient.pl"    # <--- R integration
);

Basic Rserve macros

The Rserve software creates an R session for each remote client. This means that clients' interactions with R are kept separate from each other, just as if you started R twice on your local computer. A session persists as long as the client is connected, so that multiple calls from the client using the same session see the objects created in previous calls. (This behaviour mirrors what happens in a local session, where each R command you execute at the console after pressing ENTER sees the results of earlier commands.) When a sessions is *closed*, its contents are wiped off without a trace, just like quitting the R application run locally.

The RserveClient offers macros to start and finish a session, and execute R commands in the current session:

rserve_eval("some R code"): this function sends the R code given as its string argument to Rserve for execution. It returns *an array* representation of the R code's result. (This means that the value of rserve_eval("pi") is an array with a single element 3.14159265358979. If you want to keep this value and use it in the rest of the problem, assign it to an array variable. For example:
```
@pi = rserve_eval("pi");
```
Note: Multiple calls within the same problem share the R session and the object workspace, so you can break up your R code in as many rserve_eval statements as you'd like.
rserve_start(), rserve_finish(): Start up and close the current connection to the Rserve server. In normal use, these functions are completely optional because the first call to rserve_eval will call start the Rserve session if one is not already open. Similarly, the current session will be automatically closed at the end of the problem. Other than backward compatibility, the only reason for using these functions is to start a new clean session within a single problem, which shouldn't be a common occurrence.

A note on Perl quoting rules

Beware of Perl's quoting rules when writing R code. The text in double quotes gets interpreted for escape sequences (e.g., "\n" represents a newline) and variables (e.g., "The value of pi is $pi[0]" will be interpolated into "The value of pi is 3.14159265358979", given the code above). This is a problem if you're trying to extract an element of a list by name using the "$" operator in R because the text following it will be interpreted as a variable. For instance, running rserve_eval("cars$speed") will not return the "speed" column of the standard "cars" dataset, because "$speed" in the string will be replaced by the value of the PG variable $speed, which if not yet defined will be empty string, so that the R code that actually gets executed is simply "cars". Instead, using single quotes, which prevent variable and escape sequence interpolation and instead keep the string exactly as entered: rserve_eval('cars$speed').

On the other hand, some time you actually might want variable interpolation to be done, for instance to construct the R code that uses values of variables constructed with PG functions. For instance:

Context("Numeric");

$pi = Real("pi");
@difference = rserve_eval("pi - $pi");

will calculate the difference between the value between R and PG's values of "pi" and put the result in the @difference array. Note that the same R code can be constructed using Perl's string concatenation operator dot ("."): rserve_eval('pi - ' . $pi). Personally, I recommend sticking with single quotes to prevent unwanted surprises, and using the dot operator if needing to include the value of a PG variable.

Displaying R graphics

R has excellent facilities for creating production-quality statistical graphics, from simple scatter plots to complex spatial visualizations overlaid on geographical maps. These graphics can be produced in a variety of formats (in R parlance, devices), from the user's monitor to PDF or JPG files. The RserveClient allows the author to present these graphics in the question by calling the rserve_start_plot method before executing the R graphing code, calling the rserve_finish_plot method afterward, and then inserting the produced image into the problem.

The following code is a complete example

DOCUMENT();

loadMacros(
   "PGstandard.pl",
   "PGML.pl",
   "RserveClient.pl",
);

$mean = random(-2, 2, .5);

$img = rserve_start_plot('png');
rserve_eval('curve(dnorm(x, mean=' . $mean . '), xlim=c(-4, 4)); 0');
$image_path = rserve_finish_plot($img);

BEGIN_PGML
What is the mean of the normal distribution shown in the figure below: [_]{$mean}{5}

[!plot of normal distribution!]{$image_path}{300}
END_PGML

ENDDOCUMENT();

The four key lines are as follow:

$img = rserve_start_plot('png'): sets up R to plot to a 'PNG' file and returns a unique plot identifier to be used later.
rserve_eval('curve(...)'): runs plotting commands on the R server
$image_path = rserve_finish_plot($img): completes the plotting to the PNG file and transfers it to a location on the WebWork server. Returns the path of the file, which is stored in Perl variable $image_path.
[!plot of normal distribution!]{$image_path}{300}: inserts the image into the problem.

As of WeBWorK/PG 2.20 this can be simplified a bit. Instead just use the rserve_plot method as in the following example.

DOCUMENT();
loadMacros("PGstandard.pl", "PGML.pl", "RserveClient.pl");

$mean  = random(-2, 2, .5);
$image = rserve_plot('curve(dnorm(x, mean=' . $mean . '), xlim=c(-4, 4)); 0');

BEGIN_PGML
What is the mean of the normal distribution shown in the figure below: [_]{$mean}{5}

[!plot of normal distribution!]{$image}{300}
END_PGML

ENDDOCUMENT();

Transferring files from the R server

Sometimes it may be convenient to make a file from the R server available to the student via a link in Webwork. (For instance, using R to generate a random data file that the student can download.) The macro rserve_get_file($remote_name) can be used to transfer the file $remote_name from the R server to WebWork's temporary html directory for the current course, and returns the name of the local file that can then be used by the htmlLink macro.

The following code is a complete example

DOCUMENT();

loadMacros(
   "PGstandard.pl",
   "PGML.pl",
   "RserveClient.pl",
);

($intercept, $slope) = rserve_eval('coef(lm(log(dist)~log(speed), data = cars))');

($remote_file) = rserve_eval('filename <- tempfile(fileext=".csv"); write.csv(cars, filename); filename');
$local_file = rserve_get_file($remote_file);

$local_url = alias($local_file);

BEGIN_PGML
What is the slope of the linear regression of log-transformed stopping distance vs. car speed in the dataset linked below:
[_]{$slope}{5}

[@ htmlLink($local_url, "Download", download => 'dataset.csv') @]* the problem data (CSV file).
END_PGML

ENDDOCUMENT();

The four key lines are as follow:

($remote_file) = rserve_eval('filename <- tempfile(fileext=".csv"); write.csv(cars, filename); filename'): stores the desired dataset into a temporary CSV file on the R server and returns its path, which is stored in Perl variable $remote_file.
$local_file = rserve_get_file($remote_file): transfers the file from the R server to WeBWorK's temporary file area and returns its path, which is stored in Perl variable $local_file.
$local_url = alias($local_file): converts the local file path into a URL that can be used as an argument to the htmlLink macro, saving it in Perl variable $local_url.
[@ htmlLink($local_url, "Download", download => 'dataset.csv') @]*: inserts the link to the downloaded file into the problem. Note that adding download => 'dataset.csv' sets the name of the file that will be offered for download to the student (instead of the obscure file alias that is used for the file on the server).

As of WeBWorK/PG 2.20 this also can be simplified a bit. Instead use the rserve_data_url method as in the following example.

DOCUMENT();
loadMacros("PGstandard.pl", "PGML.pl", "RserveClient.pl");

($intercept, $slope) = rserve_eval('coef(lm(log(dist)~log(speed), data = cars))');
$url = rserve_data_url('cars');

BEGIN_PGML
What is the slope of the linear regression of log-transformed stopping distance vs. car speed in the dataset linked below:
[_]{$slope}{5}

[@ htmlLink($url, "Download", download => 'dataset.csv') @]* the problem data (CSV file).
END_PGML

ENDDOCUMENT();

R in WeBWorK

Contents

Required setup

Use a compatible version of WeBWorK

Set up the R server

Install the Perl-R bridge

Configure Webwork with the location of the R server

Additional configuration when WeBWorK and R are on separate hosts

Troubleshooting Installation Problems

Authoring problems with R code

Loading the macros

Basic Rserve macros

A note on Perl quoting rules

Displaying R graphics

Transferring files from the R server

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Documentation for

NAVIGATION

Tools