Basic Perl syntax

From WeBWorK
Jump to: navigation, search

The PG Language is based on Perl. It helps to have a basic grasp of Perl before you start authoring PG problems. This article covers the minimum of Perl you need to know to work with PG.


Important changes from vanilla Perl

If you already know Perl, this is all you need to know:

Because backslashes have a special meaning within BEGIN_TEXT...END_TEXT blocks, backslashes are not used for their normal Perl meaning. Instead, ~~ (double-tilde):

Perl PG
print "Hello, $name\n";
print "Hello, $name~~n";
$aref = \@array;
$href = \%hash;
$sref = \$scalar;
$aref = ~~@array;
$href = ~~%hash;
$sref = ~~$scalar;

Perl Syntax Overview


Each statement in Perl must end on a semicolon. Statements can be spread over several lines. Spacing and indentation is for the most part ignored by Perl and should be used to make the program more readable. (See the <<EOF construction for the one case where spacing can't be ignored.)

$a = 1;

Comment lines

Comments start with # and continue to the end of the line.

Data types


Scalar variable names start with a dollar sign (e.g. $scalar_variable). Scalar variables can contain an integer, a real number, a string or advanced types (pointers to objects).

Array or list variable names start with an @ sign (e.g. @list_variable). Lists contain a sequence of scalar variables indexed by integers starting with zero. They have an implied order.

Hash or associate array variable names start with a % sign (e.g. %hash_variable). Hashes also contain a set of scalar variables, but the indices can be any string, and they have no implied order. The hash is stored as key-value pairs.


A number: e.g. 3.1415926 or -543.

A string: e.g. 'How now brown cow' or "How now brown cow".

An arrray: e.g. ( 1, 1, 2, 3, 5, 8, 12 ).

A hash: e.g. ('ssn'=>'123-34-5676', 'name'=>'Jane Doe'). The => symbol is translated into a comma, but using => implicity quotes the word to its left, meaning that (ssn=>'123-34-5676', name=>'Jane Doe') works too.


Understanding how to create strings using quotes is important for writing WeBWorK PG problems. The two important concepts are interpolation of variables and temporarily redefining quote tokens.

There are two basic types of quoted string constants. Double quoted strings "interpolate" variables embedded in the string — they replace variable names by the contents of the variable. Single quoted strings do not do this.

Assuming that the value of $a_variable is 3 we have:

"Let a = $a_variable"</code> becomes Let a = 3.
'Let a = $a_variable'</code> remains Let a = $a_variable.

A perpetual problem is how to insert a quote inside a quoted string. A standard method is to escape the quote. Remember that the escape character in PG is ~~, not the backslash \ as in normal Perl.

'Don~~'t forget to escape single quotes within "single" quotes!'

Perl has a more elegant method for handling this. You can temporarily redefine the quote character using the command qq for double quotes, or q for single quotes. Any character can be defined as a quote character.

qq!Any character but the exclamation point can occur in this quoted string!
q~The variable $A will not be interpolated in this single quoted string~
qq{Parentheses, brackets and braces { must be balanced } when used as quote tokens.}

More on interpolation and on quoting text blocks can be found in the section on the BEGIN_TEXT...END_TEXT construction.

"HERE" documents

For quoting large blocks of material use a HERE document (borrowed from unix).

$string = <<'END_QUOTE";
   All of the material in this section following the HERE document's opening line is "quoted" and 'placed' in the variable $string.
   The quoted section is terminated by a line containing only the phrase END_QUOTE -- which must be left justified.

The single quotes about END_QUOTE indicate that no interpolation takes place. Using double quotes or no quotes allows interpolation.

This is what the BEGIN_tEXT/END_TEXT construction translates to:

  The material in this section is collected into a string which is then passed to the EV3 (evaluation 3 ) routine which handles
  processing of the sections in curly braces, then does interpolation, and finally processes the LaTeX commands.


The function names that are not followed by parentheses must have & in front. (The words function, subroutine and macro are used intechangeably in this document.)

The function beginproblem requires no arguments (i.e. no parentheses after it), therefore we can write it as &beginproblem or as beginproblem() but not beginproblem which will usually cause a compiler warning message about "barewords" appearing in the problem.

Mathematical operations

The Perl symbol for taking some value to a power is **. To take variable $a to the power 2, write $a**2, and not $a^2. (When using MathObjects to specify a formula as a string you can use either ** or ^. Students can use ^ as well.

References — more complicated data types


To handle objects, which are more complicated structures than scalars, lists and hashes we use references. A reference is a scalar which is not the object itself, but the address of the object. For example $ml = new_match_list() can't really fit an entire matching list object into the scalar variable $ml, instead it just stores the address of where the object structure is stored. An object is a structure which contains both data and functions (called methods) which operate on that data. You can tell that $ml is a reference to an object and not just a number, because the statement ref($ml) will return Match (the type of a matching list object). In general ref($var) will print out the kind of object pointed to by a scalar variable, and will print nothing if it is just an ordinary number or string.

Calling and object's methods

$ml->choose(4) tells the match list object $ml to perform its choose method, choosing 4 arguments. The beauty of objects and methods over standard subroutines is that you don't have to give the choose subroutine a list of questions and answers to choose from — it already 'knows' what questions and answers it has available and will choose 4 of them. The arrow construction is the same as the period construction in Java. In Java you would write ml.choose(). Java's typography is much neater, but unfortunately the period was already being used for string concatenation by Perl, so we're stuck with the arrow construction, which takes up more space.

Pointers to arrays and hashes

You can use references to arrays and to hashes too if you want:

ref($ra_foo) prints ARRAY and is a reference to an array variable. You write $ra_foo->[1] to get the second item stored in the array.
$ra_foo->[1]->[0] can be shortened to $ra_foo->[1][2] to simplify addressing multidimensional arrays. You can also write ~~@array = ~~@{$ra_foo} which has now stored the array pointed to by $ra_foo in the new array variable ~~@array.
ref($rh_foo) prints HASH and is a reference to a hash variable. $rh_foo->{first_name} gets the value associated with the key 'first_name'. %hash = %{$rh_foo} stores the referenced hash in the new hash variable %hash.
ref($rf_foo) prints 'CODE' and is a reference to a function or subroutine. The construction &{$rf_foo}(34) takes the value 34 and uses it as an argument for the subroutine referenced by $rf_foo.

Note that the array reference variable started with $ra_, the hash reference variable started with $rh_ and the function reference variable started with $rf_. This is a naming convention PG uses, (which is voluntary, not enforced) which helps keep track of the kind of reference stored in a scalar variable. The convention is useful for macros, and is probably overkill for most short PG problems.

See also

follow us