Difference between revisions of "Basic Perl syntax"

From WeBWorK_wiki
Jump to navigation Jump to search
Line 1: Line 1:
The PG Language is based on Perl. It helps to have a basic grasp of Perl before you start authoring PG problems. This article covers the minimum of Perl you need to know to work with PG. (The following is excerpted from the <code>[http://perldoc.perl.org/perlintro.html perlintro]</code> manual page.)
+
The PG Language is based on Perl. It helps to have a basic grasp of Perl before you start authoring PG problems. This article covers the minimum of Perl you need to know to work with PG.
   
 
== Important changes from vanilla Perl ==
 
== Important changes from vanilla Perl ==
Line 5: Line 5:
 
If you already know Perl, this is all you need to know:
 
If you already know Perl, this is all you need to know:
   
Because backslashes have a special meaning within <code>BEGIN_TEXT...END_TEXT</code> blocks, backslashes are not used for their normal Perl meaning. Instead, <code>~~</code> (double-tilde):
+
Because backslashes have a special meaning within BEGIN_TEXT...END_TEXT blocks, backslashes are not used for their normal Perl meaning. Instead, ~~ (double-tilde):
   
 
{|border="1"
 
{|border="1"
 
! Perl
 
! Perl
! PG
+
!PG
 
|-
 
|-
 
|
 
|
Line 25: Line 25:
 
$sref = ~~$scalar;
 
$sref = ~~$scalar;
 
|}
 
|}
  +
  +
== Perl Syntax Overview ==
  +
  +
=== Statements ===
  +
  +
Each statement in Perl must end on a semicolon. Statements can be spread over several lines. Spacing and indentation is for the most part ignored by Perl and should be used to make the program more readable. (See the <code>&lt;&lt;EOF</code> construction for the one case where spacing can't be ignored.)
  +
  +
$a = 1;
  +
Context("Numeric");
  +
  +
=== Comment lines ===
  +
  +
Comments start with # and continue to the end of the line.
  +
  +
=== Data types ===
  +
  +
==== Variables ====
  +
  +
'''Scalar variable''' names start with a dollar sign (e.g. <code>$scalar_variable</code>). Scalar variables can contain an integer, a real number, a string or advanced types (pointers to objects).
  +
  +
'''Array or list variable''' names start with an @ sign (e.g. <code>@list_variable</code>). Lists contain a sequence of scalar variables indexed by integers starting with zero. They have an implied order.
  +
  +
'''Hash or associate array variable''' names start with a % sign (e.g. <code>%hash_variable</code>). Hashes also contain a set of scalar variables, but the indices can be any string, and they have no implied order. The hash is stored as key-value pairs.
  +
  +
==== Constants ====
  +
  +
A number: e.g. <code>3.1415926</code> or <code>-543</code>.
  +
  +
A string: e.g. <code>'How now brown cow'</code> or <code>"How now brown cow"</code>.
  +
  +
An arrray: e.g. <code>( 1, 1, 2, 3, 5, 8, 12 )</code>.
  +
  +
A hash: e.g. <code>('ssn'=>'123-34-5676, 'name'=>'Jane Doe')</code>. The <code>=></code> symbol is translated into a comma, but using <code>=></code> implicity quotes the word to its left, meaning that <code>(ssn=>'123-34-5676, name=>'Jane Doe')</code> works too.
  +
  +
=== Quotes ===
  +
  +
Understanding how to create strings using quotes is important for writing WeBWorK PG problems. The two important concepts are interpolation of variables and temporarily redefining quote tokens.
  +
  +
There are two basic types of quoted string constants. Double quoted strings "interpolate" variables embedded in the string {{--}} they replace variable names by the contents of the variable. Single quoted strings do not do this.
  +
  +
Assuming that the value of $a_variable is 3 we have:
  +
  +
:"Let a = $a_variable"</code> becomes <code>Let a = 3</code>.
  +
  +
:'Let a = $a_variable'</code> remains <code>Let a = $a_variable</code>.
  +
  +
A perpetual problem is how to insert a quote inside a quoted string. A standard method is to escape the quote. Remember that the escape character in PG is <code>~~</code>, ''not'' the backslash <code>\</code> as in normal Perl.
  +
  +
'Don~~'t forget to escape single quotes within "single" quotes!'
  +
  +
Perl has a more elegant method for handling this. You can temporarily redefine the quote character using the command <code>qq</code> for double quotes, or <code>q</code> for single quotes. Any character can be defined as a quote character.
  +
  +
qq!Any character but the exclamation point can occur in this quoted string!
  +
q~The variable $A will not be interpolated in this single quoted string~
  +
qq{Parentheses, brackets and braces { must be balanced } when used as quote tokens.}
  +
  +
More on interpolation and on quoting text blocks can be found in the section on the <code>BEGIN_TEXT...END_TEXT</code> construction.
  +
  +
=== Functions ===
  +
  +
The function names that are not followed by parentheses must have & in front. (The words function, subroutine and macro are used intechangeably in this document.)
  +
  +
The function <code>beginproblem</code> requires no arguments (i.e. no parentheses after it), therefore we can write it as <code>&beginproblem</code> or as <code>beginproblem()</code> but not <code>beginproblem</code> which will usually cause a compiler warning message about "barewords" appearing in the problem.
  +
  +
=== Mathematical operations ===
  +
  +
The Perl symbol for taking some value to a power is <code>**</code>. To take variable <code>$a</code> to the power <code>2</code>, write <code>$a**2</code>, and not <code>$a^2</code>. (When using [[MathObjects]] to specify a formula as a string you can use either <code>**</code> or <code>^</code>. Students can use <code>^</code> as well.
  +
  +
=== References {{--}} more complicated data types ===
  +
  +
==== References ====
  +
  +
To handle objects, which are more complicated structures than scalars, lists and hashes we use references. A '''reference''' is a scalar which is not the object itself, but the address of the object. For example <code>$ml = new_match_list()</code> can't really fit an entire matching list object into the scalar variable $ml, instead it just stores the address of where the object structure is stored. An '''object''' is a structure which contains both data and functions (called methods) which operate on that data. You can tell that <code>$ml</code> is a reference to an object and not just a number, because the statement <code>ref($ml)</code> will return <code>Match</code> (the type of a matching list object). In general <code>ref($var)</code> will print out the kind of object pointed to by a scalar variable, and will print nothing if it is just an ordinary number or string.
  +
  +
==== Calling and object's methods ====
  +
  +
<code>$ml->choose(4)</code> tells the match list object <code>$ml</code> to perform its <code>choose</code> method, choosing 4 arguments. The beauty of objects and methods over standard subroutines is that you don't have to give the choose subroutine a list of questions and answers to choose from {{--}} it already 'knows' what questions and answers it has available and will choose 4 of them. The arrow construction is the same as the period construction in Java. In Java you would write <code>ml.choose()</code>. Java's typography is much neater, but unfortunately the period was already being used for string concatenation by Perl, so we're stuck with the arrow construction, which takes up more space.
  +
  +
==== Pointers to arrays and hashes ====
  +
  +
You can use references to arrays and to hashes too if you want:
  +
  +
:<code>ref($ra_foo)</code> prints <code>ARRAY</code> and is a reference to an array variable. You write $ra_foo->[1] to get the second item stored in the array.
  +
  +
:<code>$ra_foo->[1]->[0]</code> can be shortened to <code>$ra_foo->[1][2]</code> to simplify addressing multidimensional arrays. You can also write <code>~~@array = ~~@{$ra_foo}</code> which has now stored the array pointed to by <code>$ra_foo</code> in the new array variable <code>~~@array</code>.
  +
  +
:<code>ref($rh_foo)</code> prints <code>HASH</code> and is a reference to a hash variable. <code>$rh_foo->{first_name}</code> gets the value associated with the key 'first_name'. <code>%hash = %{$rh_foo}</code> stores the referenced hash in the new hash variable <code>%hash</code>.
  +
  +
:<code>ref($rf_foo)</code> prints 'CODE' and is a reference to a function or subroutine. The construction <code>&{$rf_foo}(34)</code> takes the value <code>34</code> and uses it as an argument for the subroutine referenced by <code>$rf_foo</code>.
  +
  +
Note that the array reference variable started with <code>$ra_</code>, the hash reference variable started with <code>$rh_</code> and the function reference variable started with <code>$rf_</code>. This is a naming convention PG uses, (which is voluntary, not enforced) which helps keep track of the kind of reference stored in a scalar variable. The convention is useful for macros, and is probably overkill for most short PG problems.

Revision as of 00:11, 4 March 2008

The PG Language is based on Perl. It helps to have a basic grasp of Perl before you start authoring PG problems. This article covers the minimum of Perl you need to know to work with PG.

Important changes from vanilla Perl

If you already know Perl, this is all you need to know:

Because backslashes have a special meaning within BEGIN_TEXT...END_TEXT blocks, backslashes are not used for their normal Perl meaning. Instead, ~~ (double-tilde):

Perl PG
print "Hello, $name\n";
print "Hello, $name~~n";
$aref = \@array;
$href = \%hash;
$sref = \$scalar;
$aref = ~~@array;
$href = ~~%hash;
$sref = ~~$scalar;

Perl Syntax Overview

Statements

Each statement in Perl must end on a semicolon. Statements can be spread over several lines. Spacing and indentation is for the most part ignored by Perl and should be used to make the program more readable. (See the <<EOF construction for the one case where spacing can't be ignored.)

$a = 1;
Context("Numeric");

Comment lines

Comments start with # and continue to the end of the line.

Data types

Variables

Scalar variable names start with a dollar sign (e.g. $scalar_variable). Scalar variables can contain an integer, a real number, a string or advanced types (pointers to objects).

Array or list variable names start with an @ sign (e.g. @list_variable). Lists contain a sequence of scalar variables indexed by integers starting with zero. They have an implied order.

Hash or associate array variable names start with a % sign (e.g. %hash_variable). Hashes also contain a set of scalar variables, but the indices can be any string, and they have no implied order. The hash is stored as key-value pairs.

Constants

A number: e.g. 3.1415926 or -543.

A string: e.g. 'How now brown cow' or "How now brown cow".

An arrray: e.g. ( 1, 1, 2, 3, 5, 8, 12 ).

A hash: e.g. ('ssn'=>'123-34-5676, 'name'=>'Jane Doe'). The => symbol is translated into a comma, but using => implicity quotes the word to its left, meaning that (ssn=>'123-34-5676, name=>'Jane Doe') works too.

Quotes

Understanding how to create strings using quotes is important for writing WeBWorK PG problems. The two important concepts are interpolation of variables and temporarily redefining quote tokens.

There are two basic types of quoted string constants. Double quoted strings "interpolate" variables embedded in the string — they replace variable names by the contents of the variable. Single quoted strings do not do this.

Assuming that the value of $a_variable is 3 we have:

"Let a = $a_variable" becomes Let a = 3.
'Let a = $a_variable' remains Let a = $a_variable.

A perpetual problem is how to insert a quote inside a quoted string. A standard method is to escape the quote. Remember that the escape character in PG is ~~, not the backslash \ as in normal Perl.

'Don~~'t forget to escape single quotes within "single" quotes!'

Perl has a more elegant method for handling this. You can temporarily redefine the quote character using the command qq for double quotes, or q for single quotes. Any character can be defined as a quote character.

qq!Any character but the exclamation point can occur in this quoted string!
q~The variable $A will not be interpolated in this single quoted string~
qq{Parentheses, brackets and braces { must be balanced } when used as quote tokens.}

More on interpolation and on quoting text blocks can be found in the section on the BEGIN_TEXT...END_TEXT construction.

Functions

The function names that are not followed by parentheses must have & in front. (The words function, subroutine and macro are used intechangeably in this document.)

The function beginproblem requires no arguments (i.e. no parentheses after it), therefore we can write it as &beginproblem or as beginproblem() but not beginproblem which will usually cause a compiler warning message about "barewords" appearing in the problem.

Mathematical operations

The Perl symbol for taking some value to a power is **. To take variable $a to the power 2, write $a**2, and not $a^2. (When using MathObjects to specify a formula as a string you can use either ** or ^. Students can use ^ as well.

References — more complicated data types

References

To handle objects, which are more complicated structures than scalars, lists and hashes we use references. A reference is a scalar which is not the object itself, but the address of the object. For example $ml = new_match_list() can't really fit an entire matching list object into the scalar variable $ml, instead it just stores the address of where the object structure is stored. An object is a structure which contains both data and functions (called methods) which operate on that data. You can tell that $ml is a reference to an object and not just a number, because the statement ref($ml) will return Match (the type of a matching list object). In general ref($var) will print out the kind of object pointed to by a scalar variable, and will print nothing if it is just an ordinary number or string.

Calling and object's methods

$ml->choose(4) tells the match list object $ml to perform its choose method, choosing 4 arguments. The beauty of objects and methods over standard subroutines is that you don't have to give the choose subroutine a list of questions and answers to choose from — it already 'knows' what questions and answers it has available and will choose 4 of them. The arrow construction is the same as the period construction in Java. In Java you would write ml.choose(). Java's typography is much neater, but unfortunately the period was already being used for string concatenation by Perl, so we're stuck with the arrow construction, which takes up more space.

Pointers to arrays and hashes

You can use references to arrays and to hashes too if you want:

ref($ra_foo) prints ARRAY and is a reference to an array variable. You write $ra_foo->[1] to get the second item stored in the array.
$ra_foo->[1]->[0] can be shortened to $ra_foo->[1][2] to simplify addressing multidimensional arrays. You can also write ~~@array = ~~@{$ra_foo} which has now stored the array pointed to by $ra_foo in the new array variable ~~@array.
ref($rh_foo) prints HASH and is a reference to a hash variable. $rh_foo->{first_name} gets the value associated with the key 'first_name'. %hash = %{$rh_foo} stores the referenced hash in the new hash variable %hash.
ref($rf_foo) prints 'CODE' and is a reference to a function or subroutine. The construction &{$rf_foo}(34) takes the value 34 and uses it as an argument for the subroutine referenced by $rf_foo.

Note that the array reference variable started with $ra_, the hash reference variable started with $rh_ and the function reference variable started with $rf_. This is a naming convention PG uses, (which is voluntary, not enforced) which helps keep track of the kind of reference stored in a scalar variable. The convention is useful for macros, and is probably overkill for most short PG problems.