Instructions for creating language template files

From WeBWorK_wiki
Jump to navigation Jump to search

Creating POT files using POedit

1. First, make sure you have POedit installed. You can download the application from http://www.poedit.net/download.php. There are versions for Windows, Mac OS, and Linux.


2. Make sure you have the correct settings: we want POedit to be able to read the .pm files that WeBWorK's content generator files uses.

  • To do this, open up POedit, then go to File --> Preferences. This will open up a new window.
  • Click on the Parsers tab in this window, and then select 'Perl' from the list the tab displays, and click 'Edit'. This will open up yet another window.
  • IMPORTANT: In the text area titled, "List of extensions separated by semicolons (e.g. *.cpp;*.h)", add, if it's not already there, the extension '*.pm' by typing it into the text area. As the title says, make sure you separate it from other extensions with a semicolon.
  • Click OK on both windows.


3. Now we are ready to create a POT file.

  • Before we begin, go to File --> Preferences again. This time, click on the Editors tab, and in the 'Behavior' section, uncheck the checkbox titled "Automatically compile .mo file on save". This will prevent POedit from creating useless .mo files when we create the POT file. Click OK once you are done.
  • Go to File --> New catalog. Do not use File --> New catalog from POT file -- this is something different. This will open up a new window.
  • On the 'Project Info' tab, you can fill out information about the POT file you are creating. The only thing really important here is to give it a name in the 'Project name and version' section, and that you make sure that both the 'Charset' and the 'Source code charset' sections are set to 'UTF-8' (this will be lower case for the 'Source code charset' section).
  • Click on the 'Paths' tab. Click on the second button from the left in the box, which should have a tooltip reading 'New item'. This will allow you to edit a line in the box below. Type '.' and hit the Enter key. This will tell POedit to look for translatable strings in the same directory as POT file when it is filled in. Alternatively you can fill in the complete path. For example use '/opt/webwork' as the base path and enter 'webwork2' and 'pg' in the text area blank for paths. This will search all of the files under the webwork2 and pg directories. Make sure that you have added the .pm extension as described above, otherwise the search will find no files with maketext strings.
  • Click on the 'Keywords' tab. Click on the 'New item' button in the box (again, second one from left). This will allow you to edit a line in the box below. Type 'maketext' and hit the Enter key. WeBWorK uses the 'maketext' function for most of its translations -- this will tell POedit to look for strings that are used by this function when it is filled in.
  • Once you are done with the steps above, click OK. This will open up your document viewer window, allowing to choose a location to save your POT file in. Make sure that it is in a directory which contains all of the files you want to translate (either directly in the directory or inside a directory contained in the directory) and nothing else except the files you want to translate. Name your POT file, change the file extension to '.pot', and save.


4. If all goes well and everything is in the right place, POedit should generate a POT file containing all of the translatable strings in the .pm files from the directory the POT file is in and all of its sub-directories. You will be able to view the translatable strings in POedit in the main box.


Updating the POT file

Updating the POT file when you update you .pm source files is easy: simply make sure the updated .pm files are in the same directory as the POT file or in a sub-directory (and make sure that they are the only kinds of files in the directory or sub-directories, aside from the POT file), open up the POT file in POedit, and hit the 'Update Catalog' button, which will be the 3rd button from the left at the top of the application.


This will make POedit go through all the files again in the directory and sub-directories and pull out the translatable strings. Any new strings will be added to the POT file, while any strings that no longer appear in the .pm files will be removed. There is a warning alert about the strings to be added and the obsolete strings which will be removed.


Some caution is required. Strings that were added by hand will not be found in maketext() functions and will therefore be considered obsolete and will be removed. Make a backup of your current .pot file before updating it either from source files

or from another .pot file.

Updating from other POT files

Another option for updating POT files is to update them strings for other POT files. To do this, go to Catalog --> Update from POT file, which will open up your document explorer. Choose the POT file you wish to update from, and POedit will go through all of the strings in the chosen POT file, adding new ones to the POT file currently opened in POedit and deleting obsolete strings (strings which are not found in the new POT file). Note that you may use this method to update both POT and ordinary PO files, as they both use the same format.

Creating POT files using xgettext.pl

An alternative to using POedit for creating POT files is to use xgettext.pl, a Perl command line script which comes wrapped in with the CPAN module Locale::Maketext::Extract. It has much the same functionality as POedit in terms of pulling strings from a source or a directory of sources. Several important distinctions do exist, however:

  • xgettext.pl will add the actual names of the variables being sent to the maketext function as parameters to the translations as comments.
  • xgettext.pl will pull strings from commented-out lines, whereas POedit will not.
  • xgettext.pl cannot recognize maketext functions that are called inside other maketext functions. So if we had code like:
 $r->maketext("This is a [_1] sandwich", $r->maketext("ham"));

xgettext.pl will not recognize the "ham" string. This problem can be resolved by changing the source code so that the inner maketext call is stored as a variable before being passed into the outer maketext call, as seen below:

 $var = $r->maketext("ham");
 $r->maketext("This is a [_1] sandwich", $var);

For more information on the Locale::Maketext::Extract module, see http://search.cpan.org/~drtech/Locale-Maketext-Lexicon-0.91/lib/Locale/Maketext/Extract.pm.

For more information on the xgettext.pl script itself, see http://search.cpan.org/~drtech/Locale-Maketext-Lexicon-0.91/script/xgettext.pl.

Warning : Some of these comments need further investigation. POedit uses xgettext() internally to scan for strings so it may be that the configuration of the flags for this command in POedit just needs to be adjusted.

Requirements

As stated, xgettext.pl is a Perl script contained in a CPAN module, and thus you will need to install the latest Perl installation, which can be found at http://www.perl.org/get.html. Once you have the installation, you will then need to install the CPAN module Locale::Maketext::Extract (the procedure for which differs between distributions).


Procedure

Note: While this procedure attempts to remain as non-platform-specific as possible, it is based on a Windows Perl distribution. There may be some small variations on other systems.

1. Open up your command line or terminal interface.

2. Navigate to a directory which contains the directory of sources you wish to translate.

3. Enter this command into the interface:

 xgettext.pl -D <Name of source directory> -u -d <Name of output file>

Note: In specifying the name of the output file, make sure to leave off the file extensions.

4. The script will take a few seconds to run and pull your strings, longer or shorter depending on the size of your source directory. Note that there will be nothing to indicate that it is running -- no "loading" message, for example. Once it is done loading, it will return a new line in your interface.

5. There should be a new file in the directory that you ran the script in called <Name of output file>.po. This contains all of the translatable strings from your sources in PO file format, very similar to the ones produced by POedit. Change the file extension to .pot. You now have a new POT file created using xgettext.pl

Other command line tools

 msgcat can be used to safely concatenate .pot files

POT File Formats, Practices and Conventions

Several things to keep in mind when working with POT files:


  • POT Headers

All POT files generated using POedit will contain a header beginning with:

   msgid ""
   msgstr ""

And containing several more lines of data. These simply give POedit some extra information about the POT file being edited, and can for the most part be ignored. The first two lines, however, should be noted as they describe the format in which the strings and their translations are recorded in the POT file.

msgid -- This records the string that is actually passed to the maketext functions in the Content Generator files for translation (the English version, usually).

msgstr -- This records the appropriate translation of the string.

Example:

   msgid "Welcome to WeBWorK!"
   msgstr ""

When you first create a template file, the all of the msgstr strings will be empty. The template files can be then be passed along to translators, who can create PO files from them and proceed to perform the appropriate translations.


  • Strings with variable input

Some strings used for maketexts in WeBWorK contain entities like the following:

   "The set [_1] is assigned to [_2]."

These entities ([_1], [_2]) allow the Content Generators to print out variable values where [_1] and [_2] are, passed to the maketext function as parameters (so in this case, [_1] will be the name of the set in question, and [_2] will be the user in question).

Each entity must be labeled with a different number, incrementally increasing from 1. So the first entity will be [_1], the second will be [_2], the third will be [_3] and so forth.

These entities can be defined in two ways: either "[_n]" format shown above, or as "%n", where n is the number label of the variable in question. The two formats are entirely interchangeable, between both the original and translated strings.


  • Commenting

Comments can be added to above the translations using the "#" character. For instance:

   # " This is the homework sets editor page where you can view and edit the homework sets that exist in this course and the problems that
they contain. The top of the page contains forms which allow you to filter which sets to display in the table, sort the sets in a 
chosen order, edit homework sets, publish homework sets, import/export sets from/to an external file, score sets, or create/delete sets.
 To use, please select the action you would like to perform, enter in the relevant information in the fields below, and hit the
\"Take Action!\" button at the bottom of the form.  The bottom of the page contains a table displaying the sets and several pieces of 
relevant information."
   #: ContentGenerator/Instructor/ProblemSetList.pm:498
   #, fuzzy
   msgid "_HMWKSETS_EDITOR_DESCRIPTION"
   msgstr ""

Types of comments:


  • Normal -- Defined by a single "#" tag. Can be used to add any notes one wishes to the translation. Are shown in a side box at the bottom when the POT file is viewed using POedit.

NOTE: In some cases, such as the translation above, the msgid is a identifier for a string to be translated, not the actual string itself. This is usually done when there is a especially long string block to be translated. In this case, the standard is to put the actual string that needs to be translated as a normal comment, again as shown above.


  • Line Comments -- Defined by a "#:" tag. This contains information about where the strings were found from in the scanned source code; namely, the path and filename and the line number. These fields are automatically filled in when the POT file is created or updated. These fields can be viewed in POedit by right-clicking on the translation in the list.


  • Fuzzy markers -- Defined by a "#, fuzzy" tag, these mark strings which have yet to be translated. Strings which have been marked as fuzzy are brought to the top of the POT or PO files. When first creating a POT file, all strings should marked as fuzzy for the benefit of the translators, who can remove these markers as they translate the strings. Unfortunately, there currently is no way to automatically mark all untranslated strings fuzzy: one would have to go through and manually mark each one. A keyboard shortcut for marking fuzzy is to hit "Alt-U" on a selected translation in the POedit list.

Other useful development files: