Configuring Platform Scripts for PBS/Torque Managers
The example scripts supplied with this tutorial work for a very basic PBS/Torque configuration. However, they can be very easily modified to support customised PBS/Torque configurations too. The basic PBS/Torque testbed platform we used to develop and test these scripts had the following configuration:
- All PBS/Torque and GRIA services run on the same machine, i.e. pbs_server, pbs_sched, pbs_mom;
- Use the default PBS/Torque queue;
- System users e.g. the GRIA user (tomcat) can submit and run simple jobs.
Submit Job: startJob.pl
This example startJob.pl script can submit GRIA jobs to PBS/Torque RM systems. Customisation of this script will require modifications to the following:
- SECTION A: Initialise Resource Manager global variables, such as the path for PBS
binaries, the PBS server name, etc. In particular make sure that the following
variables are set up correctly:
- RM_PATH=<PBS binary path>
- RM_SERVER=<PBS server name>
- SECTION B: Turn verbose debug flags on/off. This step is optional.
- SECTION C: This section generates a job description file (JDF), which is the
actual file submitted to PBS to run the job.
You should edit
this part of code if you want to change any of the default PBS
directives or change the way jobs are submitted.
The PBS JDF file has two main parts, the first one describes all the PBS directives required to run the job. This section of the code should be edited only when we have to pass specific PBS directives than the exiting ones or to parse RM directives passed with the -r arguments, i.e. see section E below. The default directives used in that part of the script are:#PBS -N J${SESSION_NAME}The second part of the file describes how to invoke the application wrapper and how to create time-stamp files, etc. This part of the code should cover a wider range of PBS configurations.
#PBS -o job.out
#PBS -e job.err
#PBS -l cput=3600
${raString} # see SECTION E
... - SECTION D: This section contains the actual PBS submit
command. According to your PBS system configuration you may have
to edit it only for customised PBS configurations that use
multiple queues, PBS servers, etc.
# compose submit command to the default queue
my $command_line="$RM_SUBMIT -q \@$RM_SERVER $JDF";
# execute the submit command and store submission job ID
my $sub = 0xffff & system "$command_line > $JOB_PID"; - SECTION E: This subroutine should process any job constraints passed as
command line arguments for the RM into PBS directives.
The subroutine should return a text string with valid PBS directives that will be
attached in the JDF file PBS directives section,
e.g. ${raString}.
The current implementation of this subroutine returns an empty
string. However, if you intend to pass RM directives dynamically
using the -r command line arguments you should parse them
in this subroutine and return them as a PBS directive string, e.g.
...
#PBS cput=2300
#PBS -l 2
...
Check Job: checkJob.pl
This is a perl script that checks and reports to GRIA users the status of a job. For most PBS configurations the default rm_local/checkJob.pl script can be used without editing.
Kill Job: killJob
This is an example killJob.pl script for terminating PBS jobs, the following parts of the code need editing:
- SECTION A: Initialise Resource Manager global vars, such as path for PBS
binaries, PBS server name, etc. In particular make sure that the following
variables are set up correctly:
- RM_PATH=<PBS binary path>
- RM_SERVER=<PBS server>
- SECTION B: Turn verbose debug flags on/off. This step is optional.
- SECTION C: The first part of this section reads the status of the PBS job. According
to your PBS configuration you may have to edit the code that grabs the job
status, e.g. in a PBS qstat command the status of a job is always
the 6th field, etc.
my $qString = `${RM_QUEUE} | grep $concatPID`;Unless the output format of qstat is different you do not need to change this section, e.g.
my @words = "ewords('\s+', 0, $qString);
my $jStatus = $words[9];Job id Name User Time Use S Queue
---------------- ---------------- ---------------- -------- - -----
74.siegerrebe pm tomcate 00:30 0 R dque