Personal tools

4.4.1. Overview

Up one level
Applications and The Job Service

Relation to the Job Service

A functioning GRIA Job Service installation is composed of three main parts:

  1. The service software running under Tomcat/AXIS.
  2. Scripts for running or interacting with application codes on an execution platform, which may be the service host itself, a separate execution server, or a cluster.
  3. Some installed applications capable of running on the execution platform, each corresponding to a different Job Service end-point.

The Job Service software (1) supports various bookkeeping operations for assigning job ids, workspace, etc. The core operations are those for actually starting and managing the execution of an application: starting jobs, checking their status, and killing jobs. Each of these has to access the execution platform by running the appropriate script (2), which can start or otherwise interact with the application running on the execution platform (3). For security reasons, we do not usually allow users to upload their own applications, so the service operator must install all the applications.

The Job Service and platform scripts are designed to support a uniform model of application execution, shown in Figure 1:

Application Model

Figure 1 - Application Model

The workspace for each job is set up by the Job Service when the job is initialised (this is one of the bookkeeping service operations that must precede the call to start the job). The workspace has a standard directory structure so the Job Service and platform scripts can create and find information stored in it, including a working sub-directory where the job will actually run.

When the user starts the job, the Job Service transfers input data files from Data Service URI's into the job’s workspace. It then runs a platform script locally, which in turn submits the application to the execution platform (cluster, etc) where it will run (using the specified command line), possibly after some queuing delay. The platform script saves the job handle to the workspace. The application has to read the input data left in its workspace by the Job Service, and write any outputs to the workspace so that the Job Service can find them and transfer them to output Data Service URI's when the job has finished.

When the user asks for the status of the job, the service runs a second platform script that reads status information (e.g. when the job started or finished, etc), which the application must store in the workspace. This platform script may also run an associated monitoring application (using the specified command line) to gather application-specific status information (e.g. number of iterations completed, convergence plots, etc) from the workspace.

If the user asks for the job to be killed, the service uses a third platform script that reads the job handle from the workspace, and issues a command to kill this job on the execution platform. The Job Service will detect that the job has finished, and will transfer any output produced. Note that the user (or their client-side application) should always check the status of a job to find out if it crashed or was killed, as some incomplete output may appear in this case.

Why are Application Wrapper Scripts Required?

In practice, few legacy applications behave exactly according to the model shown in Figure 1. It is rarely possible to change the application itself to fix this, so instead GRIA uses so-called wrapper scripts that do conform to the application model for starting and managing the application.

In practice, the wrapper scripts can do more than just make the underlying application work as indicated in Figure 1. They can also be used to handle and implement application specific features of the service.

One can also use (optional) wrapper scripts to look for application-specific status information in the working directory of the job. Without such scripts, the platform scripts can only provide basic job status information from the job submission system.

Finally, wrapper scripts also provide a configurable mechanism for dealing with any application-specific security risks, e.g. checking for malicious input that may exploit a feature of the application. Few legacy applications were designed as network-accessible services, and since we can’t change them to remove security loopholes, the use of wrapper scripts is essential to check for any exploits of application vulnerabilities. In the limit, one can configure the wrapper (and platform) scripts to run the application in a sandbox (e.g. chroot), with access only to a working sub-directory of the job workspace, as shown in Figure 2.

Wrapper Scripts and Security

Figure 2 - Wrapper Scripts and Security