Personal tools
You are here: Home GRIA Documentation Documentation 5.2 Reference Manuals GRIA Workflow Application

GRIA Workflow Application

Note: Return to reference manual view.

This guide describes the GRIA Workflow Application software for deploying and running XScufl workflows as GRIA applications. XScufl workflows may be created using Taverna or g-Eclipse.

1. Introduction

The GRIA Workflow Application v0.2 software includes tools for deploying and running XScufl workflows as GRIA applications. XScufl workflows may be created using Taverna or g-Eclipse. A command line deployment tool generates GRIA application wrapper scripts and other requisite files from an XScufl workflow. Running the deployment tool generates a new application that can be deployed to the GRIA Job service in the usual way using the Job Service Administration web pages.

For convenenience in the documentation, we refer to GRIA applications that have been created from workflows as workflow applications. Also, a job created from a workflow application is referred to as a workflow job.

Architecture

The diagram below illustrates that an XScufl workflow ("a Workflow application" in the diagram) can be deployed to the GRIA Job Service as a GRIA application. After deployment, clients can create jobs from a workflow application using the Job Service in exactly the same way as they would for other GRIA applications. Clients will not necessarily be aware that they are using a workflow application, though this may be apparent from log messages that are made available when monitoring job execution. Note that the diagram is conceptual and this section covers only the details relevant to deploying and using workflow applications. For full details on the relationship between applications and the Job Service, refer to the Job Service documentation in the Basic Application Services User Guide.


architecture

Architecture

GRIA applications consist of an application directory containing at least an application wrapper script, startJob.pl, and an application description file, ApplicationMetadata.xml. For each GRIA application, a directory should be created to store the application wrapper and application description files. The application wrapper is invoked indirectly from the Job Service when job execution is started. Usually, the application wrapper will invoke a command line executable that provides the functionality for the application.

For workflow applications, there is a single command line executable (GRIA Workflow Application) that is responsible for executing the Taverna workflow and mapping input and output data stagers to workflow inputs and outputs. This is illustrated in the diagram below.


workflow-app

Job Service and GRIA Workflow Application interactions

When a workflow job is started, the Job Service invokes (via a platform script) the startJob.pl application wrapper. In turn, the application wrapper invokes the GRIA Workflow Application (GWA). The GWA compiles the workflow, workflow.xml, and also reads the application description file, ApplicationMetadata.xml. The application description file is used to map input and output data stagers to workflow inputs and outputs. Finally, the GWA executes the workflow and makes workflow outputs available in the output data stagers for the job.

Note that the ApplicationMetadata.xml and startJob.xml can be generated automatically from a workflow file, using the deploy tool. Further details are provided in the tutorial.

Related Components

GRIA documentation and software downloads are available from the GRIA homepage. Taverna information and user documentation can be found at the Taverna project site. For further information about g-Eclipse, please refer to the g-Eclipse project site. The latest releases of the GRIA Workflow Plugins and user documentation can be obtained from the GRIA homepage also.

2. Installation

Installation involves unpacking the release archive, setting of environment variables, and configuring the software with details of the keystore used by your GRIA Job service.

Prerequisites


You should install the software on a file system that can be accessed by your GRIA Basic Application Services. This is necessary because the GRIA Workflow Application software will be executed indirectly by the GRIA Job service, when a workflow job is started.

The GRIA Workflow Application v0.2 software can be used with the GRIA Basic Application services package, versions 5.0.1, 5.1 and 5.2.

The following software components are required before installation. Note that all prerequisite software will have been installed previously, before the installation of the GRIA Basic Application Services.

N.B. the GRIA Workflow Application has been tested on Windows XP Service Pack 2 and SuSE 10.1, but may work on other Windows and Linux platforms.

Unpacking the software

Unpacking a release archive generates a new directory, gria-workflow-app-0.2, that hereafter will be referred to as GRIA_WORKFLOW_HOME. Follow the instructions below, according to your operating system.

Windows

On Windows systems, use WinZip to unpack the gria-workflow-app-@VERSION@.zip distribution file to a convenient destination directory and proceed to the Environment variables section, below.

Linux

First login as root using the su command and by entering the super user password.

su root

Change directory to the location under which you wish the software to be installed. For example:

cd /usr/local

Next, unzip the .zip release archive with the unzip utility. Alternatively, you can use the tar command to unpack the .tgz release archive.

unzip gria-workflow-app-0.2.zip

Or:

tar xzf gria-workflow-app-0.2.tgz

Finally, change ownership of the newly created gria-workflow-app-@VERSION@ directory so that it is owned by the user that runs tomcat. For example, if you run tomcat as the tomcat user, do the following.

chown -R tomcat gria-workflow-app-0.2

Environment variables

The PATH environment variable must be updated so that GRIA_WORKFLOW_HOME/bin is on the system path. Also, the GRIA_WORKFLOW_HOME environment variable must be set to the absolute path of the directory created from unpacking the software (i.e. gria-workflow-app-0.2).

Windows

Environment variables are managed from the Environment Variables dialog. This is accessed as follows. Choose Start, Settings, Control Panel, and then double-click on System. Select the Advanced tab and then the Environment Variables button to launch the dialog. Note that changes to environment variables take effect only when a new console window is opened or when software that is already running is restarted.

PATH

The PATH can be a series of directories separated by semi-colons (;). Look for "Path" or "PATH" in the User Variables and System Variables sections of the Environment Variables dialog. If you are not sure where to add the path, add it to the beginning of the existing "Path" value in the User Variables section.

For example, if the software was installed at c:\gria-workflow-app-0.2, the value of "Path" should be updated by pre-pending the following to any existing value.

c:\gria-workflow-app-0.2\bin; 

GRIA_WORKFLOW_HOME

Add a new variable called GRIA_WORKFLOW_HOME in the User Variables section of the Environment Variables dialog. Set its value to the location of the gria-workflow-app-@VERSION@ directory. This is the directory that was created from unpacking the release archive. For example:

c:\gria-workflow-app-0.2

Linux

Refer to your system's documentation on how to update the PATH variable and add a new environment variable called GRIA_WORKFLOW_HOME. The following example assumes that Bash is being used, the user that runs tomcat is tomcat, and that the software has been installed under the system-wide location /usr/local.

If necessary, login as the user that runs tomcat.

su tomcat

Using a suitable text editor, edit the ~/.bash_profile file, creating it if it does not already exist. For example:

vi ~/.bash_profile

Add the following entries at an appropriate place (usually at the bottom of the file).

export GRIA_WORKFLOW_HOME=/usr/local/gria-workflow-app-0.2
export PATH=$GRIA_WORKFLOW_HOME/bin:$PATH

Save the edited file. For the changes to take effect, you must restart your login session or source the ~/.bash_profile file, before restarting tomcat.

source ~/.bash_profile
/etc/init.d/tomcat restart

Permissions

On Linux systems, it is necessary to ensure that the files in the GRIA_WORKFLOW_HOME/bin directory are executable. Windows users can skip this section.

First ensure that you are logged in as the user that runs tomcat. For example:

su tomcat

Next, set execute permissions for the tomcat user on all files in the GRIA_WORKFLOW_HOME/bin directory.

chmod u+x "$GRIA_WORKFLOW_HOME"/bin/*

Configuration

Configuration involves specifying in a configuration file keystore security details. Before proceeding with this section, ensure that your GRIA Basic Application Services have been configured correctly using the administration web pages.

Use the administration web pages for the GRIA Basic Application Services to determine the location of the configuration directory. This is displayed at the top of the main page, as shown below.


config-dir

On Windows systems the configuration directory is usually c:\gria\basic-app-services\config. On Linux, the configuration directory is usually /etc/gria/basic-app-services.

To configure the GRIA Workflow Application, simply copy the crypto.properties file that can be found in the configuration directory to the GRIA_WORKFLOW_HOME/conf directory, overwriting the existing file.

For example, on a Linux system, first ensure you are logged in as the user that runs tomcat. Assuming that the configuration directory is /etc/gria/basic-app-services, copy the crypto.properties file as follows.

cp -f /etc/gria/basic-app-services/crypto.properties "$GRIA_WORKFLOW_HOME"/conf 

On a Windows system, use Windows Explorer to copy the crypto.properties file in the configuration directory to the GRIA_WORKFLOW_HOME/conf directory, overwriting the existing file.

This completes installation and configuration of the software.

3. Tutorial

To test your installation and configuration and to become familiar with the features of the software please follow the tutorial below.

Prerequisites

This tutorial follows on from the tutorial provided in the GRIA Workflow Plugins for Taverna guide. You should complete the GRIA Workflow Plugins tutorial before proceeding with this tutorial. After completing the GRIA Workflow Plugins tutorial, ensure that you save the final workflow in a convenient location, if you intend to close Taverna.

This tutorial requires that you have Taverna 1.4 with the GRIA Workflow Plugins version 2.0.0 or later installed and configured. However, in order to use GRIA 5.2 services, you must use version 2.1.0 or later.

You will also need administrative access to a machine running the GRIA Basic Application Services, version 5.0.1 or later. The GRIA Workflow Application software should be installed on this machine according to the instructions provided in the installation section.

You should already have Taverna and the GRIA Workflow Plugins installed and configured, as this is a prerequisite to completing the GRIA Workflow Plugins tutorial. Administrative access to a GRIA Basic Application Services server and a working installation of the GRIA Workflow Application software are required so that you are able to deploy the tutorial workflow as a GRIA application.

Finally, you will need a suitable client to test if the workflow application works correctly after it has been deployed to the GRIA Job Service. Either the GRIA Client version 5.0.1 or later or Taverna 1.4 with the GRIA Workflow Plugins version 2.0.0 or later can be used (or version 2.1.0, if using a GRIA 5.2 Job Service).

Overview

In the GRIA Workflow Plugins tutorial an image processing workflow was created. The workflow has an upload processor to upload an input image, two job processors for performing the image processing, and a download processor to retrieve the processed image. The final workflow can be seen below.


Image processing workflow developed in the GRIA Workflow Plugins tutorial

Image processing workflow developed in the GRIA Workflow Plugins tutorial

In this tutorial, the image processing workflow will be modified to make it suitable for deployment as a workflow application. Before deploying the workflow, it is necessary to create a new input parameter and create a new output parameter. When the workflow is deployed as a GRIA application, the workflow input will be mapped to an input data stager and the workflow output will be mapped to an output data stager. After clients have created jobs from the workflow application, they will be able to upload input data to the input data stager, have the input data processed by the workflow by starting the job, and download the results from the output data stager.

The workflow that will be produced at the end of this tutorial is shown below.


Final tutorial workflow

Final tutorial workflow

Compose Workflow

Start Taverna

If Taverna (with the GRIA Worfklow Plugins installed) is not already running, follow the system-specific instructions provided below, to start the workbench.

Note that when the the Taverna distribution was originally unpacked during installation, a new directory was created. This directory will hereafter be referred to as TAVERNA_HOME.

On WinXP, start the Taverna workbench by double clicking on the file:

TAVERNA_HOME/runme.bat

On Linux systems, first change directory to TAVERNA_HOME. Next, set execute permissions on the runme.sh, before executing it.

chmod u+x runme.sh
./runme.sh

Load the workflow

Load the final workflow that was created in the GRIA Workflow Plugins tutorial. Select the Load tool bar button in the Advanced model explorer before selecting the Load from a file option from the context menu.


Loading the workflow

Loading the workflow

This displays a file system browser dialog, called Open workflow, from which you can select the workflow to load.

Add a workflow input

Add a workflow input by right-clicking on the Workflow inputs node in the Advanced model explorer and selecting Create New Input... in the context menu. You will be prompted for a name for the new input. Call it imageIn.


Add a workflow input

Add a workflow input

Add a workflow output

Similarly, add a workflow output by right-clicking on the Workflow outputs node in the Advanced model explorer and selecting Create New Output... in the context menu. Again, you will be prompted for a name. Call it imageOut.


Add a workflow output

Add a workflow output

Add a string constant

Expand the Local Services node in the Available Services panel and right-click on the String Constant processor. Select Add to model with name... from the context menu. You will be prompted for a name for the new string constant processor. Use downloadedFile as the name.


Add a string constant processor

Add a string constant processor

Create data links

Place data links between processors in the workflow until the workflow looks like the diagram below. Recall from the Taverna documentation that data links are added using context menus available from the Advanced model explorer window. As an example, consider making the data link from the imageIn workflow input to the Upload processor. Select the imageIn workflow input in the Advanced model explorer window. Right-click and use the context menu to make a data link to the localFile input port of the Upload processor. Add the other data links in a similar way.


Final tutorial workflow

Add data links until the workflow looks like this

Edit string value

The download processor saves the swirl processor's swirled-image data stager contents as a new file on the file system. Specify the name of the file by right-clicking on the downloadedFile processor in the Advanced model explorer and selecting Edit string value... in the context menu. You will be prompted for a new value. Set the value as swirlResults.


Edit string value to specify a download file

Edit string value to specify a download file

Mark processors as critical

Expand the Processors node in the Advanced model explorer. Select the Critical checkbox alongside each processor in the workflow. This ensures that if there is a problem executing the processor at runtime, the workflow will report failure correctly. In contrast, if a non-critical processor fails, the workflow may report success.


Mark processors as critical

Mark processors as critical

Configure processor lifecycle

Configure processors that create remote resources such that they terminate when the workflow finishes. This ensures that remote resources are cleaned up on workflow completion. The Upload, Paint and Swirl processors all create remote resources (data stagers and jobs). Configure each of these in turn. Select the processor in the Advanced model explorer window. Right-click and select the approprate configure option in the context menu (either Configure Upload or Configure GRIA Job). In the dialog, find the Life cycle panel and ensure that the Terminate with workflow radio button is selected. Select the OK button so that any changes take effect.


Configure processors to terminate with the workflow

Configure processors to terminate with the workflow

Save the workflow

Select the Save toolbar button in the Advanced model explorer. This displays a file system browser dialog, called Save workflow, from which you can select a convenient location and file name to save the workflow. For the purpose of the tutorial, we will assume that the workflow has been saved with a file name, tutorial-workflow.xml .

Deploy workflow

There are two steps to deploying the workflow. First, the deploy tool will be used to generate the application wrapper script and application descriptions files. The second step is using the Job Service administration web page to deploy the application to GRIA.

Run the deploy tool

Open a console session on the machine that hosts the GRIA Basic Application Services and the GRIA Workflow Application.

On Windows systems, the command prompt is available under Start, All Programs or Programs, then Accessories. Select Command Prompt to start the console session.

On Linux systems, consult your system documentation on opening a console session, if necessary. You must also ensure that you are logged in as the user that runs tomcat. For example, assuming that tomcat runs as the tomcat user, use su to change to that user.

su tomcat

In the console, run the deploy tool without any arguments to see usage information.

deploy.pl

Note that Linux users can also use the following command, with the .pl file extension omitted.

deploy

The first line of the usage information shows the command format. This is similar to the text below.

deploy [OPTION]... WORKFLOW APPNAME [DESTINATION]

The command requires a Taverna workflow file, WORKFLOW, a name for the new application, APPNAME, and optionally can accept a destination directory, DESTINATION, in which to place generated files. If a destination directory is not provided, the files will be created in the current directory.

Chose a suitable destination directory for generated files. There is no need to create the directory, but its parent directory must already exist. We will use the deploy tool to create the destination directory as well as generate the deployment files.

For this tutorial, we will assume that a Windows server is being used and that the tutorial workflow was saved at c:\tutorial-workflow, and that there is a directory, c:\gria\applications, under which we wish a destination directory, gwa-tutorial-app, to be created. Also, we will call the new application http://www.gria.org/workflow-application/tutorial-app. Adjust paths as appropriate, according to your systems requirements and your preferences. Run the deploy tool from the console, but with your own paths substituted. For example:

deploy.pl -c c:\tutorial-workflow.xml http://www.gria.org/workflow-application/tutorial-app c:\gria\applications\gwa-tutorial-app

Note that the -c switch is used so that the destination directory will be created.

The command should return to the console prompt with out any error message being displayed. A destination directory containing three files should have been created at your chosen location. The three files are: startJob.pl, ApplicationMetadata.xml and workflow.xml.

Deploy the application

Deployment is straightforward and is the same procedure as deploying any GRIA application. For full details of GRIA Job Service administration and managing applications, see the The Job Service section of the GRIA Basic Application Services User Guide.

Browse to the Job Service Administration web page. This can be accessed by following the Job Service link from the main administration page for the GRIA Basic Application Services. This is usually at a location similar to the one shown below, but with the HOST and PORT substituted as appropriate for your system.

https://<HOST>:<PORT>/gria-basic-app-services

Scroll down to the Applications section and enter the absolute path to the destination directory that was created by the deploy tool, before selecting the Deploy new application button.


Deploy the workflow as an application

Deploy the workflow as an application

In the Application Properties section of the displayed page, select the Accept button to complete deployment of the new application.

Test the application

Any GRIA Client can be used to test the tutorial workflow application. For example, Taverna, with the GRIA Workflow Plugins installed, or the GRIA Client would be suitable. For this tutorial, the GRIA Client will be used for testing.

Start the client

On Linux systems, start the client by first changing directory to the client installation directory, and then running the gridcli command.

gridcli

On Windows systems start the client by double-clicking on the gridcli.bat file in the client installation directory.

Add the job service

Add your Job Service to the client according to the instructions in the Adding Services section of the Client User's Tutorial. The easiest way to do this is by drag and drop of the WSDL link for the Job Service, from a browser to the client. The WSDL link is available in the main administration web page for the GRIA Basic Application services.

Create a new job

Create a new job, according to the Creating a job section of the Client User's Tutorial. When prompted for the application type, select the new workflow application, as shown below.


Create a workflow job

Create a workflow job

Upload job input data

Upload an input image to the imageIn input data stager of the newly created job. Right-click on the imageIn node in the client. In the context menu, follow Data Functions and select Upload data. This will display a file system browser dialog called Open. Use any convenient image you have available on your file system. For this tutorial, the account-types.png image from the GRIA Client 5.0.1 release will be used.


Upload job input data

Upload job input data

Run the job

Run the job by right-clicking on the job in the client. Select Start job in the context menu. When prompted for arguments, click Ok so that no arguments are provided.


Run the job

Run the job

The Job Monitor dialog will be displayed and updated with monitoring information while the job is executing. After job execution, the end of the text in the dialog should be similar to the text shown below.


Job monitoring output

Job monitoring output

Download the output

Download the processed image by previewing the image in the client. Right-click on the imageOut output data stager in the client. In the context menu, select Data properties and then select the Preview tab in the dialog. Click Ok in the warning dialog to indicate that you are happy to download the data for preview. If you used the same input image, your output will look similar to that shown below.


Preview the output image

Preview the output image

This completes the tutorial. You have successfully modified a workflow to make it suitable for deployment as a workflow application. You have used the deploy tool to generate deployment files that the GRIA Job service requires. You have deployed the new application using the Job Service administration pages. Finally, the GRIA Client has been used to test the new application.

Consult the Reference section for information that may not have been coverred in the tutorial.

4. Reference

This section provides additional reference information on the GRIA Workflow Application.

Input and output mapping

GRIA Jobs have inputs and outputs in the form of files that are held in data stagers. They can also have command line arguments, but these are not used by the GRIA Workflow Application. When a job is implemented with a Taverna workflow, the system must determine both which workflow input or output corresponds to a paticular data stager, and how the data should be transferred between the two.


Mapping between data stagers and workflow inputs and outputs

Mapping between data stagers and workflow inputs and outputs

The system uses the ApplicationMetadata.xml file to determine the workflow input or output that corresponds to a particular data stager. Recall from the tutorial that this file is auto generated using the deploy tool. In contrast, the workflow author must specify how data should be transferred between a staged input file and a workflow input, and between a workflow output and a staged output file.

Two strategies are supported for transferring data: Pass by reference and Pass by value. Explanations of these strategies are provided in sections below. First, we will consider how to declare that a workflow input or output should use a particular strategy. The MIME types feature in Taverna is used to provide some additional typing of workflow inputs and outputs.

In the Advanced model explorer window in Taverna, right-click on a workflow input or output and select Edit metadata in the context menu.


Editing workflow input and output metadata

Editing workflow input and output metadata

A dialog for editing metadata will be displayed. Click on the MIME Types tab to view the MIME types that have been associated with the input or output. The strategy that you wish to be used for transferring data can be entered in the Enter new MIME type and hit return text box.


Specifying a MIME type for an input or output

Specifying a MIME type for an input or output

Pass by reference is the default strategy and there is no need to add an entry if this is the behaviour you require. It is possible to be explicit, however, by entering a value of filename and hitting return.


Specifying pass by reference

Specifying pass by reference

Alternatively, if you wish pass by value to be used, enter value and hit return.


Specifying pass by value

Specifying pass by value

Note that you should never specify both filename and value for the same workflow input or output.

Pass by reference

As mentioned above, pass by reference is the default strategy. In this case, the file name of the staged file is passed to a workflow input, or read from a workflow output, at runtime.

The system will provide a workflow input with the file name of a staged input file. In contrast, the workflow author must ensure that a workflow output is passed the correct file name for a file that should be staged as an output.

When using pass by reference, relative file names should always be used. In general, be very careful if using absolute paths in workflows that will be deployed as workflow applications.

Pass by reference is useful when data sets are being used that are not trivial in size.

The workflow developed, deployed and tested in the tutorial used pass by reference for its inputs and outputs.

Pass by value

To use pass by value for a workflow input or output, you must explicitly declare value as a MIME type, as described above. In this case, the system will pass the contents of an input data stager to a workflow input at runtime. Similarly, the contents of a workflow output will be used to create a file that will be staged in an output stager.

Pass by value is useful when very small data sets are being used.

Example

A simple example of a workflow that uses pass by value for all of its inputs and outputs is included in this release. The workflow definition is available at:

GRIA_WORKFLOW_HOME/samples/pass-by-value.xml

The workflow has a local (Java) processor for concatenating two input strings to produce a single output string, as shown in the diagram below.


Pass by value example workflow

Pass by value example workflow

This workflow is ready to be deployed as a GRIA application using the deploy tool. Refer to the Deploy Workflow section of the tutorial for details of using the deploy tool and deploying GRIA applications.

Log files

Log files for plugins can be found under the directory:

GRIA_WORKFLOW_HOME/logs

5. Support

This section describes how to get support for installing and using the software.

Support requests and bug reports for GRIA Workflow Application should be directed to support@gria.org