Personal tools
You are here: Home GRIA Documentation Documentation 5.1 Reference Manuals GRIA Workflow Plugins for Taverna

GRIA Workflow Plugins for Taverna

Note: Return to reference manual view.

This guide describes the GRIA Workflow Plugins for Taverna and how GRIA jobs and data transfer operations can be composed and executed in workflows.

1. Introduction

GRIA Workflow Plugins are components that extend the Taverna workbench and Freefluo workflow enactment engine. They enable GRIA job and data transfer operations to be composed into workflows using the Taverna workbench, and enable such workflows to be executed by the Freefluo workflow enactment engine.

Architecture

An overview of the workflow and GRID system architecture is provided in the figure below. Only the GRIA Workflow Plugins (GWP) are provided in this release.

System Architecture


System Architecture


The plugins enhance the Taverna workbench so that it can be used to compose and execute workflows of GRIA jobs and data transfers. Using Taverna with GWP, available data stagers and applications can be discovered from the GRIA basic application services, and jobs and data stagers can be created and managed in workflows.

There are three alternative interaction patterns involving Taverna with GWP and the GRIA services, depending on how service providers have configured their GRIA services, and if your organisation has deployed and configured the GRIA client management package. These alternative scenarios are explained below.

Firstly, a service provider may have made its services free, allowing clients to create and manage jobs and data stagers without restriction. In this case, only the interactions between Taverna with GWP and the GRIA basic application services are relevant in the diagram.

GRIA basic application services may require that the client presents a reference to a valid service level agreement (SLA). In this case, the services are not free and SLAs should be agreed with service providers using the GRIA Client software.

In the second scenario, Taverna with GWP queries the GRIA basic application services to discover SLAs that are available to the client. The workflow author can select a particular SLA that should be used for creating and managing jobs and data stagers.

The final scenario applies if your organisation has deployed the GRIA client management package to help manage local users and business relationships with service providers. In this case, Taverna with GWP can query the client management services to obtain requisite security tokens that can be presented to service providers.

More details of these scenarios and the GRIA software packages are available in the GRIA user guides.

Transport and message-level communication between Taverna and the GRIA services are secured using HTTPS and WS-Security.

Related components

GRIA documentation and software downloads are available from the GRIA homepage. Taverna information and user documentation can be found at the Taverna project site. Latest releases of the Freefluo workflow enactment engine can be obtained from the Freefluo project site.

2. Installation

Installation involves deployment of the GRIA Workflow Plugins to the Taverna workbench. In addition, a small amount of configuration is required. Please follow the instructions in the following sections to install and configure the plugins.

Access to one or more service providers' GRIA services is required before workflows can be composed and executed. Please note that the Taverna workflow tools are compatible with only GRIA versions 5.0.1 and 5.1. If you need to install GRIA, please refer to the downloads section of the GRIA website for GRIA releases, documentation and installation instructions.

Prerequisites

Before installing the GRIA Workflow Plugins you should have completed the Client User's Tutorial for the GRIA Client. This is necessary both for familiarity with GRIA and because a keystore, containing your security credentials, is generated as part of completing the tutorial, and this is required when configuring the GRIA Workflow Plugins.

The following software components are required before installing the plugins.

N.B. the plugins have been tested on Windows XP Service Pack 2, Fedora Core 5 and Suse 10.1, but may work on other Windows and Linux platforms.

Taverna installation

Please note that a fresh installation of Taverna is required before installing the plugins. The installation procedure will overwrite some of Taverna's files and therefore it is desirable to use a fresh Taverna installation. Using a fresh installation will prevent loss of existing work and settings. It's important to note that other Taverna plugins may be affected by overwriting Taverna configuration files. Overwrite is necessary because of library incompatibilities.

Installation of the Taverna workbench on Windows XP is simply a case of unpacking the Taverna release archive to a convenient directory. For example, use WinZip to unzip the taverna-workbench-1.4.zip file.

On Linux, you can unzip the release archive with the unzip utility:

unzip taverna-workbench-1.4.zip

Unzipping the Taverna release archive generates a new directory, taverna-1.4, that hereafter will be referred to as TAVERNA_HOME.

If you are installing Taverna on a Linux platform, it is also necessary to install the GraphViz package from AT&T. Refer to the Linux section of the Taverna user manual for further details.

For SuSE 10.1, the GraphViz package is available on the installation CDs/DVD and can be installed easily using the YaST package manager.

Deployment

Unzip the gria-workflow-plugins-@VERSION@.zip distribution file to a convenient location. Unzipping the distribution will reveal the following files and directories.

gria-workflow-plugins-@VERSION@/docs/ - directory containing user documentation for the plugins
gria-workflow-plugins-@VERSION@/taverna-1.4/ - directory containing libraries and configuration files
gria-workflow-plugins-@VERSION@/release-notes.html - the release notes for this release of GRIA Workflow Plugins
gria-workflow-plugins-@VERSION@/LICENSE - the license that applies to this software

After unzipping the release distribution, simply copy the gria-workflow-plugins-@VERSION@/taverna-1.4/ directory, and paste it such that you overwrite TAVERNA_HOME (the taverna-1.4 directory, generated when you unzipped the Taverna release archive). Be sure to answer "yes to all" when prompted whether you'd like to overwrite files in the existing directory. Next, proceed to configure the plugins with your security settings, as described below.

Configuration

Configuration comprises of specifying in configuration files the details of your security credentials. Please note that you should have a keystore containing your security credentials after having completed the Client User's Tutorial for the GRIA Client.

Security credentials

Note that if your GRIA Client is set up correctly, you should just copy the GRIA_CLIENT_HOME/conf/crypto.properties file to TAVERNA_HOME/conf/crypto.properties.

Alternatively, you can edit the file supplied with this release, as described below.

Open in a text editor the following file.

TAVERNA_HOME/conf/crypto.properties

As detailed in the comments at the start of this file, four values must be provided by editing the lines that begin as follows:

org.apache.ws.security.crypto.merlin.file =
org.apache.ws.security.crypto.merlin.keystore.password =
org.apache.ws.security.crypto.merlin.keystore.alias =
org.apache.ws.security.crypto.merlin.alias.password =

Complete each of these lines in the file by editing the values after the = character. The values you must provide are respectively:

  • the location of your keystore
  • the password for you keystore
  • the alias for your keypair/certificate
  • the password for your private key

3. Tutorial

To test your installation and configuration and to become familiar with the features of the software please follow the tutorial below.

The tutorial assumes that you are using GRIA 5 services that are free (and do not require valid SLAs to be presented by the client). However, with some additional configuration of the Upload and Job processors in the tutorial workflow, you will be able to use GRIA 5 services that require SLAs. If you wish to use such services, refer to the Reference section for details of how to configure processors to use SLAs.

Overview

We'll build a workflow in Taverna that performs image processing. The first step in using the GRIA plugins will involve using Taverna to discover the applications that are available at a GRIA service provider. We'll then build the workflow for image processing. The first step in the workflow will be a processor for uploading an image file to a GRIA server. After adding the upload processor, we'll add two job processors for performing image processing computations. The final step in the workflow will be a download processor for retrieving the output image file from the remote GRIA server.

The final workflow will look like the diagram below.

Image processing workflow

Image processing workflow

Prerequisites


You should be already familiar with the GRIA client and have completed the Client User's Tutorial.

You should be familiar with constructing and executing workflows using Taverna. If not, please refer to the Taverna documentation.

In addition, the GRIA Workflow Plugins and prerequisite software must have been installed and configured according to the instructions in the Installation section.

Start Taverna

On WinXP, start the Taverna workbench by double clicking on the file:

TAVERNA_HOME/runme.bat

On Linux systems, first change directory to TAVERNA_HOME. Next, set execute permissions on the runme.sh, before executing it.

chmod u+x runme.sh
./runme.sh

Discover applications

Discover the applications that are available at your chosen GRIA service provider.

In the Available services panel tree, right-click on the root node, named Available Processors. In the context menu that's displayed, select Add applications from a GRIA 5.* service provider, as seen below.


Discover applications

Discover applications


You will be asked to enter the Address (URL) of the GRIA service provider. Enter this appropriately. Usually this will involve simply replacing the HOST:PORT placeholder in the text field with the name of the server. You may also need to change the URL prefix from https to http if the services aren't secured by SSL. Alternatively you can select URLs you have previously used.

After entering the address, the system will query the remote server to discover details of the applications that are available. At this point you may be warned that the certificate presented by the remote service is issued by an unknown certificate authority. This means that there is no way to check that it is the server it claims to be. In this case, and when you are using the software for evaluation and testing purposes only, you can select that you wish to trust the certificate temporarily for this session.

After discovering applications, the service providers' applications will be represented as nodes (Processors) in the tree in the Available services panel. These can be added as Processors in workflows.

Compose workflow

Add an upload processor

Add an upload processor to transfer the input image file from your machine to the remote server. Expand the node in the Available services panel tree that represents the GRIA server provider. Under the Data node right click on the Upload node and in the context menu select Add to model.


Add upload processor

Add an upload processor

Add a paint job

Under the Applications node right click on paint application node, named:

http://it-innovation.soton.ac.uk/grid/imagemagick/paint

and in the context menu select Add to model with name.... Call the new processor paint.

Add paint job processor

Add a paint job processor

Add a swirl job

As done above for paint, add a swirl processor to the workflow. The swirl application node is called:

http://it-innovation.soton.ac.uk/grid/imagemagick/swirl

but choose a more readable name such as swirl.

Add a download processor

Add a download processor to retrieve the results file from the remote server. The download processor is available under the Data node for the GRIA service provider.


Add download processor

Add a download processor

Create data links

Place data links between processors in the workflow until the workflow looks like the workflow diagram at the start of this section. Recall from the Taverna documentation that data links are added using context menus available from the Advanced model explorer panel. As an example, consider making the data link from paint to swirl. Select the painted-image port of the paint processor in the Advanced model explorer panel. Right-click and use the context menu to make a data link to the source-image input port of the swirl processor. Add the other data links in a similar way.

Add datalinks

Add data links

Bind inputs

Specify file names for the input and output image files. The workflow is complete except for specifying the file to upload and what to call the results file. Provide these values by editing the localFile input ports. For each String, right-click on the node in the Advanced model explorer panel. Select Set default value from the context menu and provide a name for the file in the dialog that's displayed. Note that an absolute file name can be used. Alternatively, a relative file name may be used and in this case the path is relative to TAVERNA_HOME.


Specify filenames

Specify filenames

Your workflow should now look similar to the workflow at the start of this section. It will have input data bound and therefore will be ready to execute.

Execute workflow

Run the workflow by selecting Run workflow from the Tools and workflow invocation menu.

Run the workflow

Run the workflow

View results

If the workflow completes successfully, the workflow status panel should look similar to the figure below and the resulting image file should be available on your file system at the location you specified.


Successful workflow completion

Successful workflow completion

Next steps

In this tutorial, the default GRIA Upload and Job processor settings have been used. To understand how the processors can be configured, please read the Reference section.

4. Reference

This section describes GRIA application and workflow discovery, the Processors provided in this release and how they are configured. Finally, locations of log files of log files are provided. These should be sent in support queries and bug reports, if you problems are encountered with the software.

The GRIA Workflow Plugins comprise of a number of Processors for use in Taverna. Discovery is performed to determine the GRIA applications and workflows that are hosted by a particular GRIA server. After performing discovery, appropriate Processors are made available for use in workflows. After adding a processor to a workflow, some configuration may be appropriate, though often the default processor configuration will be suitable. Each of these tasks are considered below.

Discovery

The GRIA applications and workflows hosted by a GRIA server are discovered as follows. In the Available services panel tree, right-click on the root node, named Available Processors. In the context menu that is displayed, select Add applications from a GRIA 5.* service provider. This can be seen in the figure below.

Discover applications

Application discovery

You'll be asked to enter the Address (URL) of the GRIA service. Enter this appropriately. Usually this will involve simply replacing the HOST:PORT placeholder in the text field with the name of the server. You may also need to change the URL prefix from https to http if the services aren't secured by SSL. The URLs from GRIA service providers are stored in a configuration file, therefore users are not required to fill in the HOST:PORT information again; they can just select previously included service providers. The configuration file can also be edited manually. It can be found in

TAVERNA_HOME/conf/recent.gria.service.provider

After discovery is complete, Processors will be made available in the Available services panel tree. After discovering GRIA applications, there will be processors available for performing data transfer operations and a processor for each application hosted by the server.


Processor configuration


After adding a processor to a workflow, Processor configuration is achieved by right-clicking on the processor in the Advanced model explorer panel and selecting the appropriate configuration option in the context menu. Note that you have to right-click directly on the label for the processor in the Advanced model explorer panel, otherwise the context menu may not be presented.

After the context menu has been displayed you can configure a job processor, for example, by selecting Configure GRIA job, as below.

Configuration of a job processor

Configuration of a job processor


Note that if the service required for an Upload or Job processor is currently unavailable (offline), you will be informed that an error occurred during configuration.

Upload processor

The upload processor is used to transfer a file from the client machine to a GRIA server. The uploaded file is said to be staged at the GRIA server.

Configuration options for the Upload processor include specifying a data stager (to upload the data to) as well as supplying appropriate billing and lifecycle information.

Use string or file

An Upload processor always has an input port named either localFile or stringData. The Use string or file panel should be used to configure which of these is used. If the use file radio button is selected, the localFile input port will be exposed. This should be passed a relative or absolute file name corresponding to the file to upload. If using relative paths, note that paths are relative to the TAVERNA_HOME directory. In contrast, if the use string radio button is selected, the stringData input port will be exposed. This should be provided with the actual data that you wish to upload (instead of just the filename).


Upload processor configuration

Upload processor configuration

Data stager

There are two main options regarding the data stager to use for upload. An existing data stager can be used, or a new data stager can be created.

Select the Use existing radio button if you want to upload data to an existing data stager. When this is selected, there are two further options that concern how the data stager is determined. The ID radio button should be selected if you wish to specify the data stager statically, at the time that the workflow is authored. In this case, the associated '...' button should be used to discover and select an existing data stager, as seen below.

Selection of available data stagers

Selection of available data stagers

Alternatively, the data stager can be determined dynamically at runtime by exposing an input port. If this is selected, a new input port called existingDataStager will be exposed on the processor. This should be passed the ID of the data stager at runtime.

The second option for data stager use is Create new. Select this radio button, if you want the uploaded file to be stored in a new data stager, created at the GRIA server. When this option is selected, the options for Billing, Life cycle and Authorisations become relevant. This is because in GRIA, whenever a data stager (or job) is allocated and the service is not free, it is done so within the context of a SLA. For more information see the section for Billing, Lifecycle and Authorisations, below.

Note that when you select to use an Input port for an existing data stager or an SLA, a new input port is exposed on the processor. After this, it isn't possible to rename the input port. If you wish to rename the port, simply remove the port temporarily by selecting an alternative option and clicking on the OK button. Next reconfigure the processor and add the port again, with the new name.

Job processor

Job processors are used to execute a specific application at a GRIA server. Job processors can be linked together in a workflow by passing references to data stagers (data stager IDs) between them.

The number and names of input ports and output ports for job processors varies depending on the application that the processor represents. Different applications require different numbers and types of input files and produce different numbers of output files, and for a particular job processor this is reflected in the ports that are present. In addition, a command line port and SLA port can optionally be exposed. Details of optional port configuration are provided below.

Configuration involves specifying a command line for the application, and specifying how a SLA should be selected. The latter aspects are discussed below in the section for Billing, Lifecycle and Authorisations.


Job processor configuration

Job processor configuration

Command line

The Static radio button should be selected if the command line for the application is known before workflow runtime. In this case, simply provide the command line string in the associated text box. Note that the command line string should not include the name of the executable application itself, just the options, switches and arguments that should be passed to the application.

If the command line for the application is to be determined during workflow execution, select the Input port radio button. This will expose on the job processor an input port on which to read the command line. The name for the new input port must be provided in the associated text box. After this, it isn't possible to rename the input port. If you wish to rename the port, simply remove the port temporarily by selecting an alternative option and clicking on the OK button. Next reconfigure the processor and add the port again, with the new name.

Job description

The Description text area can be used to specify user constraints on job execution. This is optional and the text area can be left empty if job constraints are not required. GRIA 5 supports a number of user-provided job contraints that can be used to help manage job execution. The constraints should be specified using an XML document that complies with the GRIA job constraints XML document schema. Further details of job constraints and an example job constraints XML file can be found in the Job Service section of the GRIA user guides.

Download processor

The download processor is the most simple of the available processor types. It is used to download to the client machine a file that has been staged at a GRIA server.

The processor has an input port dataStager which should be passed the ID of the source data stager.

There are two options for storing the contents of the downloaded data.

Firstly, an input port localFile can be used. This should be passed the relative or absolute file name at which to store the downloaded data. If you are using relative paths, note that they should be relative to the TAVERNA_HOME directory. Alternatively, an output port stringData can be used. In this case the downloaded data will not be stored on the file system. Instead, the data will be passed to the output port for use later in the workflow.

A configuration dialog is used to specify these options, as shown below. Select the Use file radio button to expose the localFile input port. Alternatively, select the Use string radio button if you wish the downloaded data to be made available on the stringData output port.

Download processor configuration

Download processor configuration

Billing, lifecyle and authorisations

Billing

The billing section allows configuration of how to obtain and present billing information to remote services. GRIA 5 services may be free, or alternatively may require that a valid SLA is presented when creating new Jobs and Data Stagers. Note that the system determines if a service is free automatically by querying the remote service. If it is free, the other options will be disabled in the configuration dialog. If a service is not free, the user is responsible for providing the corresponding SLA information. This can be done in several ways. One possibility is to discover the endpoint reference (EPR) of a SLA. This can be done statically before runtime of the workflow. A dynamic solution is to provide the EPR of the SLA during execution by passing this information to an assigned input port.

Finally the user can decide to use a predefined project account. For that, a client management service has to be connected in order to discover available private accounts. Normally the user has just to define the HOST:PORT of the client management account service.

Discover client managment service

Discover client management service

After connecting to a client management account, existing private accounts are displayed for selection.

Discover private account

Selecting an account

The URLs of different client management services used so far are stored in

TAVERNA_HOME/conf/recent.gria.client.mgt.service

Lifecycle

The life cycle section applies to data stagers and jobs that are created during workflow execution. It allows the user to specify if these remote resources should outlive the execution of the workflow, or if instead they should be finished i.e destroyed at the end of workflow execution.

Authorisation

This functionality is not used or available in the current release.

Log files


Log files for plugins can be found in the file:

TAVERNA_HOME/plugins/logs/plugins.log

5. Support

This section describes what to do if you require support installing or using the software.
Please direct support requests and bug reports appropriately according to the component to which the request relates and as detailed below.


Support requests and bug reports for GRIA Workflow Plugins and GRIA should be directed to support@gria.org

For Taverna workbench-related support requests, please refer to the Taverna documentation for support details. However, in general the taverna-users@lists.sourceforge.net is appropriate for user-level support.

A summary of the support e-mail address is provided below.

Software componentSupport e-mail address
GRIA Workflow Pluginssupport@gria.org
GRIAsupport@gria.org
Tavernataverna-users@lists.sourceforge.net