OR/14/022 Description of dynamic (run-time) approaches

From Earthwise
Jump to: navigation, search
Barkwith A K A P, Pachocka M, Watson C, Hughes A G. 2014. Couplers for linking environmental models: Scoping study and potential next steps. British Geological Survey Internal Report, OR/14/022.

Atmospheric

Runtime coupling of environmental models is important, to capture the many feedbacks that exist between Earth systems. This section of the report details coupling software used in the atmospheric sciences. Where the software has been used within a project, the coupling component tends to be formed from two distinct sections; the coupler, which communicates with different model components; and the modelling framework, the architecture in which the coupler can operate. As atmospheric systems are tightly coupled with the Earth surface, many of the coupling frameworks encompass land and ocean modelling components.

There is a commonality of the data transfer methods for many of the approaches to produce coupled systems. In general, an active component needs data from (get or pull), and provides data to (set or put), the coupler, while data driven components read data during runtime and then provide that data to the coupler. Set (put) is typically a non-blocking communication implying that the calling code does not wait for a set to complete before proceeding. Get (pull) is blocking, so the receiver may have to wait until a sender puts the requested data. Initialise, Run Finalise (IRF) is used to describe the life-cycle of a model component within the modelling framework (Figure 1). Initialise describes the internal state of a component (eg, opening a file for reading, or a creating a database connection), Run provides the implementation logic of the component where input is being transformed to output, and Finalise provides the notion of a final cleanup after model execution. Dynamic data exchange between model components usually occurs during the run phase. The Message Passing Interface (MPI) is another standardised method commonly employed in dynamic model coupling. MPI is a language-independent communications protocol used to program parallel computers, which supports point-to-point and collective communication.

Figure 1 A typical dynamic interaction between an ensemble component using the IRF method.

CESM -CPL 7 (Framework and Coupler)

Overview

The Community Earth System Model (CESM) framework is used by researchers at the University Corporation for Atmospheric Research (UCAR) and the National Center for Atmospheric Research (NCAR) to couple land, sea, ice and atmospheric models using the CESM coupler CPL7 (Figure 2). The CESM replaces the previous Community Climate System Model (CCSM) modelling framework. CPL7 is designed to synchronise component time-stepping within the framework, manage component data communication, conservatively map data between component grids, and compute fluxes between components. While the processor configuration is relatively flexible and components can be run sequentially or concurrently, the sequencing of components in the driver (main CESM program) is fixed and independent of the processor layout. CESM components are called via the standard IRF method. The framework description used in this report is modified from Craig (2011)[1].

Figure 2 The basic CCSM framework with the CPL coupler timing controlled by the driver.

The CESM architecture is composed of a single executable with a high-level driver (Figure 2). The driver handles coupler sequencing, model concurrency, and communication of data between components. The driver directly calls the CPL7 coupler methods (for re-gridding, rearranging, merging, an atmosphere-ocean flux calculation, and diagnostics), which are run on a subset of processors essentially as a model component.

The standard CESM component model interfaces are based upon the ESMF design. Each component provides an IRF method with consistent arguments. As part of initialisation, an MPI communicator is passed from the driver to the component, and grid and decomposition information is passed from the component back to the driver. The driver and coupler acquire information about resolution, configurations, and processor layout at run-time from either a file or from communication with components.

In CESM, parts of the Model Coupling Toolkit (MCT) have been adopted at the driver-level, where they are used directly in the component IRF interfaces. In addition, MCT is also used for all data rearranging and re-gridding (interpolation) executed by the coupler.

The CESM driver manages the main clock in the system. That clock advances at the shortest coupling period and uses alarms to trigger component coupling and other events. In addition, the driver maintains a clock that is associated with each component. The standard implementation for grids in CESM has been that the atmosphere and land models are run on identical grids and the ocean and sea ice model are run on identical grids. An ocean model mask is used to derive a complementary mask for the land grid, such that for any given combination of atmosphere-land and ocean-ice grids there is a unique land mask. This approach for dealing with grids is still used a majority of the time in CESM, however it is possible to separate the atmosphere and land grids.

Process

CESM consists of both data driven and active components. In general, an active component needs data from (get or pull), and provides data to (set or put), the coupler, while data driven components read data during runtime and then provide that data to the coupler. There are seven basic processor groups in the CESM framework associated with; the atmosphere, land, ocean, sea ice, land ice, coupler, and the global group. Each of the seven processor groups can be distinct, but that is not a requirement of the system.

System initialisation is relatively straight-forward. Firstly, the seven MPI communicators are computed in the driver. Then the atmosphere, land, ocean, sea ice, and land ice model initialisation-methods are called on the appropriate processor sets, an MPI communicator is sent and grid and decomposition information are passed back to the driver. Once the driver has all the grid and decomposition information from the components, various re-arrangers and re-gridding routines are initialised that will move data between processors, decompositions, and grids as needed at the driver level. The driver derives all MPI communicators at initialisation and passes them to the component models for use. There are two issues related to whether the component models run concurrently. The first is whether unique chunks of work are running on distinct processor sets. The second is the sequencing of this work in the driver. CESM driver sequencing has been implemented to maximize the potential amount of concurrency of work between different components. However, the active atmosphere model cannot run concurrently with the land and sea-ice models.

Data exchange

Active data exchange within the CESM may only occur through the coupler. Typically two dimensional gridded datasets are passed. Exchanged data must conform to a specific unit convention. A list of time variant and time invariant data exchange items may be found in Kauffman et al., (2004)[2]. Exchanged items are passed to the coupler as a set of output fields, where fluxes may be calculated. The coupler then provides a set of input fields for the receiving system component to read at the following timestep. Input flux fields handled by the system components are understood to fall into a set interval, otherwise the conservation of fluxes is lost. For example, if the atmospheric component communicates once per hour, but takes four internal time steps, the hourly precipitation received by the atmospheric component needs to be averaged internally over the four hours.

OASIS3-MCT_2.0 (Framework and Coupler)

Overview

The framework description for OASIS3-MCT_2.0 is modified from Valcke et al., (2013)[3]. In 1991, CERFACS started the development of a software interface to couple existing ocean and atmosphere numerical General Circulation Models. OASIS3-MCT_2.0 is interfaced with the MCT, developed by the Argonne National Laboratory in the USA. MCT implements fully parallel re-gridding and parallel distributed exchanges of the coupling fields based on pre- computed re-gridding weights and addresses. MCT has proven parallel performance and is also the underlying coupling software used in the CESM.

Low model component intrusiveness, portability and flexibility were key concepts when designing OASIS3-MCT_2.0. The software itself may be envisaged as a coupling library that needs to be linked to the component models, the main function of which is to interpolate and exchange the coupling fields between them to form a coupled system. OASIS3-MCT_2.0 supports coupling of 2D logically-rectangular fields but 3D fields and 1D fields expressed on unstructured grids are also supported using a one dimension degeneration of the structures.

Process

The employment of the MCT allows all transformations, including re-gridding, to be executed in parallel. All couplings are executed in parallel directly between the components via MPI. In addition to this, OASIS3-MCT_2.0 also supports file input and output (I/O) using the NetCDF file standard. To communicate with another model, or to perform I/O actions, a component needs to include specific calls to the OASIS3-MCT_2.0 coupling library. Information, about the resolution, configurations, and processor layout at run-time, may be gathered from either a file or from communication between components.

With OASIS3-MCT_2.0, time transformations are supported more generally with use of the coupling restart file. The coupling restart file allows the partial time transformation to be saved at the end of a run for exact restart at the start of the next run.

Data exchange

Using the OASIS3-MCT_2.0 coupling library, the user has the ability to use differing coupling algorithms. In the components, the set and get routines can be called at each model timestep, with the appropriate date argument giving the actual time at the beginning of the timestep. This time argument is automatically analysed by the coupling library and, depending on the coupling period and lag value chosen by the user, for each coupling field, different coupling algorithms can be reproduced without modifying the component model codes themselves.

The lag value tells the coupler to modify the time at which that data is sent (set) by the amount of lag. The lag can be positive or negative, but should never be larger than the coupling period of any field due to problems with restartability and dead-locking. When a component model calls set, the value of the lag is automatically added to the value of the date argument and the set is actually performed when the sum date+lag is a coupling time; in the target component, this set will match a get for which the date argument is the same coupling time. The lag only shifts the time data is sent and cannot be used to shift the time data is received.

The order of coupling operations in the system is determined solely by the order of calls to send (set) and receive (get) data in the models in conjunction with the setting of the lag. Data that is received (get) is always blocking while data that is sent (set) is non-blocking with respect to the model making that call. It is possible to deadlock the system if the relative orders of puts and gets in different models are not compatible. With OASIS3-MCT provides the coupling layer with an ability to detect a deadlock before it happens and exit. It does this by tracking the order of get and set calls in models.

FLUME (Framework)

Overview

The UK Meterological Office’s Flexible Unified Model Environment (FLUME) project created a coupling framework for the Unified Model (UM) system. The framework separates infrastructure and scientific code, where scientific code is modularised and infrastructure code generated during the project.

Components, such as an ocean model or a particular sea-ice model, and support systems, such as those providing for restart and diagnostic output, are composed to form a set of communicating processes which combine to create a weather or climate simulation. The coupled components communicate through the FLUME communications interface using the set-get method. The remainder of the framework description is modified from Ford and Riley (2003)[4].

Process

The sequencing and execution rates of components and couplers must be specified. Data from a number of components may have to be combined, with the appropriate coupler, in order to satisfy the requirements of the receiving component. In addition the definition of the coupling intervals between components is required. Couplers are called from the high-level framework driving code and therefore are similar in many aspects to the scientific components. The allocation of component implementation and coupler functionality to executable files, and their deployment on a set of available computing resources, must also be provided.

The layered framework approach for the coupling system is shown in Figure 3. The control layer invokes model components at a rate consistent with the coupling intervals defined in the composition environment. The control code implements the sequencing of the models both sequentially and concurrently depending on requirements.

In Figure 3, the intra-component communication, which is a consequence of models exploiting parallel implementation, is shown at the bottom of the layered architecture. This reflects the current implementation choice for Met. Office models, where such communication takes place from within a component. Inside the top level call, each component and coupler perform the exchange before and after calls to the component implementation routines.

Data exchange

There are a couple of options available for the inter-model communication mechanisms to implement coupling exchanges. Arbitrary placement of communications use asynchronous set and get functions, which may be placed anywhere within a model. The alternative method is to layer the placement of communications. Under this method the model should be implemented as a subroutine and communication should only occur through an argument list. In this scenario, communication is through a higher layer function placed in the control (driving) layer.

Figure 3 Layered architecture for FLUME.

FLUME defines five types of input and output data:

  • Initial input control data — this data is used to configure a model i.e. set its ‘knobs ’and ‘switches’.
  • Initial input data — this data is used to provide initial conditions to prognostic fields (fields which are internally calculated by a model and whose state is maintained across timesteps) and to initialise any constant data.
  • Coupling input and output data — this input data is produced externally to the model and changes over timesteps; this output data provides external data to other models which also changes over timesteps.
  • Diagnostic output data — this data is used by scientists to determine the behaviour of the model.
  • Restart dump (checkpoint) output data — this data is used to store the models state at intermediate steps in a simulation so that if an error occurs the simulation can be re-started from the latest checkpoint rather than from its initial conditions.

Data required to start a model must be specified in a models initialisation phase and by association the same data must also be specified in the dump phase. However, whether this data includes coupling data or not is a design choice. This document suggests (and makes the assumption that) coupling data is not specified as input or output in the init and dump phases respectively. Two reasons for this are 1: it reduces the number of ‘get ’calls that need to be maintained 2: coupling get calls always return data (in the alternate case the first coupling get call after initialisation may need to ‘silently ’return without modifying the data).

OpenPALM (Coupler)

Overview

PALM is a coupler designed to combine dynamically different components into a high performance application. PALM was originally developed for operational oceanographic data assimilation in the framework of the French MERCATOR project. The PALM driver supports the dynamic launching of the coupled components, while its coupling library ensures the parallel data exchanges between the components. PALM also provides pre-defined algebra units. This PALM coupler description is modified from Valcke and Morel (2006)[5].

In 2003 the final version of the PALM coupler, PALM_MP, was released. PALM_MP, which allows independent programs to work together, dealing with different data and different parts of the algorithm. The use of MPI2 for the passing of data makes this possible. In PALM_MP, components can be fully independent programs or, for optimization issues, subroutines of higher level entities called blocks. These recent developments allow the PALM coupler to operate on massively parallel architectures as well as integrate advanced interpolation methods. The latter are considered important as surface and volume interpolation models are needed to pass information between solvers at differing spatio-temporal scales

A PALM application can be described as a set of computational units arranged in a coupling algorithm. The different units are controlled by conditional and iterative constructs and belong to algorithmic sequences called computational branches. A branch is structured like a program in a high level programming language: it allows the definition of sequential algorithms. Inside a branch, the coupled component are invoked as if they were subroutines of the branch program.

Process

PALM introduced the dynamic coupling approach where a coupled component can be launched and can release resources upon termination at any moment during the simulation. The originality of this coupler resides in the ability to describe complex coupling algorithms. Programs, parallel or not, can be executed in loops or under logical conditions. Computing resources such as the required memory and the number of concurrent processors, are handled by the PALM coupler. A component of the coupled system is only initialised when needed, reducing memory and processor use when inactive. With a static coupler, all the coupled programs would have to start simultaneously at the beginning of the simulation, occupying memory and CPU resources from the beginning to the end of the application. The concept of dynamic coupling came from the observation that different data assimilation algorithms can be obtained with different execution sequences of the same basic units and operators. In PALM, a dynamic coupling algorithm is composed of basic pieces of code, the components themselves and assembled components in different execution sequences (branches). Simulation maybe be started or stopped dynamically during the run.

The user defines and provides the elementary units, thereby fixing the scale of the coupling. Each component is a piece of code that must be instrumented by the user with a PALM wrapper. Each unit can consume and/or produce data, which are called objects, via the implementation of the get-set primitives. All the objects that a component can request or provide must be described in the component code by comment lines following a pre-defined syntax, which contain the object metadata. Modularity is ensured by the end-point communication principle: i.e., there is no reference to the origin of the input or to the destination of the output in the component code.

Data exchange

The execution of the coupled components is driven by a scheduler that allocates the computational resources according to the algorithm flow, the priorities and the limitations set by the user. At run time, the PALM driver ensures the execution and synchronisation of the different components, compiled by the user, following the sequence of actions defined in PrePALM.

The PrePALM package allows users to choose the elementary components to be coupled, which appear as individual boxes on the PrePALM GUI, and defines their execution sequences (branches). PrePALM analyses different component codes and clearly identifies the potential data input and output. To establish an exchange of information between components, the user links the output of one component to the input of another component; a pop-up appears on the link which allows the user to specify the different exchange parameters, such as the times of exchange. PrePALM also provides supervision tools such as a performance analyser and a runtime monitoring.

Summary

Atmospheric modelling frameworks for the coupling of Earth system components provide an attractive option for integrated modelling within the BGS. Many contain a land surface component as part of a coupled atmosphere-land-ocean coupling. However, these frameworks have little flexibility in terms of linking components within the land surface, as is often required in the coupled environmental modelling research we undertake. The coupling technology for the majority of these models is based on the MCT (model coupling toolkit), a set of open-source software tools for creating coupled models. MCT is fully parallel and can be used to couple message-passing parallel models to create a parallel coupled model. The passing of data is most commonly performed using the MPI (message passing interface) standard, where data is moved from the address space of one process to that of another process through cooperative operations. Due to the complexity of atmospheric modelling frameworks, the ability to restart model composition runs from a saved point is highly desirable. As integrated environmental modelling within the BGS advances and becomes increasingly complex, this ability to restart model compositions will also benefit future modelling. If BGS were to further develop a model coupling system, the Met Office FLUME project would be of interest, as the process of development and background research is freely available.

Hydrological

While a large number of couples exist, clearly a few of them emerge as the most prominent. We will take a closer look at these couplers, also mentioning those that have a potential for linking different modelling frameworks. The report is primarily concerned with technologies that can be used to couple models from the same realm. However, web services that can be used to link hydrology and climate models or to link model and databases are also considered. The section on couplers is split into two parts: the first part describing couplers that can be deployed on lower level computing platforms such as desktops, and the second part describing these that are specifically designed for high performance computing (HPC).

Software suitable for desktop applications

OPENMI

Open Modelling Interface (OpenMI) Standard was established by a consortium of 14 organisations from seven countries, in the course of the HarmonIT project co-funded through the European Commison’s Fifth Framework programme (Moore et al., 2010[6]). It was originally developed to address the Water Framework Directive's call for integrated water resources at the catchment level (Moore and Tindall 2005[7]), however, its application was later extended to other domains of environmental management (OATC 2010a[8]). OpenMI is maintained and promoted by the OpenMI Association (OpenMI 2013[9]), and is supported by the FluidEarth initiative of HR Wallingford (FluidEarth 2013[10]), which provides tools for robust model integration, e.g.: FluidEarth2 Toolkit. OpenMI is equipped with GUI (OpenMI Configuration Editor), which facilitates creating and running compositions (Goodall et al., 2011[11]).

Components in OpenMI are called 'Linkable Components' (Lu 2011[12]) and their architectural design follow initialise/run/finalise cycle (Lawrence et al., Manuscript[13]). They must be accompanied by metadata provided in the form of XML files (OATC 2010a[8]) and encoded using either VB.Net or C# (Lu 2011[12]). Models written in other languages (e.g.: Fortran, C, C++, F#, Matlab, etc.) can be integrated in OpenMI after implementing appropriate wrappers (OATC 2010a[8]). A number of tools are available to assist users in developing their applications, including wrappers, which are provided in the form of code libraries (Software Development Kits or SDKs) (OATC 2010a[8]). A set of interfaces need to be implemented to make a component OpenMI-compliant (OATC 2010a[8]), with the central one being 'ILinkableComponent' (OATC 2010b[14]).

The primary data structure is the 'ExchangeItem', which can be of two different types: 'InputExchangeItem' and 'OutputExchangeItem' (Saint and Murphy 2010[15]). The ExchangeItems can be either 'Quantities' or 'Elementsets' (Lu 2011[12]). A Quantity contains metadata of a variable, while an Elementset provides its spatial information (Lu 2011[12]). To enable linking of data expressed in different units, each Quantity is provided with a conversion formula to standard SI system units (OATC 2010b[14]). Elementsets contain references to the coordinate system used, which allows mapping between different systems (OATC 2010b[14]).

The OpenMI was designed to exchange data on the time basis (i.e.: time stamp or time span), however, the exchange of data between temporal and non-temporal components (e.g: databases, data analysis tools) is also possible (OATC 2010a[8]). The communication mechanism is based on request-reply mechanism ('pull driven' approach) (Lu 2011[12], OATC 2010a[8]). A component only progresses if other component requests data from it via 'GetValues' method (OATC 2010a[8]). Data request invokes 'Update' function on the called component, which triggers next time step computation. The produced output may have to be modified before returning to the calling component, to provide for differing grids (regridding) or time steps (interpolation, extrapolation) (OATC 2010b[14]). Essentially, “components in OpenMI are connected in a chain and invoking the Update method on the last component in the chain triggers the entire stack of data exchange” (OATC 2010b[14]).

OpenMI is a very popular standard for linking hydrologic models. The fact that a significant number of prominent water resources models (e.g.: MIKE SHE, MODFLOW, SWAT, ISIS, HEC-RAS) have been made OpenMI compliant (Graham et al., 2006[16], Gijsbers et al., 2010[17], ISIS 2013[18]) proves that it is the industry standard of choice for integrated modelling.

OMS

Object Modelling System (OMS) is an open-source software for linking components by means of annotations (David et al., 2013[19], OMS 2013[20]). It was developed to support research within agricultural and natural resources management programmes administered by the US Department of Agriculture (USDA) (David et al., 2010[21]). OMS originates from Modular Modelling System (MMS) — one of the first coupling frameworks, a hybrid between stand-alone model and a component-based modelling system (Lu 2011[12], David et al., 2013[19]). OMS employs new advances in software framework design and is described as lightweight and non-invasive. It supports implicit multi-threading, implicit scaling to cluster and cloud, domain specific languages, and interoperability with other frameworks (David et al., 2013[19]). Web services are enabled through specific annotations on the components (David et al., 2013[19]). Simulations are described using a mini-language called Domain Specific Language (DSL) (David et al., 2010[21]); the simulation file lists all model components, define connectivity, and provide parameter definitions (David et al., 2013[19]). A number of pre-defined simulation types are available, including: Shuffled Complex Evolution global search algorithm (for model calibration), Fourier Amplitude Sensitivity Test, Dynamically Dimensioned Search parameter estimation, and Ensemble Streamflow Prediction (David et al., 2013[19]). Models can be executed in a number of different platforms, e.g.: PC, cluster, or cloud (David et al., 2010[21])

OMS is based on Java, however, it is interoperable with C, C++ and Fortran. Therefore, models written in these languages do not need to be changed (David et al., 2010[21]). The integration of components in OMS3 is achieved through the use of metadata annotations, encoded as declarations within XML files (Lu 2011[12]), ‘which specify and describe points of interest amongst data fields and class methods of the model’ (David et al., 2013[19]). The initialise/run/finalise cycle is maintained merely by tagging methods with the corresponding annotations, e.g.: the compute method is tagged with '@Execute' (David et al., 2013[19]). Data exchange is described using '@In' and '@Out' annotations (David et al., 2013[19]). Components can be hierarchical and composed of progressively finer components (David et al., 2013[19]). Annotation approach facilitates capturing modelling metadata (e.g.: units, ranges) and automatic generation of component's documentation (David et al., 2013[19]).

In case of incompatible data types, units, resolution, or time step, the data can be transformed using a service provider interface (SPI) (David et al., 2013[19]).

Execution is multithreaded by design; no explicit definition of execution order is needed as it is defined by the flow of data (David et al., 2013[19]). Components are executed in parallel if all their input data is available (David et al., 2010[21]).

There are several hydrologic applications of OMS3 up to date. The National Water and Climate Centre of the USDA Natural Resources Conservation Service (NRCS) used OMS3 to develop a modelling system for short term stream flow forecasting. The system is based on distributed physical process models and the Ensemble Steamflow Prediction (ESP) methodology. It provides capabilities for displaying selected ESP output traces, performing frequency analysis on the peaks/volumes, or weighting output traces based on climate signals (e.g.: El Nino, La Nina, and Pacific Decadal Oscillation) (David et al., 2013[19]). Another example of OMS application is Agro-Ecosystem-Watershed model (AgES-W) — a fully distributed model that simulates hydrology of a large watershed. It consist of above 80 Java-based components derived from a number of models, namely: J2K-S, SWAT, RZWQM2, and WEPP, which are integrated using OMS (David et al., 2013[19]). OMS is also used in Northern and Central Africa for groundwater modelling studies using isotope tracing (David et al., 2013[19]).

In recent years USDA-NRCS has initiated the Cloud Services Innovation Platform (CSIP). CSIP employs OMS3 and various databases to support environmental modelling within the cloud environment. CSIP development is still ongoing but it already runs watershed scale models (David et al., 2013[19]).

Time

The Invisible Modelling Environment (TIME) is a metadata-based framework developed within the Catchment Modelling Toolkit project in the Cooperative Research Centre for Catchment Hydrology (CRCCH) (Rahman et al., 2003[22]). CRCCH is currently a part of the eWater Cooperative Research Centre (CRC) — an organisation responsible for implementation of the Australian Government's National Hydrological Modelling Strategy (eWater CRC 2013).

TIME architecture is based on as a number of interacting layers, with each layer consisting of a number of components and a framework supporting the specific layer's function (Rahman et al., 2003[22]). The central layer is the Kernel, which contains definitions of metadata tags, the parent classes for models and data, and mechanisms for performing IO operations (Rahman et al., 2003[22]). The other layers include: the Model layer, which consists of all the modelling components; the Tools layer, which includes components for data and model processing and parameter optimisation; and the Visualisation and User Interface layer, which contains tools for data visualisation and user interaction (Rahman et al., 2003[22]).

Components can be encoded in one of the several .NET languages, e.g.: Visual Basic, Fortran 95, C#, C++, Visual J#; modelling systems can be composed of components written in different languages (Rahman et al., 2003[22]). All models are implemented as child classes inheriting from the Kernel's parent classes. Fields for inputs, outputs, parameters, and state variables are defined and documented using metadata tags (Rahman et al., 2003[22]).

TIME supports a number of data types, e.g.: rasters, time series, points, lines, polygons, node link networks (e.g.: river systems), cross sections, arrayed data (Rahman et al., 2003[22]). Most data types are represented by two classes: a class storing the data values, and a class storing its spatial/temporal context (Rahman et al., 2003[22]). Along generic processing tools that act on all data types (e.g.: adding two objects together, statistics, and rule-based processing), a number of data type specific tools are available, e.g.: terrain analysis of rasters (Rahman et al., 2003[22]). Unit conversions are provided by the Unit component (Rahman et al., 2003[22]).

TIME was used to design a large number of integrated catchment modelling tools, mostly within the Catchment Modelling Toolkit project (Argent et al., 2009[23]). A prominent example of TIME application is a decision support system (DSS), called E2 (Argent et al., 2009[23]). E2 offers a tailored approach to conceptualisation of catchment dynamics, providing for flexible representation of different processes, through easily exchangeable model components (Argent et al., 2009[23]). A catchment in E2 is represented by sub-catchments, each of which can contain one or more Functional Units (FU) — a portion of the sub-catchment displaying distinct characteristics and thus modelled using different models or parameterisation than the other parts of the sub-catchment (Argent et al., 2009[23]). TIME features a sophisticated calibration tool, which provides a number of unique capabilities, e.g.: parameters varying in proportion between FUs can be scaled during the calibration to maintain the proportions (Argent et al., 2009[23]). E2 software, a part of the Catchment Modelling Toolkit, has been used to construct over 20 water and environmental management DSSs (Argent et al., 2009[23]). An advanced version of the catchment hydrology and water quality DDS, built upon E2, was released in 2008 under the name ‘WaterCAST’ (Argent et al., 2009[23]).

Kepler

Kepler is an open-source desktop application for creating scientific workflows, which emerged from Ptolemy II (Kepler 2013[24]). Ptolemy II is a framework allowing for a number of different modes of execution, which was developed at the University of California at Berkley and originally targeted at bioinformatics, computational chemistry, ecoinformatics, and geoinformatics (Kepler 2013a[25], Kepler 2013b[26]). Ptolemy II and Kepler are characterised by separation of workflow components from the workflow orchestration, which enables direct reusability of components (Kepler 2013b[26]). Workflows can be executed either from the GUI or from a command line (Kepler 2013[24]). Each component is represented graphically in the GUI by an icon reflecting its function (Kepler 2013a[25]). Kepler is featuring a library of above 530 ready components (Kepler 2013b[26]), which facilitate a number of tasks, among others: remote data access, processing, analysis and visualization; transformations for syntactically incompatible components; GIS processing; execution of command line applications; statistical analysis using R or Matlab; web services invocation; cluster and grid computing, execution and monitoring (Goodall et al., 2011[11], Kepler 2013b[26], Kepler 2013[24]). Kepler is maintained for Windows, OSX, and Linux operating systems (Kepler 2013[24]).

Kepler workflow is composed of components, called actors, each performing a different function. A director is a special type of an actor that controls (directs) the execution of a workflow. Workflows can have a number of sub-workflows (also called composite actors), each comprised of a collection of actors performing complex embedded task and each controlled by its own director (Kepler 2013a[25]). Kepler is developed in Java, however, components written in other language can be adopted by using wrappers (Kepler 2013b[26]).

Workflows pipe output of one component to an input of another component. Library actors facilitate data transformations for syntactically incompatible components. Data is exchanged via ports; there are three types of ports: input, output, and input/output. Ports are configured to specify the type of data that they accept and to indicate if they are 'singular' or 'multiple'. A single port can only be connected to one actor, whereas a multiple port can be connected to many actors. In the latter case, data can be sent to a number of different places in the workflow, e.g.; a different actor for further processing and a display actor to visualise the data at a specific reference point (Kepler 2013a[25]).

Workflow execution can be synchronous or parallel, depending on the type of director used. A small set of directors come pre-packaged with Kepler, including: Synchronous DataFlow (SDF), Process Networks (PN), Dynamic Dataflow (DDF), Continuous Time (CT), and Discrete Events (DE). (Kepler 2013a[25], Kepler 2013b[26]). SDF director is used to oversee simple, sequential workflows, in which data consumption and production rate is constant and declared (Kepler Project 2013b[26]). PN director is used for workflows that are driven by data availability. Actor is executed once it collects all the required inputs. Being loosely coupled, this kind of workflows are good candidates for parallel and distributed computing. DE director oversees workflow where events occur at discrete times and is well suited for modelling time-oriented systems. CT director is designed to oversee workflows that predict how systems evolve as a function of time. Rates of change in such systems are described by differential equations and each workflow execution is simply one time step of a numerical integration. Similarly to SDF director, DDF director executes a workflow in a single thread. However, data production and consumption rates can change as workflow executes. It is a good choice for workflows that use Boolean switches, if-then-else statements, branching, or that require data-dependent iterations (Kepler 2013b[26])

There do not seem any hydrological applications of Kepler in the open literature. However, Kepler was suggested to perform web services orchestration of water resources models (Goodall et al., 2011[11]), and to replace OpenMI in a two-way coupled system, developed by Goodall et al., (2013)[27], which links a hydrological model with a climate model.

Taverna

Taverna is an open-source software, composed of a set of tools written in Java, which facilitates discovery, design, and execution of scientific workflows (Taverna 2013[28]). It automates multi-step and repetitive tasks involving invocation of several applications, largely web services-based (Deelman et al., 2009[29]), by defining the flow of data and performing format conversions (Taverna 2013[28]). Taverna has been developed within myGrid project and funded through OMII-UK — an organisation supporting development of open source software for the UK research community (Taverna 2013[28]). The rationale behind Taverna development was providing scientists, that only have basic understanding of programming, with a straightforward environment for assembling and executing workflows (Sroka et al., 2010[30]). Scientific collaboration and reuse of workflows is encouraged through partnership with myExperiment portal, a social networking and workflow sharing environment for scientists, where the existing workflows can be discovered and downloaded from (Taverna 2013[28], De Roure and Goble 2009[31]).

A range of different types of services are supported within Taverna, e.g.: WSDL, RESTful, BioMart, BioMoby and SoapLab web services; R scripts on a R server (Rshell scripts); local Java services (Beanshell scripts); data import from Excel or csv spreadsheets (Taverna 2013[28]). Users can access over 3500 ready applications and analysis tools; BioCatalogue, accessible through Taverna website, provides details of the services that are currently available (Taverna 2013[28]). External tools, scripts, or Java libraries can be easily incorporated as plug-ins or via ssh calls (Taverna 2013[28]).

Tools for workflow validation (debugging) during the composition and detection of service’s interface changes and off-line times are included in the suite (Taverna 2013[28]). Execution can be monitored and paused, and workflows can be debugged at run time (Taverna 2013[28]). Workflows are run from within the desktop application, called Workbench, which provides a graphical user interface for the selection of the services (Taverna 2013[28], De Roure and Goble 2009[31]); Command Line Tool for the execution of workflows from a terminal is also provided (Taverna 2013[28]). Workflow execution is data-driven and parallel; the number of the concurrent threads is configurable (Sroka et al., 2010[30], Taverna 2013[28]). A trace of a workflow is recorded, providing information on the executed services, inputs, and outputs (Taverna 2013[28]). Taverna supports remote deployment of workflows, e.g.: on a grid or on a cloud, and editing and running workflows on the Web (Taverna 2013[28]).

Although Taverna was originally designed for bioinformatics, it is domain independent and can be applied in a number of different disciplines (Taverna 2013[28]). Currently, more than 350 organisations around the world employ Taverna and its use has spanned a large number of different fields, e.g.: bioinformatics, astronomy, chemistry, engineering, geoinformatics, biodiversity, social sciences, data mining, education, arts (Taverna 2013[28]). An example of a hydrology-related application of Taverna is the development of the Environmental Virtual Observatory (EVO) (Taverna 2013[28]), environmental monitoring and decision making system based on web services (EVO 2013[32]).

Frames

Framework for Aquatic Modelling of the Earth System (FrAMES), developed at the University of New Hampshire, is software used for simulating biogeochemical processes as water is routed through an aquatic system to a coastal zone. It allows assessing contaminant removal and attenuation from its source to the river's outlet, and permits studying process kinetics, role of different stream orders, impact of water withdrawals, spatial distribution of contaminant inputs, and factors controlling contaminant removal (Wollheim 2006[33]). The modelling system is composed of gridded terrestrial and aquatic components, and can be applied at both local and global scales using gridded river networks of varying resolutions depending on the application (Wollheim 2006[33]). FrAMES runs on Linux/Unix operating systems and requires very little knowledge of coding for its implementation (Wollheim 2006[33]).

Building on FrAMES, Next Generation Framework for Aquatic Modelling of the Earth System (NextFrAMES) is being developed. It uses an eXtensible Markup Language (XML) for describing a model structure (Fekete et al., 2009[34], Lu 2011[12]) and a declarative language to integrate components (Lu 2011[12]). It is characterised by a high level of abstraction; most of the services are hidden behind the platform to offer more straightforward model development environment (Fekete et al., 2009[34]).

FRAMES

Framework for Risk Analysis of Multi-Media Environmental Systems (FRAMES) is a piece of software developed and used by the US Environmental Protection Agency. It is composed of 17 modules (called 3MRA) collectively simulating release, fate and transport, and exposure and risk to human and environment associated with contaminants originating from landfills, waste piles etc (Jagers 2010[35]). As the results are based on ten thousand simulations, to shorten the total run time, the modules use highly simplified representation of processes (Jagers 2010[35]). The communication method is one-way and file-based, which is planned to be replaced by two-way in-memory communication based on OpenMI (Jagers 2010[35]).

Software developed for HP computing

CSDMS

CSDMS is an international initiative, funded by US National Science Foundation (NSF), which promotes sharing, reusing, and integrating Earth-surface models (Peckham et al., 2013[36]). CSDMS implements CCA Common Component Architecture (CCA) standard for model coupling, which is adopted by many US federal agencies. CCA development started in 1998 to address the demand for technology standards in high-performance scientific computing (Peckham et al., 2013[36]). CCA is distinguished by its capacity to support language interoperability, parallel computing and multiple operating systems (Peckham et al., 2013[36]). Three fundamental tools underpin CSDMS, namely: Babel, Ccaffeine, and Bocca (Peckham et al., 2013[36]). CSDMS is equipped with GUI, called Ccafe-GUI, in which components are represented as boxes that can be moved from a palette into a workspace. Connections between components are made automatically by matching ‘uses ports’ to ‘provides ports’ (Peckham and Goodall 2013[37]). Results of simulations can be visualised and analysed during and after the model run using a powerful visualisation tool (VisIt) (Peckham and Hutton 2009[38]), which features, among others, the ability to make movies from time-varying databases (Peckham et al., 2013[36]). A light-weight desktop application is provided, called CSDMS Modelling Tool (CMT), which runs on a PC but communicates with the CSDMS supercomputer to perform simulations (Peckham and Goodall 2013[37], CSDMS 2013[39]).

CCA components’ must be split into initialise, update, and finalise sections. CSDMS provide a tool called Bocca that helps creating, editing and managing CCA-compliant components (Peckham et al., 2013[36]). Models can be written in a number of different languages, i.e.: C, C++, Fortran (77, 90, 95, and 2003), Java, and Python. The communication between such disparate pieces of code is achieved thanks to implementation of the language interoperability tool called Babel, which automatically generates the ‘glue code’, enabling models to exchange data (Peckham et al., 2013[36]). For Babel to do its work, it only needs the descriptions of the component's interface, written either in XML (eXtensible Markup Language) or SIDL (Scientific Interface Definition Language), including information on the data types and the return values of the methods (Peckham et al., 2013[36]).

Data transformations between components are enabled through the use of the utility components, which provide services such as: spatial regridding, time interpolation, unit conversion, variable name matching, or writing outputs to a standard or NetCDF file formats (Peckham et al., 2013[36]).

To allow communication between components they have to be wrapped with two interfaces. The first level interface called Basic Model Interface (BMI), must be implemented by a model developer and provide a set of basic functions, namely: initialise, update, and finalise. These functions allow communication with the underlying wrapped model and enable model to ‘fit into a second-level wrapper’ (Peckham and Goodall 2013[37]). A model that has the BMI interface is converted to a CSDMS component by providing it with the second level interface, called the Common Model Interface (CMI), using the CSDMS automated tools. CMI allows CSDMS components to communicate and exchange data (Peckham et al., 2013[36]). Runtime environment is provided through the third fundamental CSDMS tool called Ccaffeine, which enables ‘component instantiation and destruction, connecting and disconnecting ports, handling of input parameters, and control of Message Passage Interface (MPI) communicators’ (Peckham et al., 2013[36]).

CSDMS maintains a large database of contributed models from a variety of Earth’s surface dynamics disciplines, e.g.: hydrology, sediment transport, landscape evolution, geodynamics, glaciology, coastal and marine, and stratigraphy. The current number of hydrological model in the repository exceeds 50 (CSDMS 2013).

CSDMS have been used in a number of hydrologic studies. Ashton et al., (2013)[40] coupled hydrological transport model HydroTrend with Coastline Evolution Model (CEM) to study how fluctuations in sediment input due to climate change may affect delta morphology and evolution (Ashton et al., 2013[40]). An ongoing PhD study employs CSDMS to improve representation of the physiographic distribution of snow water equivalent and timing and volume of simulated stream flows (CSDMS 2013[39]). Examples of other applications include: studying the consequences of past and future climate changes on water resources, water storage, and the expansion of the desert in the eastern watersheds of Jordan; or investigating the effects of terrain and vegetation structure on soil moisture, hydrological flow, and snowmelt (CSDMS 2013[39]).

BFG

Bespoke Framework Generator (BFG) is software developed at the Centre for Novel Computing (CNC) in the School of Computer Science at the University of Manchester. The rationale for its development was creation of a framework that imposes minimal number of requirements on component's architecture and thus allows for straightforward and flexible model integration (Henderson 2006[41]). BFG only needs metadata, in the form of XML files, in order to generate the required wrapper code, which then can be used with a coupling system of the user's choice (Henderson 2006[41], Warren et al., 2008[42]). A component must comply with a small set of rules, i.e.: it must be a subroutine or a function, and use 'put' to provide data and 'get' to receive data (Warren et al., 2008[42]). XML files must be entered manually by a user; it is planned that in the future they will be generated automatically from a GUI (Henderson 2006[41]).

The process of model integration is characterised by ‘separation of concerns’, which can be summarised by terms: Define, Compose, and Deploy (DCD) (Warren et al., 2008[42]). These terms correspond to three XML files containing interface, composition, and deployment information (Henderson 2006[41]). The interface metadata describes which fields component requires and which it provides, and includes information about the module's time step (Warren et al., 2008[42]). Composition metadata describes how fields are connected between different models. Fields can be connected using either 'inplace I/O' or 'argpass I/O'. In the case of the former, the output fields are connected with the corresponding input fields using “point-to-point notation”. In the case of the later, the connections between fields are made by grouping together the subroutines that use a particular field (Henderson 2006[41]). Deployment metadata defines scheduling information, that is: a number of executables and MPI processes, a number of threads, and a sequence in which model functions are to be called (Henderson 2006[41]). Using an XSLT processor metadata is converted to a source code capable of controlling and coupling the models (Henderson 2006[41]).

BFG supports complex control representation, e.g.: inner loops or convergence based loops (Henderson 2006[41]). On the other hand, it allows for control to be handled within the source code of the models. Such models are referred to as having “minimal compliance” and they must only provide one entry point subroutine that BFG can call to start the model (Henderson 2006[41]).

BFG can generate wrapper code for: "models with Fortran entry points running in sequence on a single machine in a single executable communicating through shared buffers; models with Fortran entry points running concurrently, generated as a single executable communicating through MPI; models with Fortran entry points running concurrently, with a configurable number of executables, communicating through MPI; models with Fortran entry points running concurrently using a TDT sockets implementation; models with Fortran entry points running concurrently using a TDT SSH implementation; models with Fortran 90 entry points running concurrently using OASIS3" (BFG 2013[43]).

At the moment BFG has no built-in capability for carrying out unit, spatial, and temporal transformations. When BFG is used with OASIS, these transformations are carried out by OASIS itself (Henderson 2006[41]).

A prominent example of BFG-facilitated model integration is the Flexible Unified Model Environment (FLUME) — UK Met Office Earth System Modelling system (Henderson 2006[41]). Another example is GENIE Earth System Modelling Framework — IGCM atmosphere and GOLDSTEIN ocean models coupled using OASIS4 and BFG (Henderson 2006[41]). It is also worth mentioning Community Integrated Assessment System (CIAS) — a system used for studying relationships between the economy and the climate change and composed of models distributed across different institutions (Warren et al., 2008[42]). Owing to BFG, models in CIAS can be easily exchanged, allowing for different policy variants and the modelling uncertainty to be readily assessed (Warren et al., 2008[42]).

ESMF

The Earth System Modelling Framework (ESMF) is a software for building complex Earth system modelling applications and is typically used to couple models of large physical domains (ESMF 2013[44]). ESMF originates in the Common Modelling Infrastructure Group (CMIWG), which comprised major US weather and climate modelling organisations. It was developed in response to the NASA Earth Science Technology Office (ESTO) Cooperative Agreement Notice, entitled ‘Increasing Interoperability and Performance of Grand Challenge Applications in the Earth, Space, Life and Microgravity Sciences’, which called for it creation (ESMF 2013[44]). ESMF implements methods, which allow separate components to operate as a single executable, multiple executables or web services (Valcke et al., 2012[45]). It supports parallel computing on Unix, Linux, and Windows HPC platforms (Lu 2011[12], Jagers 2010[35]).

ESMF is based on two types of components: 'Gridded Components' (ESMF_GridComp) and 'Coupler Components' (ESMF_CplComp) (ESMF 2013[44]). Gridded Components represent the physical domain being modelled while Coupler Components enable data transformation and transfer (ESMF 2013[44]). Coupler Component's operations include: time advancement, data redistribution, spectral and grid transformations, time averaging, and unit conversions (ESMF 2013[44]). Coupler Components need to be written in Fortran on case by case basis using ESMF classes (ESMF 2013[44]). Gridded Components need to be split into one or more initialise, run, and finalise sections callable as subroutines (Goodall et al., 2013[27], ESMF 2013[44]). ESMF allow for nested components, with "progressively more specialised processes or refined grids" (ESMF 2013[44]).

The user is required to write a wrapper code that will connect component's native data structures to ESMF data structures (ESMF 2013[44]). There are two ways to do it: either using the 'ESMF_Array' class to represent the data structures in an index-space, or using the 'ESMF_Field' class to represent them it in a physical space (ESMF 2013[44]). In the latter case interpolation weights can be calculated using coordinate information stored in the 'ESMF_Grid' class; bilinear and higher order interpolation calculations in up to three dimensions are supported (ESMF 2013[44]). User is also required to write 'SetServices' routine, which associates the ESMF initialise/run/finalise methods with their corresponding user code methods (ESMF 2013[44]).

Data is passed using container classes called 'States' (Goodall et al., 2013[27]); each Gridded component has an import State, containing its inputs, and an export State, containing its outputs (ESMF 2013[44]). States can hold different data classes, including Arrays, ArrayBundles, Fields, or FieldBundles (ESMF 2013). ‘Arrays store multidimensional data associated with an index space. Fields include data Arrays along with an associated physical grid and a decomposition that specifies how data points in the physical grid are distributed across computing resources. ArrayBundles and FieldBundles are groupings of Arrays and Fields, respectively’ (Goodall et al., 2013[27]).

Although, ESMF is primarily aimed at high performance climate/weather/atmospheric computations, its developers seek cooperation with hydrological modellers and have been looking into ways to achieve cross-domain integration between ESMF and water resources modelling systems (Deluca et al., 2008[46]).

OASIS

Ocean Atmosphere Sea Ice Soil coupler (OASIS) is a software used for coupling models representing different components of the Earth system (OASIS 2013[47]). It was developed at The European Centre for Research and Advanced Training in Scientific Computation (CERFACS) in the framework of the EU FP5 Programme for Integrated Earth System Modelling (PRISM) (Valcke et al., 2006[5]). The main purpose of PRISM was development of the infrastructure for European climate research and it involved 17 European climate research centres and a number of computer software companies (DKRZ 2013[48]). OASIS is characterised by low intrusiveness; "components remain almost unchanged with respect to their standalone mode" (Valcke et al., 2012[45]). In a coupled system components act as separate executables, while the main function of the coupler is to interpolate and exchange data between the components (Caubel et al., 2005[49]). OASIS is based on Fortran and C (Valcke and Morel 2006[50]). Currently three versions of the coupler exist: OASIS3, OASIS4, and OASIS3-MCT (Caubel et al., 2005[49], OASIS 2013[47]). Since OASIS3 only supports 2D coupling fields, a fully parallel OASIS4 was developed, which supports higher number of coupling fields and targets high resolution climate simulations (Caubel et al., 2005[49]). OASIS3-MCT, is the OASIS coupler interfaced with Model Coupling Toolkit (MCT). This version provides capabilities for parallel execution of data transformations and exchanges (OASIS 2013[47]).

To implement data exchange at run time, the components are linked to the OASIS coupling interface library (PSMILe), which enables sending data requesting and data passing calls. The characteristics of the exchanges are defined outside of the model code, in an external user-written configuration file (Valcke et al., 2012[45]).

Due to its flexibility and low intrusiveness, OASIS have been very popular and is currently used by about 35 different climate modelling groups in Europe, Australia, Asia and North America (Valcke et al., 2012[45]). An example of hydrology-related application of OASIS is the study of impacts of climate change on the water cycle in the Mediterranean using the coupled system composed of REgional atmosphere MOdel (REMO), the Max-Planck-Institute for Meteorology Ocean Model (MPI-OM) and the Hydrological Discharge Model (HD model) (Arellano 2011[51]).

Web services

Applications operating as web services are based on components that are independent, distributed, loosely-coupled and exchange data over a computer network. In the hydrological domain web services are used in a number of ways, e.g.: to integrate hydrologic data from heterogeneous sources; to link modelling frameworks with databases; to connect models, databases, and analysis tools into water resources decision support systems; or to join modelling systems from different domains (e.g.: hydrology and climate).

There are a number of examples of successful use of service-oriented technology for environmental data integration. One such example is Hydrologic Information System (HIS), created by the Consortium of Universities for the Advancement of Hydrological Science Inc. (CUAHSI) — an organisation of more than 100 US universities aimed at developing infrastructure and services for the advancement of the hydrologic sciences (Peckham and Goodall 2013). HIS is composed of hydrologic databases and servers connected through web services (Peckham and Goodall 2013[52]). It employs WaterOneFlow web service interface and Water Markup Language (WaterML) for data transmission to enable integration of hydrologic data from heterogeneous data sources into one ‘virtual database’ (Goodall et al., 2011[11]).

Research efforts focus also on ways to integrate data and modelling systems. HydroDesktop is open source GIS-enabled software developed by CUASHIU HIS, which allows accessing HIS services from a personal computer. It not only provides capabilities for data querying, downloading, visualisation, editing, graphing, analysis, and exporting to different formats but also supports integrated model development and use of the retrieved data in simulations (HydroDesktop 2013[53]). HydroModeler is a HydroDesktop plug-in, based on OpenMI Configuration Editor, which provides functionality for building and executing model compositions from within HydroDesktop (HydroDesktop 2013[53]). Another example of data and modelling systems integration stems from the partnership between CSDMS and HIS. As a result of this cooperation a novel system was developed, which allows accessing HIS data through web services calls from within the CSDMS modelling environment (Peckham and Goodall 2013[52]). This functionality was achieved by incorporating an additional component, called DataHIS, within a CSDMS model composition. It is planned that CSDMS web services are further developed, provided that other environmental databases employ standardised interfaces for data retrieval and integration. It is envisioned that in the future CSDMS components could become web services themselves, potentially available to client applications such as HydroDesktop and HydroModeler (Peckham and Goodall 2013[52]).

Building water resources modelling systems using web services is certainly more challenging than using them for data integration. However, it offers an advantage of keeping models independent thus allowing for continuous maintenance and development (Goodall et al., 2011[11]). Goodall et al., (2011)[11] proposed interface design for exposing models as web services and presented a prototype of service-oriented water resources decision support system. The interface was designed combining ideas from two standards: OGS Web Processing Service, and the Open Modelling Interface (Goodall et al., 2011[11]). OpenMI ExchangeItem object was used as a starting point for developing data exchange standard. However, more work is needed to standardise the vocabulary of variables, unit names and geographical referencing systems, possibly adopting NetCDF Climate and Forecast Metadata Conventions (Goodall et al., 2011[11]). For web services integration, OpenMI Configuration Editor was selected, as it already includes conventions specific for water resources modelling. However, since OpenMI does not support web services, a web service component was created that enables incorporation of this functionality within OpenMI (Goodall et al., 2011[11]). To demonstrate the successful implementation of the system, a model simulating rainfall/runoff was assembled (Goodall et al., 2011[11]).

Another technology that could potentially be harnessed for building decision support systems is cloud computing. Environmental Virtual Observatory (EVO) pilot project, sponsored by the UK’s Natural Environment Research Council (NERC), employs cloud computing to integrate datasets, models and tools for cost-effective, efficient and transparent environmental monitoring and decision making (EVO 2013[32]). EVO works with other international partners (e.g.: CUAHSI, NeON) to develop consistent standards for exchanging data and models (EVO 2013[32]). The project activities include developing cyber infrastructure, cloud-enabled environmental models, and a number of exemplar web-based services concerning soil and water management at both local and national scales (EVO 2013[32]). Exemplars developed within the course of the two year pilot project focus on a range of environmental problems, which directly affect the well-being of people in the UK, e.g.: studying national-scale nutrient fate using linked hydrogeological and biochemical models, developing a system to assess the effects of different land management practices on reducing diffuse pollution from agriculture, advancing modelling capabilities for drought and flood predictions to address and mitigate the effects of climate change, or establishing technologies for studying biodiversity and ecosystem service sustainability (EVO 2013[32]). EVO aims to provide different groups of users, from scientists to local stakeholders, with free and easy access to expert knowledge by combining assets from various sources with novel tools for data analysis and visualisation (Gurney et al., 2011[54]). The system is designed to promote feedback, ownership, community involvement, and better communication between technical ad non-technical users (EVO 2013[32]). An example of a community tool established within EVO is The Local Landscape Visualisation Tool, developed by engaging stakeholders in three catchments in the UK: the Afon Dyfi, the River Tarland, and the River Eden (Wilkinson et al., 2013[55]). The tool is accessed via a web portal and communicates flood risk in the local impacted communities. It is based on a number of services, i.e.: catchment datasets, hydrological models, and visualisation tools. Users can access real time data concerning river levels, rainfall, weather, and water quality, which is additionally supported by webcam images, or can use cloud-based models to explore how different land management strategies might affect the risk of flooding (Wilkinson et al., 2013[55]).

Last but not least, web services can be used to link different modelling frameworks. Hydrologic studies traditionally did not consider bi-directional interactions between atmosphere and water bodies. However, as the scale of the models increase, the assumption about the lack of feedback between the land surface and the atmosphere may no longer hold and bi-directional coupling becomes important (Goodall et al., 2013[27]). Up to date coupling of hydrological and climate models has been hindered by discrepancies between both technologies, namely climate models run on high performance computers while hydrologic models run on personal computers (Goodall et al., 2013[27], Saint and Murphy 2010[15]). Additionally, there is a lack of established techniques for transferring data between differing spatial scales of climate and hydrologic models (Goodall et al., 2013[11]). Hydrological Modelling for Assessing Climate Change Impacts at different Scales project (HYACINTS) coupled climate model HIRHAM and physically distributed hydrological model MIKE SHE for the whole of Denmark by migrating both models into the OpenMI standard (HYACINTS 2013[56]). Method based on statistical downscaling and bias-correction was developed to enable data transfer across different grids (HYACINTS 2013[56]). While the project achieved integration of models from different domains, this required migrating them to the same standard. Goodall et al., (2013)[27] proposed a novel approach to loosely couple climate and hydrologic models using web services, which enabled integration of different modelling frameworks. The researchers did not address the problem of data scalability between climate and hydrologic models but merely aimed to develop technically feasible strategy for coupling such models. In the proposed approach web services are used to pass data between a hydrologic model running on desktop computer and a climate/weather model running in HPC environment (Goodall et al., 2013[27]). The prototype developed in the study was a two-way coupled system composed of the Community Atmosphere Model (CAM) and the Soil and Water Assessment Tool (SWAT) (Goodall et al., 2013[27]). CAM implemented with ESMF was made available as a web service. SWAT was provided as an OpenMI compliant model and CAM model was wrapped with an OpenMI interface (Goodall et al., 2013[27]). The execution was controlled and implemented by OpenMI’s Configuration Editor (Saint and Murphy 2010[15]). This study proved that coupling of two disparate modelling systems is feasible while still maintaining the models' original structure and purpose (Goodall et al., 2013[27]). The study provided a technical solution for coupling models running on different computing platforms, e.g.: PC and HPC, different HPCs, or cloud (Goodall et al., 2013[27]). Bridging the gap between OpenMI and ESMF was possible due to features that both standards provide, namely: ESMF supporting web services and OpenMI supporting a wrapper for accessing external services (Goodall et al., 2013). Both frameworks are widely used within their respective communities and their integration is an important milestone in modelling coupled hydrology-climate systems (Saint and Murphy 2010[15]).

References

  1. CRAIG, T, 2011. CPL7 User’s Guide. http://www.cesm.ucar.edu/models/cesm1.0/cpl7/cpl7_doc/ug.pdf (accessed 14.02.14).
  2. KAUFFMAN, B G, JACOB, R, CRAIG, T, and LARGE, W G. 2004: The CCSM Coupler version 6.0: User's guide, source code reference, and scientific description. http://www.ccsm.ucar.edu/models/ccsm3.0/cpl6/users guide/users guide.html (accessed 13.12.13).
  3. VALCKE, S, CRAIG, T, and COQUART, L. 2013. OASIS3-MCT User Guide, OASIS3-MCT 2.0, Technical Report, TR/CMGC/13/17, CERFACS/CNRS SUC URA No 1875, Toulouse, France.
  4. FORD, R W, and RILEY, G D. 2003. Towards the Flexible Composition and Deployment of Coupled Models. World Scientific: 2003: 189–195.
  5. 5.0 5.1 VALCKE, S and MOREL, T. 2006. OASIS and PALM, CERFACS couplers. Technical Report TR/CMGC/06/38.
  6. MOORE, R, GIJSBERS, P, FORTUNE, D, GREGERSEN, J, BLIND, M, GROOSS, J and VANECEK, S. 2010. OpenMI Document Series: Scope for the OpenMI (Version 2.0). In: MOORE, R. (ed.).
  7. MOORE, R V and TINDALL, C I. 2005. An overview of the open modelling interface and environment (the OpenMI). Environmental Science & Policy, 8 279–286.
  8. 8.0 8.1 8.2 8.3 8.4 8.5 8.6 8.7 OATC 2010a. OpenMI Document Series: The OpenMI 'in a Nutshell' for the OpenMI (Version 2.0). The OpenMI Association Technical Committee. In: MOORE, R. (ed.).
  9. OPENMI 2013. The OpenMI Association Website [Online]. [cited 14 November 2013]. Available: http://www.openmi.org/.
  10. FLUIDEARTH 2013. FluidEarth HR Wallingford Website [Online]. [cited 14 November 2013]. Available: http://fluidearth.net/default.aspx.
  11. 11.00 11.01 11.02 11.03 11.04 11.05 11.06 11.07 11.08 11.09 11.10 GOODALL, J L., ROBINSON, B F. and CASTRONOVA, A M. 2011. Modeling water resource systems using a service-oriented computing paradigm. Environmental Modelling and Software 26, 573–582.
  12. 12.0 12.1 12.2 12.3 12.4 12.5 12.6 12.7 12.8 12.9 LU, B. 2011. Development of A Hydrologic Community Modeling System Using A Workflow Engine. PhD thesis, Drexel University.
  13. LAWRENCE, B N, BALAJI, V, CARTER, M, DELUCA, C, EASTERBROOK, S, FORD, R., HUGHES, A & HARDING, R. Manuscript. Bridging Communities: Technical Concerns for Integrating Environmental Models.
  14. 14.0 14.1 14.2 14.3 14.4 OATC 2010b. OpenMI Document Series: OpenMI Standard 2 Specification for the OpenMI (Version 2.0). The OpenMI Association Technical Committee. In: MOORE, R. (ed.).
  15. 15.0 15.1 15.2 15.3 SAINT, K & MURPHY, S. End-to-End Workflows for Coupled Climate and Hydrological Modeling. International Congress on Environmental Modelling and Software, Modelling for Environment's Sake, Fifth Biennial Meeting 2010 Ottawa, Canada.
  16. GRAHAM, D N, CHMAKOV, S, SAPOZHNIKOV, A & GREGERSEN, J B. OpenMI coupling of Modflow and Mike 11. In: GOURBESVILLE, P, CUNGE, J, GUINOT, V & LIONG, S Y, eds. 7th International Conference on Hydroinformatics 2006 Nice, France. Gregory, J M, 2003. The CF metadata standard. Technical Report 8, CLIVAR
  17. GIJSBERS, P, HUMMEL, S, VANECEK, S, GROOS, J, HARPER, A, KNAPEN, R, GREGERSEN, J, SCHADE, P, ANTONELLO, A & DONCHYTS, G. From OpenMI 1.4 to 2.0. International Congress on Environmental Modelling and Software Modelling for Environment's Sake, Fifth Biennial Meeting 2010 Ottawa, Canada.
  18. ISIS 2013. ISIS User Community Website [Online]. [cited 14 November 2013]. Available: http://www.isisuser.com/.
  19. 19.00 19.01 19.02 19.03 19.04 19.05 19.06 19.07 19.08 19.09 19.10 19.11 19.12 19.13 19.14 19.15 19.16 DAVID, O, ASCOUGH II, J C, LLOYD, W, GREEN, T R, ROJAS, K W, LEAVESLEY, G H and AHUJA, L R. 2013. A software engineering perspective on environmental modeling framework design: The Object Modeling System. Environmental Modelling and Software, 39 201–213.
  20. OMS 2013. Object Modelling System Website [Online]. [cited 14 November 2013]. Available: http://www.javaforge.com/project/oms.
  21. 21.0 21.1 21.2 21.3 21.4 DAVID, O, CARLSON, J R, LEAVESLEY, G H, ASCOUGH II, J C, GETER, F W, ROJAS, K W and AHUJA, L R. 2010. Object Modeling System v3.0 Developer and User Handbook.
  22. 22.0 22.1 22.2 22.3 22.4 22.5 22.6 22.7 22.8 22.9 RAHMAN, J M, SEATON, S P, PERRAUD, J-M., HOTHAM, H, VERRELLI, D I and COLEMAN, J R. It's TIME for a New Environmental Modelling Framework. Proceedings of MODSIM International Congress on Modelling and Simulation 2003 Townsville, Australia. Modelling and Simulation Society of Australia and New Zealand Inc., 1727–1732.
  23. 23.0 23.1 23.2 23.3 23.4 23.5 23.6 ARGENT, R M, PERRAUD, J M, RAHMAN, J M., GRAYSON, R B and PODGER, G M. 2009. A new approach to water quality modelling and environmental decision support systems. Environmental Modelling & Software 24, 809–818.
  24. 24.0 24.1 24.2 24.3 KEPLER 2013. The Kepler project [Online]. [cited 14 November 2013]. Available: https://kepler- project.org/.
  25. 25.0 25.1 25.2 25.3 25.4 KEPLER. 2013a. Getting started with Kepler Manual [Online]. [cited 14 November 2013]. Available: https://kepler-project.org/.
  26. 26.0 26.1 26.2 26.3 26.4 26.5 26.6 26.7 KEPLER. 2013b. Kepler User Manual [Online]. [cited 14 November 2013]. Available: https://kepler-project.org/.
  27. 27.00 27.01 27.02 27.03 27.04 27.05 27.06 27.07 27.08 27.09 27.10 27.11 GOODALL, J L, SAINT, K D, ERCAN, M B, BRILEY, L J, MURPHY, S, YOU, H, DELUCA, C and ROOD, R B. 2013. Coupling climate and hydrological models: Interoperability through Web Services. Environmental Modelling and Software, 46 250–259.
  28. 28.00 28.01 28.02 28.03 28.04 28.05 28.06 28.07 28.08 28.09 28.10 28.11 28.12 28.13 28.14 28.15 28.16 TAVERNA. 2013. Taverna Workflow Management System Website [Online]. Available: http://www.taverna.org.uk/ [cited 2/12/2013].
  29. DEELMAN, E, GANNON, D, SHIELDS, M and TAYLOR, I. 2009. Workflows and e-Science: An overview of workflow system features and capabilities. Future Generation Computer Systems 25, 528–540.
  30. 30.0 30.1 SROKA, J, HIDDERS, J, MISSIER, P and GOBLE, C. 2010. A formal semantics for the Taverna 2 workflow model. Journal of Computer and System Sciences, 76, 490–508.
  31. 31.0 31.1 DE ROURE, D & GOBLE, C. 2009. Software Design for Empowering Scientists. Ieee Software 26, 88–95.
  32. 32.0 32.1 32.2 32.3 32.4 32.5 EVO 2013. Environmental Virtual Observatory Website [Online]. [cited 14 November 2013]. Available: http://www.evo-uk.org/.
  33. 33.0 33.1 33.2 WOLLHEIM, W. FrAMES — Framework for Aquatic Modeling of the Earth System. Denitrification Modeling Across Terrestrial, Freshwater and Marine Systems Workshop 2006 Millbrook, New York.
  34. 34.0 34.1 FEKETE, B M, WOLLHEIM, W M, WISSER, D and VÖRÖSMARTY, C J. 2009. Next generation framework for aquatic modeling of the Earth System. Geosci. Model Dev. Discuss. 2 279–307.
  35. 35.0 35.1 35.2 35.3 JAGERS, H R A. Linking Data, Models and Tools: An Overview. International Congress on Environmental Modelling and Software Modelling for Environment's Sake, Fifth Biennial Meeting 2010 Ottawa, Canada.
  36. 36.00 36.01 36.02 36.03 36.04 36.05 36.06 36.07 36.08 36.09 36.10 PECKHAM, S D, HUTTON, E W H & NORRIS, B. 2013. A component-based approach to integrated modeling in the geosciences: The design of CSDMS. Computers and Geosciences, 53, 3–12.
  37. 37.0 37.1 37.2 PECKHAM, S D and GOODALL, J L. 2013. Driving plug-and-play models with data from web services: A demonstration of interoperability between CSDMS and CUAHSI-HIS. Computers and Geosciences, 53, 154–161.
  38. PECKHAM, S D and HUTTON, E. Componentizing, standardizing and visualizing: How CSDMS is building a new system for integrated modeling from open-source tools and standards. American Geophysical Union Fall Meeting 2009.
  39. 39.0 39.1 39.2 CSDMS 2013. The Community Surface Dynamics Modeling System Website [Online]. [cited 14 November 2013]. Available: http://csdms.colorado.edu/wiki/Main_Page.
  40. 40.0 40.1 ASHTON, A D, HUTTON, E W H, KETTNER, A J, XING, F, KALLUMADIKAL, J, NIENHUIS, J and GIOSAN, L. 2013. Progress in coupling models of coastline and fluvial dynamics. Computers and Geosciences, 53 21–29.
  41. 41.00 41.01 41.02 41.03 41.04 41.05 41.06 41.07 41.08 41.09 41.10 41.11 HENDERSON, I. 2006. GENIE BFG. University of Bristol Geography Source Website. [Online]. Last revised 26 August 2008. [cited 14 November 2013]. Available: https://source.ggy.bris.ac.uk/wiki/GENIE_BFG.
  42. 42.0 42.1 42.2 42.3 42.4 42.5 WARREN, R, DE LA NAVA SANTOS, S, ARNELL, N W, BANE, M, BARKER, T, BARTON, C, FORD, R, FÜSSEL, H M, HANKIN, R K S, KLEIN, R, LINSTEAD, C, KOHLER, J, MITCHELL, T D, OSBORN, T J, PAN, H, RAPER, S C B, RILEY, G, SCHELLNHÜBER, H J, WINNE, S and ANDERSON, D. 2008. Development and illustrative outputs of the Community Integrated Assessment System (CIAS), a multi-institutional modular integrated assessment approach for modelling climate change. Environmental Modelling and Software 23, 1215–1216.
  43. BFG 2013. Bespoke Framework Generation Website. [Online]. Centre for Novel Computing University of Manchester. [cited 14 November 2013]. Available: http://cnc.cs.man.ac.uk/projects/bfg.php.
  44. 44.00 44.01 44.02 44.03 44.04 44.05 44.06 44.07 44.08 44.09 44.10 44.11 44.12 ESMF 2013. Earth System Modeling Framework Website [Online]. [cited 14 November 2013]. Available: http://www.earthsystemmodeling.org/.
  45. 45.0 45.1 45.2 45.3 VALCKE, S, BALAJI, V, CRAIG, A, DELUCA, C., DUNLAP, R, FORD, R W, JACOB, R, LARSON, J, O'KUINGHTTONS, R, RILEY, G D & VERTENSTEIN, M. 2012. Coupling technologies for Earth System Modelling. Geoscientific Model Development, 5, 1589–1596.
  46. DELUCA, C, OEHMKE, R, NECKELS, D, THEURICH, G, O'KUINGHTTONS, R, DE FAINCHTEIN, R, MURPHY, S and DUNLAP, R. Enhancements for Hydrological Modeling in ESMF. American Geophysical Union Fall Meeting 2008.
  47. 47.0 47.1 47.2 OASIS 2013. OASIS Coupler Website [Online]. [cited 14 November 2013]. Available: https://verc.enes.org/oasis.
  48. DKRZ 2013. PRISM Program for Integrated Earth System Modelling Website [Online]. [cited 14 November 2013]. Available: http://www.dkrz.de/daten- en/wdcc/projects_cooperations/past-projects/prism.
  49. 49.0 49.1 49.2 CAUBEL, A, DECLAT, D, FOUJOLS, M-A, LATOUR, J, REDLER, R, RITZDORF, H, SCHOENEMEYER, T, VALCKE, S and VOGELSANG, R. 2005. The PRISM couplers: OASIS3 and OASIS4. Geophysical Research Abstracts [Online], 7.
  50. VALCKE, S and MOREL, T. 2006. OASIS and PALM, CERFACS couplers. Technical Report TR/CMGC/06/38.
  51. ARELLANO, A E. 2011. The Water Cycle in the Mediterranean Region and the Impacts of Climate Change. PhD thesis, Max-Planck-Institute for Meteorology.
  52. 52.0 52.1 52.2 PECKHAM, S D and GOODALL, J L. 2013. Driving plug-and-play models with data from web services: A demonstration of interoperability between CSDMS and CUAHSI-HIS. Computers and Geosciences, 53, 154–161.
  53. 53.0 53.1 HYDRODESKTOP 2013. HydroDesktop CUAHSI Open Source Hydrologic Data Tools Website [Online]. Last revised 13 March 2012. [cited 14 November 2013]. Available: http://his.cuahsi.org/hdhelp/welcome.html.
  54. GURNEY, R, EMMETT, B, MCDONALD, A, BLAIR, G, BUYTAERT, W, FREER, J E, HAYGARTH, P, REES, G, TETZLAFF, D and EVO SCIENCE TEAM. The Environmental Virtual Observatory: A New Vision for Catchment Science. American Geophysical Union Fall Meeting 2011.
  55. 55.0 55.1 WILKINSON, M, BEVEN, K, BREWER, P, EL-KHATIB, Y, GEMMELL, A, HAYGARTH, P, MACKAY, E, MACKLIN, M, MARSHALL, K, QUINN, P, STUTTER, M., THOMAS, N & VITOLO, C. The Environmental Virtual Observatory (EVO) local exemplar: A cloud based local landscape learning visualisation tool for communicating flood risk to catchment stakeholders. EGU General Assembly 2013 Vienna Austria.
  56. 56.0 56.1 HYACINTS 2013. Hydrological Modelling for Assessing Climate Change Impacts at different Scales Project Website [Online]. Last revised 26 June 2009. [cited 14 November 2013]. Available: http://hyacints.dk/main_uk/main.html.