OR/14/022 Introduction

From Earthwise
Jump to: navigation, search
Barkwith A K A P, Pachocka M, Watson C, Hughes A G. 2014. Couplers for linking environmental models: Scoping study and potential next steps. British Geological Survey Internal Report, OR/14/022.

Background

To make Integrated Modelling (IM) work, a way of passing data between models is required and to facilitate this, standards are necessary. Two sets of standards are required: data standards for one way, static transfer of data and model runtime standards for dynamic coupling. For the former, BGS environmental modellers use some basic standards: dxf, CSV, etc. However, it would be useful to identify internationally accepted standards that could be used for data exchange, particularly for gridded data. In terms of exchange of data during model runtime, the current standard and implementation used at BGS is OpenMI. OpenMI was designed with the solution of the problem posed by the Water Framework Directive, that of simulating catchment processes in a holistic manner. Therefore, the main aim of the OpenMI as it is currently implemented is for flexibility. It may not be appropriate in a high performance computing environment. Given that BGS’ requirements may change then it is necessary to identify and understand other standards or even approaches for linking models at runtime.

This report, therefore, focuses on the data standards for static and runtime coupling of numerical models used in the hydrological and atmospheric sciences. Included in this process are workflow engines, but approaches for other disciplines such as risk in the insurance industry and human health are not included.

The need for couplers

The need for interdisciplinary environmental modelling has become clear over the last decade as the evidence of the climate change has been growing stronger. Such modelling provides the means to study complex dynamics of the Earth system and thus aids finding ways to mitigate the impacts of the environmental change. In the year 2000, the Water Framework Directive was enacted, which recognised the need to implement integrated management strategies to address ever more rising and conflicting demands for water resources in a catchment. This problem is best addressed by adopting sound modelling approaches. Integrated modelling requires sharing and coupling models simulating different parts of the Earth system. The approach used to link such models is called ‘a coupler’. While a large number of different couplers are currently in use by scientists, their basic functions remain the same, namely: coordinating the execution of the coupled models and managing data transfer between them (Valcke et al., 2012[1]).

The technologies used for coupling models vary in the level of ‘intrusiveness’, which can be defined as the amount of work required to make a component ‘couplable’ (Lawrence et al., Manuscript[2]). The coupling technologies can be divided into: monolithic, component-based, communication-based, and scheduled (Dunlap et al., 2013[3]). The monolithic approach requires combining code from multiple models into one code (Dunlap et al., 2013[3]). The component-based approach introduces the concept of standard interfaces. In this approach each model, called a ‘component’, has: an interface to communicate with other models, a structure in compliance with predefined criteria, and performs a distinct function (Dunlap et al., 2013[3], Lu 2011[4]). In communication and scheduled approaches models are independent (Dunlap et al., 2013[3], Lu 2011[4]). The communication approach requires embedding library calls within the model's code for sending and receiving data (Dunlap et al., 2013[3]). In the scheduled approach the output from one model is used as an input to the next one, thus the models do not affect each other during the execution (Dunlap et al., 2013[3]).

The coupling technologies can be formally divided into: coupling libraries, coupling frameworks, and workflows (Lawrence et al., Manuscript[2], Dunlap et al., 2013[3]). Libraries provide concrete solution fragments (Lawrence et al., Manuscript[2]); they minimise the amount of code changes required to make a model couplable, typically allowing it to act as independent executable and merely to exchange data at appropriate locations and times (Dunlap et al., 2013[3]). Frameworks use standard interfaces for communication with the components, which must comply with the interfaces' calling conventions (Dunlap et al., 2013[3]). Consequently that components must be structured in accordance with a predefined architectural design (Dunlap et al., 2013[3]). Workflow engines are non-intrusive tools that allow components to remain independent, solely coordinating the exchange of data (Lawrence et al., Manuscript[2]). There are significant overlaps between the technologies and they are often used in tandem (Lawrence et al., Manuscript[2]). Based on the level of integration between the components, the coupling can be defined as either ‘tight ’or ‘loose ’(Goodall et al., 2011[5]). Summarising, while all couplers have the same basic functions they differ in the level of component standardisation, the way the components are called and exchange data, and the degree to which they are integrated.

A large number of coupling technologies were developed up to date, which seemingly appears to be a redundant effort. However, this is not the case as different approaches address different, often conflicting demands, like: generality, flexibility, ease of use, accuracy, and performance (Jagers 2010[6]).

Coupler use cases and requirements gathered from BGS staff

In 2010 the BGS produced the Dream Scoping study report (Giles et al., 2010[7]), as part of the research for this report a wide range of BGS scientists, responsible for answering questions raised by clients, were asked what they required from a model linkage solution, below are a selection of those responses:

“As a geologist focussed on the urban environment I want an environmental modelling platform to act as an effective communication tool, perhaps through visual representations of processes, so that others, including non-geologists, can better understand the model.”
“As a geoscience standards and property team member I want to be able to calculate the financial implications of varying sub-surface project options, for example 'where is the cheapest place to dig this tunnel?', so that our customers (& potential customers) understand the significance and benefits of sub-surface knowledge.”
“As a geophysicist I want an environmental modelling platform to handle high volumes of data traffic on a regular and ongoing basis, so that I can process real time data from the field or sensors, automatically model it and I & customers can view the results and identify trends.”
“As a flood analyst, I want to predict possible flood scenarios for the village over the next 24hours using various inputs such as rain fall, groundwater, water table levels, so that decision makers can be given the info necessary to decide whether the village should be evacuated.”

At the time of capturing these use cases the imagined solution was referred to as an environmental modelling platform and opinions varied greatly on how much functionality would be delivered through the new platform and what existing components would be re-used. Despite significant differences in opinion it was possible to identify a common set of desirable attributes that any solution should exhibit.

Commonly desirable model coupling technology attributes

There is an almost bewildering choice of methodologies, technologies and tools available to integrated environmental modelling (IEM) practitioners, however there are some concepts which we regard as desirable.

The IEM technologies used by the BGS should incorporate the following attributes:

  • Ability to link models in a modular way, rather than developing a single piece of code (model) that incorporates data manipulation and scientific logic we should encourage developers to separate out these functions so that they can be used in more than one scenario.
  • Visual workflow builders open up the world of linked model development to users with little to no programming experience. Although care should be taken to ensure that any assessment of the performance of a linked model solution fully considers the impact of technological implementation as well as scientific logic, this becomes difficult when the user does not fully understand how a technology works behind the scenes.
  • It should be simple to capture the metadata required to describe scientific models, the data they require and any data outputs generated, in order to support model discovery and provide guidance on how to use the model(s).
  • Coupling technologies which exhibit a low degree of invasiveness tend to have less of a negative impact on the performance of existing models, extensive alterations can lead to code divergence and may adversely affect the original model design or purpose. In addition, alterations made for one technology can limit model re-use in alternative technologies.
  • Technologies with significant community support provide potential users with a confidence that help is at hand should it be needed. The BGS should pay particular attention to the technologies favoured by communities who specialise in those areas of science we wish to integrate with.
  • And finally, a ‘stable’ or clearly versioned technology provides the user with a certain degree of certainty that doesn’t exist with rapidly changing environments. Models and linked models can be assessed for their scientific value without the added confusion of a transient informatics platform. Although the technology should be stable, it is also desirable that there is an active, albeit separate, development path which helps to improve the technology in response to community needs.

Structure of these articles

The following articles describe in detail the dynamic (run-time) approaches for atmospheric and hydrological approaches, which is followed by a summary of data standards for one-way, static transfer of data. Article OR/14/022 Comparison of approaches compares the different approaches and the findings of are summarised in OR/14/022 Summary and recommendations along with providing recommendations for the next stage of work.

References

  1. VALCKE, S, BALAJI, V, CRAIG, A, DELUCA, C., DUNLAP, R, FORD, R W, JACOB, R, LARSON, J, O'KUINGHTTONS, R, RILEY, G D and VERTENSTEIN, M. 2012. Coupling technologies for Earth System Modelling. Geoscientific Model Development, 5, 1589–1596.
  2. 2.0 2.1 2.2 2.3 2.4 LAWRENCE, B N, BALAJI, V, CARTER, M, DELUCA, C, EASTERBROOK, S, FORD, R., HUGHES, A and HARDING, R. Manuscript. Bridging Communities: Technical Concerns for Integrating Environmental Models.
  3. 3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 DUNLAP, R, RUGABER, S and LEO, M. 2013. A feature model of coupling technologies for Earth System Models. Computers and Geosciences, 53, 13–20.
  4. 4.0 4.1 LU, B. 2011. Development of A Hydrologic Community Modeling System Using A Workflow Engine. PhD thesis, Drexel University.
  5. GOODALL, J L., ROBINSON, B F and CASTRONOVA, A M. 2011. Modeling water resource systems using a service-oriented computing paradigm. Environmental Modelling and Software 26, 573–582.
  6. JAGERS, H R A. Linking Data, Models and Tools: An Overview. International Congress on Environmental Modelling and Software Modelling for Environment's Sake, Fifth Biennial Meeting 2010 Ottawa, Canada.
  7. GILES, J R A, et al. 'Data, and research for applications and models (DREAM): scoping study report.' (2010).