OR/14/042 Appendix 3 – Summary of breakout groups

From MediaWiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
Royse, K R, and Hughes, A G (editors). 2014. Meeting Report: NERC Integrated Environmental Modelling Workshop (Held at the British Geological Survey, Keyworth, 4–5 February). British Geological Survey Internal Report, OR/14/042.

Q1: In the future most models will at some point in their life-cycle need to be linked to other models. What needs to be achieved in order for this to happen?

Group 1

  • The idea that platforms are heading towards a system was introduced
  • Qu. - What is your definition of a platform?
  • Basis/framework starting point
  • An environment in which one can explore data/using models and other tools
  • Common systems designed to link model components
  • Lots of models, wires for linking up visualising results, platform upon which to play with models
  • A modelling platform is a structure that allows model components to hang together, communicate, the sum of which is a recognisable tool to tackle a Dig problem
  • A system that facilitates the integration of model components and environmental datasets into a framework that can deliver the outputs required by users including an objective assessment of the uncertainties associated with the output
  • Hardware and software infrastructure with defined set of standards for interaction between code and datasets. Could also be populated with models and data required
  • Infrastructure – servers web access (cloud) accessibility
  • Software – Operating systems security, interoperability, connectivity, data
  • Definitions – (ontology & semantics) workflow
  • User interface – Inputs, outputs adaptability
  • A (computer-based) infrastructure for users to engage with models, data and tools for the processing and visualisation of results… and/or… a toolbox

How do we get to the next stage in order to achieve the vision?

  • Platform needs to solve a problem
  • Without a problem is difficult to construct
  • Do we need several platforms?
  • Are Met Office models platforms?
  • Not necessarily one all signing/dancing platform — principle is not affected by the scale of the problem. Current infrastructure is not there for linking/running models in a plug-play style
  • Not only need a framework but also a set of tools/visualisation
  • Platforms open up modelling to the wider community (social sci/economist/non-specialist)
  • There is a list of simple questions that will require a lot of work to answer
  • We are not the only people looking at this — Medical
  • Generic platform with different views
  • Provision of metadata must exist to provide info to non-specialist users. This would have to be tailored to different communities (Households/policy makers).
  • Different layers of metadata
  • Metadata is essential if impacts are unexpected
  • Can then construct a model-chain
  • Platform needs to generate its own metadata
  • Version data/library is important
  • Liability and traceability are important (litigation drives things in the US)
  • We need to be more rigorous with linked components
  • Artificial Intelligence for metadata
  • Quality and Uncertainty must form a part of the platform
  • Scaling is important as different properties may emerge at different scales. These may be unexpected where multiple systems are interacting
  • Platform should have the tools to explore impacts
  • Platform as a means of bridging scales
  • Scale bridging needs to be done in an intelligent way
  • Built in checks are important to tell you when something is not scaled properly
  • NERC or EPSRC needs to stat a research program into integrated modelling
  • Just because two components are the same — should they be linked?
  • Probably not
  • We need to have a conceptual framework behind the platform (a set of rules and a checklist)
  • CSDMS has a tool where you pick a component and that guides you through the process
  • OpenMI does not impose any constraints on the modelling you do, however real world links need to be checked by the user
  • Requirement for a research program where a working group should establish some of the biggest/best questions to be answered
  • Is a platform liked by/useful for scientists?
  • Scientist benefits from platform by developing the conceptual model
  • Scientists can free up time to spend on other research
  • We are at the start of a large learning curve — IEM is difficult
  • Platforms help bridge the gap and bring down the required skill levels
  • e.g. Google earth is easy — the early GIS on which they are based are difficult
  • If barriers are in place for scientists to make their models platform compliant, they won’t
  • Computing infrastructure has matured to a point where we can make better progress
  • We do however need to make things slick so that it is used
  • Culture barrier in academia to develop something original — are platforms original?
  • Find it had to publish platform output
  • Find it difficult to get funding — seen as risky
  • Not really platform specific
  • How do you QA model output from platform
  • Role for research centres
  • NERC Training up users
  • Why are you building a platform/community?
  • Need to feel like you are part of something
  • Scientists want their output used
  • We have to do the boring things to get things working

Group 2

  • Qu - What is your priority in getting a platform together?
  • Dog wagging tail, i.e. questions drive system not a system developed for its own sake
  • Needs to be a science problem
  • Forums for linking scientists — a forum with model scientists
  • Need to define users and what they need — should this be left to NERC?
  • Links to needing a strong science-policy interface
  • Should they be specially trained?
  • Policy makers don’t have the time to pose the best questions
  • Scientists don’t know the pressure on policy makers to answer questions
  • Feeds in to defining your question
  • Need to set an output that is attainable and need a question that brings people together
  • Assumption we are making is that the questions need IEM to solve
  • If the question is not defined — then difficult to answer
  • Flexibility is key — use of software/hardware can be changed easily to answer different questions using the same platform.
  • Must supply as much information as possible to the user
  • Allows the user to trust the model
  • Security is important
  • Needs to be efficient — changing who is in control
  • Learning from other users
  • How do you motivate scientists to document it properly — strengths and weaknesses
  • Have to change the reward system for scientists to do this?
  • Version control software (got to trust system)
  • Education is required
  • Can cause damage if this is not done
  • IPR — are all models going to be open source
  • Could just make the output available and not necessary the input/model data
  • Access to the code makes you feel better
  • Commercial side may not allow you publish/release data/code
  • A universal way of providing feedback for a model
  • A way of versioning so that mistakes in the code don’t make their way back into the code-proper. Maybe have a trunk version?
  • Academic — Commercial interface is a difficult area in which to operate.
  • If one part of the model chain is commercial does that impose IPR issues to the rest of the platform output
  • Encryption on links — e.g. the insurance industry
  • The industry is moving towards open source — the data that drives the models will however remain hidden
  • Problems with data access/licensing (strategic UK datasets especially)
  • But there are some areas where access is good (e.g. WRF)
  • Variability in data access is a problem
  • Can we scope what resources will be needed
  • High end — Airbus has whole teams
  • Low end — couple of people
  • This QA is undertaken by making code open source
  • There is a spectrum of users/developers that code with good-will

Group 3
The previous groups’ findings was introduced [listed above] and asked the question — what do you think?

  • Scale of question can drive what hardware you need (e.g. laptop to solve dredging problem and HPC to solve UK flooding problem)
  • Who are the modellers and who are the scientists — mostly researchers are both
  • Danger of underestimating the answer to the question
  • You can end up with a large modelling platform to answer a small problem which wastes resources
  • A small number of platforms would be a better approach
  • What we can do now with technology has not been clearly described — policy makers don’t know what is possible
  • Strengths of this group — Atmosphere/Earth surface modellers in one group
  • Two drivers
  • Scientific Problems that can’t be solved (need big spending on resources and less on people)
  • End user problems (needs people and less on resources)
  • Both important but separate and require different approaches
  • Need to be clearer about what the driver is guiding
  • We now have a set of tools and if we combine them we may have a platform
  • If the NERC definition of platform (Ship/Plane/etc.) be used here
  • Large piece of infrastructure that only works on the bigger scale
  • People resources is the bottleneck that is holding thing up
  • We could end up with some new platforms if we brought stuff we already have together
  • Should we working more closely with other organisations or use the competition to drive innovation
  • What turns a workflow into a platform?
  • Who is the user?
  • Science — need a good question
  • Corporate — need a good partner
  • Governance
  • Platforms not Platform is important
  • Modular structure is important — how do we get missing model components?
  • Is IEM value for money — in terms of people or resources
  • Needs to be future proofed in terms of modelling and data structures

Q2: Encourage the development of modelling platforms. What needs to be done to enable this to happen?

There are number of challenges to achieve this goal:

  • A significant amount of work is required!
  • Consideration should be given as to whether all models need to be made linkable
  • For example, some should be used ‘stand alone’

But — modular approach is a good approach

  • Start with new framework
  • Conceptual & predictive — may need different frameworks

The next steps are as follows:

  • Categorise models
  • Focus on questions to solve
  • Perhaps choose a generic framework/questions (flooding)
  • Standards need to be developed and/or existing ones used for the following:
  • described models, i.e. metadata
  • data in/out (flat files as well as runtime linking)
  • computing environment
  • Motivation for adopting standards — this needs to be considered
  • Become market driver — can also hold back.
  • Open standard! (community contribute) — these need to be clearly documented
  • CF standard — example
  • Categorise standards
  • Level of integration
  • Standards at each level — taxonomy of standards — processes in 1 model can provide the input parameter of another
  • Models to exchange processes — time consuming
  • Metadata standards are including assumptions
  • Think about how & what models are exchanging (map/flow diagram)
  • Financial — bar code idea (useful?)
  • Work on different areas & harmonise
  • How to find common denominate
  • Realistic standards that number of communities can meet
  • Functional difference — how much — very different problems
  • Standardisation vers variety
  • Common group? — time stepping
  • Break off ‘big’ categories
  • Mesh/grid independent
  • Spatially resolved — geographically rep in space
  • e.g. climate system — but what else?
  • Decision support — env/economic/social
  • Where feedback — more complex
  • Understand where existing models failing
  • Demonstrate better model
  • e.g. Somerset levels
  • Linking more systematic & robust
  • Repeatability
  • Validation
  • Interoperability
  • Extensibility
  • Model Metadata
  • Allows taking out tight coupling — for example
  • Model description (assumptions, emulators etc.)
  • Is model right for right reasons?
  • Effort to describe & pop metadata — time
  • Some standards available e.g. ISO 19115
  • Drivers for metadata
  • Legal
  • Funding
  • Citation
  • Community models
  • High level s/w for metadata capture
  • Skills software eng. required
  • Traceability of incorporated models
  • Most models simple in essence
  • History
  • Can data pick up metadata as it goes through chain?
  • S/w should maintain audit trail
  • Useful to have

Summary:

  • Resource
  • Incentivise
  • Taxonomy of Standards
  • Metadata
  • Automatic capture
  • Skills

Group 1, comments on previous discussion:

  • Overlap with frameworks
  • Platform/interface distinction




  • Parameters/plumbing — horses for courses
  • Parameters — in metadata
  • Clear ontology
  • What in/out
  • What operation performed
  • Coding expert knowledge in to get meaningful result
  • Incentivisation
  • Funders to require compatibility
  • Establish critical mass — NERC/EPSRC e.g.
  • Encourage linkages between platforms
  • Description of model
  • Look at other systems — e.g. human body
  • Similar discussion in medical model — link
  • Need RCUK level input
  • Skills
  • S/W expertise to assist metadata
  • How to support legacy models of E.O.











  • Metadata
  • What do you need for standards to communicate
  • IPR issues
  • Open data to solve?
  • Free financial models
  • But input data may have different IPR for linked models
  • Conflict with commercial exploitation
  • Issue of comm. Use data/models
  • Political problem — address
  • Description — taxonomy of standards




  • Make standards work at particular level — not big ahead of taxonomy
  • Parameter types (A.B.C…) — what dataset/models can I link to
  • ESMF — e.g.
  • What is standard for
  • Investment in current standard — period of stability

Summary from Group 1 & 2 on Question 1:

  • Need for minimum standard — model metadata — (plus understanding model)
  • ‘Taxonomy’ of standards — need to adopt/meet that which best ‘fits’
  • Link model standards
  • Focus on interface
  • Multiple agencies using same ‘framework’
  • Incentivisation (RCUK)
  • Critical mass
  • IPR
  • Skills
  • Gain momentum

Standards:

  • Bench — marking — model performance (e.g. water & energy balance)
  • Capturing scientific reality/accuracy of final model — getting it right
  • Danger of ‘plug & play’ — users can make inappropriate links
  • Avoid ‘bolting together’ — interactive process
  • Include users experience of using version ‘x’
  • Do you always need to fully couple — effort worthwhile
  • IPR — value still in being expert on your model
  • Analysis of pros/cons of open access
  • ‘Idiots’ guide to standards
  • Horizon 2020 — funding for standards
  • Exemplars — could examine benefits & time for linking – advantage in dynamic linking?

Q3: Assessing and quantifying uncertainty. What needs to be done to enable this to happen?

Assessing & Quantifying Uncertainty in IEM:
Ensembles & Models (ISIMIP) Climate & Impact Models Range of uncertainties for predictions One component of Uncertainty

  • Structural also parameters
  • Users need different measures
  • Can be changed parameters or input data ranges
  • Hydromodels — flood forecasting/drought
  • Scenario uncertainty (boundary conditions)
  • Need to track uncertainty
  • Comparison with observation (real-time or historic)
  • Must occur at each model
  • Especially at interfaces
  • Extreme events

Different Models will be responsive to mean or extremes: Smoothing determines lack of

  • Coupled models scales
  • Baysian
  • Model assumptions
  • Uncertainty measures
  • Metrics relevant to end-users,
  • 2 types of structural uncertainty
  • Spatial configuration
  • Time-series variability
  • Kinds of structural uncertainty
  • States that particular models can reach depending on input parameters
  • Changes to equations
  • Presenting Uncertainty Should we present it, without devaluing values

Coupling of many models makes understanding ‘modelling chain’ a largely redundant concept

  • Will ensembles help
  • Incorporating model use risks massive uncertainty
  • Remediated by collaboration
  • Open source models facilitates checking, but also miss-use
  • Uncertainty as a flag for maturity
  • Metadata must include boundary conditions & fundamental limitations of models
  • Codifications of other peoples’ model limitations to ensure models used is fit-for-purpose
  • Are their problems that can be solved better if more models included?
  • People will assess their own uncertainty
  • Uncertainty understanding required by model developers rather than end-users
  • Uncertainty used to assess usefulness of models

Data assimilation for models not communicating to end-users

  • Statistical assessment of models required
  • Model outputs need to be tested
  • Can we have scientific (e.g. geophysical) interface to allow us to test components?
  • Lack of ‘geophysical zippers’ as scale determines how this can be done

Baysian Uncertainty requires understanding of priority weighting/rating the ‘arbitrariness’
Coupling uncertainty in how you couple models as will otherwise get too complex

  • Coupling different time??
  • Bias corrections

Observation System simulation experiments

  • How do observations bias results?
  • Observation biases underplayed
  • Difficulties measuring extremes
  • Time stops can multiply uncertainty

Principals

  1. User context dependent: End-users will determine what they need
  2. Effect of uncertainty calculations on interactions
  3. Effect of interactions on uncertainty
  4. Exponential increase in work
  5. Assumptions & metrics of uncertainty
  6. Not uncertainty that’s improved, but confidence
  7. To what extent do people write extra applications to manage uncertainty around core equations?
  8. Tracking parameter uncertainties with parameters through each element of modelling process
  9. Sensitivity – Is uncertainty of parameter important to problem being tested (e.g. relevant scales)
  10. Limits where model results cannot be exceeded
  11. Need component uncertainty to understand linked system
  12. Scenario uncertainty/intrinsic chaos/how well does model represent processes?
  13. Emulators to assess uncertainty

Issues to address:

  1. Integral assumptions can be documented in metadata
  2. document uncertainties in results separate — dynamic
  3. Understanding of statistics in deterministic models — mismatch between statistics with modeller
  4. Explore workflow engines — identify best fit-for-purpose
  5. Knowledge of tools from other sectors
  6. Do uncertainties compound? Feedback
  7. Communication to end-users
  8. Some uncertainties cannot be meaningfully calculated

Confidence — has uncertainty been minimised in sense that best estimate been made rather than variability of output parameters.

  • End-users want to know that underpinning data is included, not ignored, of that range is within same limits
  • Clear vocabulary of uncertainties
  • Process of assessing uncertainty more important than value
  • Known unknowns
  • Statistical distribution rather than statements
  • Can interfaces cope with additional uncertainty information
  • Computational environment creating uncertainty through calculation changes or precision
  • Metrics of how well modelled
  • Observational evidence — to understand system & deal with extremes
  • Mismatch between observers & modellers
  • Need to look at uncertainties between components across interfaces, within interfaces & whole system
  • Training to understand uncertainties as important as training in the IT
  • Make clear that quantifying models does not increase its value
  • Need to understand users perception of risk