OR/14/042 Appendix 3 – Summary of breakout groups
Royse, K R, and Hughes, A G (editors). 2014. Meeting Report: NERC Integrated Environmental Modelling Workshop (Held at the British Geological Survey, Keyworth, 4–5 February). British Geological Survey Internal Report, OR/14/042. |
Q1: In the future most models will at some point in their life-cycle need to be linked to other models. What needs to be achieved in order for this to happen?
Group 1
- The idea that platforms are heading towards a system was introduced
- Qu. - What is your definition of a platform?
- Basis/framework starting point
- An environment in which one can explore data/using models and other tools
- Common systems designed to link model components
- Lots of models, wires for linking up visualising results, platform upon which to play with models
- A modelling platform is a structure that allows model components to hang together, communicate, the sum of which is a recognisable tool to tackle a Dig problem
- A system that facilitates the integration of model components and environmental datasets into a framework that can deliver the outputs required by users including an objective assessment of the uncertainties associated with the output
- Hardware and software infrastructure with defined set of standards for interaction between code and datasets. Could also be populated with models and data required
- Infrastructure – servers web access (cloud) accessibility
- Software – Operating systems security, interoperability, connectivity, data
- Definitions – (ontology & semantics) workflow
- User interface – Inputs, outputs adaptability
- A (computer-based) infrastructure for users to engage with models, data and tools for the processing and visualisation of results… and/or… a toolbox
How do we get to the next stage in order to achieve the vision?
- Platform needs to solve a problem
- Without a problem is difficult to construct
- Do we need several platforms?
- Are Met Office models platforms?
- Not necessarily one all signing/dancing platform — principle is not affected by the scale of the problem. Current infrastructure is not there for linking/running models in a plug-play style
- Not only need a framework but also a set of tools/visualisation
- Platforms open up modelling to the wider community (social sci/economist/non-specialist)
- There is a list of simple questions that will require a lot of work to answer
- We are not the only people looking at this — Medical
- Generic platform with different views
- Provision of metadata must exist to provide info to non-specialist users. This would have to be tailored to different communities (Households/policy makers).
- Different layers of metadata
- Metadata is essential if impacts are unexpected
- Can then construct a model-chain
- Platform needs to generate its own metadata
- Version data/library is important
- Liability and traceability are important (litigation drives things in the US)
- We need to be more rigorous with linked components
- Artificial Intelligence for metadata
- Quality and Uncertainty must form a part of the platform
- Scaling is important as different properties may emerge at different scales. These may be unexpected where multiple systems are interacting
- Platform should have the tools to explore impacts
- Platform as a means of bridging scales
- Scale bridging needs to be done in an intelligent way
- Built in checks are important to tell you when something is not scaled properly
- NERC or EPSRC needs to stat a research program into integrated modelling
- Just because two components are the same — should they be linked?
- Probably not
- We need to have a conceptual framework behind the platform (a set of rules and a checklist)
- CSDMS has a tool where you pick a component and that guides you through the process
- OpenMI does not impose any constraints on the modelling you do, however real world links need to be checked by the user
- Requirement for a research program where a working group should establish some of the biggest/best questions to be answered
- Is a platform liked by/useful for scientists?
- Scientist benefits from platform by developing the conceptual model
- Scientists can free up time to spend on other research
- We are at the start of a large learning curve — IEM is difficult
- Platforms help bridge the gap and bring down the required skill levels
- e.g. Google earth is easy — the early GIS on which they are based are difficult
- If barriers are in place for scientists to make their models platform compliant, they won’t
- Computing infrastructure has matured to a point where we can make better progress
- We do however need to make things slick so that it is used
- Culture barrier in academia to develop something original — are platforms original?
- Find it had to publish platform output
- Find it difficult to get funding — seen as risky
- Not really platform specific
- How do you QA model output from platform
- Role for research centres
- NERC Training up users
- Why are you building a platform/community?
- Need to feel like you are part of something
- Scientists want their output used
- We have to do the boring things to get things working
Group 2
- Qu - What is your priority in getting a platform together?
- Dog wagging tail, i.e. questions drive system not a system developed for its own sake
- Needs to be a science problem
- Forums for linking scientists — a forum with model scientists
- Need to define users and what they need — should this be left to NERC?
- Links to needing a strong science-policy interface
- Should they be specially trained?
- Policy makers don’t have the time to pose the best questions
- Scientists don’t know the pressure on policy makers to answer questions
- Feeds in to defining your question
- Need to set an output that is attainable and need a question that brings people together
- Assumption we are making is that the questions need IEM to solve
- If the question is not defined — then difficult to answer
- Flexibility is key — use of software/hardware can be changed easily to answer different questions using the same platform.
- Must supply as much information as possible to the user
- Allows the user to trust the model
- Security is important
- Needs to be efficient — changing who is in control
- Learning from other users
- How do you motivate scientists to document it properly — strengths and weaknesses
- Have to change the reward system for scientists to do this?
- Version control software (got to trust system)
- Education is required
- Can cause damage if this is not done
- IPR — are all models going to be open source
- Could just make the output available and not necessary the input/model data
- Access to the code makes you feel better
- Commercial side may not allow you publish/release data/code
- A universal way of providing feedback for a model
- A way of versioning so that mistakes in the code don’t make their way back into the code-proper. Maybe have a trunk version?
- Academic — Commercial interface is a difficult area in which to operate.
- If one part of the model chain is commercial does that impose IPR issues to the rest of the platform output
- Encryption on links — e.g. the insurance industry
- The industry is moving towards open source — the data that drives the models will however remain hidden
- Problems with data access/licensing (strategic UK datasets especially)
- But there are some areas where access is good (e.g. WRF)
- Variability in data access is a problem
- Can we scope what resources will be needed
- High end — Airbus has whole teams
- Low end — couple of people
- This QA is undertaken by making code open source
- There is a spectrum of users/developers that code with good-will
Group 3
The previous groups’ findings was introduced [listed above] and asked the question — what do you think?
- Scale of question can drive what hardware you need (e.g. laptop to solve dredging problem and HPC to solve UK flooding problem)
- Who are the modellers and who are the scientists — mostly researchers are both
- Danger of underestimating the answer to the question
- You can end up with a large modelling platform to answer a small problem which wastes resources
- A small number of platforms would be a better approach
- What we can do now with technology has not been clearly described — policy makers don’t know what is possible
- Strengths of this group — Atmosphere/Earth surface modellers in one group
- Two drivers
- Scientific Problems that can’t be solved (need big spending on resources and less on people)
- End user problems (needs people and less on resources)
- Both important but separate and require different approaches
- Need to be clearer about what the driver is guiding
- We now have a set of tools and if we combine them we may have a platform
- If the NERC definition of platform (Ship/Plane/etc.) be used here
- Large piece of infrastructure that only works on the bigger scale
- People resources is the bottleneck that is holding thing up
- We could end up with some new platforms if we brought stuff we already have together
- Should we working more closely with other organisations or use the competition to drive innovation
- What turns a workflow into a platform?
- Who is the user?
- Science — need a good question
- Corporate — need a good partner
- Governance
- Platforms not Platform is important
- Modular structure is important — how do we get missing model components?
- Is IEM value for money — in terms of people or resources
- Needs to be future proofed in terms of modelling and data structures
Q2: Encourage the development of modelling platforms. What needs to be done to enable this to happen?
There are number of challenges to achieve this goal:
- A significant amount of work is required!
- Consideration should be given as to whether all models need to be made linkable
- For example, some should be used ‘stand alone’
But — modular approach is a good approach
- Start with new framework
- Conceptual & predictive — may need different frameworks
The next steps are as follows:
- Categorise models
- Focus on questions to solve
- Perhaps choose a generic framework/questions (flooding)
- Standards need to be developed and/or existing ones used for the following:
- described models, i.e. metadata
- data in/out (flat files as well as runtime linking)
- computing environment
- Motivation for adopting standards — this needs to be considered
- Become market driver — can also hold back.
- Open standard! (community contribute) — these need to be clearly documented
- CF standard — example
- Categorise standards
- Level of integration
- Standards at each level — taxonomy of standards — processes in 1 model can provide the input parameter of another
- Models to exchange processes — time consuming
- Metadata standards are including assumptions
- Think about how & what models are exchanging (map/flow diagram)
- Financial — bar code idea (useful?)
- Work on different areas & harmonise
- How to find common denominate
- Realistic standards that number of communities can meet
- Functional difference — how much — very different problems
- Standardisation vers variety
- Common group? — time stepping
- Break off ‘big’ categories
- Mesh/grid independent
- Spatially resolved — geographically rep in space
- e.g. climate system — but what else?
- Decision support — env/economic/social
- Where feedback — more complex
- Understand where existing models failing
- Demonstrate better model
- e.g. Somerset levels
- Linking more systematic & robust
- Repeatability
- Validation
- Interoperability
- Extensibility
- Model Metadata
- Allows taking out tight coupling — for example
- Model description (assumptions, emulators etc.)
- Is model right for right reasons?
- Effort to describe & pop metadata — time
- Some standards available e.g. ISO 19115
- Drivers for metadata
- Legal
- Funding
- Citation
- Community models
- High level s/w for metadata capture
- Skills software eng. required
- Traceability of incorporated models
- Most models simple in essence
- History
- Can data pick up metadata as it goes through chain?
- S/w should maintain audit trail
- Useful to have
Summary:
- Resource
- Incentivise
- Taxonomy of Standards
- Metadata
- Automatic capture
- Skills
Group 1, comments on previous discussion:
- Overlap with frameworks
- Platform/interface distinction
- Parameters/plumbing — horses for courses
- Parameters — in metadata
- Clear ontology
- What in/out
- What operation performed
- Coding expert knowledge in to get meaningful result
- Incentivisation
- Funders to require compatibility
- Establish critical mass — NERC/EPSRC e.g.
- Encourage linkages between platforms
- Description of model
- Look at other systems — e.g. human body
- Similar discussion in medical model — link
- Need RCUK level input
- Skills
- S/W expertise to assist metadata
- How to support legacy models of E.O.
- Metadata
- What do you need for standards to communicate
- IPR issues
- Open data to solve?
- Free financial models
- But input data may have different IPR for linked models
- Conflict with commercial exploitation
- Issue of comm. Use data/models
- Political problem — address
- Description — taxonomy of standards
- Make standards work at particular level — not big ahead of taxonomy
- Parameter types (A.B.C…) — what dataset/models can I link to
- ESMF — e.g.
- What is standard for
- Investment in current standard — period of stability
Summary from Group 1 & 2 on Question 1:
- Need for minimum standard — model metadata — (plus understanding model)
- ‘Taxonomy’ of standards — need to adopt/meet that which best ‘fits’
- Link model standards
- Focus on interface
- Multiple agencies using same ‘framework’
- Incentivisation (RCUK)
- Critical mass
- IPR
- Skills
- Gain momentum
Standards:
- Bench — marking — model performance (e.g. water & energy balance)
- Capturing scientific reality/accuracy of final model — getting it right
- Danger of ‘plug & play’ — users can make inappropriate links
- Avoid ‘bolting together’ — interactive process
- Include users experience of using version ‘x’
- Do you always need to fully couple — effort worthwhile
- IPR — value still in being expert on your model
- Analysis of pros/cons of open access
- ‘Idiots’ guide to standards
- Horizon 2020 — funding for standards
- Exemplars — could examine benefits & time for linking – advantage in dynamic linking?
Q3: Assessing and quantifying uncertainty. What needs to be done to enable this to happen?
Assessing & Quantifying Uncertainty in IEM:
Ensembles & Models (ISIMIP)
Climate & Impact Models
Range of uncertainties for predictions One component of Uncertainty
- Structural also parameters
- Users need different measures
- Can be changed parameters or input data ranges
- Hydromodels — flood forecasting/drought
- Scenario uncertainty (boundary conditions)
- Need to track uncertainty
- Comparison with observation (real-time or historic)
- Must occur at each model
- Especially at interfaces
- Extreme events
Different Models will be responsive to mean or extremes: Smoothing determines lack of
- Coupled models scales
- Baysian
- Model assumptions
- Uncertainty measures
- Metrics relevant to end-users,
- 2 types of structural uncertainty
- Spatial configuration
- Time-series variability
- Kinds of structural uncertainty
- States that particular models can reach depending on input parameters
- Changes to equations
- Presenting Uncertainty Should we present it, without devaluing values
Coupling of many models makes understanding ‘modelling chain’ a largely redundant concept
- Will ensembles help
- Incorporating model use risks massive uncertainty
- Remediated by collaboration
- Open source models facilitates checking, but also miss-use
- Uncertainty as a flag for maturity
- Metadata must include boundary conditions & fundamental limitations of models
- Codifications of other peoples’ model limitations to ensure models used is fit-for-purpose
- Are their problems that can be solved better if more models included?
- People will assess their own uncertainty
- Uncertainty understanding required by model developers rather than end-users
- Uncertainty used to assess usefulness of models
Data assimilation for models not communicating to end-users
- Statistical assessment of models required
- Model outputs need to be tested
- Can we have scientific (e.g. geophysical) interface to allow us to test components?
- Lack of ‘geophysical zippers’ as scale determines how this can be done
Baysian Uncertainty requires understanding of priority weighting/rating the ‘arbitrariness’
Coupling uncertainty in how you couple models as will otherwise get too complex
- Coupling different time??
- Bias corrections
Observation System simulation experiments
- How do observations bias results?
- Observation biases underplayed
- Difficulties measuring extremes
- Time stops can multiply uncertainty
Principals
- User context dependent: End-users will determine what they need
- Effect of uncertainty calculations on interactions
- Effect of interactions on uncertainty
- Exponential increase in work
- Assumptions & metrics of uncertainty
- Not uncertainty that’s improved, but confidence
- To what extent do people write extra applications to manage uncertainty around core equations?
- Tracking parameter uncertainties with parameters through each element of modelling process
- Sensitivity – Is uncertainty of parameter important to problem being tested (e.g. relevant scales)
- Limits where model results cannot be exceeded
- Need component uncertainty to understand linked system
- Scenario uncertainty/intrinsic chaos/how well does model represent processes?
- Emulators to assess uncertainty
Issues to address:
- Integral assumptions can be documented in metadata
- document uncertainties in results separate — dynamic
- Understanding of statistics in deterministic models — mismatch between statistics with modeller
- Explore workflow engines — identify best fit-for-purpose
- Knowledge of tools from other sectors
- Do uncertainties compound? Feedback
- Communication to end-users
- Some uncertainties cannot be meaningfully calculated
Confidence — has uncertainty been minimised in sense that best estimate been made rather than variability of output parameters.
- End-users want to know that underpinning data is included, not ignored, of that range is within same limits
- Clear vocabulary of uncertainties
- Process of assessing uncertainty more important than value
- Known unknowns
- Statistical distribution rather than statements
- Can interfaces cope with additional uncertainty information
- Computational environment creating uncertainty through calculation changes or precision
- Metrics of how well modelled
- Observational evidence — to understand system & deal with extremes
- Mismatch between observers & modellers
- Need to look at uncertainties between components across interfaces, within interfaces & whole system
- Training to understand uncertainties as important as training in the IT
- Make clear that quantifying models does not increase its value
- Need to understand users perception of risk