Currently, a major challenge remains in the data interoperability and digital ecosystem needs of Digital Twins. Global environmental data must be made readily available on an easy access, user-friendly platform. Moreover, the plentiful observational and modeling environmental data is currently collected in a disparate patchwork of national and international projects. International digital standards need to be defined and applied, harmonizing the data sets and making them applicable to global Digital Twin applications.In order to address this challenge, Martin Visbeck (GEOMAR) and Ute Brönner (SINTEF) organized two sessions on “Interoperable Data Systems for Digital Twins of the Ocean” during the Ocean Best Practices Workshop VI on October 17th and 18th.
A range of international speakers from varying sectors (government, IT private sector, research) presented their work on different aspects (technical, governance, application) of data interoperability for Digital Twins.
A summary of the individual presentations can be found below.
“GIS, a Foundational Building Block of a Digital Twin of the Ocean” by Marten Hogeweg (Esri)
Esri is the global leader in Geographic Information System (GIS) technology. It aims to use GIS technology to create and integrate Digital Twins. Esri has created Digital Twins for a range of applications: facilities management, smart cities, port management, etc. The Digital Twins have so far focused mostly on terrestrial applications, and the company is now looking into creating a Digital Twin of the Ocean. The marine exploration industry has been identified as potential client, for example the monitoring of underwater pipelines. This would require marine sonar and rata data, which could also be readily used in a Digital Twin of the Ocean.
Esri thinks of Digital Twins as a Geodesign Process: you assess the current situation through observations, decide which process models and suitability models to use, apply different ‘what if’ scenarios and evaluate the resultant impacts. Policy decisions can be made accordingly. The goal is multi-objective optimization, resulting in as many positive impacts as possible.
Data interoperability begins with the multi-dimensionality of Digital Twins. Whilst 3D models are surely necessary for some data sets (e.g. thermohaline circulation), sometimes ‘only’ 1D data is needed (e.g. temperature time-series). Thus, the data fed into Digital Twins must be stored in a manner that easily allows the extraction of different dimensions.
Moreover, Digital Twins cover multiple time horizons (current, short-term simulations and future, long-term simulations) and data sources: observational, in-situ, remote, modeling, etc. This is a lot of data, inviting the question of how all this data will be made available. Currently, environmental data exists in individual catalogs distributed globally: e.g. national governments, ocean observing systems from research institutes etc. In order to access this data as a user, you must first know about the existence of these catalogues. Over the years, the development of federated catalogues has attempted to increase the accessibility to individual data sets: catalogues can cross-register amongst themselves, such that users are made aware of different catalogues. Whilst this connects individual catalogues, it still requires considerable manual efforts of the end-user to find the data. Esri thus promotes federated GIS: data is made readily available in a joint catalogue environment, e.g. data spaces, data mesh, data fabric etc.
ArcGIS (Open Geographic Information System) supports the data interoperability needed for Digital Twins; it is open data, open source, open science and has open standards. Thus, ArcGIS supports a DITTO through:
(1) Observing System & Data Spaces
(2) Interaction & Provisioning
(3) Data Analytics & Prediction Engine
(4) Outreach, Share & Collaboration.
“BioDT: Digital Twins for Nature & Interoperability” by Jeroen Broekhuijsen (TNO)
BioDT is an EU project that started in June 2022. The goal of BioDt is to support the research infrastructure for biodiversity by producing a first prototype digital twin. The twin intends to drive both science and use cases, and connect different EU twins & initiatives (ECMWF, European Open Science Cloud etc.). Four research institutes are part of the BioDT Consortium (DISSCO, GBIF LifeWatch & eLTER) that are supporting the research infrastructure for biodiversity. Different European universities connected to these research infrastructures are doing the actual biological/ecological work.
Traditional digital twins used in the industrial sector have 4 main steps: product design, value chain, production process and asset management. This design process may be applied to environmental digital twins, for example in the forestry sector:
- Product design: design strategies to manage forests
- Value Chain: use forests to produce wood
- Production process: harvesting wood for use
- Asset management: manage forests
However, this is a very anthropogenic-centered design process, as the actives are all human-driven. Rather than manage our interaction with nature, natural digital twins must have a science-driven approach.
In the science-centered design process of Digital Twins, nature is captured through monitoring, observational or citizen science approaches. This data then enters research infrastructures and gives scientists new ideas how we can understand nature. This knowledge is then translated into models (e.g. species model, climate models). Next, we must assess whether what we captured digitally about nature accurately describes what we observe. In a next step, we can include how humans impact nature, and use the models to predict how anthropogenic intervention affects nature. Contrary to the traditional design process of industrial Digital Twins, this approach is nature-first.
Thus, the steps of the BioDT Twin Prototype are:
- Capturing Nature
- Understanding & modeling nature
- Affecting nature
When it comes to interoperability across the EU, the EOSC has established a framework describing different aspects (legal, semantic, organizational and technical) and levels (technical, syntactic, semantic, conceptual, experiential) of operability.
Building on this, there are many different options for interoperability in BioDT: data interoperability, workflow interoperability, application interoperability etc. We need this interoperability to find, exchange, query and assess the quality of this data. Whilst best practice examples of interoperability for Digital Twins do not exist yet, we can use success stories from the energy or manufacturing sector as guidelines.
“An Information Management Framework for Environmental (IMFe) Digital Twins” by Justin Buck (NOC)
The NOC developed a digital twin in the scope of an Information Management Framework (IMFe) roadmap project between October 2021 and March 2022. Specific goals included:
- Establishing a shared vision;
- Developing the conceptual framework;
- Agreeing on and implementing, digital commons;
- Delivering demonstrators (pilot DTs) with tangible benefits;
- Developing DT components
The roadmap was created with a two-way approach: a top-down approach, starting from the theory, and a bottom-up approach, using the existing landscape (governance, models, use cases). These were combined to create the IMFe Model.
At the bottom-up scale, the project examined existing digital twins across scales (local, national, regional and global), across disciplines (ocean, terrestrial, etc.) and across capabilities (predictive modeling, machine learning etc.). From this, three use cases were chosen to represent the range of IMFe requirements: 3DT, a digital twin run by the Met Office that delivers dispersion model for air pollution in cities; the Land Insight digital twin run by the UK Centre for Ecology & Hydrology, which models carbon and water in soil moisture, for flood monitoring and IPCC Assessments; and the Antarctic Digital Twin by the British Antarctic Survey, which optimizes ship navigation to minimize fuel consumption.
The top-down approach examined the existing theory and standards to develop a conceptual framework for the management, governance, security & support of the IMFe. Aspects considered included asset commons for data, models, methods and workflows, and underlying cloud native services. This was based on the theory developed for the Centre for Digital Built Britain (CDBB) Information Management Framework, and adapted for the environmental framework.
The Outcome: an IMFe Roadmap. The Roadmap is readily available here.
A new project, running from October 2022 to November 2023, is piloting the IMFe. It is designing and building an IMFe for the use case of Haig Fras Seabed Imagery. A key element of this pilot project is stakeholder engagement and communities of practice; until January the project must recommend existing communities of practice. The planned outcomes are an IMFe Framework, IMFe Services, as well as a demonstrator Haig Fras Digital Twin. This use case was chosen because ocean protection is a policy priority and protection of 30% of the ocean by 2030 a commitment of the UK G7 Presidency. Seabed Imagery is a non-destructive method that can be used to assess and monitor Marine Protected Areas; a Haig Fras Digital Twin can create a continuous, timely assessment of seabeds.
The expected project outcomes are: continuation of the stakeholder community and outreach to relevant communities of practice. This includes updates to & publication of the theoretical IMFe framework (in about 5 months time); producing a first version of an asset register to enable a collaborative and coordinated development of environmental digital twins, and an exemplar Haig Fras marine imagery digital twin.
On November 24th, the NOC is organizing a workshop to improve the coordination between the development of digital twins of the ocean, observational networks and digital infrastructure.
“South Africa’s Marine Information Management System (MIMS)” by Marjolaine Krug (Ministry of Forestry, Fisheries & the Environment, Cape Town)
South Africa has established the Oceans and Coastal Information Management System (OCIMS) to improve ocean governance and protection, and to promote the growth of the blue economy. In the scope of this, Decision Support Tools were created through consultations between developers and stakeholders. For example, for fisheries & aquaculture, or marine spatial planning. Theses Decision Support Tools are developed with Technical Advisory Groups. Currently, OCIMS serves mainly government institutes, but also the industry and NGOs. For technical support, there is a dedicated data center and IT infrastructure called the Marine Information Management System (MIMS). MIMS was implemented by the South African Environmental Observation Network (SAEON) in collaboration with the Department of Forestry, Fisheries & the Environment (DFFE).
MIMS is an Open Archival Information System that stores and publishes marine datasets. The IT infrastructure belongs to and primarily serves the DFFE, but as the need and interest from other stakeholders grows, the IT infrastructure will be expanded. Over the next 5 years, the DFFE aims to establish a complete failover system. MIMS stores various kinds of data: geographic data, biological metadata and simple metadata. The data adheres to African as well as international data standards.
MIMS is built on a combination of TRUST & FAIR; transparency, responsibilty, accesibitly etc. The long-term preservation of data is ensured through Data Management Policies, for example adhering to international standards to maximize interoperability, limiting data embargo and persuading data holders to share data.
MIMS: Regional & International Links
MIMS does not work in isolation; there is a national node for the IODE of the IOC of UNESCO, and the program just received the IODE Associate Data Unit (ADU) status. Moreover, MIMS acts as the Repository for the Southern African Data Centre for Oceanography (SADCO) and hosts the IODE AfrOBIS – an Ocean Biodiversity Information System (OBIS) that coordinates marine biological data management activities for the sub-Saharan African region. At a regional level, MIMS works with the Benguela Current Commission (BCC) and the Western Indian Ocean Marine Science Association (WIOMSA).
In summary, MIMS serves as an exemplary open archive information system, following international standards of best practice of FAIR & TRUST data management principles. This includes common (meta)data structures, sharing protocols, use of standardized classifications & vocabularies as well as open data formats and standard interfaces.
“Planned Blue-Cloud 2026 activities towards Digital Twins of the Oceans” by Dick Schaap (MARIS)
An EMODnet study found that in Europe, 1.4 Billion Euros a year are spent on marine data acquisition (remote sensing & in-situ). This means that there is a highly engaged marine data management landscape (EuroGOOS, Copernicus, PANGAE etc.), operating at ca. 95% capabilities. A remarkable infrastructure underlies this landscape: there is a range of data collectors, then data aggregators (3 main power blocks in Europe: SeaDataNet, EMODnet, Copernicus), then come the intermediate users (people turning data into knowledge), and the end users.
Blue Cloud focuses on the data part. The project started as a federated project to combine data sets, analytical resources and computing resources: “to promote the sharing of data, processes and research findings in the marine domain by delivering a collaborative web-based environment that enables open science”.
There are 3 levels in the overarching concept of blue cloud: assimilating increasingly more data from the range of repositories, applying common standards (OGC, ISO) for (meta)data interoperability, and developing value-added services and applications, turning data into real information and making it ready for other users.
Blue Cloud works together with many leading E- and blue data infrastructures.
There are 3 key products and services of blue cloud:
- Blue-Cloud Data & Discovery Access service: federating key European data management infrastructures;
- Blue-Cloud Virtual Research Environment: promoting Collaboration, Sharing, Reuse, & Reproducibility.
- Blue-Cloud Virtual Labs (5 in total), serving as Demonstrators: Marine Environmental Indicators, Aquaculture Monitor, Plankton Genomics, Fish a matter of scales, Zoo & Phytoplankton EOV products.
The Blue Cloud Discovery & Access Service produces a common interface for the main European data management infrastructures, so users can easily find the data they need in a “one-stop-shop” approach, rather than having to access a number of different data catalogues. This is also beneficial for the producers of blue data: they have a wider outreach to potential users, they are informed about data requests and they streamline their data with international standards. The Discovery & Access Service applies a two-step approach: first, it identifies interesting data collections with few criteria. Next, it applies more criteria to select specific data sets.
Due to the success of the first project round of Blue-Cloud, there is a successor Project: Blue-Cloud 2026. starting on January 1st 2023. The Blue-Cloud 2026 mission it “to achieve a further evolution of the Blue-Cloud Infrastructure into a Federated European Ecosystem to deliver FAIR and Open data, analytical services and instruments for deepening research of oceans (…) This will provide a core data service for the Digital Twin of the Ocean (DTO)”. For this follow-up project, the consortium expanded from 20 to 40 partners, and includes more Blue Data and E-infrastructures.
There are specific aims planned by Blue-Cloud 2026 to increase data interoperability: adding semantic brokerage for harmonizing terminologies, adding data subsetting functionality to facilitate the querying and extracting of data sets for specific criteria, and building semi-automatic workbenches for compiling data sets & elaborating these into validated and aggregated data collections of selected data types.