27/04/2026 News

Europe is building a new Data Space to safeguard the privacy and traceability of environmental data

Communication Technician

Diego de la Vega

Scientist, historian and science communicator. I am passionate about science, mainly in its social and historical dimensions.

At a crucial moment for Europe to become a competitive and climate-neutral economy by 2050 , businesses, governments, and scientists need to be able to share environmental data securely within a European sovereignty framework that enables collaboration. The European SAGE project has addressed this challenge and, after a year of research, has defined the basic architecture of the Data Space for the Green Deal. This data space is a platform that will be operational by 2028. It will facilitate secure, reliable, and interoperable access to thousands of datasets related to the challenges facing Europe and the planet in terms of biodiversity, climate change adaptation, the circular economy, and pollution.

The architecture acts as an instruction manual that establishes how the different pieces of the data space should be assembled . Its backbone is sovereignty . This means that whoever shares information through this platform will maintain control at all times over who accesses the data and for what purpose it is used. This is especially critical when sharing restricted information, such as personal data protected by the European Union's General Data Protection Regulation (GDPR) , data whose public disclosure could be sensitive, such as the location of protected species, or even private data subject to the internal rules of companies or governments. The Data Space for the Green Deal seeks to foster collaboration among entities that would otherwise not share this information, or that, if they do so using the cloud or email, risk having their data made public without knowing who is using it and for what purposes.

The architecture acts as an instruction manual that establishes how the different pieces of the Green Deal Data Space should be assembled, where the backbone is data sovereignty.

The main person responsible for this advance has been Joan Masó , a researcher at CREAF in the GRUMETS group and an expert in data interoperability and international standards. His work is part of the SAGE project, an ambitious European collaborative challenge that brings together more than 40 partners from 12 countries. Masó was inspired by the results of the European project All Data 4 Green Deal (AD4GD) , which he coordinated for three years and which developed a set of tools potentially capable of transforming environmental data into information that is “locatable, accessible, interoperable, and reusable” (FAIR).

Data addressing 10 environmental challenges

With the core architecture now agreed upon, the SAGE project has begun developing the code that will bring its data space prototype to life through intensive programming sessions, or code sprints . Everything will be tested to ensure it functions correctly in ten pilot cases where the Green Deal Data Space is expected to have a significant impact. These cases focus on monitoring pollinator populations, controlling CO₂ and other greenhouse gas emissions from construction, tracking human exposure to harmful environmental agents, compiling national forest inventories, increasing the reuse of soil and clay obtained as byproducts from excavation projects, and assessing the balance between the economy and the services provided by the environment.

Investigadores tecleando delante de un ordenador durante uno de los codesprints.

Scientific team of the SAGE project during one of the development code sprints in Barcelona. Image: B. Giralt.

A prime example is the textile industry . Thanks to this platform, it is expected that exchanging information on the life cycle of garments and their environmental impact will be faster and more secure through Digital Product Passports and Extended Producer Responsibility schemes. The data space approach improves the traceability of this information and drives its digitization. Furthermore, it helps European and national authorities monitor compliance with existing sustainability legislation and fosters more profitable and efficient circular business models.

A puzzle of security, transparency, and interoperability

The architecture of the Green Deal Data Space defined by SAGE is designed in a modular way , so that it can be easily adapted to other contexts. Furthermore, open-source solutions are used whenever possible to promote system transparency.

It's important to understand that a data space isn't a repository, but rather a highway along which information flows . Imagine a city council wants to share information about the presence of protected species in its municipality with an environmental consultancy to conduct a study. Instead of having to upload the data to an external platform, both entities can connect thanks to a technology called data space connectors. This is the central component. The connector works like a traffic light, allowing data to flow based on rules established by the provider—in this case, the city council—without any intermediaries. Both parties must sign a digital contract that automatically verifies compliance with the rules. If everything matches, it's a green light: the information is shared. All operations are recorded, ensuring traceability. This is how the provider knows at all times who is using the data and how.

Joan Masó CREAF

The Green Deal Data Space does not function as a repository or a computing infrastructure, but as a facilitator of secure access to existing data and metadata , contributing to the mobilization of environmental information and thereby boosting the European economy while protecting data security and sovereignty.

Joan Masó , technical manager of the GDDS architecture and CREAF researcher.

On the other hand, it is important to ensure the consistent exchange of data. This is what experts call interoperability . There are several levels of interoperability, but one very ambitious level is semantic: ensuring that information has a clear meaning and does not lead to misinterpretations . Researchers explain that it would be like discerning whether, in a conversation, the word "plant" refers to nature, a factory, or the sole of the foot. This is achieved through dictionaries that compile the most important scientific terms and define the language that the data space understands. Interoperability also applies to metadata, that is, the contextual information that explains where the data comes from, when it was created, or in what format it is stored. Using standards provides the interoperability that allows thousands of databases from diverse subject areas and origins to be connected precisely.

The legacy of AD4GD

One of the main contributions of the AD4GD project to the SAGE architecture has been precisely the improvement of semantic interoperability . For three years, AD4GD developed a protocol that links environmental data with its meaning. This allows scientists, governments, and businesses to more easily find the databases they need, be more certain that they contain the information they are actually looking for, and, in turn, interpret the data correctly. This makes it easier and faster to advance new research or make more precise decisions on the ground. In turn, artificial intelligence models can also find these databases more easily and accurately and use them for training or as prompts to find results.

Thanks to this strategy, for example, it is now faster and easier to assess the connectivity of terrestrial habitats , a key indicator in European policies for ecosystem restoration and conservation . Connectivity measures how easily mammals such as the roe deer, wolf, and European polecat move across the territory. It is calculated by combining land use and land cover maps, the location of built-up areas and protected areas, information on threatened species, and observations made by experts or through citizen science, for example, using camera traps. AD4GD found that, for the Catalonia region, the calculation speed was greater and fewer resources were needed when using the appropriate artificial intelligence algorithm compared to traditional approaches based on the tools provided by geographic information systems.

Un grupo de personas que miran algo cerca de un estanque, al aire libre.

Research team from the AD4GD project testing the functionality of one of its applications in a lake in Berlin. Image: Diego de la Vega.

A second pilot case for the first time, Berlin's hundreds of urban lakes were ranked according to water quality and quantity to prioritize improvement interventions. This system integrated administrative data with satellite imagery and sampling data collected by experts and through citizen science. A third pilot project improved the accuracy of air quality maps generated by satellite sensors (Copernicus program) by integrating citizen observations from low-cost, on-site sensors. In both Berlin and Catalonia, two platforms were designed, in collaboration with potential users, to provide access to all these tools: Splasboard and Bioconnect , respectively .

During its three-year run under the coordination of Joan Masó, AD4GD also led the EuroGEO Action Group for the European Green Deal Data Space, where a network of European projects with similar objectives was built. These projects produced, among other things, a policy brief with recommendations for the European Commission to implement this data space. CREAF will now continue to lead this group through the SAGE project.