Without coordinates, samples cannot be located on a map and, therefore, cannot be linked to the environmental conditions of the place of origin, such as the climate or surrounding vegetation.
Species collections are the biological memory of the planet , an irreplaceable legacy that documents ecological, evolutionary and geographical changes over time. It is estimated that there are billions of species specimens collected in collections around the world, from museums and botanical gardens to research centers. However, these specimens still face a great challenge because many of them are not georeferenced, that is, geographic coordinates have not been assigned to the description of the place where they were found.
Until recently, only a part had been digitized and even less georeferenced. In 2022, a study led by CREAF already warned that, of the 180 million digitized specimens available in GBIF (the Global Biodiversity Information System) , only 38% had coordinates, and only 18% had been assigned an uncertainty , that is, to what extent the real position of a specimen can vary. From this need was born GeoPick , a web application developed entirely by researcher Arnald Marcer and programming expert technician Agustí Escobar , both from CREAF, and which helps add quality georeferencing to natural history collections.
A technological gap that needed to be filled
In natural history collections there are species that were documented many years ago, as far back as the 16th century, when technology was not as advanced as it is today. In museums, botanical gardens, zoos, research centers, and universities, there are hundreds of specimens documented ambiguously. In some cases, individuals are described as “found 3 km south of…”, “near the river…”, “on the north slope of a mountain”.
Putting such old collections with such vague descriptions on the map is a slow and complicated process. The landscape may have changed a lot since then and, even if there are guides and techniques to help do it, it is difficult to document it.
Therefore, the global scientific community dedicated to the study of living beings, biodiversity and taxonomy needed an easy-to-use tool that followed good georeferencing practices and, at the same time, complied with international biodiversity data exchange standards .
Until now, each institution, museum or collection filled in these fields in its own way, which made it very difficult to integrate the data and use them for studies. For this reason, standardization is very important so that the millions of biodiversity records can be interpreted, compared and reused around the world . The international standards for biodiversity data are collected in the Darwin Core , a document agreed upon by the global scientific community that defines which data fields should be filled in and in what format when documenting biodiversity. This framework has been led by two world leaders in georeferencing, John R. Wieczorek, from the University of California, and Arthur D. Chapman, from Biodiversity Information Services in Melbourne. At the same time, these scientists have collaborated and are also authors of GeoPick. Therefore, good practices in georeferencing , together with data standardization, ensure that the natural history collections stored in GBIF are of high quality.
Born at CREAF for the global community
CREAF researchers already had experience in georeferencing and biodiversity data management projects thanks to their collaboration with the Barcelona Museum of Natural Sciences . In this context, Ali-bey was born, a tool for georeferencing the museum's collections that inspired the creation of GeoPick.
Thus, GeoPick allows you to georeference any group of organisms on the map and add the uncertainty associated with the interpretation of a locality . It can also be useful for other types of collections, such as geological or paleontological ones. The tool uses cartographic resources available online, including OpenStreetMap, and, when digitizing or searching for a location, draws a radius that represents the minimum uncertainty that includes the toponym or place mentioned. This radius allows you to delimit environmental variables (such as average temperature or vegetation) and take into account the variability of a territory when the locality is not exact. This approach is especially useful in cases where the description of a point is ambiguous , such as a naturalist who, in an old notebook, simply indicates that he found an organism “in Montseny”.
The success story in Africa: more than 35,000 georeferenced places from 38 different countries
To expand the user community, international collaboration has been essential. In the case of Africa, one of the collaborators, John R. Wieczorek, organized training courses for four African georeferencing centers located in Gabon, Ghana, Malawi and Rwanda, within the Tropical African Plants Thematic Collection Network program, funded by the National Science Foundation (USA).
The trainings were aimed at students, curators, researchers and museum directors, all of whom are linked to herbaria and botanical collections. “One of the key elements of the project has been to form local teams so that, in turn, they can train new users. This has allowed the impact of the tool to be multiplied, for example, teams in Malawi and Rwanda have been able to provide training to new georeferencers, exponentially expanding the scope and capacity of the program”, explains Arnald Marcer. These continued efforts have made GeoPick an essential tool for documenting African biodiversity in an accurate, transparent and fully compatible with global standards.
How does GeoPick work?
GeoPick is an open source tool available on the GBIF website. To date, more than 10,000 people from 158 different countries have used the application and it has become an important aid for georeferencing collections and biodiversity data on a planetary scale.
To use the platform, you must follow these steps:
When a location is digitized, GeoPick calculates the minimum circle that contains all the geometry and obtains the radius of uncertainty , which indicates how far the actual position of the sample can vary.
Each institution digitizes its collections and, in most cases, publishes them in GBIF, where the data is validated and becomes useful for research. Georeferencing is carried out by museums, botanical gardens or research centers. Once the data arrives at GBIF, it can be downloaded with all the spatial information and easily represented on a map.
Therefore, the combination of the digitization carried out by all these institutions and the standardization provided by GeoPick guarantees accurate, transparent and reusable biodiversity data.
Pioneering projects
GeoPick's global expansion and its success stories demonstrate that this tool is much more than a technical resource, as it democratizes georeferencing and allows institutions around the world to accelerate the digitization of their natural heritage with precision and speed.
GeoPick was officially launched in August 2023 at the TDWG Annual Conference in Hobart, Australia, and shortly thereafter at the TaxonWorks Together conference. In addition, GeoPick is being used as a testbed for how to integrate artificial intelligence into georeferencing at Massey University in New Zealand.