This page has been created to provide information related to the 2021 GBIF Ebbe Nielsen Challenge

This page provides information about STARDIT + GBIF.

The links below will take you to different sections of this page:

Short video about STARDIT + GBIF

You can download this video here, or watch it here if it doesn’t play in this page.

Why has Standardised Data on Initiatives (STARDIT) + GBIF been developed?

Biodiversity cannot be effectively monitored, preserved and restored without working across disciplines, languages and databases. There is currently no standardised way to share information across disciplines about initiatives, including fields such as health, environment, basic science, manufacturing, media and international development. All problems, including complex global problems such as biodiversity loss require reliable data sharing between disciplines in order to respond effectively.

STARDIT (Standardised Data on Initiatives) has been co-created on the understanding that such problems require evidence-informed collaborative methods, multidisciplinary research and interventions in which people who are affected are involved in every stage. STARDIT is being created to help everyone in the world find and understand information about collective human actions, which are referred to as ‘initiatives’.

STARDIT + GBIF has been developed to help align and standardise GBIF data with many other kinds of data from other disciplines, and provide a way for people to find and update data.

STARDIT is the first open access data-sharing system in the world being developed to standardise the way that information about initiatives is reported across diverse fields and disciplines, including information about which tasks were done by which stakeholders (for example, biodiversity data collection by Indigenous peoples). STARDIT also offers a way to add updates throughout the lifetime of an initiative, from planning to evaluation, and allows reporting of data (including data about any impacts) in many languages.

How can STARDIT support the work of GBIF?

The Global Biodiversity Information Facility (GBIF) enables biodiversity data to be available via an accessible and searchable portal. However, there are a number of limitations to the kinds of data which can be submitted to GBIF, including detailed information about who was involved in the research or data collection (for example, in citizen science projects), and any impacts from the data which has been collected (such as how it has informed policy or practice). STARDIT is designed to be interoperable between multiple databases and can help link these kinds of data.

STARDIT is free to use and data can be accessed or submitted by anyone. The authors of the data can be verified (to improve trust), and the data checked for quality, offering a potentially important source of high-quality standardised information on initiatives trying to solve complex multidisciplinary global problems such as biodiversity loss.

For example, many industries use self-regulatory processes to govern industry practices, with examples including the Forest Stewardship Council (FSC), Marine Stewardship Council (MSC), Certified B Corporations, and ‘eco’ tourism. STARDIT could be used to improve public awareness of, and access to the data already reported by such self-regulatory standards. Increased transparency could, for example, support people to make informed decisions when investing or buying products; automate analysis of data to facilitate such decisions, and improve accountability overall. By working in this way, data can also be shared more easily with other disciplines, for example, those exploring the links between biodiversity and human health, as highlighted by the recent COVID pandemic.

STARDIT + GBIF has already been used to create a report about the ‘Wild DNA’ project.

Data categories which are aligned with GBIF data include data about who was involved in research design, monitoring and management processes, data about funding for monitoring or management (for example, funding for biodiversity monitoring), data about how information will be stored and shared (including what will be redacted and data security), data about who decides what data will be redacted and how this decision is made (such as location data), information about how data will be analysed (including relevant code and algorithms) and how learning from data will be shared, information about relevant data privacy legislation and regulation and information about how findable, accessible, interoperable and reusable (FAIR) it is. Data is stored in a machine-readable format using structured Wikidata, based on the widely used Resource Description Framework (RDF) developed by the World Wide Web Consortium (W3C). STARDIT + GBIF has been designed to work with the GBIF API. Further technical details can be found in the STARDIT Beta version pre-print:

A visualisation of STARDIT and GBIF data

What are the benefits?

Among its main benefits, STARDIT offers those carrying out research and interventions access to standardised information which enables well-founded comparisons of the effectiveness of different methods, including data on impacts of citizen science.

The STARDIT + GBIF tool is envisioned as a way to align GBIF data collection with other international data standards, facilitate the production and curation of high-quality structured open access data. Using the Wikimedia Foundation’s Wikidata project as the underlying architecture, STARDIT also allows data to be added and translated into multiple human languages using Wikidata items, as well as being interoperable with multiple other data formats.

STARDIT can help get essential, verified and high-quality biodiversity data to the people who need it, without it needing to go through lengthy (and often expensive) peer-review. The data also remains structured and machine readable. Automatic integration with the peer reviewed Wiki Journals and Wikipedia can also help ensure biodiversity data is automatically integrated and disseminated as widely as possible, to help ensure translation and improve evidence-informed decision making when working to protect and promote biodiversity.

In this way, STARDIT could be used to share information which makes research more reproducible, improving accessibility to the information required to critically appraise research and evidence and thus improving trust in processes such as the scientific method and facilitate an appraisal of different knowledge systems, including Indigenous knowledge systems. Such data sharing could also improve the translation of trusted, quality research and data, by empowering people to both access and appraise relevant data. For example, improved access to more standardised information (in multiple languages) about data and outcomes, could help to facilitate more informed collaborations between researchers and those monitoring and protecting critically endangered species, particularly where there is no common language.

STARDIT is designed to be future-proofed so future data standards can also be added, with an open and transparent governance process (currently hosted by the charity Science for All, in partnership with the Wiki Journals).

Photos from the ‘Wild DNA’ project, which created the first STARDIT + GBIF report

What are the limitations?

The current version of STARDIT is Beta and has been co-created on a very limited budget. It is a working prototype, built in Wikidata by volunteers and a limited number of paid engineers. Accordingly, the user interface has a number of usability and accessibility issues, which would be improved by a human centered design process in order to improve both the usability of the report creation and access and analysis of the data. In the future, learning and development opportunities will need to be co-created help include more people in report creation, for example, to support Indigenous peoples to be involved.

Who is involved in STARDIT?

STARDIT is designed and run using a ‘participatory action research’ paradigm, which aims to involve all stakeholders (the public, experts and others with a ‘stake’ in the work) in every aspect of the initiative. The participatory action research process is currently being hosted pro-bono by the charity Science for All, informed by their work in citizen science to monitor and restore biodiversity during the ‘Wild DNA’ and ‘Campfires and Science’ citizen science projects. It provides a standardised way of reporting many kinds of data, including who was involved in which tasks, data was collected (for example, how environmental DNA samples were collected, prepared and analysed) STARDIT uses the Wikimedia Foundation’s ‘Wikidata’ project, and involves over 40 different experts from multiple disciplines and academic institutions. 

STARDIT + GBIF and citizen science

Below is a poster about STARDIT from the USA’s Citizen Science Association 2021 conference.

Operating instructions

  1. A STARDIT report is created by completing a simple online form (please note you will need to create a Wikispore account to save your report)
  2. Create or search for a report by typing in the name of your report, and hit ‘create or edit’
  3. Data that can then be added includes information about an initiative including the title (description), the aims, methods, who was involved, how it was funded and any impacts or outcomes
  4. Complete all the data fields and then hit ‘save page’ at the bottom of the page
  5. Once submitted, Editors will check the data the STARDIT report will be entered into the database.
  6. The STARDIT report is then findable and editable by anyone.

Technical specifications

STARDIT data is stored in a machine-readable format using structured Wikidata, based on the widely used Resource Description Framework (RDF) developed by the World Wide Web Consortium (W3C).

STARDIT reports can be used to both describe data submitted to GBIF (providing additional meta data), include machine readable links to GBIF data and include any relevant data.

Data can be added in both structured (standardised) forms as free text. Once the form is completed, a unique Wikidata number is created, along with a permanent stable version of that form. Each STARDIT report is assigned a unique Wikidata item number and all previous versions are navigable in a transparent history.

Like a Wikipedia page or Wikidata entry, the STARDIT report can be updated at any point by anyone. This allows start reports to be updated overtime, unlike peer reviewed journal articles.

Subsequent updates will log any changes and preserve a record of the updates. Information includes who (or what) created or updated the report (for example, a person with an ORCID or a machine learning algorithm), and who (or what) checked the report.

STARDIT + GBIF has been designed to work with the current GBIF API V1 (and future versions).

STARDIT + GBIF is designed to work with multiple data standards, including Darwin Core Terms, Biological Collection Access Service and Ecological Metadata Language.

Further technical details can be found in the STARDIT Beta version pre-print:

STARDIT + GBIF Data categories

STARDIT Data fieldWikidata encodingMICRO (Compulsory)GBIF Metadata Elements
Initiative nameLenCompulsoryProject
Geographic location or scopeP937CompulsoryGeographic Coverage, Coverage
Purpose of the initiative ‘stated as’ free textP3712 P3712 Q P6001Compulsory
Start date of initiativeP580CompulsoryTemporal Coverage, beginDate
End date of initiativeP582Temporal Coverage, endDate
Organisations or other initiatives involvedP664CompulsoryPeople and Organisations
Ethics approval (org)P793 Q98550700 P1027
Ethics approval (date)P793 Q98550700 P585
Ethics approval (ID)P793 Q98550700 P1932
Funding sources (org)P8324CompulsoryFunding
Funding sources (dept or scheme or grant ID)P793 Q P1932
Relevant URLsP856onlineUrl
keywords, metatags, mesh terms, raid termsP921
Date of reportP793 Q37260 P518 Q10870555 P793 Q37260 P585 or P793 Q37260 P580 and P793 Q37260 P582Compulsory
Methods of the initiative (what is planned to be done, or is being reported as done)P4510CompulsoryMethods, methodStep
Link to a public domain methodology documentP4510 Q P973methodStep, sampling, samplingDescription
theoretical or conceptual models or relevant ‘values’ of peopleP4510
Name of report author (person or algoritm)P50Compulsory
ORCIDauthor item P496
Public domain profile / institutional pageauthor item P856
Key contact email at initiative for confirming report contentP793 Q37260 P968CompulsoryelectronicMailAddress
Who has checked the quality of the data in this report?P4032
Who was involved (named individual, organisation) Who was involved (group of anonymous individuals acting in role)P767 P767 Q P1114Compulsory
Specific tasks of this person or groupP767 Q P2868CompulsoryRole
Methods of involvement of participantsP767 Q P2283
What was the outcome or output of the involvement?P1542Compulsory
Were any publication produced as part of this?P921Compulsory
Methods of appraising and analysing involvement (assessing rigour, deciding outcome measures, data collection and analysis)
Facilitators of involvement (what helps the contributors in achieving the project’s outcomes?) ‘stated as’ free textP1552 Q101097118 P5102 P1552 Q101097118 P6001
Barriers of involvement (what inhibits the contributors from achieving the project’s outcomes?) ‘stated as’ free textP1552 Q16515105 P5102 P1552 Q16515105 P6001
What was the outcome or output of the involvement of these people? What changed as a result of involving people? Were there any impacts?P767 Q P1542
Which stage of the initiative were these people involved?P767 Q P585 or P767 Q P580 and P767 Q P582
What was the estimated financial cost for involving each person or group How much time did it take to involve each person or group Were there any other non-financial costs in involving each person or groupP767 Q P2130 P767 Q P2047 P767 Q P1542
Financial relationship or other interest this person has to this projectP767 Q P1932 P1542 Q99429881 P6001Compulsory
Conflicting or competing interests ‘stated as’ free textP1552 Q99429881 P1932 P1552 Q99429881 P6001Compulsory
What was the estimated financial cost for the overall initiative. How much time did it take.P2130 P2047
Findable: How is information about this data disseminatedP1056 Q42848 P1552 Q100451967 P1056 Q42848 P7228Compulsory
Accessible: How is it stored and hostedP1056 Q42848 P4945Compulsory
Interoperable: What analyses wereP4510
Interoperable: What format is it inP1056 Q42848 P2701
Reusable: Access restriction statusP1056 Q42848 P7228 Q66739888Compulsory
Reusable: LicenseP1056 Q42848 P275
Who owns itP1056 Q42848 P1552 Q2587068 P1056 Q42848 P127CompulsoryIntellectual Property Rights, intellectualRights
Where is it storedP1056 Q42848 P276Compulsory
Access restriction statusP1056 Q42848 P7228CompulsoryIntellectual Property Rights, intellectualRights
How to access (email) How to access (url)P1056 Q42848 P968 P1056 Q42848 P2699Dataset, URL
Data steward/curatorP1056 Q42848 P1640
Has anything changed or happened as a result of this initiative that isn’t captured in previous answers?P1542Compulsory
What new knowledge has been generated? (if appropriate, include effect size, relevant statistics and level or evidence)P1542 Q133500Compulsory
What part of the initiative was the learning about What topic was learnedP1542 Q133500 P518 P1542 Q133500 P921
Describe how the learning or knowledge generated from this initiative has or will be usedP1542 Q133500 P1542
How has or how will this be measured?P1542 Q P459
Who is involved in measuring this?P1542 Q P767