This page has been created to provide information related to the 2021 GBIF Ebbe Nielsen Challenge
This page provides information about STARDIT + GBIF.
The links below will take you to different sections of this page:
- Why has Standardised Data on Initiatives (STARDIT) + GBIF been developed?
- How can STARDIT support the work of GBIF?
- What are the benefits?
- What are the limitations?
- Who is involved in STARDIT?
- STARDIT + GBIF and citizen science
- Operating instructions
- Technical specifications
Why has Standardised Data on Initiatives (STARDIT) + GBIF been developed?
Biodiversity cannot be effectively monitored, preserved and restored without working across disciplines, languages and databases. There is currently no standardised way to share information across disciplines about initiatives, including fields such as health, environment, basic science, manufacturing, media and international development. All problems, including complex global problems such as biodiversity loss require reliable data sharing between disciplines in order to respond effectively.
STARDIT (Standardised Data on Initiatives) has been co-created on the understanding that such problems require evidence-informed collaborative methods, multidisciplinary research and interventions in which people who are affected are involved in every stage. STARDIT is being created to help everyone in the world find and understand information about collective human actions, which are referred to as ‘initiatives’.
STARDIT + GBIF has been developed to help align and standardise GBIF data with many other kinds of data from other disciplines, and provide a way for people to find and update data.
STARDIT is the first open access data-sharing system in the world being developed to standardise the way that information about initiatives is reported across diverse fields and disciplines, including information about which tasks were done by which stakeholders (for example, biodiversity data collection by Indigenous peoples). STARDIT also offers a way to add updates throughout the lifetime of an initiative, from planning to evaluation, and allows reporting of data (including data about any impacts) in many languages.
How can STARDIT support the work of GBIF?
The Global Biodiversity Information Facility (GBIF) enables biodiversity data to be available via an accessible and searchable portal. However, there are a number of limitations to the kinds of data which can be submitted to GBIF, including detailed information about who was involved in the research or data collection (for example, in citizen science projects), and any impacts from the data which has been collected (such as how it has informed policy or practice). STARDIT is designed to be interoperable between multiple databases and can help link these kinds of data.
STARDIT is free to use and data can be accessed or submitted by anyone. The authors of the data can be verified (to improve trust), and the data checked for quality, offering a potentially important source of high-quality standardised information on initiatives trying to solve complex multidisciplinary global problems such as biodiversity loss.
For example, many industries use self-regulatory processes to govern industry practices, with examples including the Forest Stewardship Council (FSC), Marine Stewardship Council (MSC), Certified B Corporations, and ‘eco’ tourism. STARDIT could be used to improve public awareness of, and access to the data already reported by such self-regulatory standards. Increased transparency could, for example, support people to make informed decisions when investing or buying products; automate analysis of data to facilitate such decisions, and improve accountability overall. By working in this way, data can also be shared more easily with other disciplines, for example, those exploring the links between biodiversity and human health, as highlighted by the recent COVID pandemic.
STARDIT + GBIF has already been used to create a report about the ‘Wild DNA’ project.
Data categories which are aligned with GBIF data include data about who was involved in research design, monitoring and management processes, data about funding for monitoring or management (for example, funding for biodiversity monitoring), data about how information will be stored and shared (including what will be redacted and data security), data about who decides what data will be redacted and how this decision is made (such as location data), information about how data will be analysed (including relevant code and algorithms) and how learning from data will be shared, information about relevant data privacy legislation and regulation and information about how findable, accessible, interoperable and reusable (FAIR) it is. Data is stored in a machine-readable format using structured Wikidata, based on the widely used Resource Description Framework (RDF) developed by the World Wide Web Consortium (W3C). STARDIT + GBIF has been designed to work with the GBIF API. Further technical details can be found in the STARDIT Beta version pre-print: https://doi.org/10.31219/osf.io/w5xj6
What are the benefits?
Among its main benefits, STARDIT offers those carrying out research and interventions access to standardised information which enables well-founded comparisons of the effectiveness of different methods, including data on impacts of citizen science.
The STARDIT + GBIF tool is envisioned as a way to align GBIF data collection with other international data standards, facilitate the production and curation of high-quality structured open access data. Using the Wikimedia Foundation’s Wikidata project as the underlying architecture, STARDIT also allows data to be added and translated into multiple human languages using Wikidata items, as well as being interoperable with multiple other data formats.
STARDIT can help get essential, verified and high-quality biodiversity data to the people who need it, without it needing to go through lengthy (and often expensive) peer-review. The data also remains structured and machine readable. Automatic integration with the peer reviewed Wiki Journals and Wikipedia can also help ensure biodiversity data is automatically integrated and disseminated as widely as possible, to help ensure translation and improve evidence-informed decision making when working to protect and promote biodiversity.
In this way, STARDIT could be used to share information which makes research more reproducible, improving accessibility to the information required to critically appraise research and evidence and thus improving trust in processes such as the scientific method and facilitate an appraisal of different knowledge systems, including Indigenous knowledge systems. Such data sharing could also improve the translation of trusted, quality research and data, by empowering people to both access and appraise relevant data. For example, improved access to more standardised information (in multiple languages) about data and outcomes, could help to facilitate more informed collaborations between researchers and those monitoring and protecting critically endangered species, particularly where there is no common language.
STARDIT is designed to be future-proofed so future data standards can also be added, with an open and transparent governance process (currently hosted by the charity Science for All, in partnership with the Wiki Journals).
What are the limitations?
The current version of STARDIT is Beta and has been co-created on a very limited budget. It is a working prototype, built in Wikidata by volunteers and a limited number of paid engineers. Accordingly, the user interface has a number of usability and accessibility issues, which would be improved by a human centered design process in order to improve both the usability of the report creation and access and analysis of the data. In the future, learning and development opportunities will need to be co-created help include more people in report creation, for example, to support Indigenous peoples to be involved.
Who is involved in STARDIT?
STARDIT is designed and run using a ‘participatory action research’ paradigm, which aims to involve all stakeholders (the public, experts and others with a ‘stake’ in the work) in every aspect of the initiative. The participatory action research process is currently being hosted pro-bono by the charity Science for All, informed by their work in citizen science to monitor and restore biodiversity during the ‘Wild DNA’ and ‘Campfires and Science’ citizen science projects. It provides a standardised way of reporting many kinds of data, including who was involved in which tasks, data was collected (for example, how environmental DNA samples were collected, prepared and analysed) STARDIT uses the Wikimedia Foundation’s ‘Wikidata’ project, and involves over 40 different experts from multiple disciplines and academic institutions.
STARDIT + GBIF and citizen science
Below is a poster about STARDIT from the USA’s Citizen Science Association 2021 conference.
- A STARDIT report is created by completing a simple online form (please note you will need to create a Wikispore account to save your report)
- Create or search for a report by typing in the name of your report, and hit ‘create or edit’
- Data that can then be added includes information about an initiative including the title (description), the aims, methods, who was involved, how it was funded and any impacts or outcomes
- Complete all the data fields and then hit ‘save page’ at the bottom of the page
- Once submitted, Editors will check the data the STARDIT report will be entered into the database.
- The STARDIT report is then findable and editable by anyone.
STARDIT data is stored in a machine-readable format using structured Wikidata, based on the widely used Resource Description Framework (RDF) developed by the World Wide Web Consortium (W3C).
STARDIT reports can be used to both describe data submitted to GBIF (providing additional meta data), include machine readable links to GBIF data and include any relevant data.
Data can be added in both structured (standardised) forms as free text. Once the form is completed, a unique Wikidata number is created, along with a permanent stable version of that form. Each STARDIT report is assigned a unique Wikidata item number and all previous versions are navigable in a transparent history.
Like a Wikipedia page or Wikidata entry, the STARDIT report can be updated at any point by anyone. This allows start reports to be updated overtime, unlike peer reviewed journal articles.
Subsequent updates will log any changes and preserve a record of the updates. Information includes who (or what) created or updated the report (for example, a person with an ORCID or a machine learning algorithm), and who (or what) checked the report.
STARDIT + GBIF has been designed to work with the current GBIF API V1 (and future versions).
Further technical details can be found in the STARDIT Beta version pre-print: https://doi.org/10.31219/osf.io/w5xj6
STARDIT + GBIF Data categories
|STARDIT Data field||Wikidata encoding||MICRO (Compulsory)||GBIF Metadata Elements|
|Geographic location or scope||P937||Compulsory||Geographic Coverage, Coverage|
|Purpose of the initiative ‘stated as’ free text||P3712 P3712 Q P6001||Compulsory|
|Start date of initiative||P580||Compulsory||Temporal Coverage, beginDate|
|End date of initiative||P582||Temporal Coverage, endDate|
|Organisations or other initiatives involved||P664||Compulsory||People and Organisations|
|Ethics approval (org)||P793 Q98550700 P1027|
|Ethics approval (date)||P793 Q98550700 P585|
|Ethics approval (ID)||P793 Q98550700 P1932|
|Funding sources (org)||P8324||Compulsory||Funding|
|Funding sources (dept or scheme or grant ID)||P793 Q P1932|
|keywords, metatags, mesh terms, raid terms||P921|
|Date of report||P793 Q37260 P518 Q10870555 P793 Q37260 P585 or P793 Q37260 P580 and P793 Q37260 P582||Compulsory|
|Methods of the initiative (what is planned to be done, or is being reported as done)||P4510||Compulsory||Methods, methodStep|
|Link to a public domain methodology document||P4510 Q P973||methodStep, sampling, samplingDescription|
|theoretical or conceptual models or relevant ‘values’ of people||P4510|
|Name of report author (person or algoritm)||P50||Compulsory|
|ORCID||author item P496|
|Public domain profile / institutional page||author item P856|
|Key contact email at initiative for confirming report content||P793 Q37260 P968||Compulsory||electronicMailAddress|
|Who has checked the quality of the data in this report?||P4032|
|Who was involved (named individual, organisation) Who was involved (group of anonymous individuals acting in role)||P767 P767 Q P1114||Compulsory|
|Specific tasks of this person or group||P767 Q P2868||Compulsory||Role|
|Methods of involvement of participants||P767 Q P2283|
|What was the outcome or output of the involvement?||P1542||Compulsory|
|Were any publication produced as part of this?||P921||Compulsory|
|Methods of appraising and analysing involvement (assessing rigour, deciding outcome measures, data collection and analysis)|
|Facilitators of involvement (what helps the contributors in achieving the project’s outcomes?) ‘stated as’ free text||P1552 Q101097118 P5102 P1552 Q101097118 P6001|
|Barriers of involvement (what inhibits the contributors from achieving the project’s outcomes?) ‘stated as’ free text||P1552 Q16515105 P5102 P1552 Q16515105 P6001|
|What was the outcome or output of the involvement of these people? What changed as a result of involving people? Were there any impacts?||P767 Q P1542|
|Which stage of the initiative were these people involved?||P767 Q P585 or P767 Q P580 and P767 Q P582|
|What was the estimated financial cost for involving each person or group How much time did it take to involve each person or group Were there any other non-financial costs in involving each person or group||P767 Q P2130 P767 Q P2047 P767 Q P1542|
|Financial relationship or other interest this person has to this project||P767 Q P1932 P1542 Q99429881 P6001||Compulsory|
|Conflicting or competing interests ‘stated as’ free text||P1552 Q99429881 P1932 P1552 Q99429881 P6001||Compulsory|
|What was the estimated financial cost for the overall initiative. How much time did it take.||P2130 P2047|
|Findable: How is information about this data disseminated||P1056 Q42848 P1552 Q100451967 P1056 Q42848 P7228||Compulsory|
|Accessible: How is it stored and hosted||P1056 Q42848 P4945||Compulsory|
|Interoperable: What analyses were||P4510|
|Interoperable: What format is it in||P1056 Q42848 P2701|
|Reusable: Access restriction status||P1056 Q42848 P7228 Q66739888||Compulsory|
|Reusable: License||P1056 Q42848 P275|
|Who owns it||P1056 Q42848 P1552 Q2587068 P1056 Q42848 P127||Compulsory||Intellectual Property Rights, intellectualRights|
|Where is it stored||P1056 Q42848 P276||Compulsory|
|Access restriction status||P1056 Q42848 P7228||Compulsory||Intellectual Property Rights, intellectualRights|
|How to access (email) How to access (url)||P1056 Q42848 P968 P1056 Q42848 P2699||Dataset, URL|
|Data steward/curator||P1056 Q42848 P1640|
|Has anything changed or happened as a result of this initiative that isn’t captured in previous answers?||P1542||Compulsory|
|What new knowledge has been generated? (if appropriate, include effect size, relevant statistics and level or evidence)||P1542 Q133500||Compulsory|
|What part of the initiative was the learning about What topic was learned||P1542 Q133500 P518 P1542 Q133500 P921|
|Describe how the learning or knowledge generated from this initiative has or will be used||P1542 Q133500 P1542|
|How has or how will this be measured?||P1542 Q P459|
|Who is involved in measuring this?||P1542 Q P767|