Difference between revisions of "Existing name checking mechanisms"
m (→Checklist Bank (GBIF & Catalogue of Life)) |
|||
Line 1: | Line 1: | ||
(Testing and documentation in progress) | (Testing and documentation in progress) | ||
− | + | == Terminology == | |
− | + | In the context of TETTRIs, we distinguish three types of "Target aggregators", i.e. online databases of organism names that offer services for the use cases defined in the TETTRIs project:<br/> | |
+ | '''Primary target aggregators:''' These are aggregators that offer comprehensive taxonomic coverage for a certain group of organisms, globally or with a European geographic coverage. As a role, they offer more in depth information compared to the other types of aggregators. They may contain data from literature, from smaller-scale aggregators or sometimes even original data. TETTRIs here focuses sites directly driven by the respective taxonomic community: For Fungi: Index Fungorum / Species Fungorum (IF). For vascular plants and bryophytes: World Flora Online Plant List (WFO) and for Europe, Euro+Med PlantBase. IF is also contributing to CoL and PESI, Euro+Med to PESI (and it will use WFO Name IDs), and WFO Plant List will in the future probably contribute to CoL, too. The question of identifying appropriate community datasets for animal groups and for Algae is being discussed. <br/> | ||
+ | Species Fungorum and Euro+Med do not provide their own name matching services. However, since the CoL Services essentially are also covering ChecklistBank, versions of these datasets deposited in a standard format there can be accessed directly by singling them out as target datasets.<br/> | ||
+ | '''Secondary Target Aggregators:''' These are aggregators that offer comprehensive taxonomic coverage of names and taxa irrespective of the taxonomic group, with either global (Catalogue of Life) or European (PESI/EU-nomen) geographic coverage. Their records are mostly contributed by secondary aggregators representing a certain taxonomic group. In contrast to "Lookup" aggregators, they provide a single classification of name usage, either as synonyms, accepted taxon names or for some reason unplaced names. | ||
+ | Primary target aggregators are the principal components of the TETTRIs taxonomic backbone. TETTRIs focuses on Catalogue of Life and PESI/EU-nomen as a Primary Target Aggregators. <br/> | ||
+ | '''"Lookup" target aggregators:''' These are services that provide access to a multitude of stored datasets, which may be updated or rather out-of-date. They are of interest for name discovery and to identify where a designation comes from. However, TETTRIs focusses here on GBIF Checklist Bank and on Global Names and both offer a service to match names against specific datasets – this essentially renders them (indirect) secondary target aggregators. | ||
+ | |||
==[https://www.checklistbank.org/tools/name-match Checklist Bank (GBIF & Catalogue of Life)]== | ==[https://www.checklistbank.org/tools/name-match Checklist Bank (GBIF & Catalogue of Life)]== | ||
− | '''Scope:''' All taxa or specific groups, also geographically restricted depending on the target dataset chosen<br /> | + | '''Type:''' lookup. |
+ | '''Scope:''' All taxa or specific groups, also geographically restricted, depending on the target dataset chosen<br /> | ||
'''Software updated:''' May 23 (frontend), June 23 (backend) (checked August 15, 2023)<br /> | '''Software updated:''' May 23 (frontend), June 23 (backend) (checked August 15, 2023)<br /> | ||
'''Codebase/Documentation:''' [https://api.checklistbank.org/ https://api.checklistbank.org/]<br /> | '''Codebase/Documentation:''' [https://api.checklistbank.org/ https://api.checklistbank.org/]<br /> | ||
Line 13: | Line 20: | ||
'''Aggregator name ID returned:''' YES - in download only<br /> | '''Aggregator name ID returned:''' YES - in download only<br /> | ||
'''Interactive mode for partial matches:''' NO<br /> | '''Interactive mode for partial matches:''' NO<br /> | ||
+ | |||
+ | == Catalogue of Life == | ||
+ | '''Type:''' Secondary | ||
+ | '''Scope:''' potentially all taxa but incomplete for some zoological groups (genera only) and for Algae<br/> | ||
+ | '''Software:''' Editions are integrated into Checklist Bank, see above. | ||
==[https://list.worldfloraonline.org/matching.php World Flora Online WFO Plant List]== | ==[https://list.worldfloraonline.org/matching.php World Flora Online WFO Plant List]== | ||
− | Scope: Plants<br /> | + | '''Type:''' Primary<br/> |
− | Software updated: <br /> | + | '''Scope:''' Plants<br/> |
− | Codebase/Documentation <br /> | + | '''Software updated:''' ongoing Sept. 2023 (not stated on website)<br /> |
− | Data updated: <br /> | + | '''Codebase/Documentation:''' [https://list.worldfloraonline.org/gql_index.php GraphQL API], [https://list.worldfloraonline.org/matching_rest.php Name Matching REST API], [https://list.worldfloraonline.org/reconcile_index.php Reconciliation API] <br /> |
− | Limitation: <br /> | + | '''Data updated:''' July 2023 (semiannual edition)<br /> |
− | Other: <br /> | + | '''Limitation:''' Not found - tested with 144.000 records<br /> |
− | Local ID input returned: <br /> | + | '''Other:''' Service can be installed as local copy<br /> |
− | Local Name input returned: <br /> | + | '''Local ID input returned:''' Yes <br /> |
− | Aggregator name ID returned: <br /> | + | '''Local Name input returned:''' Yes <br /> |
− | Interactive mode for partial matches: <br /> | + | '''Aggregator name ID returned:''' Yes - WFO-ID <br /> |
+ | '''Interactive mode for partial matches:''' Yes <br /> | ||
==[https://www.eu-nomen.eu/portal/taxamatch.php PESI]== | ==[https://www.eu-nomen.eu/portal/taxamatch.php PESI]== | ||
Line 55: | Line 68: | ||
a 2021 Algabase set (but matching doesn’t work) <br /> | a 2021 Algabase set (but matching doesn’t work) <br /> | ||
an (supposedly) up to date Index Fungorum search – the output (HTML, JSON, CSV, TSV) contains a UUID in the „id“ field which is not the „Index Fungorum UUID“, probably a Global Names UUID. But the field „RecordID“ is the „Index Fungorum Registration Identifier“ which in a URL resolves to the name page: http://www.indexfungorum.org/Names/NamesRecord.asp?RecordID=229900 <br /> | an (supposedly) up to date Index Fungorum search – the output (HTML, JSON, CSV, TSV) contains a UUID in the „id“ field which is not the „Index Fungorum UUID“, probably a Global Names UUID. But the field „RecordID“ is the „Index Fungorum Registration Identifier“ which in a URL resolves to the name page: http://www.indexfungorum.org/Names/NamesRecord.asp?RecordID=229900 <br /> | ||
− | Scope: defined by stored datasets<br /> | + | '''Scope:''' defined by stored datasets<br /> |
Software updated: <br /> | Software updated: <br /> | ||
Codebase/Documentation <br /> | Codebase/Documentation <br /> | ||
Data updated: <br /> | Data updated: <br /> | ||
− | Limitation: 5000 names, at least in interactive mode<br /> | + | '''Limitation:''' 5000 names, at least in interactive mode<br /> |
− | Other: Offers a kind of query language that seems to be very flexible<br /> | + | '''Other:''' Offers a kind of query language that seems to be very flexible<br /> |
Local ID input returned: <br /> | Local ID input returned: <br /> | ||
Local Name input returned: <br /> | Local Name input returned: <br /> | ||
Line 68: | Line 81: | ||
==[http://namematch.science.kew.org/ Kew (IPNI)] <br />== | ==[http://namematch.science.kew.org/ Kew (IPNI)] <br />== | ||
− | Scope: Vascular plants (POWO - IPNI offered but not working) <br /> | + | '''Scope:''' Vascular plants (POWO - IPNI offered but not working) <br /> |
− | Software updated: ? <br /> | + | '''Software updated:''' ? <br /> |
− | Codebase/Documentation ? <br /> | + | '''Codebase/Documentation''' ? <br /> |
− | Data updated: current <br /> | + | '''Data updated:''' current <br /> |
− | Limitation: Not found - tested with 144.000 records<br /> | + | '''Limitation:''' Not found - tested with 144.000 records<br /> |
− | Other: OpenRefine Interface? <br /> | + | '''Other:''' OpenRefine Interface? <br /> |
− | Local ID input returned: YES <br /> | + | '''Local ID input returned:''' YES <br /> |
− | Local Name input returned: YES<br /> | + | '''Local Name input returned:''' YES<br /> |
− | Aggregator name ID returned: YES: IPNI-LSID<br /> | + | '''Aggregator name ID returned:''' YES: IPNI-LSID<br /> |
− | Interactive mode for partial matches: NO | + | '''Interactive mode for partial matches:''' NO |
==[https://tnrs.biendata.org/ TNRS Taxonomic Name Resolution Service] <br />== | ==[https://tnrs.biendata.org/ TNRS Taxonomic Name Resolution Service] <br />== | ||
− | Scope: Plants, WFO and vascular plants WCVP - potentially more datasets could be included<br /> | + | '''Scope:''' Plants, WFO and vascular plants WCVP - potentially more datasets could be included<br /> |
− | Software updated: v. 5.0 Feb. 24, 2021<br /> | + | '''Software updated:''' v. 5.0 Feb. 24, 2021<br /> |
− | Codebase/Documentation: https://github.com/ojalaquellueva/TNRSapi<br /> | + | '''Codebase/Documentation:''' https://github.com/ojalaquellueva/TNRSapi<br /> |
− | Data updated: 2023<br /> | + | '''Data updated:''' 2023<br /> |
− | Limitation: Pasting 5000 names; API-processing unlimited (in batches of 5000)<br /> | + | '''Limitation:''' Pasting 5000 names; API-processing unlimited (in batches of 5000)<br /> |
− | Other: R package available<br /> | + | '''Other:''' R package available<br /> |
− | Local ID input returned: No <br /> | + | '''Local ID input returned:''' No <br /> |
− | Local Name input returned: YES<br /> | + | '''Local Name input returned:''' YES<br /> |
− | Aggregator name ID returned: NO<br /> | + | '''Aggregator name ID returned:''' NO<br /> |
− | Interactive mode for partial matches: YES | + | '''Interactive mode for partial matches:''' YES |
==[https://legacy.tropicos.org/NameMatching.aspx Tropicos]== | ==[https://legacy.tropicos.org/NameMatching.aspx Tropicos]== | ||
− | Scope: Plants<br /> | + | '''Scope:''' Plants<br /> |
+ | Software updated: <br /> | ||
+ | Codebase/Documentation <br /> | ||
+ | Data updated: <br /> | ||
+ | Limitation: <br /> | ||
+ | Other: <br /> | ||
+ | '''Local ID input returned:''' Yes<br /> | ||
+ | '''Local Name input returned:''' Yes<br /> | ||
+ | '''Aggregator name ID returned:''' Yes: Tropicos-ID<br /> | ||
+ | '''Interactive mode for partial matches:''' No<br /> | ||
+ | |||
+ | ==[https://www.gbif.org/tools/species-lookup GBIF Taxonomic Backbone]== | ||
+ | '''Scope:''' All taxa<br /> | ||
Software updated: <br /> | Software updated: <br /> | ||
Codebase/Documentation <br /> | Codebase/Documentation <br /> |
Revision as of 12:15, 7 September 2023
(Testing and documentation in progress)
Contents
Terminology
In the context of TETTRIs, we distinguish three types of "Target aggregators", i.e. online databases of organism names that offer services for the use cases defined in the TETTRIs project:
Primary target aggregators: These are aggregators that offer comprehensive taxonomic coverage for a certain group of organisms, globally or with a European geographic coverage. As a role, they offer more in depth information compared to the other types of aggregators. They may contain data from literature, from smaller-scale aggregators or sometimes even original data. TETTRIs here focuses sites directly driven by the respective taxonomic community: For Fungi: Index Fungorum / Species Fungorum (IF). For vascular plants and bryophytes: World Flora Online Plant List (WFO) and for Europe, Euro+Med PlantBase. IF is also contributing to CoL and PESI, Euro+Med to PESI (and it will use WFO Name IDs), and WFO Plant List will in the future probably contribute to CoL, too. The question of identifying appropriate community datasets for animal groups and for Algae is being discussed.
Species Fungorum and Euro+Med do not provide their own name matching services. However, since the CoL Services essentially are also covering ChecklistBank, versions of these datasets deposited in a standard format there can be accessed directly by singling them out as target datasets.
Secondary Target Aggregators: These are aggregators that offer comprehensive taxonomic coverage of names and taxa irrespective of the taxonomic group, with either global (Catalogue of Life) or European (PESI/EU-nomen) geographic coverage. Their records are mostly contributed by secondary aggregators representing a certain taxonomic group. In contrast to "Lookup" aggregators, they provide a single classification of name usage, either as synonyms, accepted taxon names or for some reason unplaced names.
Primary target aggregators are the principal components of the TETTRIs taxonomic backbone. TETTRIs focuses on Catalogue of Life and PESI/EU-nomen as a Primary Target Aggregators.
"Lookup" target aggregators: These are services that provide access to a multitude of stored datasets, which may be updated or rather out-of-date. They are of interest for name discovery and to identify where a designation comes from. However, TETTRIs focusses here on GBIF Checklist Bank and on Global Names and both offer a service to match names against specific datasets – this essentially renders them (indirect) secondary target aggregators.
Checklist Bank (GBIF & Catalogue of Life)
Type: lookup.
Scope: All taxa or specific groups, also geographically restricted, depending on the target dataset chosen
Software updated: May 23 (frontend), June 23 (backend) (checked August 15, 2023)
Codebase/Documentation: https://api.checklistbank.org/
Data updated: depending on target dataset
Limitation: Direct input of list limited to 6000 names. (With file upload not limited)
Other: Login with GBIF account is recommended (self-registration at https://www.gbif.org/user/profile)
Local ID input returned: NO (checked August 16, 2023)
Local Name input returned: YES
Aggregator name ID returned: YES - in download only
Interactive mode for partial matches: NO
Catalogue of Life
Type: Secondary
Scope: potentially all taxa but incomplete for some zoological groups (genera only) and for Algae
Software: Editions are integrated into Checklist Bank, see above.
World Flora Online WFO Plant List
Type: Primary
Scope: Plants
Software updated: ongoing Sept. 2023 (not stated on website)
Codebase/Documentation: GraphQL API, Name Matching REST API, Reconciliation API
Data updated: July 2023 (semiannual edition)
Limitation: Not found - tested with 144.000 records
Other: Service can be installed as local copy
Local ID input returned: Yes
Local Name input returned: Yes
Aggregator name ID returned: Yes - WFO-ID
Interactive mode for partial matches: Yes
PESI
Scope:
Software updated:
Codebase/Documentation
Data updated:
Limitation:
Other:
Local ID input returned:
Local Name input returned:
Aggregator name ID returned:
Interactive mode for partial matches:
LifeWatch
Scope:
Software updated:
Codebase/Documentation
Data updated:
Limitation:
Other:
Local ID input returned:
Local Name input returned:
Aggregator name ID returned:
Interactive mode for partial matches:
Global Names
Contains options to restrict matching to certain source dataset, among them:
a 2021 Algabase set (but matching doesn’t work)
an (supposedly) up to date Index Fungorum search – the output (HTML, JSON, CSV, TSV) contains a UUID in the „id“ field which is not the „Index Fungorum UUID“, probably a Global Names UUID. But the field „RecordID“ is the „Index Fungorum Registration Identifier“ which in a URL resolves to the name page: http://www.indexfungorum.org/Names/NamesRecord.asp?RecordID=229900
Scope: defined by stored datasets
Software updated:
Codebase/Documentation
Data updated:
Limitation: 5000 names, at least in interactive mode
Other: Offers a kind of query language that seems to be very flexible
Local ID input returned:
Local Name input returned:
Aggregator name ID returned:
Interactive mode for partial matches:
Kew (IPNI)
Scope: Vascular plants (POWO - IPNI offered but not working)
Software updated: ?
Codebase/Documentation ?
Data updated: current
Limitation: Not found - tested with 144.000 records
Other: OpenRefine Interface?
Local ID input returned: YES
Local Name input returned: YES
Aggregator name ID returned: YES: IPNI-LSID
Interactive mode for partial matches: NO
TNRS Taxonomic Name Resolution Service
Scope: Plants, WFO and vascular plants WCVP - potentially more datasets could be included
Software updated: v. 5.0 Feb. 24, 2021
Codebase/Documentation: https://github.com/ojalaquellueva/TNRSapi
Data updated: 2023
Limitation: Pasting 5000 names; API-processing unlimited (in batches of 5000)
Other: R package available
Local ID input returned: No
Local Name input returned: YES
Aggregator name ID returned: NO
Interactive mode for partial matches: YES
Tropicos
Scope: Plants
Software updated:
Codebase/Documentation
Data updated:
Limitation:
Other:
Local ID input returned: Yes
Local Name input returned: Yes
Aggregator name ID returned: Yes: Tropicos-ID
Interactive mode for partial matches: No
GBIF Taxonomic Backbone
Scope: All taxa
Software updated:
Codebase/Documentation
Data updated:
Limitation:
Other:
Local ID input returned:
Local Name input returned:
Aggregator name ID returned:
Interactive mode for partial matches: