Metadata
Metadata Documentation
Metadata can be used to describe single items such as objects (physical or digital), but can also be used to describe groups of items. This documentation refers to metadata relating to groups of items only, as the reBiND project focusses on biodiversity data collections. This text aims to give an overview over metadata standards, which are relevant for the management of biodiversity data. Object data/metadata will be covered with ABCD standard.
Contents
Definitions and Functions
metadata
- structured data describing information resources
- The National Information Standards Organization (NISO) defines metadata as "structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource."
- The World Wide Web Consortium (W3C) defines metadata as "machine understandable information for the web."
- The Federal Geographic Data Committee (FGDC) defines metadata as describing, "the content, quality, condition, and other characteristics of data."
- Put simply, metadata are data about data. They provide context for research findings, ideally in a machine-readable format. Once published, metadata can enable discovery of data via electronic interfaces and enable correct use and attribution of your findings.
(from: http://marinemetadata.org/guides/mdataintro/mdatadefined)
metadata from and for research data
- external metadata: basis for unambiguous citation, comparable to classical catalogue data (libraries).
- ID
- technical data (Technische Daten)
- discription of content (Beschreibung des Inhalts)
- people and rights (Personen und Rechte)
- networking (Vernetzung)
- life cycle (Lebenszyklus)
- internal metadata: metadata on subject-specific level, necessary for subject-specific understanding.
- technical and method specific: data record comprehensible from technical and content point of vies, data file name, file format, file size, (hash value) , information about software,
- subject-specific, here biodiversity specific: biodiversity specific metatdata for subject-specific retrieval available?, ABCD sufficient for all biodiversity primary data?
metadata for long term storage
- structural metadata: relation of one object with other objects in an achrive (standard, e.g. METS)
- administrative mtadata: administration of archived objects, originator and use evidence, access control, provinience information
- preservation metadata: history of an objekt, e.g. provinience, measurements for long term accessability, authenticity, rights information regarding applicable processes (standards PREMIS, LMER)
recommendations from the German research foundatation (DFG)
"Die Daten werden durch Metadaten beschrieben. Mit den Metadaten (mindestens nach Dublin Core) werden zum einen die bibliographischen Fakten festgehalten. Es sind dies der Name des Forschers, der die Daten erhoben hat, die Benennung des Datensatzes, Ort und Jahr der Veröffentlichung sowie technische Daten (Format etc). In den inhaltsbezogenen Metadaten werden die Primärdaten umfassend beschrieben. Hier finden sich die Angaben zu den Rahmenbedingungen, unter denen sie erhoben bzw. gemessen wurden. Hier beschreibt der Autor auch die Fragestellung, unter der die Daten entstanden. Es sollen hier alle Informationen vorliegen, die für eine wiederholte Nutzung der Daten in anderen Fragestellungen erforderlich sind. Die Kriterien des Information Life Cycle Management sollen dabei berücksichtigt werden."
(from: Deutsche Forschungsgemeinschaft, Ausschuss für Wissenschaftliche Bibliotheken und Informationssysteme, Unterausschuss für Informationsmanagement, Empfehlungen zur gesicherten Aufbewahrung und Bereitstellung digitaler Forschungsprimärdaten, Januar 2009)
Metadata Standards
catalogue standards
- PICA - Project of Integrated Catalogue Automation
- MARC – Machine Readable Cataloging (http://www.loc.gov/standards/marcxml/)
- Dublin Core (http://dublincore.org/documents/dces/)
standards to mix vocabularies
- Dublin Core Application Profiles: allow to mix and match terms from different vocabularies (http://dublincore.org/documents/profile-guidelines/)
schemas for external metadata
- STD – DOI: 25 desrcibing input fields, oriented on ISO-Norm 690-2* for citation of electronic resources, fields from DC and international DOI Foundation, required: identifier, creators, titles, publisher, publication year, Optional: subjects, contributors, dates, language, resourceType, alternateIdentifiers, relatedIdentifiers, formats, version, rights, descriptions (http://www.iso.org/iso/catalogue_detail.htm?csnumber=25921 )
- DataCite: draft, based on STD-DOI
- Altman & King: bases mainly on Dublin Core (http://www.dlib.org/dlib/march07/altman/03altman.html)
- OECD publisher: based on Altman and King, 27 elements
- DANS (Data Archiving and Networked Services): 15 DC terms elements, which roughly correspond to a refinement of the core elements.
- ANDS (Australian National Data Service): seperate Metadataschema, four groups: collection, service, party, and activity in different relations
(from: Konzeptstudie Forschungsdaten Chemie, www.fiz-chemie.de/fileadmin/user_upload/PDF_DE/Konzeptstudie_Forschungsdaten_Chemie.pdf)
further relevant metadata standards
- DIF - Directory Interchange Format (http://gcmd.nasa.gov/User/difguide/difman.html)
- 1988 formally approved and adopted, NASA Master Directory (NMD)
- 1990 NMD renamed to Global Change Master Directory (GCMD), the GCMD serves as NASA's FGDC Clearinghouse node for geospatial metadata.
- 2004 the ISO 19115/TC211 geospatial metadata standard was adopted
- The DIF does not compete with other metadata standards. It is simply the "container" for the metadata elements.
- Eight fields are required in the DIF
- ISO 19115 - Geographic Information Metadata (http://www.gdi-de.org/thema2009/uebersetzungiso)
- ISO 19115 defines how to describe geographical information and associated services, including contents, spatial-temporal purchases, data quality, access and rights to use.
- The standard defines more than 400 metadata elements
- 20 core elements.
- INSPIRE Infrastructure for Spacial Information in the European Community
- INSPIRE Metadata Regulation document (http://inspire.jrc.ec.europa.eu/index.cfm/pageid/101)
- INSPIRE Meteadata Implementing Rules document (http://inspire.jrc.ec.europa.eu/index.cfm/pageid/101)
- EML - Ecological Metadata Language (http://knb.ecoinformatics.org/software/eml/eml-2.1.0/index.html)
- EML is implemented as a series of XML document types that can by used in a modular and extensible manner to document ecological data.
- features: modular, detailed structure, compatible, strong typing - meeting the criteria of XML Schema, distinct content model and syntactic implementation
- was designed with the following standards in mind: Dublin Core Metadata Initiative, the Content Standard for Digital Geospatial Metadata (CSDGM from the US geological Survey's Federal Geographic Data Committee (FGDC)), the Biological Profile of the CSDGM (from the National Biological Information Infrastructure), the International Standards Organization's Geographic Information Standard (ISO 19115), the ISO 8601 Date and Time Standard, the OpenGIS Consortiums's Geography Markup Language (GML), the Scientific, Technical, and Medical Markup Language (STMML), and the Extensible Scientific Interchange Language (XSIL).
metadata standards for long term storage
- METS (Metadata Encoding & Transmission Standard): information about digitised objects, XML format, representation of inner object structure, metadatacontainer (http://www.loc.gov/standards/mets/mets-schemadocs.html)
- PREMIS (PREservation Metadata: Implementation Strategies) Entities: Intellectual, Object, Event, Rights, Agent, exact description by semantic units (http://www.loc.gov/standards/premis/)
- LMER (Deutsche Bibliothek, 2003) based on Preservation Metadata Schema of the National Library New Zealand, exchange format in cooperative archive systems, technical information, history of an object, object, file, process, modification, LMER data mapping to PREMIS data (http://www.d-nb.de/standards/lmer/lmer.htm)
Metadata Management Software
- Metacat: metadata catalogue and repository for science data (ecology, environmental research), XML syntax, open source (http://knb.ecoinformatics.org/software/metacat/01-intro.html)
- Morpho: software for metadata input, storage EML conform files, information about people, locations, research methods, data attributess (http://knb.ecoinformatics.org/morphoportal.jsp)
- MERMAid (Metadata Enterprise Resource Management Aid): tool for development, validation, management, publication of metadata (https://www.dataone.org/node/204)
- MATT (Metadata Authoring Tool): runs within webbrowser, instructions for composing metadata, data converted to XML (https://www.dataone.org/node/188)
- CatMDEdit: metadata editor tool, focus on description geographic information resources, conform with DC and ISO 19115. (http://catmdedit.sourceforge.net/)
- Archivematica: digital preservation system, free, open source, data processing from ingest to access according to ISO-OAIS model (http://archivematica.org/wiki/index.php?title=Main_Page)
(from: Konzeptstudie Forschungsdaten Chemie)
Interfaces
- external interfaces: DOI regristry with metadata mapping (https://mds.datacite.org/static/apidoc)
- interface provision: OAI-PMH, metadata mapping for OAI-PMH export
Creation of Metadata
Metadata can be produced automatically or manually/intellectually. Computer-aided combination of both.