Difference between revisions of "Metadata"

From reBiND Documentation
Jump to: navigation, search
(catalogue standards)
m (42 revisions)
 
(12 intermediate revisions by one other user not shown)
Line 55: Line 55:
  
 
===standards to mix vocabularies===
 
===standards to mix vocabularies===
* Dublin Core Application Profiles: allow to mix and match terms from different vocabularies (http://dublincore.org/documents/profile-guidelines/)
+
* Dublin Core Application Profiles (DCAP): allow to mix and match terms from different vocabularies (http://dublincore.org/documents/profile-guidelines/)
 +
defines metadata records, provides semantic interoperability, is generic for designing metadata records, terms are based on RDF
 +
 
 +
DCMI Abstract Model (DCAM): application profiles promote the sharing and linking  of data within and between communities
 +
Dublin Core Description Set Profile (DCSP): language for application profiles
 +
 
 +
DCAP specifies and describes metadata in a particular application
 +
* functional requirements
 +
* Domain Model: types of metadata and their relationships
 +
* Description Set Profile and Usage Guidelines
 +
* Syntax Guidelines and Data Formats
 +
 
 +
DCMI-SF illustrates how the standards fit together
  
 
===schemas for external metadata===
 
===schemas for external metadata===
Line 105: Line 117:
 
*interface provision: OAI-PMH, metadata mapping for OAI-PMH export
 
*interface provision: OAI-PMH, metadata mapping for OAI-PMH export
  
==Creaton of Metadata==
+
==Creation of Metadata==
 
Metadata can be produced automatically or manually/intellectually. Computer-aided combination of both.
 
Metadata can be produced automatically or manually/intellectually. Computer-aided combination of both.
 +
 +
==Mapping Meta Data Standards==
 +
 +
ISO 19115
 +
*<span style="color:red">red </span>= core element
 +
 +
DIF
 +
*<span style= "color:red">red </span>= required
 +
*<span style= "color:blue">blue </span>= highly recommended
 +
*<span style="color:green">green </span>= recommended
 +
 +
Dublin core
 +
*<span style="color:red">red </span>= DC elements
 +
 +
{| class="wikitable sortable"
 +
| '''Kategorie'''|| '''ISO 19115 Meta'''|| '''Standard DIF'''|| '''Dublin Core Metadata ''' || '''ABCD Metadata''' || '''Notes'''
 +
|-
 +
| Identifier|| <span style="color:red">fileidentifier </span>(unique identifier for this metadata file)|| <span style="color:red">Enty_ID  </span>is the unique document identifier of the metadata record (may be the same as Data_Set_ID)|| (Resource) <span style="color:red">Identifier</span>  || ''none'' ||
 +
|-
 +
| || <span style="color:red">language</span>(languages used for documenting metadata, languageCode ISO 639)|| not in DIF|| ''none'' || ''none'' ||
 +
|-
 +
|-
 +
| || <span style="color:red">characterSet </span>(full name of the character coding standard, ISO 10646-2)|| not in DIF|| ''none'' || ''none'' ||
 +
|-
 +
|-
 +
| Technische Daten || presentationForm (Mode in which the data is represented)|| <span style="color:blue">Data_Set_Citation:</span>Data_Presentation_Form (The mode in which the data are represented, e.g. atlas, image, profile, text, etc.) || (Resource)<span style="color:red">Type</span> (nature or genre of the resource, recommended: controlled vocabulary such as the DCMI Type Vocabulary) || ''none'' ||
 +
|-
 +
|-
 +
| Personen und Rechte || <span style="color:red">datasetPointofContact</span> (Point of Contact)|| <span style="color:blue">Personnel</span>(defines the point of contact for more information about the data set or the metadata, may be repeated): Role (Investigator, technical contact, DIF Author)(may be repeated): First Name, Middle Name, Last_Name/email/FAX/Phone/contact_Adress || <span style="color:red">Contributor </span> || ContentMetadata/RevisionData/Contributor (source for Dublin Core standard element Contributor) ||
 +
|-
 +
|-
 +
| Beschreibung des Inhalts || <span style="color:red">geographicDescription</span> (Documented in ISO19112 - Location) , SI_LocationInstance , geographicIdentifier , spacialResolution (optional)|| <span style="color:blue">Location: Location_Category</span>(keyword, continent, ocean, geographic region, solid earth, space, vertical location) , Location_Type (keyword), Location_Subregion_1, _2, _3 (keywords), Detailed_Location (text) || <span style="color:red">Coverage </span> (The spatial or temporal topic of the resource, the spatial applicability of the resource, or the jurisdiction under which the resource is relevant.)|| ContentMetadata/ Description/Representation/'''Coverage''' (source for the Dublin Core standard element Coverage) || ''coverage: free text form describing geographic, taxonomic and other aspects of terminology and descriptions''
 +
|-
 +
|-
 +
| Beschreibung des Inhalts || <span style="color:red">geographicBox</span> (geographic Areal Domain of the dataset)|| <span style="color:blue">Spacial_Coverage</span> ||  || ||
 +
|-
 +
|-
 +
| Beschreibung des Inhalts || <span style="color:red">EX_GeographicBoundingBox</span> (Geographic area of the entire dataset)||  ||  ||  ||
 +
|-
 +
|-
 +
| Beschreibung des Inhalts || <span style="color:red">WestBoundLongitude</span> (Western-most coordinate of the limit)<span style="color:red">eastBoundLongitude, southBoundLatitude, northBoundLatitude </span>  (referenced to WGS 84) || <span style="color:blue">Temporal_Coverage</span> (start and stop dates during which the data was collected, may be repeated): Start_Date (may not be repeated within Temp.Cov.),  Stop_Date (not valid without start date) || <span style="color:red">Coverage </span> (The spatial or temporal topic of the resource, the spatial applicability of the resource, or the jurisdiction under which the resource is relevant.)|| ContentMetadata/ Description/Representation/'''Coverage''' (source for the Dublin Core standard element Coverage) || ||
 +
|-
 +
|-
 +
| Beschreibung des Inhalts || <span style="color:orange">not in ISO</span>  || <span style="color:blue">Paleo_Temporal_Coverage </span> (length of time represented by the data collected, data spans time frames earlier than yyyy-mm-dd = 0001-01-01): Paleo_Start_Date, Paleo_Stop_Date, Chronostratigraphic_Unit (eon, era, period, epoch, stage) || <span style="color:red">Coverage </span> (The spatial or temporal topic of the resource, the spatial applicability of the resource, or the jurisdiction under which the resource is relevant.)|| ContentMetadata/ Description/Representation/'''Coverage''' (source for the Dublin Core standard element Coverage) || ||
 +
|-
 +
|-
 +
| Personen und Rechte || originator (party who created the recource), (CI_RoleCode) || <span style="color:blue">Data_Set_Citation </span> (allows the author to properly cite the data set producer): Dataset_Creator (The name of the organization(s) or individual(s) with primary intellectual responsibility for the data set's development. )|| <span style="color:red">Creator </span> (An entity primarily responsible for making the resource.)|| ContentMetadata/ Description/Representation/'''Creator''' (source for the Dublin Core standard element Coverage) ||
 +
|-
 +
|-
 +
| Personen und Rechte || principalInvestigator (key party responsible for gathering information and conducting research), (CI_RoleCode) || <span style="color:blue">Personnel: Role: Investigator  </span>  (The person who headed the investigation or experiment that resulted in the acquisition of the data described (i.e., Principal Investigator, Experiment Team Leader)): Last_Name, First_Name, Middle_Name|| <span style="color:red">Creator </span> (An entity primarily responsible for making the resource.)|| ContentMetadata/ Description/Representation/'''Creator''' (source for the Dublin Core standard element Coverage) ||
 +
|-
 +
|-
 +
| Lebenszyklus || editionDate (Date of the Edition, Ausgabedatum) || <span style="color:blue">Data_Set_Citation:  </span>  Dataset_Release_Date|| <span style="color:red">Date </span> (DateIssued)|| ContentMetadata/Version/'''DateIssued''' (source for Dublin Core standard element DateIssued) || ||
 +
|-
 +
|-
 +
| Beschreibung des Inhalts || <span style="color:red">abstract</span> (Brief narrative summary of the content), purpose (summary of the intentions with which ds) || <span style="color:red">Summary </span>  (brief description of the data set along with the purpose of the data): Abstract (brief description of the data set), Purpose (purpose of the data set)  || <span style="color:red">Description </span> (Description may include but is not limited to: an abstract, a table of contents, a graphical representation, or a free-text account of the resource)|| ContentMetadata/Description/Representation/Details (source for Dublin Core standard element Description) || ||
 +
|-
 +
|-
 +
| Technische Daten || MD_DigitalTransferOptions (Means and media by which dataset is obtained): <span style="color:red">name</span>  (Name of the media on which the dataset can be received), transferSize (Estimated size of the transferred dataset), <span style="color:red">distributonFormat </span> (provides information about the format in whicht the dataset may be obtained), fees (Fees and terms for tretrieving the dataset) || <span style="color:blue">Distribution</span>  (media options, size, data format, and fees involved in distributing the data set): Distribution_Media, Distribution_Size, Distribution_Format, Fees || <span style="color:red">Format  </span> ( (The file format, physical medium, or dimensions of the resource.)||  ||
 +
|-
 +
|-
 +
| Technische Daten || <span style="color:red">language</span>  (languages used within the dataset, languageCode ISO 639) || <span style="color:blue">Data_Set_Language</span>  (language used in the preparation, storage, and description of the data) || <span style="color:red">Language  </span> (A language of the resource.Recommended best practice is to use a controlled vocabulary such as RFC 4646 [RFC4646].)|| ContentMetadata/Description/Representation/@language || ||
 +
|-
 +
|-
 +
| Personen und Rechte ||MD_Distribution (distributor and options): distributorContact, onLine (Information about online sources), distributorContact (Party from whom the dataset may be obtained)  || <span style="color:red">Data_Center</span>  (data center, organization, or institution responsible for distributing the data) Data_Center_Name, Data_Center_URL, Data Center Contact Last_Name, - First_Name, - Middle_Name || || ||
 +
|-
 +
|-
 +
| ||CI Citation: title (name by which the cited resource is known), alternateTitle (short name or other language name by which the cited information is known. Example: “DCW” as an alternative title for “Digital Chart of the World”)  || <span style="color:blue">Data_Set_Citation</span>  (allows the author to properly cite the data set producer. two functions: a) to indicate how this data set should be cited in professional scientific literature, and b) if this data set is a compilation of other data sets, to document and credit the data sets that were used in producing this compilation, citation for the data set itself, not articles related to the research results, can be repeated, subfields cannot be repeated): Dataset_Title, || || ||
 +
|-
 +
|-
 +
|Personen und Rechte ||publisher (party who published the resource)|| <span style="color:blue">Data_Set_Citation:</span>  Dataset_Publisher (The name of the individual or organization that made the data set available for release),  ||<span style="color:red">Publisher</span>  (An entity responsible for making the resource available) || ||
 +
|-
 +
|-
 +
|Vernetzung ||? sourceCitation (recommended reference to be used for the source data)|| <span style="color:green">Reference</span>  (describes key bibliographic citations pertaining to the data set): Author, Publication_Date, Title, Series, Edition, Volume, Issue, Report_number, Publication_Place, Publisher, Pages, ISBN, DOI, Online_Recource, Other_Reference_Details  ||<span style="color:red">Relation </span>  (A related resource) || Reference (published reference): TitleCitation, CitationDetail, URI (in ABCD pro Unit/Wiss.Name)||
 +
|-
 +
|-
 +
|Vernetzung ||linkage (information about on-line sources from which the dataset, specification, or community profile name and extended metadata elements can be obtained)|| <span style="color:blue">Data_Set_Citation: </span> Online_Resource||<span style="color:red">Relation </span>  (A related resource) || ||
 +
|-
 +
|-
 +
|Vernetzung ||funcCode ( Function performed by the resource,  Cl_Online Function <<CodeList>>, download, information, offlineAccess, order, search,)|| <span style="color:blue">Related_URL </span> (specifies links to Internet sites that contain information related to the data, as well as related Internet sites such as project home pages, related data archives/servers, metadata extensions, online software packages, web mapping services, and calibration/validation data): <span style="color:blue">URL_Content_Type (type</span> , subtype), <span style="color:blue">URL,</span>  Description||<span style="color:red">Relation </span>  (A related resource) || ||
 +
|-
 +
|-
 +
|Identifier||?? funcCode ( Function performed by the resource,  Cl_Online Function <<CodeList>>)|| <span style="color:blue">Related_URL </span> (specifies links to Internet sites that contain information related to the data, as well as related Internet sites such as project home pages, related data archives/servers, metadata extensions, online software packages, web mapping services, and calibration/validation data): <span style="color:blue">URL_Content_Type (type</span> , subtype), <span style="color:blue">URL,</span>  Description||<span style="color:red">?? Resource Identifier  </span>  (An unambiguous reference to the resource within a given context.) || ||
 +
|-
 +
|-
 +
| || <span style="color:orange">IN ISO SUCHEN</span>|| Data_Set_Citation: Online_Resource||<span style="color:red">Resource Identifier  </span>  (An unambiguous reference to the resource within a given context.) || ||
 +
|-
 +
|-
 +
|Personen und Rechte|| useConstraints (constraints applied to assure the protection of privacy or intellectual property, and any special restrictions or limitations or warnings on using the resource or metadata) (MD_legalConstraints)|| <span style="color:blue">Use_Constraints</span>(information about any constraints for accessing the data set) ||<span style="color:red">Rights Management </span>(Information about rights held in and over the resource) || ContentMetadata/<b>IPR</b> statements: IPRDeclarations, Copyrights, Licences, TermsofUseStatements, Disclaimers, Acknowledgements, Citations,||
 +
|-
 +
|-
 +
|Personen und Rechte|| AccessConstraints (access constraints applied to assure the protection of privacy or intellectual property, and any special restrictions or limitations on obtaining the resource or metadata) (MD_legalConstraints)|| <span style="color:blue">Access_Constraints </span>(describe how the data may or may not be used after access is granted to assure the protection of privacy or intellectual property) ||<span style="color:red">Rights Management </span>(Information about rights held in and over the resource) ||  ||
 +
|-
 +
|-
 +
|Personen und Rechte|| otherConstraints (other restrictions and legal prerequisites for accessing and using the resource or metadata ) (MD_legalConstraints)|| <span style="color:orange">not in DIF </span> || ||  ||
 +
|-
 +
|-
 +
|Vernetzung|| CI_OnlineResource (information about on-line sources from which the dataset, specification, or community profile name and extended metadata elements can be obtained, vererbt vom übergeorndeten Objekt)|| <span style="color:blue">Related_URL </span> (specifies links to Internet sites that contain information related to the data, as well as related Internet sites such as project home pages, related data archives/servers, metadata extensions, online software packages, web mapping services, and calibration/validation data): <span style="color:blue">URL_Content_Type (type</span> , subtype), <span style="color:blue">URL,</span>  Description || <span style="color:red">Source</span> (A related resource from which the described resource is derived)|| ContentMetadata/Description/Representation/<b>URI</b> (URI pointing to an online source, related to the current project which may or may not serve an updated version of the descripition data) ||
 +
|-
 +
|-
 +
|  || OS_Platfrom ( Vehicle/other support base holding sensor)|| <span style="color:blue">Platform</span> (or Source_Name - platform used to acquire the data, 11 categories of  platforms): Source_Name (repeatable), Short_Name, Long_Name (from controlled platform keywords when using the GCMD metadata authoring tools) ||<span style="color:red">Source</span> (A related resource from which the described resource is derived) ||  ||
 +
|-
 +
|-
 +
| Beschreibung des Inhalts || keyword (common-unse word(s) or phrase(s) used|| <span style="color:green">Keyword </span>(ancillary keyword, provide any words or phrases needed to further describe the data set) ||Subject and Keywords||  ||
 +
|-
 +
|-
 +
| Beschreibung des Inhalts || <span style="color:red">category</span> (Keywords describing dataset)|| <span style="color:red">Parameters:</span> Category (default: EARTH SCIENCE), Topic, Variable:Level 1-3, Detailed_Variable ||<span style="color:red">Subject</span> and Keywords (topic of the resource, represented using keywords, key phrases, or classification codes, recommended controled vocabulary)||  ||
 +
|-
 +
|-
 +
| Beschreibung des Inhalts || <DS_Sensor (Device or piece of equipment which detects and records information)|| <span style="color:blue">Instrument</span> (name of the instrument used to acquire the data, may be repeated: Earth Remote Sensing Instruments, In Situ/Laboratory Instruments, Solar/Space Observing Instruments): Sensor_Name – short_name, long_name ||<span style="color:red">Subject</span> and Keywords (topic of the resource, represented using keywords, key phrases, or classification codes, recommended controled vocabulary)||  ||
 +
|-
 +
|-
 +
| Beschreibung des Inhalts || <DS_Sensor (Device or piece of equipment which detects and records information)|| <span style="color:blue">Instrument</span> (name of the instrument used to acquire the data, may be repeated: Earth Remote Sensing Instruments, In Situ/Laboratory Instruments, Solar/Space Observing Instruments): Sensor_Name – short_name, long_name ||<span style="color:red">Subject</span> and Keywords (topic of the resource, represented using keywords, key phrases, or classification codes, recommended controled vocabulary)||  ||
 +
|-
 +
|-
 +
| Beschreibung des Inhalts || <span style="color:orange">not in ISO</span> || <span style="color:blue">Project </span> (name of the scientific program, field campaign, or project from which the data were collected): short name, long name ||<span style="color:red">Subject</span> and Keywords (topic of the resource, represented using keywords, key phrases, or classification codes, recommended controled vocabulary)||  ||
 +
|-
 +
|-
 +
| Beschreibung des Inhalts || <span style="color:orange">not repeated in ISO</span> || <span style="color:blue">Entry_Title</span> (should be descriptive enough so that when a user is presented with a list of titles the general content of the data set can be determined) ||<span style="color:red">Title</span> (name given to the resource)||ContentMetadata/Description/representation/<b>Title</b> (source for the Dublin Core standard element Title)  ||
 +
|-
 +
|-
 +
| Personen und Rechte || ? citedResponsibleParty (name and position information for an individual or organization that is responsible for the resource) || <span style="color:green">Originating Center</span> (data center or data producer who originally generated the dataset) || || ContentMetadata/<b>Owners</b>: Organisation, Person, Roles, Adresses, TelephoneNumbers, EmailAdresses, URIs, LogoURI, ( Entities having legal possession of the data collection content. Here defined for the entire data collection, not for individual units. If an owner statement is present on the unit level, it should override this dataset-level statement.)  ||
 +
|-
 +
|    ||    ||    ||Date (Date.Created) date of creation of the recource || ContentMetadata/RevisionData/<b>DateCreated</b> (source for Dublin Core standard element DateCreated)    ||
 +
|-
 +
|-
 +
|    ||    ||    ||Date (Date.Modified) || ContentMetadata/RevisionData/<b>DateModified</b> (source for Dublin Core standard element DateModified)    ||
 +
|-
 +
|-
 +
|    ||    ||    || ||  ContentMetadata/<b>Scope</b>: GeoecologicalTerms, TaxonomicTerms, IconURI  ||
 +
|-
 +
|-
 +
|    ||    ||    || ||  ContentMetadata/<b>Version</b>(number and date of current version): Major, Minor, Modifier  ||
 +
|-
 +
|-
 +
|    ||    ||    || || Datasets/Dataset/ContentContacts/    ||
 +
|-
 +
|-
 +
|    ||    ||    || ||  Datasets/Dataset/DatasetGUID  ||
 +
|-
 +
|-
 +
|    ||    ||    || ||  Datasets/Dataset/OtherProviders  ||
 +
|-
 +
|-
 +
|    ||    ||    || ||  Datasets/Dataset/TechnicalContact/  ||
 +
|-
 +
|-
 +
|    ||    ||    || ||  Datasets/Dataset/Units/Unit/SourceID  ||
 +
|-
 +
|-
 +
|    ||    ||    || ||  Datasets/Dataset/Units/Unit/SourceInstitutionID  ||
 +
|-
 +
|-
 +
|  Identifier  ||    ||  <span style="color:red"> Entry ID </span> (unique document identifier of the metadata record) = Parent DIF  || ||    ||
 +
|-
 +
|-
 +
|    ||  MD_topicCategoryCode (high-level geographic data thematic classification to assist in the grouping and search of available geographic data sets. Can be used to group keywords as well. Listed examples are not exhaustive. NOTE It is understood there are overlaps between general categories and the user is encouraged to select the one most appropriate.)  || <span style="color:red"> ISO Topic category</span> (identify the keywords in the ISO 19115 - Geographic Information Metadata ) (Farming, Biota, Boundaries, Climatology/Meteorology/Atmosphere, Economy, Elevation, Environment, Geoscientific Information, Health, Imagery/Base Maps/Earth Cover, Intelligence/Military, Inland Waters, Location, Oceans, Planning Cadastre, Society, Structure, Transportation, Utilities/Communications)  || ||    ||
 +
|-
 +
|-
 +
|    || <span style="color:red">metadataStandardName</span> (name of the metadata standard (including profile name) used  ||<span style="color:red">Metadata_Name</span> (current DIF standard name)|| ||    ||
 +
|-
 +
|-
 +
|    || <span style="color:red">metadataStandardVersion</span> (version of the metadata standard (version of the porfile) used    || <span style="color:red">Metadata_Version</span> (current DIF Metadata standard)  || ||    ||
 +
|-
 +
|-
 +
|    ||    || <span style="color:blue">Data_Set_Progress</span>  (production status of the data set regarding its completeness): planned, in work, complete  || ||    ||
 +
|-
 +
|-
 +
|    ||    || <span style="color:blue">Data_resolution</span> (resolution of the data, which is the difference between two adjacent geographic, vertical, or temporal values)    || ||    ||
 +
|-
 +
|-
 +
|    ||    || <span style="color:blue">Quality</span> (information about the quality of the data or any quality assurance procedures followed in producing the data)  || ||    ||
 +
|-
 +
|-
 +
|    ||    || <span style="color:blue">DIF revision history </span>  (list of changes made to the DIF over time)  || ||    ||
 +
|-
 +
|-
 +
|    ||    || <span style="color:green">Multimedia_Sample</span>  (provide information that will enable the display of a sample image, movie or sound clip within the DIF): File, URL, Format, Caption, Description,  || ||    ||
 +
|-
 +
|-
 +
|    ||    ||<span style="color:green"> Parent_DIF </span> (allows the capability to relate generalized aggregated metadata records (parents) to metadata records with highly specific information (children))  || ||    ||
 +
|-
 +
|-
 +
|    ||    || <span style="color:green">IDN_Node</span>(The Internal Directory Name (IDN) Node field is used internally to identify association, responsibility and/or ownership of the dataset, service or supplemental information, not displayed to the user)  || ||    ||
 +
|-
 +
|-
 +
|    || <span style="color:red">dateStamp </span>(date that the metadata was created)  || <span style="color:green"> DIF_Creation_Date </span> (date the metadata record was created)  || ||    ||
 +
|-
 +
|-
 +
|    ||    || <span style="color:green">Last_DIF_Revision_Date  </span>( date the metadata record was created)  || ||    ||
 +
|-
 +
|-
 +
|    ||    ||  <span style="color:green">Future_DIF_Revision_Date  </span>(allows for the specification of a future date at which the DIF should be reviewed for accuracy of scientific or technical content)  || ||    ||
 +
|-
 +
|-
 +
|    ||    || <span style="color:green"> Private </span> ( restrict the data set description from being publicly available) True or False (default, makes the decription publicly available)  || ||    ||
 +
|-
 +
|-
 +
|    || <span style="color:red">locale  </span> (provides information about an alternatetively used localised character string for a linguistic extension), (Sprachraum: Kombination aus Sprache, Land und Zeichensatz in der der Datensatz vorliegt)    ||  || ||    ||
 +
|-
 +
|-
 +
|    || <span style="color:red">Role name: spatioalRepresentationInfo  </span>  (digital representation of spatial information in the dataset)  ||  || ||    ||
 +
|-
 +
|-
 +
|    || <span style="color:red">Role name: referenceSysteminfo  </span> (description of the spatial and temporal reference systems used in the dataset)  ||  || ||    ||
 +
|-
 +
|-
 +
|    || <span style="color:red"> Role name: metadataExtensionInfo </span>  (basic information about the rescources to which the metadata applies  ||  || ||    ||
 +
|-
 +
|-
 +
|    || <span style="color:red"> Role name: contentInfo </span> (provides information about the feature catalogue and describes the coverage and image data characteristics)  ||  || ||    ||
 +
|-
 +
|-
 +
|    || <span style="color:red">Role name: distributionInfo  </span> (provides information about the distributor of and options for obtaining the resource(s))  ||  || ||    ||
 +
|-
 +
|-
 +
|    || <span style="color:red">dataQualityInfo  </span> (provides overall assessment of quality of a resource(s))  ||  || ||    ||
 +
|-
 +
|-
 +
|    || <span style="color:red">Role name: portrayalCatalogueInfo  </span>  (provides information about the catalogue of rules defined for the portrayal of a resource(s))  ||  || ||    ||
 +
|-
 +
|-
 +
|    || <span style="color:red"> Role name: metadataConstraints </span>  (provides restrictions on the access and use of metadata)  ||  || ||    ||
 +
|-
 +
|-
 +
|    || <span style="color:red">Role name: applicationSchemaInfo  </span>  (provides information about the conceptual schema of a dataset)  ||  || ||    ||
 +
|-
 +
|-
 +
|    || <span style="color:red"> Role name: metadataMaintenance </span>  (provides information about the frequency of metadata updates, and the scope of those updates)  ||  || ||    ||
 +
|-
 +
|-
 +
|    || <span style="color:red">  </span>    ||  || ||    ||
 +
|-
 +
|}
 +
Quellen:
 +
http://gcmd.gsfc.nasa.gov/Aboutus/standards/difiso.html<br>
 +
http://gcmd.gsfc.nasa.gov/Aboutus/standards/dublin_to_dif.html<br>
 +
http://rs.tdwg.org/dwc/terms/history/dwctoabcd/index.htm

Latest revision as of 02:18, 19 November 2014

Metadata Documentation

Metadata can be used to describe single items such as objects (physical or digital), but can also be used to describe groups of items. This documentation refers to metadata relating to groups of items only, as the reBiND project focusses on biodiversity data collections. This text aims to give an overview over metadata standards, which are relevant for the management of biodiversity data. Object data/metadata will be covered with ABCD standard.


Definitions and Functions

metadata

  • structured data describing information resources
  • The National Information Standards Organization (NISO) defines metadata as "structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource."
  • The World Wide Web Consortium (W3C) defines metadata as "machine understandable information for the web."
  • The Federal Geographic Data Committee (FGDC) defines metadata as describing, "the content, quality, condition, and other characteristics of data."
  • Put simply, metadata are data about data. They provide context for research findings, ideally in a machine-readable format. Once published, metadata can enable discovery of data via electronic interfaces and enable correct use and attribution of your findings.

(from: http://marinemetadata.org/guides/mdataintro/mdatadefined)

metadata from and for research data

  • external metadata: basis for unambiguous citation, comparable to classical catalogue data (libraries).
    • ID
    • technical data (Technische Daten)
    • discription of content (Beschreibung des Inhalts)
    • people and rights (Personen und Rechte)
    • networking (Vernetzung)
    • life cycle (Lebenszyklus)
  • internal metadata: metadata on subject-specific level, necessary for subject-specific understanding.
    • technical and method specific: data record comprehensible from technical and content point of vies, data file name, file format, file size, (hash value) , information about software,
    • subject-specific, here biodiversity specific: biodiversity specific metatdata for subject-specific retrieval available?, ABCD sufficient for all biodiversity primary data?

metadata for long term storage

  • structural metadata: relation of one object with other objects in an achrive (standard, e.g. METS)
  • administrative mtadata: administration of archived objects, originator and use evidence, access control, provinience information
  • preservation metadata: history of an objekt, e.g. provinience, measurements for long term accessability, authenticity, rights information regarding applicable processes (standards PREMIS, LMER)

recommendations from the German research foundatation (DFG)

"Die Daten werden durch Metadaten beschrieben. Mit den Metadaten (mindestens nach Dublin Core) werden zum einen die bibliographischen Fakten festgehalten. Es sind dies der Name des Forschers, der die Daten erhoben hat, die Benennung des Datensatzes, Ort und Jahr der Veröffentlichung sowie technische Daten (Format etc). In den inhaltsbezogenen Metadaten werden die Primärdaten umfassend beschrieben. Hier finden sich die Angaben zu den Rahmenbedingungen, unter denen sie erhoben bzw. gemessen wurden. Hier beschreibt der Autor auch die Fragestellung, unter der die Daten entstanden. Es sollen hier alle Informationen vorliegen, die für eine wiederholte Nutzung der Daten in anderen Fragestellungen erforderlich sind. Die Kriterien des Information Life Cycle Management sollen dabei berücksichtigt werden."

(from: Deutsche Forschungsgemeinschaft, Ausschuss für Wissenschaftliche Bibliotheken und Informationssysteme, Unterausschuss für Informationsmanagement, Empfehlungen zur gesicherten Aufbewahrung und Bereitstellung digitaler Forschungsprimärdaten, Januar 2009)


Metadata Standards

catalogue standards

standards to mix vocabularies

defines metadata records, provides semantic interoperability, is generic for designing metadata records, terms are based on RDF

DCMI Abstract Model (DCAM): application profiles promote the sharing and linking of data within and between communities Dublin Core Description Set Profile (DCSP): language for application profiles

DCAP specifies and describes metadata in a particular application

  • functional requirements
  • Domain Model: types of metadata and their relationships
  • Description Set Profile and Usage Guidelines
  • Syntax Guidelines and Data Formats

DCMI-SF illustrates how the standards fit together

schemas for external metadata

  • STD – DOI: 25 desrcibing input fields, oriented on ISO-Norm 690-2* for citation of electronic resources, fields from DC and international DOI Foundation, required: identifier, creators, titles, publisher, publication year, Optional: subjects, contributors, dates, language, resourceType, alternateIdentifiers, relatedIdentifiers, formats, version, rights, descriptions (http://www.iso.org/iso/catalogue_detail.htm?csnumber=25921 )
  • DataCite: draft, based on STD-DOI
  • Altman & King: bases mainly on Dublin Core (http://www.dlib.org/dlib/march07/altman/03altman.html)
  • OECD publisher: based on Altman and King, 27 elements
  • DANS (Data Archiving and Networked Services): 15 DC terms elements, which roughly correspond to a refinement of the core elements.
  • ANDS (Australian National Data Service): seperate Metadataschema, four groups: collection, service, party, and activity in different relations

(from: Konzeptstudie Forschungsdaten Chemie, www.fiz-chemie.de/fileadmin/user_upload/PDF_DE/Konzeptstudie_Forschungsdaten_Chemie.pdf)

further relevant metadata standards

  • DIF - Directory Interchange Format (http://gcmd.nasa.gov/User/difguide/difman.html)
    • 1988 formally approved and adopted, NASA Master Directory (NMD)
    • 1990 NMD renamed to Global Change Master Directory (GCMD), the GCMD serves as NASA's FGDC Clearinghouse node for geospatial metadata.
    • 2004 the ISO 19115/TC211 geospatial metadata standard was adopted
    • The DIF does not compete with other metadata standards. It is simply the "container" for the metadata elements.
    • Eight fields are required in the DIF
  • ISO 19115 - Geographic Information Metadata (http://www.gdi-de.org/thema2009/uebersetzungiso)
    • ISO 19115 defines how to describe geographical information and associated services, including contents, spatial-temporal purchases, data quality, access and rights to use.
    • The standard defines more than 400 metadata elements
    • 20 core elements.
  • INSPIRE Infrastructure for Spacial Information in the European Community
  • EML - Ecological Metadata Language (http://knb.ecoinformatics.org/software/eml/eml-2.1.0/index.html)
    • EML is implemented as a series of XML document types that can by used in a modular and extensible manner to document ecological data.
    • features: modular, detailed structure, compatible, strong typing - meeting the criteria of XML Schema, distinct content model and syntactic implementation
    • was designed with the following standards in mind: Dublin Core Metadata Initiative, the Content Standard for Digital Geospatial Metadata (CSDGM from the US geological Survey's Federal Geographic Data Committee (FGDC)), the Biological Profile of the CSDGM (from the National Biological Information Infrastructure), the International Standards Organization's Geographic Information Standard (ISO 19115), the ISO 8601 Date and Time Standard, the OpenGIS Consortiums's Geography Markup Language (GML), the Scientific, Technical, and Medical Markup Language (STMML), and the Extensible Scientific Interchange Language (XSIL).

metadata standards for long term storage

  • METS (Metadata Encoding & Transmission Standard): information about digitised objects, XML format, representation of inner object structure, metadatacontainer (http://www.loc.gov/standards/mets/mets-schemadocs.html)
  • PREMIS (PREservation Metadata: Implementation Strategies) Entities: Intellectual, Object, Event, Rights, Agent, exact description by semantic units (http://www.loc.gov/standards/premis/)
  • LMER (Deutsche Bibliothek, 2003) based on Preservation Metadata Schema of the National Library New Zealand, exchange format in cooperative archive systems, technical information, history of an object, object, file, process, modification, LMER data mapping to PREMIS data (http://www.d-nb.de/standards/lmer/lmer.htm)

Metadata Management Software

(from: Konzeptstudie Forschungsdaten Chemie)

Interfaces

Creation of Metadata

Metadata can be produced automatically or manually/intellectually. Computer-aided combination of both.

Mapping Meta Data Standards

ISO 19115

  • red = core element

DIF

  • red = required
  • blue = highly recommended
  • green = recommended

Dublin core

  • red = DC elements
Kategorie ISO 19115 Meta Standard DIF Dublin Core Metadata ABCD Metadata Notes
Identifier fileidentifier (unique identifier for this metadata file) Enty_ID is the unique document identifier of the metadata record (may be the same as Data_Set_ID) (Resource) Identifier none
language(languages used for documenting metadata, languageCode ISO 639) not in DIF none none
characterSet (full name of the character coding standard, ISO 10646-2) not in DIF none none
Technische Daten presentationForm (Mode in which the data is represented) Data_Set_Citation:Data_Presentation_Form (The mode in which the data are represented, e.g. atlas, image, profile, text, etc.) (Resource)Type (nature or genre of the resource, recommended: controlled vocabulary such as the DCMI Type Vocabulary) none
Personen und Rechte datasetPointofContact (Point of Contact) Personnel(defines the point of contact for more information about the data set or the metadata, may be repeated): Role (Investigator, technical contact, DIF Author)(may be repeated): First Name, Middle Name, Last_Name/email/FAX/Phone/contact_Adress Contributor ContentMetadata/RevisionData/Contributor (source for Dublin Core standard element Contributor)
Beschreibung des Inhalts geographicDescription (Documented in ISO19112 - Location) , SI_LocationInstance , geographicIdentifier , spacialResolution (optional) Location: Location_Category(keyword, continent, ocean, geographic region, solid earth, space, vertical location) , Location_Type (keyword), Location_Subregion_1, _2, _3 (keywords), Detailed_Location (text) Coverage (The spatial or temporal topic of the resource, the spatial applicability of the resource, or the jurisdiction under which the resource is relevant.) ContentMetadata/ Description/Representation/Coverage (source for the Dublin Core standard element Coverage) coverage: free text form describing geographic, taxonomic and other aspects of terminology and descriptions
Beschreibung des Inhalts geographicBox (geographic Areal Domain of the dataset) Spacial_Coverage
Beschreibung des Inhalts EX_GeographicBoundingBox (Geographic area of the entire dataset)
Beschreibung des Inhalts WestBoundLongitude (Western-most coordinate of the limit)eastBoundLongitude, southBoundLatitude, northBoundLatitude (referenced to WGS 84) Temporal_Coverage (start and stop dates during which the data was collected, may be repeated): Start_Date (may not be repeated within Temp.Cov.), Stop_Date (not valid without start date) Coverage (The spatial or temporal topic of the resource, the spatial applicability of the resource, or the jurisdiction under which the resource is relevant.) ContentMetadata/ Description/Representation/Coverage (source for the Dublin Core standard element Coverage)
Beschreibung des Inhalts not in ISO Paleo_Temporal_Coverage (length of time represented by the data collected, data spans time frames earlier than yyyy-mm-dd = 0001-01-01): Paleo_Start_Date, Paleo_Stop_Date, Chronostratigraphic_Unit (eon, era, period, epoch, stage) Coverage (The spatial or temporal topic of the resource, the spatial applicability of the resource, or the jurisdiction under which the resource is relevant.) ContentMetadata/ Description/Representation/Coverage (source for the Dublin Core standard element Coverage)
Personen und Rechte originator (party who created the recource), (CI_RoleCode) Data_Set_Citation (allows the author to properly cite the data set producer): Dataset_Creator (The name of the organization(s) or individual(s) with primary intellectual responsibility for the data set's development. ) Creator (An entity primarily responsible for making the resource.) ContentMetadata/ Description/Representation/Creator (source for the Dublin Core standard element Coverage)
Personen und Rechte principalInvestigator (key party responsible for gathering information and conducting research), (CI_RoleCode) Personnel: Role: Investigator (The person who headed the investigation or experiment that resulted in the acquisition of the data described (i.e., Principal Investigator, Experiment Team Leader)): Last_Name, First_Name, Middle_Name Creator (An entity primarily responsible for making the resource.) ContentMetadata/ Description/Representation/Creator (source for the Dublin Core standard element Coverage)
Lebenszyklus editionDate (Date of the Edition, Ausgabedatum) Data_Set_Citation: Dataset_Release_Date Date (DateIssued) ContentMetadata/Version/DateIssued (source for Dublin Core standard element DateIssued)
Beschreibung des Inhalts abstract (Brief narrative summary of the content), purpose (summary of the intentions with which ds) Summary (brief description of the data set along with the purpose of the data): Abstract (brief description of the data set), Purpose (purpose of the data set) Description (Description may include but is not limited to: an abstract, a table of contents, a graphical representation, or a free-text account of the resource) ContentMetadata/Description/Representation/Details (source for Dublin Core standard element Description)
Technische Daten MD_DigitalTransferOptions (Means and media by which dataset is obtained): name (Name of the media on which the dataset can be received), transferSize (Estimated size of the transferred dataset), distributonFormat (provides information about the format in whicht the dataset may be obtained), fees (Fees and terms for tretrieving the dataset) Distribution (media options, size, data format, and fees involved in distributing the data set): Distribution_Media, Distribution_Size, Distribution_Format, Fees Format ( (The file format, physical medium, or dimensions of the resource.)
Technische Daten language (languages used within the dataset, languageCode ISO 639) Data_Set_Language (language used in the preparation, storage, and description of the data) Language (A language of the resource.Recommended best practice is to use a controlled vocabulary such as RFC 4646 [RFC4646].) ContentMetadata/Description/Representation/@language
Personen und Rechte MD_Distribution (distributor and options): distributorContact, onLine (Information about online sources), distributorContact (Party from whom the dataset may be obtained) Data_Center (data center, organization, or institution responsible for distributing the data) Data_Center_Name, Data_Center_URL, Data Center Contact Last_Name, - First_Name, - Middle_Name
CI Citation: title (name by which the cited resource is known), alternateTitle (short name or other language name by which the cited information is known. Example: “DCW” as an alternative title for “Digital Chart of the World”) Data_Set_Citation (allows the author to properly cite the data set producer. two functions: a) to indicate how this data set should be cited in professional scientific literature, and b) if this data set is a compilation of other data sets, to document and credit the data sets that were used in producing this compilation, citation for the data set itself, not articles related to the research results, can be repeated, subfields cannot be repeated): Dataset_Title,
Personen und Rechte publisher (party who published the resource) Data_Set_Citation: Dataset_Publisher (The name of the individual or organization that made the data set available for release), Publisher (An entity responsible for making the resource available)
Vernetzung ? sourceCitation (recommended reference to be used for the source data) Reference (describes key bibliographic citations pertaining to the data set): Author, Publication_Date, Title, Series, Edition, Volume, Issue, Report_number, Publication_Place, Publisher, Pages, ISBN, DOI, Online_Recource, Other_Reference_Details Relation (A related resource) Reference (published reference): TitleCitation, CitationDetail, URI (in ABCD pro Unit/Wiss.Name)
Vernetzung linkage (information about on-line sources from which the dataset, specification, or community profile name and extended metadata elements can be obtained) Data_Set_Citation: Online_Resource Relation (A related resource)
Vernetzung funcCode ( Function performed by the resource, Cl_Online Function <<CodeList>>, download, information, offlineAccess, order, search,) Related_URL (specifies links to Internet sites that contain information related to the data, as well as related Internet sites such as project home pages, related data archives/servers, metadata extensions, online software packages, web mapping services, and calibration/validation data): URL_Content_Type (type , subtype), URL, Description Relation (A related resource)
Identifier ?? funcCode ( Function performed by the resource, Cl_Online Function <<CodeList>>) Related_URL (specifies links to Internet sites that contain information related to the data, as well as related Internet sites such as project home pages, related data archives/servers, metadata extensions, online software packages, web mapping services, and calibration/validation data): URL_Content_Type (type , subtype), URL, Description ?? Resource Identifier (An unambiguous reference to the resource within a given context.)
IN ISO SUCHEN Data_Set_Citation: Online_Resource Resource Identifier (An unambiguous reference to the resource within a given context.)
Personen und Rechte useConstraints (constraints applied to assure the protection of privacy or intellectual property, and any special restrictions or limitations or warnings on using the resource or metadata) (MD_legalConstraints) Use_Constraints(information about any constraints for accessing the data set) Rights Management (Information about rights held in and over the resource) ContentMetadata/IPR statements: IPRDeclarations, Copyrights, Licences, TermsofUseStatements, Disclaimers, Acknowledgements, Citations,
Personen und Rechte AccessConstraints (access constraints applied to assure the protection of privacy or intellectual property, and any special restrictions or limitations on obtaining the resource or metadata) (MD_legalConstraints) Access_Constraints (describe how the data may or may not be used after access is granted to assure the protection of privacy or intellectual property) Rights Management (Information about rights held in and over the resource)
Personen und Rechte otherConstraints (other restrictions and legal prerequisites for accessing and using the resource or metadata ) (MD_legalConstraints) not in DIF
Vernetzung CI_OnlineResource (information about on-line sources from which the dataset, specification, or community profile name and extended metadata elements can be obtained, vererbt vom übergeorndeten Objekt) Related_URL (specifies links to Internet sites that contain information related to the data, as well as related Internet sites such as project home pages, related data archives/servers, metadata extensions, online software packages, web mapping services, and calibration/validation data): URL_Content_Type (type , subtype), URL, Description Source (A related resource from which the described resource is derived) ContentMetadata/Description/Representation/URI (URI pointing to an online source, related to the current project which may or may not serve an updated version of the descripition data)
OS_Platfrom ( Vehicle/other support base holding sensor) Platform (or Source_Name - platform used to acquire the data, 11 categories of platforms): Source_Name (repeatable), Short_Name, Long_Name (from controlled platform keywords when using the GCMD metadata authoring tools) Source (A related resource from which the described resource is derived)
Beschreibung des Inhalts keyword (common-unse word(s) or phrase(s) used Keyword (ancillary keyword, provide any words or phrases needed to further describe the data set) Subject and Keywords
Beschreibung des Inhalts category (Keywords describing dataset) Parameters: Category (default: EARTH SCIENCE), Topic, Variable:Level 1-3, Detailed_Variable Subject and Keywords (topic of the resource, represented using keywords, key phrases, or classification codes, recommended controled vocabulary)
Beschreibung des Inhalts <DS_Sensor (Device or piece of equipment which detects and records information) Instrument (name of the instrument used to acquire the data, may be repeated: Earth Remote Sensing Instruments, In Situ/Laboratory Instruments, Solar/Space Observing Instruments): Sensor_Name – short_name, long_name Subject and Keywords (topic of the resource, represented using keywords, key phrases, or classification codes, recommended controled vocabulary)
Beschreibung des Inhalts <DS_Sensor (Device or piece of equipment which detects and records information) Instrument (name of the instrument used to acquire the data, may be repeated: Earth Remote Sensing Instruments, In Situ/Laboratory Instruments, Solar/Space Observing Instruments): Sensor_Name – short_name, long_name Subject and Keywords (topic of the resource, represented using keywords, key phrases, or classification codes, recommended controled vocabulary)
Beschreibung des Inhalts not in ISO Project (name of the scientific program, field campaign, or project from which the data were collected): short name, long name Subject and Keywords (topic of the resource, represented using keywords, key phrases, or classification codes, recommended controled vocabulary)
Beschreibung des Inhalts not repeated in ISO Entry_Title (should be descriptive enough so that when a user is presented with a list of titles the general content of the data set can be determined) Title (name given to the resource) ContentMetadata/Description/representation/Title (source for the Dublin Core standard element Title)
Personen und Rechte  ? citedResponsibleParty (name and position information for an individual or organization that is responsible for the resource) Originating Center (data center or data producer who originally generated the dataset) ContentMetadata/Owners: Organisation, Person, Roles, Adresses, TelephoneNumbers, EmailAdresses, URIs, LogoURI, ( Entities having legal possession of the data collection content. Here defined for the entire data collection, not for individual units. If an owner statement is present on the unit level, it should override this dataset-level statement.)
Date (Date.Created) date of creation of the recource ContentMetadata/RevisionData/DateCreated (source for Dublin Core standard element DateCreated)
Date (Date.Modified) ContentMetadata/RevisionData/DateModified (source for Dublin Core standard element DateModified)
ContentMetadata/Scope: GeoecologicalTerms, TaxonomicTerms, IconURI
ContentMetadata/Version(number and date of current version): Major, Minor, Modifier
Datasets/Dataset/ContentContacts/
Datasets/Dataset/DatasetGUID
Datasets/Dataset/OtherProviders
Datasets/Dataset/TechnicalContact/
Datasets/Dataset/Units/Unit/SourceID
Datasets/Dataset/Units/Unit/SourceInstitutionID
Identifier Entry ID (unique document identifier of the metadata record) = Parent DIF
MD_topicCategoryCode (high-level geographic data thematic classification to assist in the grouping and search of available geographic data sets. Can be used to group keywords as well. Listed examples are not exhaustive. NOTE It is understood there are overlaps between general categories and the user is encouraged to select the one most appropriate.) ISO Topic category (identify the keywords in the ISO 19115 - Geographic Information Metadata ) (Farming, Biota, Boundaries, Climatology/Meteorology/Atmosphere, Economy, Elevation, Environment, Geoscientific Information, Health, Imagery/Base Maps/Earth Cover, Intelligence/Military, Inland Waters, Location, Oceans, Planning Cadastre, Society, Structure, Transportation, Utilities/Communications)
metadataStandardName (name of the metadata standard (including profile name) used Metadata_Name (current DIF standard name)
metadataStandardVersion (version of the metadata standard (version of the porfile) used Metadata_Version (current DIF Metadata standard)
Data_Set_Progress (production status of the data set regarding its completeness): planned, in work, complete
Data_resolution (resolution of the data, which is the difference between two adjacent geographic, vertical, or temporal values)
Quality (information about the quality of the data or any quality assurance procedures followed in producing the data)
DIF revision history (list of changes made to the DIF over time)
Multimedia_Sample (provide information that will enable the display of a sample image, movie or sound clip within the DIF): File, URL, Format, Caption, Description,
Parent_DIF (allows the capability to relate generalized aggregated metadata records (parents) to metadata records with highly specific information (children))
IDN_Node(The Internal Directory Name (IDN) Node field is used internally to identify association, responsibility and/or ownership of the dataset, service or supplemental information, not displayed to the user)
dateStamp (date that the metadata was created) DIF_Creation_Date (date the metadata record was created)
Last_DIF_Revision_Date ( date the metadata record was created)
Future_DIF_Revision_Date (allows for the specification of a future date at which the DIF should be reviewed for accuracy of scientific or technical content)
Private ( restrict the data set description from being publicly available) True or False (default, makes the decription publicly available)
locale (provides information about an alternatetively used localised character string for a linguistic extension), (Sprachraum: Kombination aus Sprache, Land und Zeichensatz in der der Datensatz vorliegt)
Role name: spatioalRepresentationInfo (digital representation of spatial information in the dataset)
Role name: referenceSysteminfo (description of the spatial and temporal reference systems used in the dataset)
Role name: metadataExtensionInfo (basic information about the rescources to which the metadata applies
Role name: contentInfo (provides information about the feature catalogue and describes the coverage and image data characteristics)
Role name: distributionInfo (provides information about the distributor of and options for obtaining the resource(s))
dataQualityInfo (provides overall assessment of quality of a resource(s))
Role name: portrayalCatalogueInfo (provides information about the catalogue of rules defined for the portrayal of a resource(s))
Role name: metadataConstraints (provides restrictions on the access and use of metadata)
Role name: applicationSchemaInfo (provides information about the conceptual schema of a dataset)
Role name: metadataMaintenance (provides information about the frequency of metadata updates, and the scope of those updates)

Quellen: http://gcmd.gsfc.nasa.gov/Aboutus/standards/difiso.html
http://gcmd.gsfc.nasa.gov/Aboutus/standards/dublin_to_dif.html
http://rs.tdwg.org/dwc/terms/history/dwctoabcd/index.htm