Difference between revisions of "Data preparation"
LornaMorris (talk | contribs) (→Prepare Data for XML Transfer and ABCD Mapping) |
LornaMorris (talk | contribs) |
||
Line 8: | Line 8: | ||
The following '''preparation steps''' are recommended: | The following '''preparation steps''' are recommended: | ||
− | ==Convert abbreviations into complete words | + | ===Convert abbreviations into complete words=== |
In the screenshot below abbreviations in original data on the left were transferred to complete words in the right: RLB - Rote Liste Berlin (= Red Lists Berlin), RLBb - Rote Liste Brandenburg (= Red Lists Brandenburg) | In the screenshot below abbreviations in original data on the left were transferred to complete words in the right: RLB - Rote Liste Berlin (= Red Lists Berlin), RLBb - Rote Liste Brandenburg (= Red Lists Brandenburg) | ||
[[File:ResolveAbbreviations.jpg]] | [[File:ResolveAbbreviations.jpg]] | ||
− | ==Translate numbers or characters into text | + | ==Translate numbers or characters into text=== |
− | In some cases numbers or characters don’t represent a value but qualitative statements. To make this information comprehensible it is recommended | + | In some cases numbers or characters don’t represent a value but are qualitative statements. To make this information comprehensible it is recommended that they are converted into written text. |
You can use “if functions” in Excel for conversion of numbers and characters. | You can use “if functions” in Excel for conversion of numbers and characters. | ||
e.g. =IF(A3="3";"3-gefährdet";IF(A3="1";"1- vom Aussterben bedroht";A3;"")) | e.g. =IF(A3="3";"3-gefährdet";IF(A3="1";"1- vom Aussterben bedroht";A3;"")) | ||
Line 27: | Line 27: | ||
[[File:TranslateCharacterssintoText.JPG]] | [[File:TranslateCharacterssintoText.JPG]] | ||
− | ==Convert Values into the ABCD essential form. == | + | ===Convert Values into the ABCD essential form.=== |
* Example: Transform lat/lon coordinates into decimal form (required for ABCD). | * Example: Transform lat/lon coordinates into decimal form (required for ABCD). | ||
Latitude in degree, minute and seconds is transformed into decimal degree | Latitude in degree, minute and seconds is transformed into decimal degree | ||
Line 35: | Line 35: | ||
[[File:ConvertValues.JPG]] | [[File:ConvertValues.JPG]] | ||
− | ==Add columns with ‘unit of measurement’== | + | ===Add columns with ‘unit of measurement’=== |
Add columns with ‘unit of measurement’, if missing, according to the measurement. Otherwise in ABCD the units cannot be allocated to the values correctly. | Add columns with ‘unit of measurement’, if missing, according to the measurement. Otherwise in ABCD the units cannot be allocated to the values correctly. | ||
Example: header of a column in Excel: tarsus (mm) – length, width, height. Add columns with header ‘Unit’ and enter ‘mm’ in every row. | Example: header of a column in Excel: tarsus (mm) – length, width, height. Add columns with header ‘Unit’ and enter ‘mm’ in every row. | ||
Line 46: | Line 46: | ||
− | ==Add core information from metadata== | + | ===Add core information from metadata== |
− | Add core information from metadata to | + | Add core information from metadata to e=very unit, if missing in the unit table. You can add this information by using the SQL statement above. |
− | ==Denormalise information | + | ===Denormalise information=== |
In case of repeatable elements in ABCD you need to denormalise information. | In case of repeatable elements in ABCD you need to denormalise information. | ||
More information see: | More information see: | ||
BioCASE documention Wiki: http://wiki.bgbm.org/bps/index.php/Preparation | BioCASE documention Wiki: http://wiki.bgbm.org/bps/index.php/Preparation | ||
Here: “controlled denormalisation” | Here: “controlled denormalisation” |
Revision as of 18:14, 4 November 2014
Prepare Data for XML Transfer and ABCD Mapping
Before you start with data preparation as described in BioCASE documentation, it is necessary to prepare the content of the data set for following reasons:
- Make content information comprehensible
- Add fields mandatory to ABCD Mapping
The following preparation steps are recommended:
Convert abbreviations into complete words
In the screenshot below abbreviations in original data on the left were transferred to complete words in the right: RLB - Rote Liste Berlin (= Red Lists Berlin), RLBb - Rote Liste Brandenburg (= Red Lists Brandenburg)
Translate numbers or characters into text=
In some cases numbers or characters don’t represent a value but are qualitative statements. To make this information comprehensible it is recommended that they are converted into written text. You can use “if functions” in Excel for conversion of numbers and characters. e.g. =IF(A3="3";"3-gefährdet";IF(A3="1";"1- vom Aussterben bedroht";A3;""))
- Conversion of numbers
Numbers in original data were completed with the meaning of the numbers: 1 – vom Aussterben bedroht (= critically endangered), 2 – stark gefährded (= endangered), 3 – gefährdet (= vulnerable), 4 - potentiell gefährdet (= near threatended)
- Conversion of characters
Characters in original data are turned into text (different ecological types)
Convert Values into the ABCD essential form.
- Example: Transform lat/lon coordinates into decimal form (required for ABCD).
Latitude in degree, minute and seconds is transformed into decimal degree
Add columns with ‘unit of measurement’
Add columns with ‘unit of measurement’, if missing, according to the measurement. Otherwise in ABCD the units cannot be allocated to the values correctly. Example: header of a column in Excel: tarsus (mm) – length, width, height. Add columns with header ‘Unit’ and enter ‘mm’ in every row. You can add columns with equal content by using following SQL statement: SELECT [unit_tarsus_length] AS mm
=Add core information from metadata
Add core information from metadata to e=very unit, if missing in the unit table. You can add this information by using the SQL statement above.
Denormalise information
In case of repeatable elements in ABCD you need to denormalise information. More information see: BioCASE documention Wiki: http://wiki.bgbm.org/bps/index.php/Preparation Here: “controlled denormalisation”