Difference between revisions of "Validation and Corrections"
LornaMorris (talk | contribs) (→Running the automated correction) |
LornaMorris (talk | contribs) |
||
Line 51: | Line 51: | ||
In the left-hand panel a list of 'Issues' can be seen and in the main editor window the data file is displayed. Clicking on any 'Issue' in the left-hand panel takes the user to the corresponding change in the data file. In the example shown the first issue in the list has been clicked. This expands the 'Issue' and shows the 'Old Content' and the 'New Content'. In this case the problem was that the XML schema required the content to be of the type xs:DateTime, but the old content only gave a year date range. The automated correction was of a type called 'Element Text Replacer'. This type of correction replaces a specific pattern (a regular expression) at a specified position within the XML file with some other text. In this example the lower year is taken as the year and the date is assumed to be the 1st January of that year and the time is assumed to be midnight. If this change is acceptable to the reviewer then they can click the checkbox to indicate they agree within the change. | In the left-hand panel a list of 'Issues' can be seen and in the main editor window the data file is displayed. Clicking on any 'Issue' in the left-hand panel takes the user to the corresponding change in the data file. In the example shown the first issue in the list has been clicked. This expands the 'Issue' and shows the 'Old Content' and the 'New Content'. In this case the problem was that the XML schema required the content to be of the type xs:DateTime, but the old content only gave a year date range. The automated correction was of a type called 'Element Text Replacer'. This type of correction replaces a specific pattern (a regular expression) at a specified position within the XML file with some other text. In this example the lower year is taken as the year and the date is assumed to be the 1st January of that year and the time is assumed to be midnight. If this change is acceptable to the reviewer then they can click the checkbox to indicate they agree within the change. | ||
− | If the change is not acceptable, the user should run another set of corrections on the original data file. To do this you need to [ | + | If the change is not acceptable, the user should run another set of corrections on the original data file. To do this you need to [[Correction ModulesCorrection_Modules|change the correction configuration]], in consultation with a technical administrator. |
== Default correction (selecting existing correction module) == | == Default correction (selecting existing correction module) == |
Revision as of 18:08, 29 October 2014
Contents
Validation and Correction
Once the XML data file has been uploaded into the reBiND system the user can validate and perform automated and manual corrections to the file before publishing. Publishing the data makes it available to the public search interface and also to biodiversity networks, such as GBIF.
Validation
The figure below shows the result of clicking on the validation action for the file reBiND_Puffinus.xml. When the validation is running the information screen opens and displays a throbber while the file is being validated. In the screenshot the validation is complete and the screen shows the result - that there are 600 errors in the file.
After validation is complete it is possible to review the validation results in detail in the reBiND editor (a modified version of the eXide editor which comes bundled with the eXist software). To open this editor the user should click on the 'Edit' button in the list of actions below the data file (in this case reBiND_Puffinus.xml). This opens the data file in the editor - a screenshot of this is shown below:
In the left-hand panel a list of validation errors can be seen and in the main editor window the data file is displayed. Clicking on any individual validation error in the left-hand panel takes the user to the corresponding error in the data file. Errors are marked with an red error icon within the left-hand margin. It is possible to make manual edits to the file to fix these errors, but when there are so many errors within a file this would be labour intensive. It the next step (the automated correction) we show how these errors can be fixed automatically using the reBiND correction software.
Running the automated correction
'Start Correction' is the final action in the list below the data file. Clicking on this link takes the user to the following page:
A drop-down menu gives a list of available configuration files. The first correction configuration ('default correction') is suitable for most ABCD files. Alternative corrections can be uploaded by the administrator to run different automated corrections This could depend on - for example - what sort of errors have been seen in the data file (in the validation step) or whether a different XML file has been used instead of the default ABCD data.
After clicking on 'Start Correction' a throbber appears as the correction modules (specified in the configuration file) are run. When the correction is complete a report is generated (see the following screenshot):
The output shows a link to the original data file, a link to an XML version of the report and a tabular view of the report showing the number of each type of correction made. The level of 'info', 'warning' and 'error' are used to indicate the effect of the change as follows:
- info - flags any minor change to the data from the running the correction.
- warning - flags a change where it is uncertain that the new value is correct and it should be chacked by the content administrator.
- error - flags a change that has been made that results in the file being invalid according to the associated schema.
Clicking back on the browser and then opening the reBiND editor by clicking the action 'Edit' undre the data file allows the user to review the results from the automated correction. A screenshot showing the results of the correction on the reBiND_Puffinus.xml data file is shown below:
In the left-hand panel a list of 'Issues' can be seen and in the main editor window the data file is displayed. Clicking on any 'Issue' in the left-hand panel takes the user to the corresponding change in the data file. In the example shown the first issue in the list has been clicked. This expands the 'Issue' and shows the 'Old Content' and the 'New Content'. In this case the problem was that the XML schema required the content to be of the type xs:DateTime, but the old content only gave a year date range. The automated correction was of a type called 'Element Text Replacer'. This type of correction replaces a specific pattern (a regular expression) at a specified position within the XML file with some other text. In this example the lower year is taken as the year and the date is assumed to be the 1st January of that year and the time is assumed to be midnight. If this change is acceptable to the reviewer then they can click the checkbox to indicate they agree within the change.
If the change is not acceptable, the user should run another set of corrections on the original data file. To do this you need to change the correction configuration, in consultation with a technical administrator.
Default correction (selecting existing correction module)
Modifiy correction
Add additional modules
Screenshots are on the web-site - An overview of the reBiND interface showing correction and validation