Data Problems: Difference between revisions

no edit summary
No edit summary
No edit summary
Line 25: Line 25:
*Download format and term definitions
*Download format and term definitions
**The columns after download are not in logical order. All columns that are identifiers should be clustered together, locality information clustered together, collecting event clustered etc. Within the clusters the data elements can be in a loose order, but the elements should be together.
**The columns after download are not in logical order. All columns that are identifiers should be clustered together, locality information clustered together, collecting event clustered etc. Within the clusters the data elements can be in a loose order, but the elements should be together.
**Several terms are included in the download that represent the same information, but are named only slightly different (ex. VerbatimEventDate, verbatimEventDate). These should be merged in the download file or at least returned next to each other in the download file.
**Several terms are included in the download that represent the same information, but are named only slightly different (ex. VerbatimEventDate, verbatimEventDate). These should be merged in the download file or at least returned next to each other in the download file. (fixed  in next release of portal)
**There is no document that defines the terms. One should be provided. Further, those definitions should have URI identifiers so that individuals can reuse them with confidence (including them in a meta.xml).
**There is no document that defines the terms. One should be provided. Further, those definitions should have URI identifiers so that individuals can reuse them with confidence (including them in a meta.xml).
*Portal behavior
*Portal behavior
**When searching the portal, certain fields should not be an exact match. These include Collector and Locality fields. There are others, but these were the most limiting.
**When searching the portal, certain fields should not be an exact match. These include Collector and Locality fields. There are others, but these were the most limiting.
**Higher taxonomy should be included to improve the search. Family name being the most important. If it is not in the dataset from the provider, it should automatically be added upon ingestion to iDigBio. Without the higher taxonomy, a user will miss specimen records they are likely looking for.
**Higher taxonomy should be included to improve the search. Family name being the most important. If it is not in the dataset from the provider, it should automatically be added upon ingestion to iDigBio. Without the higher taxonomy, a user will miss specimen records they are likely looking for. (improvements coming in next release of portal - using GBIF Nub taxonomy)
*Minor issues
*Minor issues
**Terms should be evaluated for continuity. The term “row number” contains a space.
**Terms should be evaluated for continuity. The term “row number” contains a space.
**Ideally would like a tsv as well as a csv download.
**Ideally would like a tsv as well as a csv download. (support for tsv export format is coming in next release of portal)
|valign="top"| K. Seltmann, R. Rabeler, TTD TCN
|valign="top"| K. Seltmann, R. Rabeler, TTD TCN
|}
|}
5,887

edits