Data Problems: Difference between revisions

Jump to navigation Jump to search
Line 18: Line 18:
**Also, we would not be able to use label data in TraitBank if the occurrences are licensed.  While we recognize licenses at the data set level, we do not implement them at the level of individual records.  We have had discussions about this and came to the conclusion that like measurements and facts, occurrence records are unlikely to be protected by copyright, especially when they are presented in a commonly used standard like DwC. Of course, we won't know for sure until somebody files a lawsuit.  But we decided to err on the side of openness.  Is there any chance this issue could be brought up for discussion at iDigBio?
**Also, we would not be able to use label data in TraitBank if the occurrences are licensed.  While we recognize licenses at the data set level, we do not implement them at the level of individual records.  We have had discussions about this and came to the conclusion that like measurements and facts, occurrence records are unlikely to be protected by copyright, especially when they are presented in a commonly used standard like DwC. Of course, we won't know for sure until somebody files a lawsuit.  But we decided to err on the side of openness.  Is there any chance this issue could be brought up for discussion at iDigBio?
**We'll have a little more work to do before we're ready to import any of the iDigBio data.  I'll let you know if there is any progress on our end.
**We'll have a little more work to do before we're ready to import any of the iDigBio data.  I'll let you know if there is any progress on our end.
|valign="top"|K. Schultz, EOL
|valign="top"|K. Schultz, EOL (2014)
|-
|-
|valign="top"|
|valign="top"|
Line 24: Line 24:
*I found the data very difficult to work with for the pilot study on treehoppers. It took me over a week to clean it up and put like information together and standardize information so it could be used in an analysis - this includes dates, common names, scientific names, higher taxonomy. And, as Katja mentioned, if you search the portal on family name but the record doesn't have a higher taxonomic designation, you miss all those records and no one wants to search by hundreds of genus or species names one by one to make sure they are all there. Records should absolutely contain Order, Suborder, Family, Subfamily, Tribe (if appropriate) and genus names.  
*I found the data very difficult to work with for the pilot study on treehoppers. It took me over a week to clean it up and put like information together and standardize information so it could be used in an analysis - this includes dates, common names, scientific names, higher taxonomy. And, as Katja mentioned, if you search the portal on family name but the record doesn't have a higher taxonomic designation, you miss all those records and no one wants to search by hundreds of genus or species names one by one to make sure they are all there. Records should absolutely contain Order, Suborder, Family, Subfamily, Tribe (if appropriate) and genus names.  
*It seems that most people view these data as species page information. However, if you try to use it to do an analysis, the format doesn't work well.
*It seems that most people view these data as species page information. However, if you try to use it to do an analysis, the format doesn't work well.
|valign="top"| C. Johnson, AEC
|valign="top"| C. Johnson, AEC (2/2015)
|-
|-
|valign="top"|
|valign="top"|
Line 37: Line 37:
**Terms should be evaluated for continuity. The term “row number” contains a space.
**Terms should be evaluated for continuity. The term “row number” contains a space.
**Ideally would like a tsv as well as a csv download. (support for tsv export format is coming in next release of portal)
**Ideally would like a tsv as well as a csv download. (support for tsv export format is coming in next release of portal)
|valign="top"| K. Seltmann, R. Rabeler, TTD TCN
|valign="top"| K. Seltmann, R. Rabeler, TTD TCN (2/2015)
|-
|-
|valign="top"|
|valign="top"|
Line 51: Line 51:


-->  To me, one of the things that iDigBio should be concerned about is having the portal be easily usable.  If we want it to be the "one stop" for biodiversity data, we need to see what users can get from other portals and provide improvements to that level of info.  If it's easier to get the info by using a combination of other sources, folks still might do that.  In the examples I sent along on the screen shots, that's what I am trying to show -  what we present should be at least as good as what you can get elsewhere.  If you compare the results of the "label" view that you get in the iDigBio portal with that in the CPNWH portal, it's clear (at least to me....) that ours is inferior for the reasons that I pointed out.
-->  To me, one of the things that iDigBio should be concerned about is having the portal be easily usable.  If we want it to be the "one stop" for biodiversity data, we need to see what users can get from other portals and provide improvements to that level of info.  If it's easier to get the info by using a combination of other sources, folks still might do that.  In the examples I sent along on the screen shots, that's what I am trying to show -  what we present should be at least as good as what you can get elsewhere.  If you compare the results of the "label" view that you get in the iDigBio portal with that in the CPNWH portal, it's clear (at least to me....) that ours is inferior for the reasons that I pointed out.
|valign="top"| R. Rabeler, TTD TCN
|valign="top"| R. Rabeler, TTD TCN (2/2015)
|-
|-
|}
|}