Dataset Errata: Difference between revisions

Jump to navigation Jump to search
m
Line 342: Line 342:
:::FIXED, now locality reads, '''"Collected in the North Slope Foothills of the Cen-tral Brooks Range, .5 km E of Toolik Lake"''' Committed to GitHub --[[User:Dpaul|Dpaul]] 15:43, 2 July 2013 (EDT)
:::FIXED, now locality reads, '''"Collected in the North Slope Foothills of the Cen-tral Brooks Range, .5 km E of Toolik Lake"''' Committed to GitHub --[[User:Dpaul|Dpaul]] 15:43, 2 July 2013 (EDT)


WIS-L-0012041_lg no datasetName in the csv file; no scientificName in the csv file; verbatimEventDate (format) in the csv file; dateIdentified (format) in the csv file  
WIS-L-0012041_lg no datasetName in the csv file; no scientificName in the csv file; verbatimEventDate (format) in the csv file; dateIdentified (format) in the csv file
:::William Ulate FIXED this record and pushed the changes to GitHub on 2 July 2013. --[[User:Dpaul|Dpaul]] 14:26, 3 July 2013 (EDT)


WIS-L-0012045_lg verbatimCoordinates concatenation  
WIS-L-0012045_lg verbatimCoordinates concatenation
:::the label image has '''Latitude 60° 33.436'N Longitude 172° 55.950'W'''
:::the csv had '''60° 33.436'N,172° 55.950'W
:::Deb removed the comma and put the field names back in to match for ''verbatim'' coordinates as '''Latitude 60° 33.436'N Longitude 172° 55.950'W''' --[[User:Dpaul|Dpaul]] 14:26, 3 July 2013 (EDT)


WIS-L-0012051_lg dateIdentified (format)  
WIS-L-0012051_lg dateIdentified (format)  
:::FIXED (was 2003-November, changed to 2003-11) --[[User:Dpaul|Dpaul]] 14:26, 3 July 2013 (EDT)


WIS-L-0012055_lg verbatimEventDate (format)  
WIS-L-0012055_lg verbatimEventDate (format)
:::FIXED, changed to match what is on the label 19 July 2003 --[[User:Dpaul|Dpaul]] 14:26, 3 July 2013 (EDT)


WIS-L-0012055_lg verbatimEventDate (19 July 2003) in the text file; but it is (2003-July-19) in the csv file  
WIS-L-0012055_lg verbatimEventDate (19 July 2003) in the text file; but it is (2003-July-19) in the csv file
:::FIXED, and fixed the dateIdentified format too. --[[User:Dpaul|Dpaul]] 14:26, 3 July 2013 (EDT)


WIS-L-0012056_lg dateIdentified (format)  
WIS-L-0012056_lg dateIdentified (format)  
:::FIXED, the dateIdentified format  --[[User:Dpaul|Dpaul]] 14:26, 3 July 2013 (EDT)


WIS-L-0012057_lg no datesetName  
WIS-L-0012057_lg no datesetName
:::FIXED --[[User:Dpaul|Dpaul]] 14:26, 3 July 2013 (EDT)


WIS-L-0012064_lg verbatimCoordinates concatenation  
WIS-L-0012064_lg verbatimCoordinates concatenation
:::looks okay? They are exactly as they appear on label, albeit with the field names included, reading: Latitude 66°54,02.9" N Longitude 159° 59' 02.2" W --[[User:Dpaul|Dpaul]] 14:26, 3 July 2013 (EDT)
:::TO BRYAN (from deb): OR do we want only: 66°54,02.9" N 159° 59' 02.2" W
:::From Deb: this will be challenging, it's been done several ways, with and w/o the label terms, and when "without" the lat lon label terms a comma was used (added). I have been removing the commas. --[[User:Dpaul|Dpaul]] 14:26, 3 July 2013 (EDT)


WIS-L-0012073_lg identifiedBy (By P. Y. Wong) in the csv file  
WIS-L-0012073_lg identifiedBy (By P. Y. Wong) in the csv file  
:::FIXED,removed the By--[[User:Dpaul|Dpaul]] 14:26, 3 July 2013 (EDT)


WIS-L-0012074_lg county (null)  
WIS-L-0012074_lg county (null)
:::NOT FIXED, there is no county on the label. Thunder Bay is a District (not a county).--[[User:Dpaul|Dpaul]] 14:26, 3 July 2013 (EDT)


WIS-L-0012074_lg county (null)  
WIS-L-0012074_lg county (null)
 
:::NOT FIXED, there is no county on the label. Thunder Bay is a District (not a county).--[[User:Dpaul|Dpaul]] 14:26, 3 July 2013 (EDT)
WIS-L-0012077_lg verbatimLocality contains verbatimCoordinates (Qianjin)  


WIS-L-0012077_lg verbatimLocality contains verbatimCoordinates (Qianjin)
:::NOT FIXED, this is okay, they are embedded. --[[User:Dpaul|Dpaul]] 14:26, 3 July 2013 (EDT)
<br> '''Gold OCR Errors'''  
<br> '''Gold OCR Errors'''  


Line 374: Line 390:
TENN-L-0000029_lg.txt adds a "1" to the scientificName ("Actinogyra muhlenbergii 1 (Ach.) Schol.").  
TENN-L-0000029_lg.txt adds a "1" to the scientificName ("Actinogyra muhlenbergii 1 (Ach.) Schol.").  


NY01075791_lg.txt converted "Müll" on the original label NY01075791_lg.jpg to "Mull" (converted umlaut "ü" to "u". We may want to do this, but if we do it should be standardized and consistent across all the labels. Same for NY01075791_lg.txt, and several others in the series.  
NY01075791_lg.txt converted "Müll" on the original label NY01075791_lg.jpg to "Mull" (converted umlaut "ü" to "u". We may want to do this, but if we do it should be standardized and consistent across all the labels. Same for NY01075791_lg.txt, and several others in the series).  
:::FIXED en mass for Mull to Mül using GREP to find, csv in Notepad++ to fix. --[[User:Dpaul|Dpaul]] 14:26, 3 July 2013 (EDT)


<br> '''Silver Parsed CSV Files''' '''(Bryan: I do not get most of these. There should be OCR errors in silver. We do need to stay true to the OCR output.)&nbsp;'''  
<br> '''Silver Parsed CSV Files''' '''(Bryan: I do not get most of these. There should be OCR errors in silver. We do need to stay true to the OCR output.)&nbsp;'''  
4,713

edits

Navigation menu