3
edits
Jegelewicz (talk | contribs) |
|||
(9 intermediate revisions by 3 users not shown) | |||
Line 86: | Line 86: | ||
'''Solution:''' | '''Solution:''' | ||
Excel does not have an easy solution for this problem. | |||
=== Georeference Metadata with no Associated Georeference === | === Georeference Metadata with no Associated Georeference === | ||
'''Problem:''' Metadata fields regarding coordinates, such as [https://dwc.tdwg.org/terms/#dwc:coordinateUncertaintyInMeters coordinateUncertaintyInMeters], [https://dwc.tdwg.org/terms/#dwc:georeferenceProtocol georeferenceProtocol], [https://dwc.tdwg.org/terms/#dwc:georeferenceSources georeferenceSources], [https://dwc.tdwg.org/terms/#dwc:georeferencedBy georeferencedBy], [https://dwc.tdwg.org/terms/#dwc:georeferenceRemarks georeferenceRemarks], and [https://dwc.tdwg.org/terms/#dwc:geodeticDatum geodeticDatum] are provided, but no coordinates are present. This is sometimes intentional, particularly when georeferencedBy and georeferencedRemarks are used to indicate whether a record was purposefully not georeferenced. However, it is rare that the other metadata fields can be used without associated coordinates (i.e., [https://dwc.tdwg.org/terms/#dwc:decimalLatitude decimalLatitude], [ https://dwc.tdwg.org/terms/#dwc:decimalLongitude decimalLongitude], or [https://dwc.tdwg.org/terms/#dwc:verbatimCoordinates verbatimCoordinates]). | '''Problem:''' Metadata fields regarding coordinates, such as [https://dwc.tdwg.org/terms/#dwc:coordinateUncertaintyInMeters coordinateUncertaintyInMeters], [https://dwc.tdwg.org/terms/#dwc:georeferenceProtocol georeferenceProtocol], [https://dwc.tdwg.org/terms/#dwc:georeferenceSources georeferenceSources], [https://dwc.tdwg.org/terms/#dwc:georeferencedBy georeferencedBy], [https://dwc.tdwg.org/terms/#dwc:georeferenceRemarks georeferenceRemarks], and [https://dwc.tdwg.org/terms/#dwc:geodeticDatum geodeticDatum] are provided, but no coordinates are present. This is sometimes intentional, particularly when georeferencedBy and georeferencedRemarks are used to indicate whether a record was purposefully not georeferenced. However, it is rare that the other metadata fields can be used without associated coordinates (i.e., [https://dwc.tdwg.org/terms/#dwc:decimalLatitude decimalLatitude], [ https://dwc.tdwg.org/terms/#dwc:decimalLongitude decimalLongitude], or [https://dwc.tdwg.org/terms/#dwc:verbatimCoordinates verbatimCoordinates]). | ||
'''Solution:''' | '''Solution:''' Excel does not have an easy solution for this problem. | ||
=== Elevation is Unlikely === | === Elevation is Unlikely === | ||
Line 105: | Line 106: | ||
'''Problem:''' The sign of the latitude ([https://dwc.tdwg.org/terms/#dwc:decimalLatitude decimalLatitude]) or longitude ([https://dwc.tdwg.org/terms/#dwc:decimalLongitude decimalLongitude]) does not match the sign/hemisphere of the given country. For example, all longitudes in the U.S. should be negative. | '''Problem:''' The sign of the latitude ([https://dwc.tdwg.org/terms/#dwc:decimalLatitude decimalLatitude]) or longitude ([https://dwc.tdwg.org/terms/#dwc:decimalLongitude decimalLongitude]) does not match the sign/hemisphere of the given country. For example, all longitudes in the U.S. should be negative. | ||
'''Solution:''' | '''Solution:''' Excel does not have an easy solution for this problem. | ||
=== Invalid Coordinates === | === Invalid Coordinates === | ||
'''Problem:''' Coordinates deviate from accepted ranges or formats, like decimalLatitude and decimalLongitude exceeding -90 to 90 and -180 to 180, respectively. verbatimCoordinates have to be valid values for coordinates in decimal degrees, degrees decimal minutes, degrees minutes second. | '''Problem:''' Coordinates deviate from accepted ranges or formats, like decimalLatitude and decimalLongitude exceeding -90 to 90 and -180 to 180, respectively. verbatimCoordinates have to be valid values for coordinates in decimal degrees, degrees decimal minutes, degrees minutes second. | ||
'''Solution:''' | '''Solution:''' Excel does not have an easy solution for this problem. | ||
=== Lower Geography Values are Provided, but No Higher Geography === | === Lower Geography Values are Provided, but No Higher Geography === | ||
Line 133: | Line 134: | ||
|- | |- | ||
| || || Kerr || missing state and country | | || || Kerr || missing state and country | ||
|- | |||
| USA || Texas || Kerr || OK | |||
|} | |} | ||
Line 149: | Line 152: | ||
'''Problem:''' The provided value for [https://dwc.tdwg.org/terms/#dwc:country country] and [https://dwc.tdwg.org/terms/#dwc:countryCode countryCode] do not match. | '''Problem:''' The provided value for [https://dwc.tdwg.org/terms/#dwc:country country] and [https://dwc.tdwg.org/terms/#dwc:countryCode countryCode] do not match. | ||
'''Solution:''' | '''Solution:''' Excel does not have an easy solution for this problem. | ||
=== Mismatched Geographic Terms === | === Mismatched Geographic Terms === | ||
'''Problem:''' A record has lower geographic terms (e.g., state/province, county) that do not exist under the provided higher geographic term(s). For example, country = Canada and stateProvince = Sussex. There is no Sussex province in Canada. | '''Problem:''' A record has lower geographic terms (e.g., state/province, county) that do not exist under the provided higher geographic term(s). For example, country = Canada and stateProvince = Sussex. There is no Sussex province in Canada. | ||
'''Solution:''' | '''Solution:''' Excel does not have an easy solution for this problem. | ||
=== Missing Geodetic Datum === | === Missing Geodetic Datum === | ||
Line 232: | Line 235: | ||
'''Solution:''' | '''Solution:''' | ||
# Open your dataset in Excel and highlight Row 1. | |||
# From the Data toolbar, click "Filter". | |||
# Navigate to the column containing your higher taxonomy values (e.g. dwc:order) and select the arrow icon next to the column header indicating that this column can now be filtered on. | |||
# At the bottom of the pop-up window, deselect "Select All", scroll down, and then only check the "Blanks" box (you may need to scroll down in the pop-up window to locate it). | |||
# Resolve missing higher taxonomy as you are able. | |||
== Other Issues == | == Other Issues == | ||
Line 246: | Line 255: | ||
'''Problem:''' When transferring text files between Unix/Linux and DOS/Windows systems, line endings can become inconsistent. Unix/Linux systems typically use line feed (LF) characters, while DOS/Windows systems use carriage return (CR) and line feed (LF) combinations. This mismatch can result in extra characters appearing in the data, causing visual artifacts and processing errors. | '''Problem:''' When transferring text files between Unix/Linux and DOS/Windows systems, line endings can become inconsistent. Unix/Linux systems typically use line feed (LF) characters, while DOS/Windows systems use carriage return (CR) and line feed (LF) combinations. This mismatch can result in extra characters appearing in the data, causing visual artifacts and processing errors. | ||
'''Solution:''' | '''Solution:''' Excel does not have an easy solution for this problem. | ||
=== Invalid Individual Count === | === Invalid Individual Count === | ||
Line 338: | Line 347: | ||
| MaterialCitation || MaterialCitation | | MaterialCitation || MaterialCitation | ||
|} | |} | ||
[https://github.com/Jegelewicz/data_quality/blob/main/Excel_examples/2024%20Excel%20data%20quality%20checks.xlsx Sample file] |
edits