Biological Collections Data: best practices and trends for standards, digitization, and biodiversity informatics literacy for research use of collections data

TitleBiological Collections Data: best practices and trends for standards, digitization, and biodiversity informatics literacy for research use of collections data
Publication TypeConference Paper
Year of Publication2016
AuthorsPaul, Deborah L., and Seltmann Katja C.
Conference NameIsland Biology 2016
Date Published06/2016
Conference LocationTerceira Island, Azores
AbstractFor biological collections data to have the longest life possible, and be usable for as many audiences as possible, we need high-quality, georeferenced, standardized data. The latest standards being used for sharing (mobilizing) collections data include but are not limited to ABCD, Darwin Core, Global Genomic GGBN, Audubon Core, and Ecological Metadata Language (EML). Transcription of data from existing specimens, imaging specimens and documents, collecting specimens, and using biological collections data requires identifiers. These unique strings make it possible to study and keep track of relationships between objects both physical and digital. Some collections currently rely more and more on researchers’ requests for deciding what to digitize; this trend is referred to as “digitization on demand.” At the same time, collections and aggregators like iDigBio (http://idigbio.org), GBIF (http://gbif.org), and VertNet (http://vertnet.org/) are providing data quality information and attempting to analyze specimen data looking for missing data. For example, these data gaps may be taxonomic, geographic, or habitat-based. This data gap analysis (DGA) can help collections to prioritize digitization, and also inform researchers about where to focus collecting and sampling efforts. Accurate georeferencing of specimen locality data using appropriate standards is critical as it facilitates better, faster research. Scientists are urged to please contribute their georeferencing expertise and gazetteers. With ever more data available, scientists find they need to update their data skills. Some groups like Data Carpentry, Software Carpentry, and Reproducible Science Curriculum, now offer easy-to-access training, designed specifically for the beginner or intermediate-level, and tailored to specific communities as well.