With this new algorithm, those capturing images of herbarium sheets will be able to group the sheets into sets by prior defined label-types. Sets by a given collector or from a given collecting event would group together. The new algorithm (5 seconds per sheet) is also more than 300% faster than the old template-matching algorithm (1800+ seconds per sheet). Other research tells us (de la Cerda 2010, private communication Elspeth Haston 2012) that data entry staff are happier and more productive when entering ordered information. This makes sense thinking about the repetitiveness of the data-entry task. If all the images in the stack to be digitized are already the same collector, then many will be from the same collecting event and all that will change is the taxonomic identification. Many herbaria (and other specimen collections) spend a lot of time organizing their collections before digitization (pre-digitization curation). For very large herbaria planning on using more industrial processes to capture herbarium sheet images, using technology like this would eliminate the need to pre-sort the collection or could greatly reduce the amount of pre-sorting needed.
Comments
Very exciting paper! With
Very exciting paper!
With this new algorithm, those capturing images of herbarium sheets will be able to group the sheets into sets by prior defined label-types. Sets by a given collector or from a given collecting event would group together. The new algorithm (5 seconds per sheet) is also more than 300% faster than the old template-matching algorithm (1800+ seconds per sheet). Other research tells us (de la Cerda 2010, private communication Elspeth Haston 2012) that data entry staff are happier and more productive when entering ordered information. This makes sense thinking about the repetitiveness of the data-entry task. If all the images in the stack to be digitized are already the same collector, then many will be from the same collecting event and all that will change is the taxonomic identification. Many herbaria (and other specimen collections) spend a lot of time organizing their collections before digitization (pre-digitization curation). For very large herbaria planning on using more industrial processes to capture herbarium sheet images, using technology like this would eliminate the need to pre-sort the collection or could greatly reduce the amount of pre-sorting needed.