W03 - Analogue to digital: faster better cheaper
| Session Type: | Workshop |
| Full Title: | W03 - Analogue 2 digital: faster better cheaper |
| Short Title: | Analogue 2 digital |
| Organizer(s): | Quentin Groom, Botanic Garden Meise |
| Contributors: | Wouter Addink, Naturalis |
| Dimitris Koureas, Natural History Museum, London | |
| Sarah Phillips, Royal Botanic Gardens, Kew | |
| Jeroen Bloothoofd, Picturae | |
| Fabian Reimeier, Botanischer Garten und Botanisches Museum Berlin | |
| Dominik Röpert, Botanischer Garten und Botanisches Museum Berlin | |
| Anton Güntsch, Botanischer Garten und Botanisches Museum Berlin | |
| Walter Berendsohn, Botanischer Garten und Botanisches Museum Berlin |
Unsolicited contributions considered? Yes
Abstract
With the acceleration in the speed of mass imaging, data capture has become a major bottleneck and cost factor in the digitisation of specimens.The benefits of digital data are multifold, but they include increasing the availability, discoverability and analysability of collections. Yet to gain from these benefits, it is necessary to convert the analogue data written on labels to a digital format. Furthermore, there is a large amount of information that can be inferred about specimens from their links with other specimens. This workshop will explore methods of label transcription and other forms of metadata enrichment. It will examine anthropic methods, such as crowdsourcing, and automated approaches, such as optical character recognition. Among automated methods, we will explore the use of webservices and how these can be used in data capture pipelines. These services can, for instance, be used to perform optical character recognition, image analyses and feature detection. The workshop will also look at ways in which information about specimens can be inferred. For example, by combining data from multiple collections and creating collection itineraries. We will also examine how different tools could be combined in workflows to reduce the overall effort. A special focus will be on effective mechanisms for creating links to semantic concepts provided by external resources (e.g. persons, scientific names, geographic entities). Such links can significantly improve the reliability of information and increase the potential for data integration and advanced semantic inferencing. The goals of the workshop are to evaluate these methods, understand their strengths, weaknesses and costs. The workshop will be an opportunity to share knowledge and suggest improvements to current practices.