W03 - Analogue to digital: faster better cheaper

Session Type: Workshop
Full Title: W03 - Analogue 2 digital: faster better cheaper
Short Title: Analogue 2 digital
Organizer(s): Quentin Groom, Botanic Garden Meise
Contributors: Wouter Addink, Naturalis
  Dimitris Koureas, Natural History Museum, London
  Sarah Phillips, Royal Botanic Gardens, Kew
  Jeroen Bloothoofd, Picturae
  Fabian Reimeier, Botanischer Garten und Botanisches Museum Berlin
            Dominik Röpert, Botanischer Garten und Botanisches Museum Berlin
            Anton Güntsch, Botanischer Garten und Botanisches Museum Berlin
  Walter Berendsohn, Botanischer Garten und Botanisches Museum Berlin


Unsolicited contributions considered? Yes

Abstract

With the acceleration in the speed of mass imaging, data capture has become a major bottleneck and cost factor in the digitisation of specimens.The benefits of digital data are multifold, but they include increasing the availability, discoverability and analysability of collections. Yet to gain from these benefits, it is necessary to convert the analogue data written on labels to a digital format. Furthermore, there is a large amount of information that can be inferred about specimens from their links with other specimens. This workshop will explore methods of label transcription and other forms of metadata enrichment. It will examine anthropic methods, such as crowdsourcing, and automated approaches, such as optical character recognition. Among automated methods, we will explore the use of webservices and how these can be used in data capture pipelines. These services can, for instance, be used to perform optical character recognition, image analyses and feature detection. The workshop will also look at ways in which information about specimens can be inferred. For example, by combining data from multiple collections and creating collection itineraries. We will also examine how different tools could be combined in workflows to reduce the overall effort. A special focus will be on effective mechanisms for creating links to semantic concepts provided by external resources (e.g. persons, scientific names, geographic entities). Such links can significantly improve the reliability of information and increase the potential for data integration and advanced semantic inferencing. The goals of the workshop are to evaluate these methods, understand their strengths, weaknesses and costs. The workshop will be an opportunity to share knowledge and suggest improvements to current practices.