W14 - Deep Learning for Biodiversity
| Session Type: | Workshop |
| Full Title: | W14 - Deep Learning for Biodiversity |
| Short Title: | Deep Learning for Biodiversity |
| Organizer(s): | Francisco Pando, Real Jardín Botánico-CSIC, Madrid, Spain |
| Contributors: | Erick Mata, CIC, Instituto Tecnológico de Costa Rica, San José, Costa Rica |
| José Carranza-Rojas, CIC, Instituto Tecnológico de Costa Rica, San José, Costa Rica | |
| Lara Lloret, Instituto de Física de Cantabria-CSIC/UC, Santander, Spain | |
| Hervé GOEAU, AMAP joint research unit - CIRAD, Montpellier, France |
Unsolicited contributions considered? Yes
Abstract
A data-intensive science is emerging across multiple scientific domains as a result of the accumulation of large quantities of data, and from the need for new analysis techniques to study them. One of the new and more successful tools to exploit all these data are the so-called Deep Learning techniques. Deep Learning is a new type of machine learning that trains a computer to perform human-like tasks, such as identifying images or finding patterns. To achieve this, the Deep Learning techniques leverage very large datasets (training datasets) for finding hidden structures within them, and making accurate predictions. There are two different types of learning: supervised learning, where the training dataset is already categorized (labeled) by experts so that the accuracy of the method can be evaluated, and unsupervised learning, where the training dataset is not labeled and the computer must decide without any expert reference whether the data present some hidden structure.
This workshop aims to present the state-of-the-art in Deep Learning techniques dealing with biodiversity studies. The potential of these methodologies for classification (e.g., species recognition in images) or pattern recognition (e.g., species distribution models) is enormous, since similar problems have already been covered with these techniques in other areas with great success. Moreover, presentations on automated morphological trait recognition and species identification were featured at TDWG 2017* showing promising results.
This workshop also aspires to initiate a community to work on common approaches, coordinate efforts, and seek collaborations to take advantage of these new techniques within the biodiversity community. Additionally, it can also be a starting point for developing standards and recommended practices specific to this community. Topics for discussion and consensus building exchanges include
-
How to involve and give credit to domain experts who provide labeled training datasets.
-
How to elaborate standards to describe the methodologies and the nature of the training datasets of the identification platforms, so users can be more aware of the capabilities and limitations.
-
Review of improvement techniques.
-
How to develop standards to facilitate the exchange of data (and datasets)
-
Pros and cons of different platforms for Deep Learning development and deployment.
-
How to exploit unsupervised learning in the biodiversity domain.
The TDWG annual meeting is an excellent opportunity to gather those involved in these developments and to overcome these and other challenges.