Artificial intelligence could predict plant behaviour in the face of climate change

Science at work 18 August 2020
Herbarium digitization is paving the way for the use of artificial intelligence to characterize the development of hundreds of thousands of plant species identified throughout the world. Analysis of this data could help to create a model to anticipate climate change. A partnership involving CIRAD has just laid the foundations for this approach, which will require IT infrastructure and the development of genuinely interdisciplinary collaborations if it is to succeed.
An example of a herbarium specimen. The brightly coloured areas show the reproductive structures. Being clearly isolated, they are likely to be successfully annotated by machine learning algorithms © CIRA

For centuries, every new plant discovered has been systematically catalogued in herbaria. Today, several million specimens representative of almost 400 000 plant species are thus archived in several thousand herbaria . The ongoing digitization of these naturalist catalogues is gradually transforming them into modern research tools.It is now possible to invest in artificial intelligence approaches to automatically identify digitized specimens and to extract data such as the presence and number of buds, flowers and fruits” , says Pierre Bonnet, a botanist at CIRAD. This information can be used in particular to study the phenology of plants, in other words the dynamics of their development according to seasonal climate variations. “With access to all known herbaria, we could obtain data across very large geographical and temporal scales, which would enable us to develop phenological models capable of predicting plant behaviour in response to current climate change” . These models could help our society, especially agriculture, to adapt to climate warming.

Towards dedicated infrastructure

“With this goal in mind, along with INRIA and our US partners we have just proposed a methodological approach based on machine learning , and more specifically on deep learning . This set of IT and statistical techniques can be used to automate the classification and analysis of vast datasets, in particular visual data , says Pierre Bonnet. But this type of approach requires IT infrastructure to host and provide access to these large volumes of data, comprised in particular of high resolution images. Analysis of these images also requires processors that are specifically designed for graphic data processing. At present, there are no such storage platforms dedicated to the application of machine learning to the fields of biology in general and to botany in particular, and most computing centres are currently equipped with processors that are not suitable for graphic data processing. To address this problem, it is essential that supranational organizations or institutions establish lasting cyberinfrastructure of this type to process and host research data.

An interdisciplinary workflow

Beyond the development of infrastructure, this ambitious long-term project also needs to bring together actors with the ability to make the information found in herbaria useable. “Close interdisciplinary collaboration between herbarium curators, botanists, ecologists and researchers in computer science is essential in order to develop and refine algorithms capable of leveraging digital collections” , says Pierre Bonnet. Although feasibility tests have already been conducted on a small number of species archived in virtual herbaria, there are still many challenges ahead. “Herbarium samples were not designed to be digitized and analysed using automated approaches” . Financing research projects that combine botany and information science could encourage specialists in artificial intelligence (AI) to address these problems. Their resolution will also have significant repercussions in other fields of application for AI.

Supporting herbaria digitization

Finally, to ensure this project has a global reach, botanists need to have access to a maximum number of plant specimens. In the Western world, private and public initiatives have enabled the digitization of millions of specimens. But numerous collections, especially in tropical regions where the flora is rich, have still not been digitized due to a lack of resources. “These herbaria are irreplaceable witnesses of the past” , insists Pierre Bonnet. Their digitization is a huge task , but in addition to feeding machine learning algorithms, these efforts will also help to protect the herbaria, to increase their use and guarantee the transmission to future generations of this wealth of information on plant biodiversity.