The genomes of Arabica coffee and its parents finally deciphered

Results & impact 29 April 2024
An international collaboration, in which CIRAD took part, has established and studied the genome of Coffea arabica, the Arabica coffee plant, and its two parental species. The results published in Nature Genetics reveal the evolutionary history of this globally appreciated resource.
Manual harvesting of Arabica coffee cherries, in the Kintamani region of Indonesia © A. Rival, CIRAD
Manual harvesting of Arabica coffee cherries, in the Kintamani region of Indonesia © A. Rival, CIRAD

Manual harvesting of Arabica coffee cherries, in the Kintamani region of Indonesia © A. Rival, CIRAD

When you enjoy your cup of coffee, you may not realize the long history of the coffee plant, from its wild state to its cultivation across multiple continents... More than 60 scientists have studied the genome of different current populations of Arabica to understand its diversification into modern coffee varieties.

Once upon a time... the Coffea

Traded on the stock market and enjoyed all over the world, coffee holds a significant place in our daily lives and is also a vital economic resource for many countries in the global South. Arabica (Coffea arabica L.) represents about 60% of the world's coffee production, and its history is fascinating! Originating from the Ethiopian highlands, Arabica has been cultivated since the 15th century in Yemen. Without delving into the details of the adventurous circumstances of its arrival in Asia and America, it is worth noting that for each new cultivation area, the initial population originated from a very small number of individuals or even just a few seeds. This leads specialists to say that the genetic base of the current global population of Arabica is limited (implying low diversity). These coveted Arabica coffee plants are descendants of plants resulting from a spontaneous hybridization that occurred over 500 000 years ago between two wild species, Coffea canephora (Robusta) and C. eugenioides.

Deciphered genomes and reconstructed genealogical relationships

The very low diversity of cultivated Arabica, through its two historical lineages - Typica and Bourbon - and their derived varieties, makes its production particularly sensitive to climate or sanitary risks. In order to enlighten specialists in modern variety improvement, scientists from 18 countries, collaborating under the initiative of IRD and Nestlé, tackled the sequencing of the three genomes (Arabica and its wild parents).

It took them ten years to unravel, in their genetic baggage, the episodes that led to the dozen best-known varieties such as Bourbon pointu, Moka, or Blue Mountain, and their relatives. "This would not have been possible without the long-standing partnership with Uganda, Brazil, and Colombia", emphasizes Valérie Poncet, a geneticist at UMR DIADE. "Thus, the sequenced individual of C. eugenioides lives in the Ugandan forests where it still coexists with C. canephora, two species studied with our partners from NARO. As for C. arabica, it was the Natural History Museum of London that provided the herbarium specimen that allowed Linnaeus to define the species.

Regarding Arabica, the main difficulty was sequencing the two subgenomes inherited from the parents and distinguishing them by comparing them to the current genomes of its two parental species. "This was made possible by accessing exceptional individuals from the living collections of IRD", adds Romain Guyot, a geneticist at research unit DIADE and another co-author. Scientists note that "neither of the two subgenomes dominates the other in terms of its expression; C. arabica is the result of a perfect cooperation between the two parents". Its taste quality is believed to be due to this balance.

Identification of genes of interest

If Arabica is so successful, it is due to its delicate flavour and its unparalleled taste compared to Robusta, which is bitter and more robust. The authors of the study have therefore looked at the gene families responsible for these sought-after qualities. They sequenced 40 individuals representing wild and cultivated diversity, mainly from EMBRAPA, the Brazilian Agricultural Research Corporation.

This allowed, among other things, to characterize the descendants of the spontaneous hybridization between Arabica and Robusta that occurred in Timor about 100 years ago. These Indonesian hybrids have changed Arabica cultivation thanks to the resistance to rust - the main coffee tree disease - brought by Robusta. Thanks to scientists and breeders, other chapters of the great coffee story will continue to be written.


Salojärvi, J., Rambani, A., Yu, Z. et al. 2024. The genome and population genomics of allopolyploid Coffea arabica reveal the diversification history of modern coffee cultivars. Nat Genet.

This news item has been adapted from an article originally published on the website of the Institut de recherche pour le développement (IRD).