A community effort to enhance the Arabidopsis thaliana reference genome
More than two years ago, TAIR took the lead in a sweeping volunteer-driven project to reannotate the Arabidopsis thaliana genome — an update known as TAIR12. The new annotation is the result of contributions by over 70 researchers based in labs from around the world, who came together to donate time, expertise, and in some cases computing power to the project. We look forward to sharing it with the world soon!
Collaboration and the spirit of science
Sometimes known as the “lab mouse” of plant biology, Arabidopsis thaliana has long served as a model organism for plant researchers due to its short lifecycle, relatively small genome, and ease of genetic manipulation. It was the first flowering plant genome to be completely sequenced and published in December 2000. From that first genome release, TAIR has been an essential companion resource, providing a curated catalog of gene functions. Twenty-five years later, it remains instrumental in plant research, the organism it documents still as important as ever.
As sequencing methods and annotation technology have improved, the genome map has been periodically updated in a process known as reannotation, with the last major update being Araport11 in 2016. By late 2022, when we undertook the current work, the genome was overdue for another update — one that would reflect the latest technological advances and incorporate the updated and corrected gene information that had come to light in the intervening years.
But there was a problem: previous reannotation projects benefited from dedicated grant funding support. This time, no such resources were available. Understanding the importance of reannotation to continued progress in the field, Tanya Berardini, Director of TAIR and Chief Scientific Officer at Phoenix Bioinformatics, the nonprofit that houses TAIR, called on what she described as a “dream team” of researchers she hoped might assist — and they agreed.
Although TAIR and Phoenix Bioinformatics provided the infrastructure and organization for the project, dozens of researchers at labs based in Canada, China, Czech Republic, France, Germany, India, Italy, Japan, the Netherlands, Pakistan, Sweden, Turkey, the United Kingdom, and the United States contributed to the hands-on work of reannotation over a period of nearly three years.
“These efforts will benefit all of plant biology.” – Tanya Berardini
“At the onset, we couldn’t imagine the amount of coordination it would take to reach the finish line, in addition to the scientific expertise and attention to detail for annotation and review,” said Berardini. “We are incredibly grateful to the individual researchers who gave their time and shared their knowledge to make TAIR12 possible. These efforts will benefit all of plant biology.”
The core project work was completed in Spring 2025 and submitted to Genbank for review at the end of April 2025. It was still under review when the US government shutdown began on Oct. 1 and staff at NCBI were furloughed. The great uncertainty over the resumption and completion of the review process pushed us to look at other options. As a result, we have shifted the submission process from Genbank to the European Nucleotide Archive (ENA), another member of the INSDC (International Nucleotide Sequence Database Collaboration), and are actively working with their staff to reach our goal of sharing the reannotated genome through a public resource.
The accompanying article, currently under preparation, presents a new Arabidopsis thaliana reference genome that represents a major advance in completeness and accuracy, said Korbinian Schneeberger, of the Max Planck Institute for Plant Breeding Research, who is coordinating the manuscript composition effort. This assembly was generated by integrating 13 independently produced Columbia (Col-0) assemblies from the Arabidopsis community and further refined through the incorporation of previously unassembled repetitive regions. The resulting reference sequence achieves near-finished quality across the entire genome. Gene annotation was carried out on this assembly and subsequently curated by over 70 members of the Arabidopsis research community, resulting in an annotation of exceptional accuracy.
“The completeness of both the assembly and annotation now enables, for the first time, a truly comprehensive view of the Arabidopsis genome. This new reference serves as a high-quality, community-driven resource that sets a new standard for the global plant genomics community,” Schneeberger said.
We look forward to sharing more details with our users and the broader research community soon — especially as the 25-year anniversary of the original genome mapping project draws near. Stand by for updates on the publication and release of the new annotation.




