Over the past two years TAIR has taken the lead in organizing a community effort to reannotate the Arabidopsis thaliana genome. The result is TAIR12, recently released on the European Nucleotide Archive (ENA) and National Center for Biotechnology Information (NCBI).
The project was only made possible through the work of nearly 100 volunteers based in labs all over the world who generously donated their time and expertise. Coordinating the effort were the TAIR team, with Phoenix Bioinformatics Bioinformatician Alyssa Proia doing a lot of the heavy lifting.
Proia was inspired to join the project because of TAIR’s invaluable role in empowering the global Arabidopsis community with high-quality, accessible data. A. thaliana is key to understanding plant genetics, with implications for fields from plant biology to pharmacology to crop research. TAIR, as the most comprehensive and trusted resource on A. thaliana, is essential to that work. TAIR12 incorporates the latest discoveries and evidence using the most advanced techniques and technologies. For researchers who use TAIR in their daily work, that means greater accuracy and reliability, superior experiment design, more precise hypotheses, and ultimately better, more efficient research.
Meet Alyssa Proia
Alyssa Proia led the integration of GFF3 annotation files from multiple collaborating teams, ensuring seamless merging of gene models and functional predictions to produce a cohesive, high-quality reference dataset for plant genomics research..
Why is reannotation important to the plant biology research community?
Reannotation is crucial for the plant biology research community because it refines gene models, resolves structural ambiguities, and integrates cutting-edge genomic and functional data, correcting outdated or incomplete annotations from prior releases. This ensures a more accurate foundation for interpreting the almost 27,000 genes that underpin plant development, metabolism, and environmental responses. Access to this empowers researchers worldwide to translate these insights into sustainable agriculture and climate-resilient crops, as well as helps to foster innovation and collaboration in a constantly evolving field.
What was the most challenging aspect of the work? The most rewarding?
The most challenging aspect of the TAIR12 reannotation work was integrating diverse GFF3 files from multiple teams, particularly as iterative updates arrived. This required repeated renumbering, format revisions, and alignment to INSDC standards after initial standardization efforts. This also demanded meticulous version control and cross-team coordination to avoid inconsistencies.
The most rewarding part was delivering a unified, standardized annotation file that spans all feature types (e.g., protein-coding genes, transposable elements, and lncRNAs), resulting in a more complete and user-friendly resource that streamlines downstream analyses for the plant genomics community.
What are your hopes for TAIR12 and its impact in the field?
I hope that TAIR12 becomes the gold standard for Arabidopsis genome annotation — that this is a high-fidelity, unified resource that empowers researchers worldwide with accurate, accessible data. By resolving inconsistencies and incorporating diverse genomic insights, I envision it catalyzing breakthroughs in plant biology, such as precision breeding for resilient crops. Ultimately, I hope this benefits the agricultural community and global food security for generations to come.
There’s more to come! Watch this space, for researcher profiles, publication information, and other news and updates on TAIR12.




