An enrichment model for wheat gene annotations using phylogeny, orthology and existing gene ontologies in 9 plant species

  1. (PDF, 596 KB)
AuthorSearch for: ; Search for: ; Search for: ; Search for:
ConferenceThe 3rd Current Opinion Conference on Plant Genome Evolution, Amsterdam, The Netherlands
AbstractGenome sequencing efforts for the Triticum aestivum genome produce massive amounts of contigs, preliminary assemblies and putative genes/proteins, nevertheless their annotation is still in its infancy. Given the much larger percentage of annotated genes in other previously sequenced plant genomes such as Arabidopsis thaliana and Oryza sativa and the known phylogenetic and orthology relationship among these plant species and their corresponding genes, we propose an enrichment model that will further expand the horizon of wheat gene annotations. Our sequences and annotations base includes data from Ensembl Plants for 9 plant species: Aegilops tauschii, Arabidopsis thaliana, Brachypodium distachyon, Brassica rapa, Hordeum vulgare, Oryza sativa subsp. japonica, Sorghum bicolor, Triticum urartu and Zea mays. Orthology relationships between wheat genes and each of the 9 plant species are predicted using an in-house software package. Next, ortholog cliques are identified such that each set of genes within a clique represents pairwise orthologs. Using the phylogenetic distances between wheat and each plant species to quantify the level of confidence for gene ontology assignments within each ortholog clique, new gene annotations are assigned to wheat genes such that either novel or more specific GO terms are associated with those genes. Overall, based on clique size equal or larger than 3, our model enriched the existing gene-GO term associations for 7,838 (8%) wheat genes, of which 2,139 had no previous annotation. For the particular case of ortholog cliques of size 10 (13 in total) where all 10 genes within a clique are tightly connected via pairwise orthology, 85 new and more specific GO terms were identified, which represent a 65% increase compared with the previously 130 known GO terms. These observations are further supported for 4 out of the 10 plant species considered in this work by experimental evidence using expressologs (Patel et al., Plant J. 2012).
AffiliationInformation and Communication Technologies; National Research Council Canada
Peer reviewedYes
NPARC number21276405
Export citationExport as RIS
Report a correctionReport a correction
Record identifierdd263013-adbd-4878-ae67-379efbf7e787
Record created2015-10-09
Record modified2016-05-09
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)