Improved sequence-based orthologs identification using genomic context information and their impact on pathway analysis in plants

  1. (PDF, 989 KB)
DOIResolve DOI:
AuthorSearch for: ; Search for: ; Search for: ; Search for:
ConferenceInternational Plant and Animal Genome Conference Asia 2013, Jan. 11-16, 2013, San Diego, CA, USA
AbstractWith the advent of sequencing techniques, a deluge of plant genome projects have emerged, all prompting for accurate and high throughput comparative genomic approaches such as orthology prediction. The current incompleteness, polyploidy and low coverage of most plant genomes prompt for further improvements of orthology prediction using evolutionary-related information such as sequence variability and gene order. While a majority of orthology prediction approaches for large genome-scale datasets typically relies on reciprocal-best-BLAST-hits (RBBH), they suffer from insufficiencies related to incorrect prediction of paralogs as orthologs when incomplete genome sequences or gene loss are present. In addition, there is an increasing interest to identify orthologs most likely to have retained similar function. To address these issues, we have developed a high-throughput multi-threaded computational approach that predicts orthologs using DNA and protein sequences and identifies which orthologs have similar genomic context and are likely to have similar function. First, we predict putative orthologs using commonly predicted DNA and protein based RBBHs. This dual approach is used whenever possible to reduce the number of false positives. Second, genomic context conservation is used to provide further support for orthologs assignment and to help with the identification of missing orthologs. Orthologs are predicted to have a higher likelihood of being similar in function if their relative genomic context is conserved. Third, the list of putative orthologs for pairs of plant species (e.g. B. distachyon and S. bicolor) is used to explore pathway similarities for the same biological process and discover putative enzymes omitted in some plant species.
Publication date
AffiliationInformation and Communication Technologies; National Research Council Canada
Peer reviewedNo
NPARC number23001298
Export citationExport as RIS
Report a correctionReport a correction
Record identifier16590ee1-deca-4017-abfd-9b3d6c5be8e2
Record created2017-01-16
Record modified2017-01-17
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)