The genome of flax (Linum usitatissimum) assembled de novo from short shotgun sequence reads

Download
  1. Get@NRC: The genome of flax (Linum usitatissimum) assembled de novo from short shotgun sequence reads (Opens in a new window)
DOIResolve DOI: http://doi.org/10.1111/j.1365-313X.2012.05093.x
AuthorSearch for: ; Search for: ; Search for: ; Search for: ; Search for: ; Search for: ; Search for: ; Search for: ; Search for: ; Search for: ; Search for: ; Search for: ; Search for: ; Search for: ; Search for: ; Search for: ; Search for: ; Search for: ; Search for: ; Search for: ; Search for: ; Search for: ; Search for:
TypeArticle
Journal titlePlant Journal
ISSN0960-7412
Volume72
Issue3
Pages461473; # of pages: 13
SubjectDNA Sequencing; Illumina; Industrial crops; Malpighiales; Whole-genome shotgun; Crops; DNA sequences; Flax; Linen; Proteins; Yarn; Genes; plant DNA; article; bacterial artificial chromosome; chemistry; chromosome map; contig mapping; DNA sequence; expressed sequence tag; flax; gene library; genetics; high throughput sequencing; methodology; molecular genetics; nucleotide sequence; plant genome; protein tertiary structure; Base Sequence; Chromosome Mapping; Chromosomes, Artificial, Bacterial; Contig Mapping; DNA, Plant; Expressed Sequence Tags; Flax; Gene Library; Genome, Plant; High-Throughput Nucleotide Sequencing; Molecular Sequence Annotation; Molecular Sequence Data; Protein Structure, Tertiary; Sequence Analysis, DNA; Arabidopsis thaliana; Linum usitatissimum; Malpighiales
AbstractFlax (Linum usitatissimum) is an ancient crop that is widely cultivated as a source of fiber, oil and medicinally relevant compounds. To accelerate crop improvement, we performed whole-genome shotgun sequencing of the nuclear genome of flax. Seven paired-end libraries ranging in size from 300 bp to 10 kb were sequenced using an Illumina genome analyzer. A de novo assembly, comprised exclusively of deep-coverage (approximately 94× raw, approximately 69× filtered) short-sequence reads (44-100 bp), produced a set of scaffolds with N50 = 694 kb, including contigs with N50 = 20.1 kb. The contig assembly contained 302 Mb of non-redundant sequence representing an estimated 81% genome coverage. Up to 96% of published flax ESTs aligned to the whole-genome shotgun scaffolds. However, comparisons with independently sequenced BACs and fosmids showed some mis-assembly of regions at the genome scale. A total of 43 384 protein-coding genes were predicted in the whole-genome shotgun assembly, and up to 93% of published flax ESTs, and 86% of A. thaliana genes aligned to these predicted genes, indicating excellent coverage and accuracy at the gene level. Analysis of the synonymous substitution rates (Ks) observed within duplicate gene pairs was consistent with a recent (5-9 MYA) whole-genome duplication in flax. Within the predicted proteome, we observed enrichment of many conserved domains (Pfam-A) that may contribute to the unique properties of this crop, including agglutinin proteins. Together these results show that de novo assembly, based solely on whole-genome shotgun short-sequence reads, is an efficient means of obtaining nearly complete genome sequence information for some plant species. © 2012 Blackwell Publishing Ltd.
Publication date
LanguageEnglish
AffiliationNational Research Council Canada (NRC-CNRC); NRC Plant Biotechnology Institute (PBI-IBP)
Peer reviewedYes
NPARC number21269207
Export citationExport as RIS
Report a correctionReport a correction
Record identifier5b86f63f-c054-418f-a0c1-6b87778682be
Record created2013-12-12
Record modified2016-05-09
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)