TY - JOUR AB - Next-generation sequencing enables the study of species without a sequenced genome at the 'omics' level. Custom transcriptome databases are generated and global expression profiles can be compared. However, the assembly of transcriptome sequence reads into contigs remains a daunting task. In this study, five different assembly programs, both traditional overlap-based, 'read-centric' assemblers and de Bruijn graph data structure-based assemblers, were compared. To this end, artificial read libraries with and without simulated sequencing errors were constructed from Arabidopsis thaliana, based on quantitative profiles of mature leaf tissue. The open source TGICL pipeline and the commercial CLC bio genomics workbench produced the best assemblies in terms of contig length, hybrid assemblies, redundancy reduction, and error tolerance. The mature leaf transcriptomes of the C-3 species Cleome spinosa and the C-4 species Cleome gynandra were assembled and analysed. The pathways and cellular processes tagged in the transcriptome assemblies reflect processes of a mature leaf. The databases are useful for extracting transcripts related to C-4 processes as full-length or nearly full-length sequences. DA - 2011 DO - 10.1093/jxb/err029 KW - Assembly KW - C-4 KW - next-generation sequencing KW - transcriptome LA - eng IS - 9 M2 - 3093 PY - 2011 SN - 0022-0957 SP - 3093-3102 T2 - Journal of Experimental Botany TI - Critical assessment of assembly strategies for non-model species mRNA-Seq data and application of next-generation sequencing to the comparison of C-3 and C-4 species UR - https://nbn-resolving.org/urn:nbn:de:0070-pub-29151618 Y2 - 2024-11-22T01:51:24 ER -