Martin Trick (JIC), Andrea Harper (CNAP, University of York), Leah Clissold (TGAC) and Ian Bancroft (CNAP, University of York)
Young leaf material was harvested from a clone of Tree 35 in Denmark in 2013.
mRNA was extracted and a paired-end (but this time not strand-specific) Illumina RNA-Seq library constructed. About 125 million read pairs were obtained from a single HiSeq 2500 lane – the raw data are available from The Sainsbury Laboratory’s FTP server, with details in the github repository here. Trinity was again used to assemble transcripts from the complete set of reads, this time generating 242,115 assemblies, and then RSEM transcript abundance analysis was carried out to select 130,978 principal isoforms which constitute our new reference sequence. 96% of the transcripts were located to scaffolds in the Tree 35 genome assembly developed by TGAC. Candidate open reading frames were extracted and the predicted peptides were queried against the UniProt protein database with BLASTP producing a functional annotation.
We have now sequenced the leaf transcriptomes of 186 trees that have been sampled from across Denmark and phenotyped for disease symptoms by our colleague Erik Dahl Kjaer’s group. SNPs and expression levels with respect to the Tree 35 reference have been calculated and we are about to start on the association work.