mRNA-Seq analysis of Tree 35

The contributors

Martin Trick (JIC), Andrea Harper (JIC*), Leah Clissold (TGAC) and Ian Bancroft (JIC*)

*Present affiliation: University of York

The material

First flush leaf material was harvested from Tree 35 in Denmark in May 2013.

The analysis

mRNA was extracted and then a strand-specific, paired-end Illumina RNA-Seq library constructed. Around 193 million read pairs were obtained from a single HiSeq 2500 lane and these are now available from The Sainsbury Laboratory’s FTP server, with details in the github repository here.

Trinity was used to assemble transcript fragments from a down-sampled set of 100 million pairs of reads, generating an unusually large number of 517,056 assemblies (data/tree/master/ash_dieback/fraxinus_excelsior/tree35/assemblies/mRNA/tree35_trinity.fasta.gz. RSEM transcript abundance analysis was then carried out to select the 238,283 principal isoforms from these (data/tree/master/ash_dieback/fraxinus_excelsior/tree35/assemblies/mRNA/tree35_principal_isoforms.fasta.gz), and this will serve as our reference sequence for the association work that will soon be conducted on a diverse panel of trees from Denmark.

Candidate open reading frames were extracted and the predicted peptides (data/tree/master/ash_dieback/fraxinus_excelsior/tree35/annotations/mRNA/tree35_predicted_peptides.fasta.gz) were queried with BLASTP against the protein database, creating a first draft functional annotation (data/tree/master/ash_dieback/fraxinus_excelsior/tree35/annotations/mRNA/annotated_peptides.txt). A significant number of proteins apparently originating from Bradyrhizobium sp. BTAi1 were found – the validity or significance of this is currently unknown. Because of this issue and the unusual complexity of the assemblies, these data should be treated with caution; another Tree 35 sample is due to be sequenced soon. Finally, the transcript assemblies were queried with BLASTN against the Tree35 genome scaffolds developed by TGAC. Around 75% of the transcripts were unambiguously located to scaffolds, thus creating an extra layer of annotation for the genomic assembly (data/tree/master/ash_dieback/fraxinus_excelsior/tree35/blast/mRNA/tree35_transcripts_vs_Nornex_s1v1.gz).