Monthly Archives: December 2012

Phylogenetic tree of Chalaria fraxinea based on ITS / 5.8S rRNA sequences

The contributors

Darren Soanes at the University of Exeter

The analysis

DNA sequences containing internal transcribed spacers (ITS) and 5.8S rRNA were obtained from Genbank. The Chalaria fraxinea sequence came from isolate LWF_LB (Accession number JX667707.1). Sequences were also obtained from Genbank of ITS / 5.8S rRNA  from a number of filamentous ascomycetes, especially Leotiomycetes (sequences cited in Wang et al, 2006 – http://www.ncbi.nlm.nih.gov/pubmed/16837216). Sequences were aligned using Muscle and positions containing gaps removed from the alignment. PhyML was used to create the phylogenetic tree (100 bootstraps) using a GTR substitution model and eight gamma rate categories. Two trees were produced:

1: general filamentous ascomycetes:

 

2: Leotiomycetes, with Stagonospora and Cochliobolus as outgroup:

 

The interpretation

The trees show that Chalaria fraxinea lies within the Hymenoscyphus clade of the Leotiomycetes (with good branch support). The most closely related species is Hymenoscyphus scutula.

 

 

AT2 RNA-seq transcripts blastx-ed against eukaryotic protein database also show high fungal enrichment

The contributors

Diane Saunders at TSL

The material

We used the AT2 assembled transcripts (in the github repository at mixed_material/ashwellthorpe_AT2/assemblies/AT2_trinity_version2.fasta), which were assembled from reads generated from RNA extracted from pith material of an infected ash branch.

The analysis

We carried out blastx of these transcripts (with default settings) against our custom database of eukaryotic proteins from representative clades. A list of content for this database is available on request.

The interpretation

The blastx (output in the github repository at data/ash_dieback/mixed_material/ashwellthorpe_AT2/blast/AT2_blastx_ed/AT2_trinity_version2_blastx_ed.out) illustrates that ~15% of the assembled transcripts have top hits to sequences from fungal organisms.

 

How useful is the AT1 assembly?

Chalara ash dieback was first confirmed in the natural environment in the UK in late autumn based on samples from Ashwellthorpe Wood near Norwich. We decided early on in this project that speed would be a critical driver given the emergency nature of the problem. We decided that we should generate genetic sequences as rapidly as possible, release them to the community, and prompt the crowdsourcing exercise we have been publicizing since Friday.

The normal procedure would be to culture the pathogen and sequence the genome and transcriptome from cultured material and controlled laboratory infections. Here, we decided to take the unusual step of directly sequencing the “interaction transcriptome” of a lesion dissected from an infected ash twig. This was the most rapid way to proceed to generate useful information without proceeding through standard laboratory culturing. This is the shortest route from the wood to the sequencer to the computer. The question that many of you must be asking is how useful is this data? This post addresses this question and summarizes the preliminary analyses that the TSL team has produced.

How much of the data is of fungal origin?

Diane Saunder’s analysis indicates that ~30% of the assembled transcripts have top hits to fungal sequences. The proportion of fungal sequences is probably even higher given that >15% of the assembly contigs do not hit any known sequences.

In our experience with Phytophthora pathosystems, transcriptome sequencing of infected plant tissue typically yields 1-2% sequences from the pathogen although up to 20% pathogen sequences can be recovered. Thus, the ~30% fungal sequences recovered here is unusually high. Perhaps, ash pith yields less plant RNA than leaves or roots and ended up being underrepresented in the sample.

What proportion of the fungal sequences are from Chalara fraxinea?

A BLASTN search of the 116 C. fraxinea sequences in GenBank against the AT1 assembly was very informative. The AT1 fungal sequences are mostly C. fraxinea transcripts (or from a very closely related taxon). For example, the cobalamin-independent methionine synthase-like protein gene matched AT1 with 565 nucleotides out of 569 (99%); elongation factor-1 alpha (EF1a) gene 768/768 (100%) etc.

If the fungal sequences were from mixed taxa then we would expect multiple divergent hits for the genes above. This doesn’t seem to be generally the case. It is likely that there are sequences from other organisms besides C. fraxinea and ash. But perhaps these were not abundant enough to assemble into long contigs.

How good is the AT1 assembly?

RNAseq assemblies of short reads vary tremendously in quality. The AT1 assembly appears to contain a reasonable proportion of full-length CDS assemblies. For example, comp1171, a fungal polyketide synthase, is 7724 bp, and includes a full length CDS of 2479 amino acids.

Are there interesting genes you could already highlight?

A full length Nep1-like protein (NLP, comp507) with similarity to actinoporin toxins is highlighted here. The polyketide synthase comp1171 mentioned above could also synthesize a toxin. In addition, comp8971 encodes a full-length secreted protein with four LysM domains, and belongs to a well known family of fungal effectors.

How do you know these genes are from C. fraxinea?

At this point we don’t know for sure although it is likely that the sequences originate from C. fraxinea given the comments above about reduced complexity in housekeeping gene sequences. Once we have pure cultures and genomic reads we will be able to address this. Of course, those of you who have already generated genomic sequences of C. fraxinea could easily answer this question.

It should be noted that it would be informative if some of the interesting sequences, such as the toxins, turn out to originate from another organism especially if they consistently associate with Chalara ash dieback. Perhaps this is a complex pathosystem that involves multiple organisms. We just know very little at this point.

What’s next?

More ash dieback transcriptomes from independent samples and genome sequences of British isolates. This is all work in progress; and we will immediately post the data on OADB as soon as available.

A more general lesson?

There is a lesson from this first dataset. Whenever new or suspect plant diseases arise we should immediately sequence transcriptomes from field collected diseased tissue. These days the cost of an RNAseq lane is reasonable and the assemblies are pretty decent. The data generated should rapidly provide valuable information about the nature of the pathogen and offer an initial insight into its genes. Whenever time is of the essence, transcriptome sequencing should be initiated as soon as possible.

Posted by @KamounLab

Useful links

#openashdb

#ashdieback

GitHub wiki

Kamoun, S. 2012. Genomics of emerging plant pathogens: too little, too late. Microbiology Today, 39:140.

Nep1-like proteins (NLPs) identified in AT1 assembled transcripts using tblastn

The contributors

Suomeng Dong and Sophien Kamoun at TSL

The material

We used the AT1 assembled transcripts (in the github repository at mixed_material/ashwellthorpe_AT1/assemblies/AT1_trinity.fasta), which were assembled from reads generated from RNA extracted from pith material of an infected ash branch.

The analysis

We collated 7 fungal and oomycete Nep1-like protein (NLP) sequences and used these in a tblastn against the AT1 assembled transcripts (with default settings). The full details of the 7 fungal and oomycete NLP sequences can be provided upon request.

The interpretation

The tblastn highlighted three AT1 transcripts with high similarity to known NLPs. The output can be found in the github repository (mixed_material/ashwellthorpe_AT1/blast/AT1_tblastn_NLPs/NLPs_7_tblastn_AT1) along with the sequences of the three transcripts (mixed_material/ashwellthorpe_AT1/blast/AT1_tblastn_NLPs/NLP_transcripts_AT1).

We identified one AT1 transcript with high similarity to NLPs, which contains all conserved residues and likely encodes a full length protein sequence. For this transcript we extracted the coding sequence (mixed_material/ashwellthorpe_AT1/blast/AT1_tblastn_NLPs/AT1_NLP1.fan) and the corresponding protein sequence (mixed_material/ashwellthorpe_AT1/blast/AT1_tblastn_NLPs/AT1_NLP1.faa). The protein sequence was then used for protein structure modelling using Phyre2, with the published NLPpya_3GNU as a template. This analysis shows that the predicted protein we identified carries all the key residues for NLP cytolytic function.

 

 

AT1 RNA-seq transcripts blastx-ed against eukaryotic protein database show high proportion of fungal enrichment

The contributors

Diane Saunders at TSL

The material

We used the AT1 assembled transcripts (in the github repository at mixed_material/ashwellthorpe_AT1/assemblies/AT1_trinity.fasta), which were assembled from reads generated from RNA extracted from pith material of an infected ash branch.

The analysis

We carried out blastx of these transcripts (with default settings) against our custom database of eukaryotic proteins from representative clades. A list of content for this database is available on request.

The interpretation

The blastx (output in the github repository at mixed_material/ashwellthorpe_AT1/blast/AT1_blastx_ed/AT1_trinity_blastx_ed.out) illustrates that ~30% of the assembled transcripts have top hits to sequences from fungal organisms.