Author Archives: admin

TGAC Ash tree assemblies (Tree 18 & 35)

The Contributors

Bernardo Clavijo and Team,
The Genome Analysis Centre (TGAC), Norwich

Assembly summary:

Tree Number of scaffolds Total sequence (Mbp) % of Ns N50 (Kbp)
Tree18 37,452 865.1 5.1 180.4
Tree35 29,847 845.6 1.4 137.6

Contig assembly

Contigs were assembled from 250bp paired-end reads generated from a PCR-free protocol. The DISCOVAR de novo software [1] was used. We used KAT [2] spectra-cn plots to QC motif representation, and tailored our data generation towards a maximum complexity, precisely sized, low bias sampling.

Haplotype filter

Expectation maximisation heuristics based on k-mer spectra of the raw reads were applied to the contigs to create a mosaic genome representation by collapsing the haplotypes into one choice per locus. The filtered set of contigs represents all homozygous content and roughly half of the heterozygous content which simplifies the scaffolding stage.

Scaffolding

Nextera LMP were constructed, QC’d, and chosen for sequencing as described in TGAC’s published method [3], and pre-processed with a pipeline based on Nextclip [4]. Haplotype-filtered contigs were scaffolded using SOAPdenovo2 [5]. SOAPdenovo2 replaces N-stretches (gaps) in contigs with Cs and Gs during scaffolding so to correct for this contigs were mapped back to the scaffolds and the gaps converted back to Ns.

Contamination screening and filtering

Scaffolds shorter than 1kbp were removed. The remaining scaffolds were checked for contamination against NCBI’s nucleotide database using BLAST+ and the results joined to NCBI’s taxonomy database. Results were filtered to show hits of >98percent identity over >90% of their length. From this list, scaffolds identified as contamination were removed.

Assemblies are available to download from oadb ftp site
Tree 18 assembly

Tree 35 assembly

Both genomes are available to blast query at
TGAC ash genome blast site

1) http://www.broadinstitute.org/software/discovar/blog/
2) http://www.tgac.ac.uk/KAT/
3) D. Heavens, G. G. Accinelli, B. Clavijo, and M. D. Clark, “A method to simultaneously construct up to 12 differently sized Illumina Nextera long mate pair libraries with reduced DNA input, time, and cost.,” BioTechniques, vol. 59, no. 1, pp. 42–45, 2015.
4) R. M. Leggett, B. J. Clavijo, L. Clissold, M. D. Clark, and M. Caccamo, “NextClip: an analysis and read preparation tool for Nextera long mate pair libraries,” Bioinformatics, p. btt702, 2013.
5) R. Luo, B. Liu, Y. Xie, Z. Li, W. Huang, J. Yuan, G. He, Y. Chen, Q. Pan, Y. Liu, J. Tang, G. Wu, H. Zhang, Y. Shi, Y. Liu, C. Yu, B. Wang, Y. Lu, C. Han, D. W. Cheung, S.-M. Yiu, S. Peng, Z. Xiaoqian, G. Liu, X. Liao, Y. Li, H. Yang, J. Wang, T.-W. Lam, and J. Wang, “SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler.,” Gigascience, vol. 1, no. 1, p. 18, 2012.

Contact: Bernardo Clavijo, Algorithms Team Leader, TGAC.
Bernardo.Clavijo@tgac.ac.uk

Genome sequencing of 9 species from the genus, Hymenoscyphus.

The Contributors

Christine Sambles, Karen Moore, Exeter Sequencing Service, Mick Kershaw, Chris Thornton, Murray Grant and David Studholme.
University of Exeter, Devon.

Summary

Genome sequencing and assembly of 9 different species from the Hymenoscyphus genus.

Table 1: The 9 sequenced species with NCBI GenBank and Short Read Archive (SRA) accession numbers:

Species Strain GenBank accession SRA accession
Hymenoscyphus fructigenus CBS 650.92 LKUV00000000 SRX1322313
Hymenoscyphus scutula CBS 480.97 LKTO00000000 SRX1322311
Hymenoscyphus varicosporoides CBS 651.66 LLCF00000000 SRX1322314
Hymenoscyphus repandus CBS 341.76 LLCE00000000 SRX1322310
Hymenoscyphus salicellus CBS 111550 LLCD00000000 SRX1322295
Hymenoscyphus fraxineus CBS 133217 LLCC00000000 SRX1322294
Hymenoscyphus infarciens CBS 122016 LLCB00000000 SRX1322117
Hymenoscyphus herbarum (Calycina herbarum) CBS 466.73 LLEY00000000 SRX1325539
Hymenoscyphus laetus* CBS 340.76 LLCA00000000 SRX1322158

*ITS phylogenetic analysis suggests this is not a Hymenoscyphus spp. but is more likely to be a species from the Phaeosphaeria genus.

Table 2: Assembly statistics of the 9 sequenced Hymenoscyphus spp. genomes:

Species Strain Genome size (bp) # contigs N50 % GC
Hymenoscyphus fructigenus CBS 650.92 61,124,938 504 373,842 43.09
Hymenoscyphus scutula CBS 480.97 63,226,382 2,591 72,955 41.07
Hymenoscyphus varicosporoides CBS 651.66 31,978,795 206 585,492 46.36
Hymenoscyphus repandus CBS 341.76 42,813,648 925 226,427 45.61
Hymenoscyphus salicellus CBS 111550 57,952,665 9,749 9,574 44.34
Hymenoscyphus fraxineus CBS 133217 51,524,987 4,749 24,374 43.65
Hymenoscyphus infarciens CBS 122016 68,152,485 1,714 156,341 37.56
Hymenoscyphus herbarum CBS 466.73 69,308,136 529 325,154 41.67
Hymenoscyphus laetus* CBS 340.76 36,473,783 471 257,815 51.89

Fig 1: ITS phylogenetic tree of 8 sequenced genomes and other member of the Helotiales order, not including H. laetus (putatively a Phaeosphaeria spp.)

hym_9species

Comment: Ash pathogen Hymenoscyphus pseudoalbidus renamed Hymenoscyphus fraxineus

J Allan Downie, John Innes Centre

On this web site (in line with current usage to date) the Chalara ash dieback pathogen has been referred to as Chalara fraxinea (asexual morph) and Hymenoscyphus pseudoalbidus (sexual morph). In line with the recent publication altering the nomenclature of this fungus (Baral et al 2014), it is suggested that on this web site the new name Hymenoscyphus fraxineus should now be used.

Genome sequencing of 23 strains of H. pseudoalbidus from Europe

The Contributors

Georgios Koutsovoulos, Mark Blaxter
The University of Edinburgh

Adam Vivian-Smith & Ari Hietala
Skog og Landskap – Norwegian Forest and Landscape Institute
Bioforsk – Norwegian Institute for Agricultural and Environmental Research

Renaud Ioos & colleagues
Unité de Mycologie, Laboratoire de la Santé des Végétaux, Domaine de Pixérécourt, Malzéville, France

Summary

Genome assembly summary of 23 Strains of H. pseudoalbidus sequenced at The University of Edinburgh

Table 1: 23 strains of H. pseudoalbidus from Europe

Strain Year Country
2008-81-6      2008      NORWAY
2008-125/2      2008      NORWAY
2008-139/1      2008      NORWAY
2008-142/5      2008      NORWAY
2008-148/4      2008      NORWAY
2008-152/4      2008      NORWAY
2009-86/3      2009      NORWAY
2010-189/4      2010      NORWAY
2010-189/5      2010      NORWAY
2011-11/1      2011      NORWAY
2012-24/1      2012      NORWAY
2012-38/2/2      2012      NORWAY
2012-42/1/1      2012      NORWAY
CBS122191      ?      AUSTRIA
CBS122503      2011      POLAND
CBS122504      2005      POLAND
CBS122505      2000      POLAND
CBS122507      2000      POLAND
FON-M-1      2009      FRANCE
GIR-M-2      2009      FRANCE
LAN-M-1      2009      FRANCE
MIG-M-1      2009      FRANCE
LSVM82      2008      FRANCE

Table 2: Summary of assembly stats

Strain contigs (>500) span of contigs (MB) N50 GC% diff. from K1 indels from K1 span of indels
2008-125/2 6267 53.5 24303 42.67 193829 37014 75714
2008-139/1 6041 53.5 24414 42.69 174930 32599 68413
2008-142/5 6138 53.5 25254 42.72 180469 32502 69266
2008-148/4 7560 51.8 18776 43.18 175814 32875 68400
2008-152/4 6736 52.8 22327 42.9 176916 34024 70682
2008-81/6 7387 51.6 19538 43.19 187222 36522 72017
2009-86/3 6160 53.5 25295 42.72 164775 32603 64923
2010-189/4 6348 53.3 23729 42.78 179776 33438 70108
2010-189/5 6331 53.4 24880 42.77 179106 33429 69312
2011-11/1 6118 53.4 25094 42.73 188201 37279 75145
2012-24/1 6317 53.3 24080 42.77 177522 33805 69614
2012-38/2/2 6094 53.5 24711 42.67 174888 33466 67026
2012-42/1/1 6516 53.1 23078 42.83 173764 32218 67008
CBS122191 6252 53.5 24172 42.71 181217 32820 69030
CBS122503 6700 53.2 22119 42.77 195226 35872 74082
CBS122504 6365 56.7 30182 41.76 191934 33400 68770
CBS122505 6422 53.5 23708 42.67 172436 32440 66584
CBS122507 6387 53.1 23038 42.79 180121 34599 70054
FON-M-1 6436 53.3 23037 42.73 179944 35230 70160
GIR-M-2 4770 51.9 34647 43.59 196420 37771 95254
LAN-M-1 6534 52.5 23965 43.15 168898 31707 64028
LSVM82 35509 77.3 19479 38.33 171274 34758 92767
MIG-M-1 6498 52.8 23204 42.91 200012 36898 78521

Reports and Raw data

Reports and presentations of complete assemblies are available at OADB github Assembly reports of European strains

And raw data was submitted to The European Nucleotide Archive (ENA) ERP006093
Further information of raw data and assembled genomes are available at Genome assemblies of European strains
and at http://www.ebi.ac.uk/ena/data/view/ERS480843-ERS480865

20 UK isolates sequenced and submitted by The Genome Analysis Centre

The Contributors

Mark McMullan, Matt Clark, Louisa Williamson, James Lipscombe, Rachel Piddock, Fiore Cugliandolo, Fiona Fraser, Tom Barker, Mario Caccamo
The Genome Analysis Centre, Norwich

Summary

Hymenoscyphus pseudoalbidus (Chalara fraxinea) were isolated from across Great Britain and sent to TGAC by the The Food and Environment Research Agency (FERA). FERA purified DNA from each isolate which was sent to TGAC, where it was QC’d, libraries constructed and finally sequenced on a HiSeq2500 (150bp) paired end run. The 20 isolates were sequenced over two lanes (20 isolates per lane) to an average of ~50x depth per isolate.
Following are the 20 isolates
FERA_cc086
FERA_cc087
FERA_cc088
FERA_cc095
FERA_cc096
FERA_cc097
FERA_cc099
FERA_cc101
FERA_cc105
FERA_cc106
FERA_cc107
FERA_cc108
FERA_cc111
FERA_cc113
FERA_cc118
FERA_cc119
FERA_cc121
FERA_cc133
FERA_cc134
FERA_cc139

and their sequence information is available at OADB github

Downstream analyses (genome assembly, phylogenies and selection analyses) have been done at TGAC (MM).
These analyses await data from European isolates sequenced in Edinburgh.

Analysis of UK Ash diversity set- morphological traits and disease susceptibility

The Contributors

Robert J. Saville, Tom Passey, Judit Linka, Karen Russell and Richard J. Harrison
Genetics and Crop Improvement, East Malling Research

Introduction

As part of the Nornex consortium EMR has been screening a collection of UK ash clones collected as part of historic DEFRA projects and by members of the Future Trees Trust. A partial analysis of the diversity of UK ash has previously been reported (Sutherland et al. 2010). Throughout the year, trees were evaluated for floral sexual morphology, leaf emergence, senescence and presence of potential Hymenoscyphus pseudoalbidus infection. The ultimate aim of this work is to identify putative resistant trees and ascertain whether previously reported correlations between senescence date and disease tolerance could be observed in UK ash material (Kjaer et al. 2012; McKinney et al. 2011; McKinney et al. 2012).

Material

Ash populations, described in the downloadable spreadsheet are in the most part duplicate populations, planted in two phases in 2008-2009.

Methods and Results

Tree sex was determined using the trait descriptors shown in Figure 1 (below). Data is presented in the supplementary excel file, in the tab labeled Tree Sex. These data are valuable for future breeding and selection of both males and females that display resistance.

Fig1a
Fig1b
Figure 1. Flower types observed on ash (Fraxinus excelsior L.). a) male flower (prior to anthesis), b) hermaphrodite flower with rudimentary gynoecium (functionally male), c) hermaphrodite flower, d) hermaphrodite flower with vestigial anthers (functionally female) and, e) female flower.

Leaf emergence was scored based on the trait descriptors shown in Figure 2 (below). Data is presented in the supplementary excel file, in the tab labeled Leaf Emergence. These data may be useful when identifying traits correlated with local niche (i.e. altitude/ latitude), which may be important for the successful introduction of resistant material in future.

Fig2
Figure 2. Leaf emergence scored on a five point scale based on level of emergence.

Senescence was recorded using three different descriptors (listed in Table 1-3). These traits were leaf loss, leaf colour and rachis retention, all of which may be significantly related to disease escape. These data are presented in the supplementary excel file, in the tab labeled Senescence.

Table 1: Trait descriptor for leaf loss

Leaf Loss Trait Description
1                   no leaf loss
2                   1-25% leaf loss
3                   26-50% leaf loss
4                   51-75% leaf loss
5                   76-99%% leaf loss
6                   100% leaf loss

 

Table 2: Trait descriptor for leaf colour scale (adapted from McKinney et al. 2011)

Leaf Colour Trait Description
1          dark green leaves
2          ~25% yellow leaves
3          ~50% yellow leaves
4          ~75% yellow leaves
5          completely yellow and fading leaves
6          necrosis (brown leaves)

 

Table 3: Trait descriptor for rachis retention (to assess disease escape significance)

Rachis retention scale Description
Y          rachis detach easily when pulled through hand
N          rachis do not detach when pulled through hand

Disease observations

Disease was recorded throughout the season in 2013, at which point (barring a single pre-existing lesion) no symptoms of foliar disease were observed during the growing season. However, in early 2014 disease assessment of dormant trees revealed several accessions with lesions on first year wood (i.e. infection that occurred during 2013) but was not observed from assessments. These results are presented in the Disease tab of the supplementary excel file, as are full diary records of dates of recording in the season diary tab. Subsequent isolation from the leading edge of suspect lesions confirmed the presence of cultures consistent with H. pseudoalbidus. PCR validation is underway, though all hallmarks of both lesions and subsequent cultures and indicative of H. pseudoalbidus being present.

Link to raw data at OADB github supplementary excel file

References

Kjaer, E.D. et al., 2012. Adaptive potential of ash (Fraxinus excelsior) populations against the novel emerging pathogen Hymenoscyphus pseudoalbidus. Evolutionary Applications, 5(3), pp.219–228.
McKinney, L. V et al., 2011. Presence of natural genetic resistance in Fraxinus excelsior (Oleraceae) to Chalara fraxinea (Ascomycota): an emerging infectious disease. Heredity, 106(5), pp.788–97.
McKinney, L. V. et al., 2012. Genetic resistance to Hymenoscyphus pseudoalbidus limits fungal growth and symptom occurrence in Fraxinus excelsior. Forest Pathology, 42(1), pp.69–74.
Sutherland, B.G. et al., 2010. Molecular biodiversity and population structure in common ash (Fraxinus excelsior L.) in Britain: implications for conservation. Molecular ecology, 19(11), pp.2196–211.

The mitochondrial genome of H. pseudoalbidus

The Contributors

Rachel Glover, FERA.

The material

In order to identify sequences potentially originating from the
mitochondrial genome of H. pseudoalbidus we downloaded the 248
fully sequenced ascomycete mitochondrial genomes from
Genbank and used these sequences as a BLAST database to screen the
genomic contigs for potential mitochondrial origin.

The result

Fifty-seven contigs
were identified with significant similarity to ascomycete mitochondrial
sequences. Further examination of these 57 contigs showed that many
contigs were identical but in reverse complement or extending by a few
hundred base pairs. These contigs were collapsed to form a dataset of 45
contigs ranging in length from 109-14,731bp and GC-contents ranging from
9.2-45.9 % (Figure 1). Most of the contigs \textgreater{}5kb fall into a GC content
range of 30-40 %, typical of AT-rich mitochondrial sequences. It may be
that the AT rich repeat islands discussed above are mitochondrial in
origin as the mitochondrial genome will be more prevalent in the
sequence dataset this would explain the increase in abundance of those
sequences
Untitled_picture

Figure 1. Contigs identified as potentially mitochondrial in origin, by similarity search. A plot of length vs GC content.

The total length of the 45 mitochondrial contigs is
156,026bp with no significant overlap. If this preliminary estimate is accurate \emph{H.pseudoalbidus} would have the largest
mitochondrial genome sequenced from the ascomycetes so far (see Figure 2), although we expect the size to reduce with further work.
mitochondrial_genome_lengths

Figure 2. Histogram of mitochondrial config length for all sequenced ascomycete mitochondrial genomes.

Interpretation

A number of factors have prevented the construction of a finished
mitochondrial genome at this time. Firstly, the potential mitochondrial
contigs were identified based upon similarity based searches against
current ascomycete mitochondrial genomes. The similarity based approach
to finding mitochondrial sequences within a nuclear genome sequencing
project may have misidentified some of these contigs as mitochondrial
when in fact they are nuclear integrations of portions of the true
mitochondrial genome (NUMTs). This is likely to have artificially
inflated our estimate of the size of the H. pseudoalbidus mitochondrial
genome. Annotation of the potential mitochondrial contigs is in progress
and there are early indications of a very large number of introns
(intronic ORFs) present in the mitochondrial genome of H. pseudoalbidus.
The second complicating factor in attempting to assemble the
mitochondrial genome at this time is the large number of AT repeats
present in the sequences we have identified as being mitochondrial in
origin. The repeats are likely to be collapsed and appear to be at the
ends of the contigs we have identified, preventing further assembly
without additional sequencing.

Identification of protein-coding genes putatively involved in infection by combining metagenomics analysis and protein orthologue clustering.

Contributors

Christine Sambles and David Studholme. University of Exeter, Devon.

Introduction

In order to identify fungal protein-coding genes associated with Fraxinus:Hymenoschyphus in planta interactions, we took an orthologue clustering approach. By identifying fungal transcripts that are present in four samples taken from infected ash and removing transcripts that are also present in the KW1 isolate could reveal some infection-related transcripts from H. pseudoalbidus. Additionally, F. excelsior transcripts present in the infected material and absent from F. excelsior with no signs of infection could identify transcripts involved in the plants response to infection by H. pseudoalbidus.

Material

Transcriptome assemblies:

F. excelsior: ATU1

C. fraxinea:  KW1

Mixed material: AT1AT2UptonHolt

Output from BLASTX searches against GenBank:

F. excelsior: ATU1

C. fraxinea: KW1

Mixed material: AT1AT2UptonHolt

Methods & Results

We used MEGAN as previously described (http://oadb.tsl.ac.uk/?p=704), to assign transcripts to taxonomic bins. These transcripts came from four transcript assemblies:

  • 1 H. pseudoalbidus isolate (KW1) and
  • 4 mixed material (AT1, AT2, Holt & Upton).

This resulted in 36,945 transcripts being allocated to the bin for order Helotiales.

The longest open reading frame for each Helotiales-binned transcript (Table 1) was translated into a predicted protein sequence. These protein sequences were clustered using OrthoMCL.

Table 1: Numbers of transcripts and percentages of all transcripts for each sample or isolate that were binned to the order Helotiales using MEGAN.


AT1

AT2

Holt

Upton

KW1

ATU1

Helotiales

8,214

7,403

6,930

7,410

6,561

0

% all transcripts

15.61%

8.80%

6.44%

12.25%

31.75%

0.00%

OrthoMCL analysis

Between 4,548 and 5,551 proteins were clustered from each sample; the number of protein clusters was 6,505 in total. A Venn diagram of the clustered proteins can be seen in Figure 1.

Description: \\isad.isadroot.ex.ac.uk\uoe\user\desktop\heloKW1_othomcl_venn\venn_result17167.png

Fig 1: Venn diagram of Helotiales-binned proteins clustered with OrthoMCL for one H. pseudoalbidus isolate (KW1) and four mixed material samples from H. pseudoalbidus infected F. excelsior (AT1, AT2, Holt and Upton).

There was a core set of 3,118 protein clusters from detectable transcripts. A set of 113 protein clusters was identified only in H. pseudoalbidus samples that were from infected F. excelsior (AT1, AT2, Holt & Upton) and 33 only identified in KW1, a H. pseudoalbidus isolate. These will be referred to as the ‘in planta’ and ‘ex planta’ groups respectively.

The 113 protein clusters found only in H. pseudoalbidus infected F. excelsior (in planta) contained a total of 565 transcripts (459 excluding isoforms).  We annotated the transcript sequences based on results of BLASTX searches. Additionally the GO, EC, KEGG, PFAM and CAZy (Carbohydrate-Active enzymes) databases were used to annotate the full set of 565 transcripts.

GO, EC and KEGG annotation were inferred using annot8r (Schmid and Blaxter 2008), PFAM domains were identified with Pfam scan (a wrapper script around hmmpfam) and CAZy-family members were annotated using the CAZYmes Analysis Toolkit (CAT) (Park, Karpinets et al. 2010).

GO analysis revealed a reduction of growth-related and an increase of cell differentiation and proliferation proteins in infected material (Fig 2).

Figure 2: Gene Ontology (GO) analysis of the the pan-proteome (KW1, AT1, AT2, Upton, Holt) compared to in planta proteins. The in planta proteins were translated from Helotiales-binned transcripts (MEGAN) and were identified only in H. pseudoalbidus samples that were from infected F. excelsior (AT1, AT2, Holt & Upton). The pan-proteome proteins were also translated from Helotiales-binned transcripts (MEGAN) and include the isolate, KW1.

PFAM and CAZy analysis of the 565 transcripts of the pan-proteome resulted in 88 PFAM domains/families and the following CAZy families:  

  • Glycosyl hydrolases family 18 (Pfam: Glyco_hydro_18, PF00704)
  • Alcohol dehydrogenase GroES-like domain (Pfam: ADH_N, PF08240) & Zinc-binding dehydrogenase (Pfam: ADH_zinc_N, PF00107)
  • alpha/beta hydrolase fold (Pfam: Abhydrolase_3, PF07859)
  • Protein of unknown function, a putative transmembrane protein from bacteria. It is likely to be conserved between Mycobacterium species (Pfam: DUF2029, PF09594) &  PAP2 superfamily (Pfam: PAP2_3, PF14378)
  • Regulator of chromosome condensation (RCC1) repeat (Pfam: RCC1, PF00415)
  • Chalcone-flavanone isomerase (Pfam: Chalcone, PF02431)
  • Myosin head (motor domain) (Pfam: Myosin_head, PF00063) & Chitin synthase (Pfam: Chitin_synth_2, PF03142)RhgB_N|fn3_3|CBM-like.

BLASTX hits from the in planta transcripts included putative CFEM domain-containing protein (Marssonina brunnea) and Galactose mutarotase-like protein (Glarea lozoyensis). The Galactose mutarotase-like protein is of interest as it is also similar to rhamnogalacturonate lyase found in Aspergillus spp. and is known to degrade plant cell walls by cleaving the pectin backbone (de Vries and Visser 2001). Some CFEM-containing proteins are proposed to have important roles in fungal pathogenesis (Kulkarni, Kelkar et al. 2003).

Comparisons of Pfam domain content among samples

PFAM domains and families in the ‘pan-proteome’ of KW1, AT1, AT2, Holt & Upton were identified using the hmmpfam wrapper script, Pfam scan. These were compared to the PFAM annotation of the ‘in planta’ group to identify over-representation of specific domains within this group. The domains and families in which >80% annotations were present in the ‘in planta’ group when compared to the ‘pan-proteome’ are shown in Table 1.

Table 1: Pfam domains and families in which >80% ‘pan-proteome’ annotations were present in the ‘in planta’ group (http://pfam.sanger.ac.uk/).


Domain/Family

Name

Pfam accession

ATP12

ATP12 chaperone protein

PF07542

BOP1NT

BOP1NT (NUC169) domain

PF08145

iPGM_N

BPG-independent PGAM N-terminus

PF06415

CDC37_M

Cdc37 Hsp90 binding domain

PF08565

CDC37_N

Cdc37 N terminal kinase binding domain

PF03234

CDC37_C

Cdc37 C terminal domain

PF08564

Chalcone

Chalcone-flavanone isomerase

PF02431

Copper-bind

Copper binding proteins  plastocyanin/azurin family

PF00127

Sdh5

Flavinator of succinate dehydrogenase

PF03937

HD_3

HD domain

PF13023

Hpt

Hpt domain

PF01627

Metalloenzyme

Metalloenzyme superfamily

PF01676

CENP-I

Mis6

PF07778

Myosin_tail_1

Myosin tail

PF01576

TRM

N2 N2-dimethylguanosine tRNA methyltransferase

PF02005

Es2

Nuclear protein Es2

PF09751

Tom37

Outer mitochondrial membrane transport complex protein

PF10568

PAP2_3

PAP2 superfamily

PF14378

PMC2NT

PMC2NT (NUC016) domain

PF08066

Porphobil_deam

Porphobilinogen deaminase  dipyromethane cofactor binding domain

PF01379

Porphobil_deam(C)

Porphobilinogen deaminase C-terminal domain  

PF03900

DUF2012

Protein of unknown function

PF09430

DUF775

Protein of unknown function

PF05603

Prp31_C

Prp31 C terminal domain

PF09785

Ribosomal_L32p

Ribosomal L32p protein family

PF01783

Several of the Pfam hits struck us as interesting; these are described below. The pairs of numbers in brackets are the number found within the in planta group / number found in entire ‘pan-proteome’:

Porphobil_deam and Porphobil_deamC (6/6) were found in two AT1 isoforms, AT2, two Holt isoforms and Upton. There were no peptides with this domain in the Helotiales binned KW1 proteome. Heme-biosynthetic porphobilinogen deaminase protects Aspergillus nidulans from nitrosative stress. In A. nidulans, a novel NO-tolerant (nitric oxide-tolerant) protein PBG-D (the heme biosynthesis enzyme porphobilinogen deaminase) modulates the reduction of environmental NO and nitrite by flavohemoglobin (FHB, encoded by fhbA and fhbB)) and nitrite reductase (NiR, encoded by niiA) (Zhou, Narukami et al. 2012). NO is part of the plant hypersensitive response, a localized programmed cell death and confines pathogen to site of attempted infection (Mur, Carver et al. 2006).

Proteins matching the ‘copper binding proteins, plastocyanin/azurin’ family (Pfam: Copper-bind, PF00127) (3/3) domain were found in AT1, Holt & Upton. OrthoMCL clustered an AT2 protein with them, but the assembled transcript was incomplete at the 5’ end and the PF00127 was therefore not present. BLASTX searches indicated an amino acid sequence similarity to cupredoxin from Glarea        lozoyensis and HHPred predicts similarity to cucumber stellacyanin. Due to the amino acid sequence similarity between the phytocyanins and fungal laccases, this may potentially be a laccase. White-rot fungi (e.g. Trametes cinnabarina, Trametes versicolor and Phlebia radiata) are reported to produce laccases which degrade lignin (Tuor, Winterhalter et al. 1995; Eggert, Temp et al. 1997) and laccase-mediated detoxification of phytoalexins generated by the plant defence systems has been observed in Botrytis cinerea (Pezet, Pont et al. 1991; Sbaghi, Jeandet et al. 1996; Adrian, Rajaei et al. 1998; Breuil, Jeandet et al. 1999).

The Hpt domain (Pfam: Hpt, PF01627) (5/5) was identified in two AT1 isoforms, AT2, Upton & Holt.  The histidine-containing phosphotransfer (HPt) domain is a novel protein module with an active histidine residue that mediates phosphotransfer reactions in the two-component signalling systems (Catlett, Yoder et al. 2003).

Although below the threshold of 80%, 35.71% (5/14) of the CFEM domains identified in the ‘pan-proteome’ of KW1, AT1, AT2, Holt & Upton were present in the ‘in planta’ group and none were present in the ‘ex planta’ group. The CFEM domains were distributed across 4 clusters, only one of which is not present in KW1:

ClusterID:         Clustered protein present in:

HELO2454:         AT1, AT2, HOLT, UPTON

HELO4337:         AT1, AT2, HOLT, UPTON, KW1

HELO5213:         AT1, HOLT, UPTON, KW1

HELO5952:         AT2, UPTON, KW1

 

Fig 2: Phylogenetic tree of H. pseudoalbidus sequences from four OrthoMCL clusters where at least one sequence in the cluster contains a CFEM domain (Pfam: PF05730). The names of full-length proteins are shown in black; in grey are names of shorter length proteins from incomplete transcript assembly that lack a CFEM domain but that cluster with CFEM domain sequences due to sequence similarity and inferred orthology. Orthologue clustering was performed on all translated transcripts binned to the Helotiales using MEGAN from the one H. pseudoalbidus isolate (KW1) and all four H. pseudoalbidus samples that were from infected F. excelsior (AT1, AT2, Holt & Upton).

The 33 clusters (representing 72 peptides) in the ex planta group which were only identified in the isolate KW1 were annotated with PFAM as previously described. This resulted in identification of 17 Pfam domains/families (Table 2).

Table 2: Pfam domains/families identified in the ex planta group


Domain/Family

Name

Pfam accession

COX1

Cytochrome C and Quinol oxidase polypeptide I

PF00115

DASH_Spc34

DASH complex subunit Spc34

PF08657

Pentapeptide_4

Pentapeptide repeats

PF13599

Vac7

Vacuolar segregation subunit 7 P

PF12751

DHQ_synthase

3-dehydroquinate synthase

PF01761

LtrA

Bacterial low temperature requirement A protein

PF06772

FSH1

Serine hydrolase

PF03959

Tyrosinase

Common central domain of tyrosinase

PF00264

Glyco_hydro_47

Glycosyl hydrolase family 47

PF01532

DUF202

Domain of unknown function

PF02656

SET

SET domain

PF00856

Abhydrolase_1

alpha/beta hydrolase fold

PF00561

adh_short_C2

Enoyl-(Acyl carrier protein) reductase

PF13561

Glyco_hydro_3

Glycosyl hydrolase family 3 N terminal domain

PF00933

ADH_zinc_N

Zinc-binding dehydrogenase

PF00107

AAA

ATPase family associated with various cellular activities

PF00004

adh_short

short chain dehydrogenase

PF00106

This low number of peptides not identified in any of the H. pseudoalbidus infected ash samples limits the ability to perform any comparative analysis.

Conclusions

Proteins putatively involved in plant-pathogen interactions have been identified from groups of translated transcripts exclusively found in planta and were not identified in isolate KW1. They included a copper binding protein within the plastocyanin/azurin family, porphobilinogen deaminase, a CFEM domain-containing protein and a Galactose mutarotase-like protein.

References

Adrian, M., H. Rajaei, et al. (1998). "Resveratrol Oxidation in Botrytis cinerea Conidia." Phytopathology 88: 472-476.

Breuil, A. C., P. Jeandet, et al. (1999). "Characterization of a Pterostilbene Dehydrodimer Produced by Laccase of Botrytis cinerea." Phytopathology 89: 298-302.

Catlett, N. L., O. C. Yoder, et al. (2003). "Whole-genome analysis of two-component signal transduction genes in fungal pathogens." Eukaryotic cell 2: 1151-1161.

de Vries, R. P. and J. Visser (2001). "Aspergillus Enzymes Involved in Degradation of Plant  Cell Wall Polysaccharides." Microbiology and Molecular Biology Reviews 65: 497-522.

Eggert, C., U. Temp, et al. (1997). "Laccase is essential for lignin degradation by the white-rot fungus Pycnoporus cinnabarinus." FEBS Letters 407: 89-92.

Kulkarni, R. D., H. S. Kelkar, et al. (2003). An eight-cysteine-containing CFEM domain unique to a group of fungal membrane proteins. Trends in Biochemical Sciences. 28: 118-121.

Mur, L. A. J., T. L. W. Carver, et al. (2006). "NO way to live; the various roles of nitric oxide in plant-pathogen interactions." Journal of experimental botany 57: 489-505.

Park, B. H., T. V. Karpinets, et al. (2010). "CAZymes Analysis Toolkit (CAT): web service for searching and analyzing carbohydrate-active enzymes in a newly sequenced organism using CAZy database." Glycobiology 20: 1574-1584.

Pezet, R., V. Pont, et al. (1991). "Evidence for oxidative detoxication of pterostilbene and resveratrol by a laccase-like stilbene oxidase produced by Botrytis cinerea." Physiological and Molecular Plant Pathology 39: 441-450.

Sbaghi, M., P. Jeandet, et al. (1996). "Degradation of stilbene‐type phytoalexins in relation to the pathogenicity of Botrytis cinerea to grapevines." Plant Pathology: 139-144.

Schmid, R. and M. L. Blaxter (2008). "annot8r: GO, EC and KEGG annotation of EST datasets." BMC bioinformatics 9: 180.

Tuor, U., K. Winterhalter, et al. (1995). Enzymes of white-rot fungi involved in lignin degradation and ecological determinants for wood decay. Journal of Biotechnology. 41: 1-17.

Zhou, S., T. Narukami, et al. (2012). Heme-Biosynthetic Porphobilinogen Deaminase Protects Aspergillus nidulans from Nitrosative Stress. Applied and Environmental Microbiology. 78: 103-109.

Orthologue_clustering_v3