The mitochondrial genome of H. pseudoalbidus

The Contributors

Rachel Glover, FERA.

The material

In order to identify sequences potentially originating from the
mitochondrial genome of H. pseudoalbidus we downloaded the 248
fully sequenced ascomycete mitochondrial genomes from
Genbank and used these sequences as a BLAST database to screen the
genomic contigs for potential mitochondrial origin.

The result

Fifty-seven contigs
were identified with significant similarity to ascomycete mitochondrial
sequences. Further examination of these 57 contigs showed that many
contigs were identical but in reverse complement or extending by a few
hundred base pairs. These contigs were collapsed to form a dataset of 45
contigs ranging in length from 109-14,731bp and GC-contents ranging from
9.2-45.9 % (Figure 1). Most of the contigs \textgreater{}5kb fall into a GC content
range of 30-40 %, typical of AT-rich mitochondrial sequences. It may be
that the AT rich repeat islands discussed above are mitochondrial in
origin as the mitochondrial genome will be more prevalent in the
sequence dataset this would explain the increase in abundance of those

Figure 1. Contigs identified as potentially mitochondrial in origin, by similarity search. A plot of length vs GC content.

The total length of the 45 mitochondrial contigs is
156,026bp with no significant overlap. If this preliminary estimate is accurate \emph{H.pseudoalbidus} would have the largest
mitochondrial genome sequenced from the ascomycetes so far (see Figure 2), although we expect the size to reduce with further work.

Figure 2. Histogram of mitochondrial config length for all sequenced ascomycete mitochondrial genomes.


A number of factors have prevented the construction of a finished
mitochondrial genome at this time. Firstly, the potential mitochondrial
contigs were identified based upon similarity based searches against
current ascomycete mitochondrial genomes. The similarity based approach
to finding mitochondrial sequences within a nuclear genome sequencing
project may have misidentified some of these contigs as mitochondrial
when in fact they are nuclear integrations of portions of the true
mitochondrial genome (NUMTs). This is likely to have artificially
inflated our estimate of the size of the H. pseudoalbidus mitochondrial
genome. Annotation of the potential mitochondrial contigs is in progress
and there are early indications of a very large number of introns
(intronic ORFs) present in the mitochondrial genome of H. pseudoalbidus.
The second complicating factor in attempting to assemble the
mitochondrial genome at this time is the large number of AT repeats
present in the sequences we have identified as being mitochondrial in
origin. The repeats are likely to be collapsed and appear to be at the
ends of the contigs we have identified, preventing further assembly
without additional sequencing.