After the shotgun stage, reads were assembled with parallel phrap

After the shotgun stage, reads were assembled with parallel phrap (High Performance Software, LLC). Possible mis-assemblies were corrected with gapResolution, Dupfinisher [49], or sequencing cloned bridging PCR fragments with subcloning or transposon bombing (Epicentre Biotechnologies, Madison, WI). Gaps between contigs were closed by editing in Consed, by selleck chemical Ruxolitinib PCR and by Bubble PCR primer walks (J.-F.Chang, unpublished). A total of 60 additional reactions were necessary to close gaps and to raise the quality of the finished sequence. Illumina reads were also used to correct potential base errors and increase consensus quality using a software Polisher developed at JGI [50]. The error rate of the completed genome sequence is less than one in 100,000. Together, the combination of the Illumina and 454 sequencing platforms provided 2,171.

8 �� coverage of the genome. The final assembly contained 384,925 pyrosequence and 63,008,730 Illumina reads. Genome annotation Genes were identified using Prodigal [51] as part of the Oak Ridge National Laboratory genome annotation pipeline, followed by a round of manual curation using the JGI GenePRIMP pipeline [52]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGR-Fam, Pfam, PRIAM, KEGG, COG, and InterPro databases. Additional gene prediction analysis and functional annotation was performed within the Integrated Microbial Genomes – Expert Review (IMG-ER) platform [53]. Genome properties The genome consists of a 2,272,954 bp long chromosome with a GC content of 35.

9% (Table 3 and Figure 3). Of the 2,181 genes predicted, 2,105 were protein-coding genes, and 76 RNAs; 56 pseudogenes were also identified. The majority of the protein-coding genes (65.5%) were assigned with a putative function while the remaining ones were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4. Table 3 Genome Statistics Figure 3 Graphical circular map of the chromosome. From outside to the center: Genes on forward strand (color by COG categories), Genes on reverse strand (color by COG categories), Cilengitide RNA genes (tRNAs green, rRNAs red, other RNAs black), GC content, GC skew. Table 4 Number of genes associated with the general COG functional categories Acknowledgements This work was performed under the auspices of the US Department of Energy Office of Science, Biological and Environmental Research Program, and by the University of California, Lawrence Berkeley National Laboratory under contract No. DE-AC02-05CH11231, Lawrence Livermore National Laboratory under Contract No. DE-AC52-07NA27344, and Los Alamos National Laboratory under contract No.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>