<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
	<ui>1471-2164-15-567</ui>
	<ji>1471-2164</ji>
	<fm>
		<dochead>Research article</dochead>
		<bibl>
			<title>
				<p>Enhanced whole genome sequence and annotation of <it>Clostridium stercorarium</it> DSM8532<sup>T</sup> using RNA-seq transcriptomics and high-throughput proteomics</p>
			</title>
			<aug>
				<au id="A1" ca="yes"><snm>Schellenberg</snm><mi>J</mi><fnm>John</fnm><insr iid="I1"/><email>john.schellenberg@ad.umanitoba.ca</email></au>
				<au id="A2"><snm>Verbeke</snm><mi>J</mi><fnm>Tobin</fnm><insr iid="I1"/><email>umverbe0@cc.umanitoba.ca</email></au>
				<au id="A3"><snm>McQueen</snm><fnm>Peter</fnm><insr iid="I2"/><email>mcqueen.p@gmail.com</email></au>
				<au id="A4"><snm>Krokhin</snm><mi>V</mi><fnm>Oleg</fnm><insr iid="I2"/><email>krokhino@cc.umanitoba.ca</email></au>
				<au id="A5"><snm>Zhang</snm><fnm>Xiangli</fnm><insr iid="I3"/><email>xiangli.zhang@ad.umanitoba.ca</email></au>
				<au id="A6"><snm>Alvare</snm><fnm>Graham</fnm><insr iid="I3"/><email>Graham.Alvare@umanitoba.ca</email></au>
				<au id="A7"><snm>Fristensky</snm><fnm>Brian</fnm><insr iid="I3"/><email>brian.fristensky@ad.umanitoba.ca</email></au>
				<au id="A8"><snm>Thallinger</snm><mi>G</mi><fnm>Gerhard</fnm><insr iid="I5"/><insr iid="I6"/><email>gerhard.thallinger@tugraz.at</email></au>
				<au id="A9"><snm>Henrissat</snm><fnm>Bernard</fnm><insr iid="I7"/><insr iid="I8"/><email>bernard.henrissat@afmb.univ-mrs.fr</email></au>
				<au id="A10"><snm>Wilkins</snm><mi>A</mi><fnm>John</fnm><insr iid="I2"/><email>jwilkin@cc.umanitoba.ca</email></au>
				<au id="A11"><snm>Levin</snm><mi>B</mi><fnm>David</fnm><insr iid="I4"/><email>David.Levin@ad.umanitoba.ca</email></au>
				<au id="A12"><snm>Sparling</snm><fnm>Richard</fnm><insr iid="I1"/><email>sparlng@cc.umanitoba.ca</email></au>
			</aug>
			<insg>
				<ins id="I1"><p>Department of Microbiology, University of Manitoba, Winnipeg, Canada</p></ins>
				<ins id="I2"><p>Manitoba Centre for Proteomics and Systems Biology, University of Manitoba, Winnipeg, Canada</p></ins>
				<ins id="I3"><p>Department of Plant Sciences, University of Manitoba, Winnipeg, Canada</p></ins>
				<ins id="I4"><p>Department of Biosystems Engineering, University of Manitoba, Winnipeg, Canada</p></ins>
				<ins id="I5"><p>Core Facility Bioinformatics, Austrian Centre of Industrial Biotechnology (ACIB), Graz, Austria</p></ins>
				<ins id="I6"><p>Institute for Genomics and Bioinformatics, Graz University of Technology, Graz, Austria</p></ins>
				<ins id="I7"><p>Architecture et Fonction des Macromol&#233;cules Biologiques, Universit&#233; Aix-Marseille, Marseille, France</p></ins>
				<ins id="I8"><p>Centre National de Recherche Scientifique, UMR 7257, 163 ave. de Luminy, Marseille 13288, France</p></ins>
			</insg>
			<source>BMC Genomics</source>
			<section><title><p>Prokaryote microbial genomics </p></title></section><issn>1471-2164</issn>
			<pubdate>2014</pubdate>
			<volume>15</volume>
			<issue>1</issue>
			<fpage>567</fpage>
			<url>http://www.biomedcentral.com/1471-2164/15/567</url>
			<xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2164-15-567</pubid><pubid idtype="pmpid">24998381</pubid></pubidlist></xrefbib>
		</bibl>
		<history><rec><date><day>18</day><month>7</month><year>2013</year></date></rec><acc><date><day>26</day><month>6</month><year>2014</year></date></acc><pub><date><day>7</day><month>7</month><year>2014</year></date></pub></history>
		<cpyrt><year>2014</year><collab>Schellenberg et al.; licensee BioMed Central Ltd.</collab><note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.</note></cpyrt>
		<kwdg>
			<kwd>Genome</kwd>
			<kwd>Proteome</kwd>
			<kwd>Transcriptome</kwd>
			<kwd>RNA-seq</kwd>
			<kwd>Tandem mass spectrometry</kwd>
			<kwd>Proteogenomics</kwd>
			<kwd>Glycolysis</kwd>
			<kwd>Pentose phosphate pathway</kwd>
			<kwd>Transaldolase</kwd>
		</kwdg>
		<abs>
			<sec>
				<st>
					<p>Abstract</p>
				</st>
				<sec>
					<st>
						<p>Background</p>
					</st><p>Growing interest in cellulolytic clostridia with potential for consolidated biofuels production is mitigated by low conversion of raw substrates to desired end products. Strategies to improve conversion are likely to benefit from emerging techniques to define molecular systems biology of these organisms. <it>Clostridium stercorarium</it> DSM8532<sup>T</sup> is an anaerobic thermophile with demonstrated high ethanol production on cellulose and hemicellulose. Although several lignocellulolytic enzymes in this organism have been well-characterized, details concerning carbohydrate transporters and central metabolism have not been described. Therefore, the goal of this study is to define an improved whole genome sequence (WGS) for this organism using in-depth molecular profiling by RNA-seq transcriptomics and tandem mass spectrometry-based proteomics.</p>
				</sec>
				<sec>
					<st>
						<p>Results</p>
					</st><p>A paired-end Roche/454 WGS assembly was closed through application of an <it>in silico</it> algorithm designed to resolve repetitive sequence regions, resulting in a circular replicon with one gap and a region of 2 kilobases with 10 ambiguous bases. RNA-seq transcriptomics resulted in nearly complete coverage of the genome, identifying errors in homopolymer length attributable to 454 sequencing. Peptide sequences resulting from high-throughput tandem mass spectrometry of trypsin-digested protein extracts were mapped to 1,755 annotated proteins (68% of all protein-coding regions). Proteogenomic analysis confirmed the quality of annotation and improvement pipelines, identifying a missing gene and an alternative reading frame. Peptide coverage of genes hypothetically involved in substrate hydrolysis, transport and utilization confirmed multiple pathways for glycolysis, pyruvate conversion and recycling of intermediates. No sequences homologous to transaldolase, a central enzyme in the pentose phosphate pathway, were observed by any method, despite demonstrated growth of this organism on xylose and xylan hemicellulose.</p>
				</sec>
				<sec>
					<st>
						<p>Conclusions</p>
					</st><p>Complementary omics techniques confirm the quality of genome sequence assembly, annotation and error-reporting. Nearly complete genome coverage by RNA-seq likely indicates background DNA in RNA extracts, however these preps resulted in WGS enhancement and transcriptome profiling in a single Illumina run. No detection of transaldolase by any method despite xylose utilization by this organism indicates an alternative pathway for sedoheptulose-7-phosphate degradation. This report combines next-generation omics techniques to elucidate previously undefined features of substrate transport and central metabolism for this organism and its potential for consolidated biofuels production from lignocellulose.</p>
				</sec>
			</sec>
		</abs>
	</fm>
	<bdy>
		<sec>
			<st>
				<p>Background</p>
			</st><p>Consolidated bioprocessing (CBP) refers to single-vessel microbial transformation of inexpensive biomass such as agricultural or forestry cellulosic wastes into fuels or other useful chemicals. This approach is based on the power of specific microbes or consortia to simultaneously degrade and transform plant cell wall components into ethanol or other molecules of interest <abbrgrp>
					<abbr bid="B1">1</abbr>
					<abbr bid="B2">2</abbr>
				</abbrgrp>. Cellulolytic clostridia such as <it>Clostridium thermocellum</it> and <it>C. stercorarium</it> are among the most widely studied organisms for CBP, producing a wide range of cellulases, xylanases and other lignocellulolytic enzymes <abbrgrp>
					<abbr bid="B3">3</abbr>
				</abbrgrp>. However, multiple end products resulting from branching metabolic pathways and low overall ethanol production mitigates the feasibility of industrial CBP using these organisms. For example, selected strains of <it>C. stercorarium</it> have been shown to produce up to ~0.4%&#160;w/v (80-100&#160;mM) ethanol in laboratory batch cultures <abbrgrp>
					<abbr bid="B4">4</abbr>
				</abbrgrp>. Improvements of at least an order of magnitude will be required to rival current yeast/starch-based processes for bioethanol production.</p><p>Two main strategies have emerged to increase ethanol production by these organisms. First, genetic modification has been applied and found to modestly improve yields of ethanol or other biofuels, usually through knocking out elements in undesired metabolic pathways <abbrgrp>
					<abbr bid="B5">5</abbr>
					<abbr bid="B6">6</abbr>
				</abbrgrp>. Second, defined co-cultures of organisms with contrasting or potentially synergistic enzymes for lignocellulose degradation and utilization have been applied, again with modest improvements in biofuels production <abbrgrp>
					<abbr bid="B7">7</abbr>
					<abbr bid="B8">8</abbr>
					<abbr bid="B9">9</abbr>
				</abbrgrp>. Central to both of these strategies is a refined conceptual framework and well-defined lignocellulolytic and central metabolic pathways for organisms of interest <abbrgrp>
					<abbr bid="B10">10</abbr>
				</abbrgrp>. To this end, application of next-generation systems biology tools, including genome sequencing and transcriptional/protein profiling has expanded rapidly in the past 10&#160;years <abbrgrp>
					<abbr bid="B4">4</abbr>
					<abbr bid="B11">11</abbr>
					<abbr bid="B12">12</abbr>
					<abbr bid="B13">13</abbr>
				</abbrgrp>, along with increasingly sophisticated techniques for integrating and visualizing these vast datasets <abbrgrp>
					<abbr bid="B12">12</abbr>
					<abbr bid="B14">14</abbr>
					<abbr bid="B15">15</abbr>
				</abbrgrp>. Basic techniques in this field are in constant flux and yielding ever more detailed information. For example, increasingly powerful mass spectrometers are reducing the importance of gel-based separation or laser desorption techniques in proteomics <abbrgrp>
					<abbr bid="B12">12</abbr>
				</abbrgrp> and microarrays for transcriptomics are increasingly displaced by next-generation RNA sequencing (RNA-seq) <abbrgrp>
					<abbr bid="B16">16</abbr>
				</abbrgrp>. Genome sequencing has become trivial at a technical level, as evidenced by the steady accumulation of brief announcements in the literature, however the currency of in-depth molecular profiling provides an opportunity to improve, confirm and contextualize genome sequence data. This information is critical for designing and interpreting effects of metabolic engineering or co-culture experiments to improve biofuel yields.</p><p>With the ultimate goal of developing &#8220;designer co-cultures&#8221; for biofuels production, our group has recently published genome- and proteome-level descriptions of central metabolism in biofuels organisms of interest <abbrgrp>
					<abbr bid="B10">10</abbr>
					<abbr bid="B11">11</abbr>
					<abbr bid="B17">17</abbr>
				</abbrgrp>. The ethanologenic thermophile <it>Clostridium stercorarium</it> DSM8532<sup>T</sup> has been investigated extensively to characterize its complement of lignocellulolytic enzymes <abbrgrp>
					<abbr bid="B4">4</abbr>
					<abbr bid="B18">18</abbr>
					<abbr bid="B19">19</abbr>
					<abbr bid="B20">20</abbr>
					<abbr bid="B21">21</abbr>
					<abbr bid="B22">22</abbr>
					<abbr bid="B23">23</abbr>
					<abbr bid="B24">24</abbr>
					<abbr bid="B25">25</abbr>
					<abbr bid="B26">26</abbr>
					<abbr bid="B27">27</abbr>
					<abbr bid="B28">28</abbr>
					<abbr bid="B29">29</abbr>
					<abbr bid="B30">30</abbr>
					<abbr bid="B31">31</abbr>
				</abbrgrp>, however substrate transport and central metabolic pathways for this organism have not been described in detail. Preferential hemicellulolysis and xylose utilization by <it>C. stercorarium</it>
				<abbrgrp>
					<abbr bid="B3">3</abbr>
				</abbrgrp> suggests that it may be compatible in co-culture with <it>C. thermocellum</it>, a rapid cellulose-degrader that does not metabolize xylose. Therefore, the goal of this study was to define metabolic potential encoded by the whole genome sequence of <it>C. stercorarium</it> DSM8532<sup>T</sup>, in the context of high-throughput molecular profiling using RNA-seq and high-throughput tandem mass spectrometry.</p>
		</sec>
		<sec>
			<st>
				<p>Methods</p>
			</st>
			<sec>
				<st>
					<p>Anaerobic culture</p>
				</st><p>
					<it>C. stercorarium</it> DSM8532<sup>T</sup> was acquired from Deutsche Sammlung von Mikroorganismen und Zellkulturen (Braunschweig, Germany) and sub-cultured to single colonies on simplified 1191 agar: (w/v) 0.15% potassium phosphate (KH<sub>2</sub>PO<sub>4</sub>), 0.335% sodium phosphate (Na<sub>2</sub>HPO<sub>4</sub>), 0.05% ammonium chloride (NH<sub>4</sub>Cl), 0.018% magnesium chloride (MgCl<sub>2</sub>), 0.1%&#160;L-cysteine, 1&#160;ml of 0.025%&#160;w/v resazurin solution (all from Sigma-Aldrich, Oakville, Canada), 0.2% yeast extract, and 0.8% agar (both from BD, Mississauga, Canada), pH&#160;7.2. Resuspended stock was spread onto non-reduced plates under normal aerobic conditions in a biosafety cabinet and incubated in jars with GasPak EZ sachets (BD) for 72&#160;h at 65&#176;C. Growth from plates inoculated with single colonies were transferred aseptically to 50&#160;ml liquid culture (as above, except agar) using a syringe and needle in nitrogen-gassed butyl-stoppered serum bottles with cellobiose or xylose (both from Sigma-Aldrich) added aseptically after autoclaving (1&#160;ml of filter-sterilized, degassed 10% solution, final concentration 0.2%&#160;w/v). Serial transfers were inoculated with 5&#160;ml (10%) overnight pre-cultures (18&#8211;24&#160;h, OD<sub>600</sub>&#8201;~&#8201;0.8). Concentration of hydrogen and carbon dioxide were determined using a Varian benchtop gas chromatograph (Agilent, Mississauga, USA) using standard curves made with degassed butyl-stoppered bottles containing known concentrations of each gas (both from Welders Supplies, Winnipeg, Canada). Concentrations of liquid components (cellobiose, xylose, acetate, lactate and ethanol, all from Sigma-Aldrich) were determined by high-pressure liquid chromatography (HPLC) using an isocratic pump (model #1515) and refractive index detector (model #2414, Waters, Milford, USA), with standard curves derived from stock solutions of each component. Genomic DNA was isolated from overnight cultures growing on 0.2% cellobiose using the Genomic DNA Wizard kit (Promega, Madison, USA) according to supplier&#8217;s protocol. RNA for transcriptomics was isolated from mid-exponential phase culture (1 &#215; 50&#160;ml, 12&#160;h, OD<sub>600</sub>&#8201;~&#8201;0.4), on 0.2% cellobiose. Protein was isolated from pooled mid-exponential phase cultures (25 &#215; 50&#160;ml, 12&#160;h, OD<sub>600</sub>&#8201;~&#8201;0.3) on 0.2% xylose, as detailed below.</p>
			</sec>
			<sec>
				<st>
					<p>Whole-genome sequencing, assembly and annotation</p>
				</st><p>The genome was sequenced using the GS-FLX Titanium platform (Roche/454, Branford, USA) and the resulting 8 kilobase paired-end library of 358,837 reads was assembled using Newbler (v2.6), resulting in 26-fold coverage and 120 contigs, of which 33 were joined by Newbler into a single large scaffold encompassing 96% of total sequence with 33 gaps. Remaining contigs represented repetitive sequences (16S, 23S and different transposons). Gaps resulting from these repetitive sequences were resolved by <it>in silico</it> gap filling, where contigs generated from a gap-specific assembly were integrated into the circular scaffold with a custom R (version 2.15.1) <abbrgrp>
						<abbr bid="B32">32</abbr>
					</abbrgrp> script (Thallinger <it>et al.</it>, manuscript in preparation). One ambiguous region and one gap remained, with the latter closed by gap edge-specific primer design using Primer-BLAST (<url>http://www.ncbi.nlm.nih.gov/tools/primer-blast</url>) followed by bidirectional Sanger sequencing of resulting amplicon using the ABI3100 Genetic Analyzer (Life Technologies, Burlington, Canada). Origin of replication was determined with originx <abbrgrp>
						<abbr bid="B33">33</abbr>
					</abbrgrp>, and the genome was rearranged to start at this position with the chromosomal replication initiator DnaA as the first protein. Raw sequencing reads (.sff file) were submitted to the Sequence Read Archive of NCBI (<url>http://www.ncbi.nlm.nih.gov/sra</url>), with accession SRX481570.</p><p>Automated gene-calling and functional annotation (EC number, COG, Pfam, TIGRfam, KEGG, Metacyc) was carried out through submission of sequence assembly to Joint Genome Institute&#8217;s Integrated Microbial Genomes Expert Review (IMG/ER) <abbrgrp>
						<abbr bid="B34">34</abbr>
					</abbrgrp>. Lignocellulolytic enzymes were identified and categorized by comparison with the Carbohydrate-Active Enzyme (CAZy) database <abbrgrp>
						<abbr bid="B35">35</abbr>
					</abbrgrp>. Identification and classification of transporters was carried out initially using the Transporter Classification function in IMG/ER, which is based on the TCDB database (<url>http://www.tcdb.org</url>) <abbrgrp>
						<abbr bid="B36">36</abbr>
					</abbrgrp>. The subset of ABC transporters with predicted carbohydrate uptake activity was established through analysis of functional annotations in IMG/ER and further characterized using the ABC transporter database (<url>http://www-abcdb.biotoul.fr</url>) <abbrgrp>
						<abbr bid="B37">37</abbr>
					</abbrgrp>. Functional annotation of well-characterized enzymes was used to create categories for specific metabolic pathways of interest (glycolysis, pentose phosphate pathway, pyruvate/PEP conversion), as previously described <abbrgrp>
						<abbr bid="B10">10</abbr>
						<abbr bid="B17">17</abbr>
					</abbrgrp>. Hydrogenases and other enzymes important in co-factor recycling and energy balance were identified by homology to known clostridial enzymes <abbrgrp>
						<abbr bid="B10">10</abbr>
						<abbr bid="B17">17</abbr>
						<abbr bid="B38">38</abbr>
					</abbrgrp>. Nearest neighbour (top bit score in maximum overlap) was established by BLAST using the NCBI website.</p><p>Annotated coding sequences were evaluated through submission of draft assembly to GenePRIMP annotation improvement platform of the Joint Genome Institute (<url>http://geneprimp.jgi-psf.org</url>) <abbrgrp>
						<abbr bid="B39">39</abbr>
					</abbrgrp> to identify long, short, unique, dubious, split or missed genes. Automatically generated annotation information was downloaded from IMG/ER to construct the feature file (.tbl) required for NCBI submission using sort and concatenation functions in Excel. Manual curation of all automatically generated product names was carried out using NCBI instructions (<url>http://www.ncbi.nlm.nih.gov/genbank</url>), with further annotation based on database comparisons and sequence improvements using transcriptomic and proteomic data as described below. The .sqn file for NCBI submission was generated using tbl2asn and adjusted through error-reporting, followed by provisional submissions and further error correction. The final closed circle assembly and annotation was approved by NCBI on January 10, 2013 and first public draft released on March 31, 2013 with accession [GenBank: CP003992]. The RefSeq accession is [GenBank: NC_020887]. In order to independently confirm specific observations regarding whole genome sequence and specific metabolic pathways, an independently-derived whole genome sequence of the same strain <abbrgrp>
						<abbr bid="B40">40</abbr>
					</abbrgrp> was accessed [GenBank: NC_020134].</p>
			</sec>
			<sec>
				<st>
					<p>RNA-seq transcriptomics and correction of homopolymers</p>
				</st><p>RNA was isolated using the ChargeSwitch magnetic bead-based technique for RNA extraction (Life Technologies, Burlington, Canada), including DNaseI treatment of crude extracts, according to manufacturer&#8217;s directions. Briefly, total RNA was quantified using a NanoDrop Spectrophotometer ND-1000 (NanoDrop Technologies, Wilmington, USA) and integrity assessed using a 2100 Bioanalyzer (Agilent Technologies, Mississauga, Canada). The ribosomal RNA depletion was done using 1&#160;&#956;g of total RNA with the Metabacteria Ribo-Zero rRNA Removal Kit (Mandell Scientific, Guelph, Canada). After rRNA depletion, remaining RNA was purified using the RiboMinus Concentration Module (Life Technologies), with final elution in 17&#160;&#956;l of &#8220;Fragment, Prime, Finish&#8221; mix instead of water, followed by fragmenting and priming for cDNA synthesis. Starting at the &#8220;First strand cDNA synthesis&#8221; step of the protocol for TruSeq Stranded mRNA Sample Prep Kit (Illumina, San Diego, USA), samples were converted into a library suitable for cluster generation and DNA sequencing. Library quality was assessed using a LabChip and Light Cycler 480 II (Roche, Mississauga, Canada) for size and an Infinite M200 Fluorimeter (Tecan, Mannedorf, Switzerland) for quantification. cDNA transcripts (2 &#215; 100&#160;bp) were sequenced with the Illumina HiSeq 2000 platform by McGill University and Genome Quebec Innovation Center. A total of 2.6 million reads (52 Gb) with an average Phred quality score of 34 (100% passed filter) were sequenced, with an expected false discovery rate (base-calling error) of 0.05% based on the quality control plot. Raw reads (.fastq file) were pre-processed using a custom script incorporating Trimmomatic (<url>http://www.usadellab.org</url>) with default settings (see Additional file <supplr sid="S1">1</supplr>). This algorithm trims adapters, removes leading or trailing low quality and N bases (Phred score&#8201;&gt;&#8201;3), scans reads in a 4&#160;bp sliding window and cuts when average quality score falls below 15, and removes all reads of less than 36&#160;bp. Tophat <abbrgrp>
						<abbr bid="B41">41</abbr>
					</abbrgrp> was used for read alignment based on the reference genome annotation, genome sequence and paired end insertion information. Final reads with greater than 2 mismatches, gaps or indels were discarded. Coverage of the genome sequence by RNA-seq transcripts was determined by generating a &#8220;base pair map&#8221; of the .bam alignment using the bam2depth function in SAMtools (<url>http://samtools.sourceforge.net</url>) <abbrgrp>
						<abbr bid="B42">42</abbr>
					</abbrgrp>. Raw Illumina sequencing reads in .fastq format were submitted to the Sequence Read Archive of NCBI (<url>http://www.ncbi.nlm.nih.gov/sra</url>) with accession SRX481592.</p>
				<suppl id="S1">
					<title>
						<p>Additional file 1</p>
					</title>
					<text>
						<p>
							<b>Custom Scripts.</b>
						</p>
					</text>
					<file name="1471-2164-15-567-S1.docx">
   <p>Click here for file</p>
</file>
				</suppl><p>Transcriptomic datasets were compared with the genome sequence in order to correct homopolymer errors in 454 pyrosequencing, confirm coding regions and parse improvement suggestions identified through GenePRIMP and NCBI error reporting. For RNA-seq data, individual Illumina reads were mapped to the genome sequence using CLC Genomics Workbench (CLC Bio, Aarhus, Denmark), resulting in a list of sequence variants that were manually checked against the mapping of 454 reads and further validated in reference to a concurrent genome sequence for this organism (GenBank Accession: NC_020134) <abbrgrp>
						<abbr bid="B40">40</abbr>
					</abbrgrp>.</p>
			</sec>
			<sec>
				<st>
					<p>2D-HPLC MS/MS proteomics and proteogenomics</p>
				</st><p>Protein was extracted from PBS-washed cell pellets by sonication in the presence of detergent, digested using trypsin, cleaned and fractionated as previously described <abbrgrp>
						<abbr bid="B11">11</abbr>
						<abbr bid="B43">43</abbr>
						<abbr bid="B44">44</abbr>
					</abbrgrp>. Resuspended peptide fractions were subjected to two-dimensional HPLC (40 1-minute fractions collected at pH&#160;10, pairwise concatenated into 20 fractions, with 1-hour gradients for each at pH&#160;2 formic acid) coupled to tandem mass spectrometry (MS/MS) <abbrgrp>
						<abbr bid="B45">45</abbr>
					</abbrgrp>, using the TripleTOF 5600 platform (AB Sciex, Concord, Canada). Results were concatenated from 20 LC-MS/MS runs of 70&#160;minutes each and converted from native WIFF format to mascot generic format (.mgf) using the Analyst built-in conversion utility. The collision-induced dissociation (CID) spectra in this file were analyzed using X!tandem (2012.10.01.1) against a database of annotated proteins (.fasta format), using the following search settings: fixed modification C&#8201;+&#8201;57.021; parent mass error: +&#8201;&#8722;&#8201;20 PPM; fragment mass error&#8201;+&#8201;&#8722;0.05&#160;Da. Peptides with an expectation value of log(e)&#8201;&lt;&#8201;&#8722;1 were reported <abbrgrp>
						<abbr bid="B46">46</abbr>
					</abbrgrp>. In order to determine coverage of the genome by MS/MS peptides, a &#8220;base pair map&#8221; assigning ion current values to the genome sequence was created using a custom Python script (see Additional file <supplr sid="S1">1</supplr>). Mass spectra (<url>http://hs2.proteome.ca/tandem/archive/csterc2dproteogenomic.mgf.txt</url>) are available at the Manitoba Centre for Proteomics and Systems Biology Global Proteome Machine server.</p><p>Proteogenomics analysis was carried out as previously described <abbrgrp>
						<abbr bid="B17">17</abbr>
					</abbrgrp>, with the goal of determining whether non-specific hits provide any information about protein coding regions not captured in automatic annotation procedures. An alternative database based on raw 454 reads instead of annotated proteins was created. Quality information was discarded, and each read transcribed across 6 reading frames into peptide-source elements between STOP codons only (no START codon required). Each element was subjected to an <it>in silico</it> single-missed cleavage tryptic digestion, resulting in a non-redundant proteogenomic peptide database. This collection of peptides was assembled into .fasta format (<url>http://hs2.proteome.ca/tandem/archive/naive454csterc.fasta</url>) and analyzed using X!tandem as described above. It is important to note that &#8220;proteins&#8221; in this database have no connection to actual assembled-annotated proteins. Rather, they are collections of connected proteogenomic tryptic peptides for purposes of identifying MS/MS spectra. The results file was parsed into non-redundant member peptides, filtered to exclude peptides containing variable post-translational modifications, and scanned against the genome annotation. Unassigned peptides were analyzed using the TblastN function in IMG/ER in order to assign them to potential source proteins in related organisms.</p>
			</sec>
			<sec>
				<st>
					<p>Coverage of genome by transcriptome and proteome</p>
				</st><p>Coverage of each locus in the genome by RNA-seq reads and MS/MS peptides was calculated by comparing base pair maps to gene regions using a custom R script (see Additional file <supplr sid="S1">1</supplr>). This analysis also created .fasta files where each base observed in reads/peptides is itself and each base not observed is represented by a dash. To visualize genome coverage, these .fasta files were uploaded along with the genome record (.gbk file) to the Gview visualization platform (<url>http://www.gview.ca</url>) <abbrgrp>
						<abbr bid="B15">15</abbr>
					</abbrgrp>, and rendered using the BLAST atlas function.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Results</p>
			</st>
			<sec>
				<st>
					<p>Whole genome sequence and gene coverage by reads/peptides</p>
				</st><p>A closed genome sequence was generated from a single 454 pyrosequencing run, using a novel <it>in silico</it> technique for gap-closing and integration of repetitive regions (Thallinger <it>et al.</it>, manuscript in preparation). Wet-lab sequence determination was required for a single gap of 29 base pairs only, with one region of 2 kilobases containing 10 ambiguous bases left unresolved. The final whole genome sequence is 2.97 megabases, with 2,580 protein-coding regions and 61 non-coding (RNA) genes, including 3 ribosomal RNA, 48 transfer RNA genes and 4 miscellaneous RNAs (see Additional file <supplr sid="S2">2</supplr>: Table S1). Genome sequence coverage by RNA-seq reads and mass spectrometry peptides was extensive. Almost all genes (2575/2641 or 97.5% were completely covered by RNA-seq reads, indicating extensive baseline transcription and/or possible residual DNA in RNA preps (Additional file <supplr sid="S2">2</supplr>: Table S1). About one-third of protein-coding genes (815/2580 or 32%) had no peptide coverage (Additional file <supplr sid="S2">2</supplr>: Table S1). Visualization of transcriptomic and proteomic coverage using Gview confirmed even distribution of RNA-seq reads and MS/MS peptides across the entire genome (Figure&#160;<figr fid="F1">1</figr>).</p>
				<suppl id="S2">
					<title>
						<p>Additional file 2: Table S1</p>
					</title>
					<text>
						<p>Summary of gene regions in <it>C. stercorarium</it> DSM8532<sup>T</sup> and coverage by RNA-seq reads and MS/MS peptides.</p>
					</text>
					<file name="1471-2164-15-567-S2.txt">
   <p>Click here for file</p>
</file>
				</suppl>
				<fig id="F1"><title><p>Figure 1</p></title><caption><p><it>Complete genome, transcriptome and proteome of C. stercorarium DSM8532</it><sup><it>T</it></sup>. Inner ring shows all genes in complete genome (positive strand in blue, negative strand in red)</p></caption><text>
   <p><b><it>Complete genome, transcriptome and proteome of C. stercorarium DSM8532</it></b><sup><b><it>T</it></b></sup><b>.</b> Inner ring shows all genes in complete genome (positive strand in blue, negative strand in red). Middle ring shows nearly complete coverage of genes by sequence reads generated by Illumina RNA-seq. Outer ring shows extensive and evenly distributed coverage of coding regions by peptides detected by MS/MS.</p>
</text><graphic file="1471-2164-15-567-1"/></fig>
			</sec>
			<sec>
				<st>
					<p>Genome improvement by RNA-seq</p>
				</st><p>A total of 94 alternative coding regions were identified by GenePRIMP and/or NCBI (Additional file <supplr sid="S3">3</supplr>: Table S2). Assembled RNA-seq reads were used to identify and correct 35 errors in genome sequence data due to inaccurate 454 sequencing of homopolymer stretches (Table&#160;<tblr tid="T1">1</tblr>). Several bases identified by RNA-seq simply contradicted the genome sequence, indicating error by either method or small-scale mutations in working stocks of lab strains. Sequence corrections by RNA-seq corroborated five suggested joins and one suggested extension by GenePRIMP/NCBI (Table&#160;<tblr tid="T1">1</tblr>). Although the remaining differences were not independently confirmed in this study (ie. by Sanger sequencing of contradictory sequence regions), comparison with a recently-published genome sequence <abbrgrp>
						<abbr bid="B40">40</abbr>
					</abbrgrp> confirmed the majority of corrections suggested by RNA-seq (Table&#160;<tblr tid="T1">1</tblr>).</p>
				<suppl id="S3">
					<title>
						<p>Additional file 3: Table S2</p>
					</title>
					<text>
						<p>
							<it>C. stercorarium</it> DSM8532<sup>T</sup> genome sequence edits suggested by GenePRIMP/NCBI.</p>
					</text>
					<file name="1471-2164-15-567-S3.txt">
   <p>Click here for file</p>
</file>
				</suppl>
				<table id="T1">
					<title>
						<p>Table 1</p>
					</title>
					<caption>
						<p>
							<b>Base changes, additions and deletions in </b><b>
								<it>C. stercorarium </it>
							</b><b>DSM8532</b>
							<sup>
								<b>T </b>
							</sup><b>genome sequence suggested by RNA-seq transcriptome</b>
						</p>
					</caption>
					<tgroup align="left" cols="6">
						<colspec align="center" colname="c1" colnum="1" colwidth="1*"/>
						<colspec align="center" colname="c2" colnum="2" colwidth="1*"/>
						<colspec align="center" colname="c3" colnum="3" colwidth="1*"/>
						<colspec align="center" colname="c4" colnum="4" colwidth="1*"/>
						<colspec align="center" colname="c5" colnum="5" colwidth="1*"/>
						<colspec align="center" colname="c6" colnum="6" colwidth="1*"/>
						<thead valign="top">
							<row rowsep="1">
								<entry align="center" colname="c1">
									<p>
										<b>Gene/interval</b>
									</p>
								</entry>
								<entry align="center" colname="c2">
									<p>
										<b>Former/final start position</b>
										<sup>
											<b>1</b>
										</sup>
									</p>
								</entry>
								<entry align="center" colname="c3">
									<p>
										<b>Original</b>
									</p>
								</entry>
								<entry align="center" colname="c4">
									<p>
										<b>Corrected</b>
									</p>
								</entry>
								<entry align="center" colname="c5">
									<p>
										<b>Offset</b>
										<sup>
											<b>2</b>
										</sup>
									</p>
								</entry>
								<entry align="center" colname="c6">
									<p>
										<b>Corroborated by alternate genome</b>
									</p>
								</entry>
							</row>
						</thead>
						<tfoot>
							<p>
								<sup>1</sup>Indicates position in genome of corrected sequence (or former position in case of deleted base). <sup>2</sup>Offset based on inserted or deleted base(s). NA&#8201;=&#8201;Not applied, indicates that final version of sequence was not changed based on RNA-seq (see Additional file <supplr sid="S1">1</supplr>). *Rows in bold indicate that RNA-seq data corroborates suggested changes by GenePRIMP/NCBI (see Additional file <supplr sid="S2">2</supplr>: Table S1).</p>
						</tfoot>
						<tbody valign="top">
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0020/21</p>
								</entry>
								<entry align="center" colname="c2">
									<p>21928</p>
								</entry>
								<entry align="center" colname="c3">
									<p>G</p>
								</entry>
								<entry align="center" colname="c4">
									<p>-</p>
								</entry>
								<entry align="center" colname="c5">
									<p>&#8722;1</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0024</p>
								</entry>
								<entry align="center" colname="c2">
									<p>25417</p>
								</entry>
								<entry align="center" colname="c3">
									<p>A</p>
								</entry>
								<entry align="center" colname="c4">
									<p>G</p>
								</entry>
								<entry align="center" colname="c5">
									<p>NA</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0030/1</p>
								</entry>
								<entry align="center" colname="c2">
									<p>31758</p>
								</entry>
								<entry align="center" colname="c3">
									<p>T</p>
								</entry>
								<entry align="center" colname="c4">
									<p>-</p>
								</entry>
								<entry align="center" colname="c5">
									<p>&#8722;1</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0035/6</p>
								</entry>
								<entry align="center" colname="c2">
									<p>37728</p>
								</entry>
								<entry align="center" colname="c3">
									<p>-</p>
								</entry>
								<entry align="center" colname="c4">
									<p>A</p>
								</entry>
								<entry align="center" colname="c5">
									<p>1</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0081</p>
								</entry>
								<entry align="center" colname="c2">
									<p>88853</p>
								</entry>
								<entry align="center" colname="c3">
									<p>A</p>
								</entry>
								<entry align="center" colname="c4">
									<p>T</p>
								</entry>
								<entry align="center" colname="c5">
									<p>NA</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0081</p>
								</entry>
								<entry align="center" colname="c2">
									<p>88886</p>
								</entry>
								<entry align="center" colname="c3">
									<p>C</p>
								</entry>
								<entry align="center" colname="c4">
									<p>A</p>
								</entry>
								<entry align="center" colname="c5">
									<p>NA</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>
										<b>Clst_0130*</b>
									</p>
								</entry>
								<entry align="center" colname="c2">
									<p>
										<b>149305</b>
									</p>
								</entry>
								<entry align="center" colname="c3">
									<p>
										<b>-</b>
									</p>
								</entry>
								<entry align="center" colname="c4">
									<p>
										<b>A</b>
									</p>
								</entry>
								<entry align="center" colname="c5">
									<p>
										<b>1</b>
									</p>
								</entry>
								<entry align="center" colname="c6">
									<p>
										<b>Y</b>
									</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0131/2</p>
								</entry>
								<entry align="center" colname="c2">
									<p>150526</p>
								</entry>
								<entry align="center" colname="c3">
									<p>-</p>
								</entry>
								<entry align="center" colname="c4">
									<p>A</p>
								</entry>
								<entry align="center" colname="c5">
									<p>1</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0269</p>
								</entry>
								<entry align="center" colname="c2">
									<p>313700</p>
								</entry>
								<entry align="center" colname="c3">
									<p>A</p>
								</entry>
								<entry align="center" colname="c4">
									<p>G</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>
										<b>Clst_0746*</b>
									</p>
								</entry>
								<entry align="center" colname="c2">
									<p>
										<b>820762</b>
									</p>
								</entry>
								<entry align="center" colname="c3">
									<p>
										<b>-</b>
									</p>
								</entry>
								<entry align="center" colname="c4">
									<p>
										<b>A</b>
									</p>
								</entry>
								<entry align="center" colname="c5">
									<p>
										<b>1</b>
									</p>
								</entry>
								<entry align="center" colname="c6">
									<p>
										<b>Y</b>
									</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0755/6</p>
								</entry>
								<entry align="center" colname="c2">
									<p>828349</p>
								</entry>
								<entry align="center" colname="c3">
									<p>-</p>
								</entry>
								<entry align="center" colname="c4">
									<p>T</p>
								</entry>
								<entry align="center" colname="c5">
									<p>1</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0791/2</p>
								</entry>
								<entry align="center" colname="c2">
									<p>868841</p>
								</entry>
								<entry align="center" colname="c3">
									<p>C</p>
								</entry>
								<entry align="center" colname="c4">
									<p>A</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0847</p>
								</entry>
								<entry align="center" colname="c2">
									<p>937715</p>
								</entry>
								<entry align="center" colname="c3">
									<p>-</p>
								</entry>
								<entry align="center" colname="c4">
									<p>C</p>
								</entry>
								<entry align="center" colname="c5">
									<p>1</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0897/8</p>
								</entry>
								<entry align="center" colname="c2">
									<p>995753</p>
								</entry>
								<entry align="center" colname="c3">
									<p>-</p>
								</entry>
								<entry align="center" colname="c4">
									<p>A</p>
								</entry>
								<entry align="center" colname="c5">
									<p>1</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0899/900</p>
								</entry>
								<entry align="center" colname="c2">
									<p>998883</p>
								</entry>
								<entry align="center" colname="c3">
									<p>C</p>
								</entry>
								<entry align="center" colname="c4">
									<p>A</p>
								</entry>
								<entry align="center" colname="c5">
									<p>NA</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0972/3</p>
								</entry>
								<entry align="center" colname="c2">
									<p>1090847</p>
								</entry>
								<entry align="center" colname="c3">
									<p>-</p>
								</entry>
								<entry align="center" colname="c4">
									<p>A</p>
								</entry>
								<entry align="center" colname="c5">
									<p>1</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>
										<b>Clst_0992*</b>
									</p>
								</entry>
								<entry align="center" colname="c2">
									<p>
										<b>1116366</b>
									</p>
								</entry>
								<entry align="center" colname="c3">
									<p>
										<b>-</b>
									</p>
								</entry>
								<entry align="center" colname="c4">
									<p>
										<b>A</b>
									</p>
								</entry>
								<entry align="center" colname="c5">
									<p>
										<b>1</b>
									</p>
								</entry>
								<entry align="center" colname="c6">
									<p>
										<b>Y</b>
									</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>
										<b>Clst_1060*</b>
									</p>
								</entry>
								<entry align="center" colname="c2">
									<p>
										<b>1201128</b>
									</p>
								</entry>
								<entry align="center" colname="c3">
									<p>
										<b>-</b>
									</p>
								</entry>
								<entry align="center" colname="c4">
									<p>
										<b>A</b>
									</p>
								</entry>
								<entry align="center" colname="c5">
									<p>
										<b>1</b>
									</p>
								</entry>
								<entry align="center" colname="c6">
									<p>
										<b>Y</b>
									</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>
										<b>Clst_1117*</b>
									</p>
								</entry>
								<entry align="center" colname="c2">
									<p>
										<b>1271246</b>
									</p>
								</entry>
								<entry align="center" colname="c3">
									<p>
										<b>AT</b>
									</p>
								</entry>
								<entry align="center" colname="c4">
									<p>
										<b>-</b>
									</p>
								</entry>
								<entry align="center" colname="c5">
									<p>
										<b>&#8722;2</b>
									</p>
								</entry>
								<entry align="center" colname="c6">
									<p>
										<b>Y</b>
									</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_1187</p>
								</entry>
								<entry align="center" colname="c2">
									<p>1342072</p>
								</entry>
								<entry align="center" colname="c3">
									<p>T</p>
								</entry>
								<entry align="center" colname="c4">
									<p>A</p>
								</entry>
								<entry align="center" colname="c5">
									<p>NA</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_1286/7</p>
								</entry>
								<entry align="center" colname="c2">
									<p>1442396</p>
								</entry>
								<entry align="center" colname="c3">
									<p>-</p>
								</entry>
								<entry align="center" colname="c4">
									<p>T</p>
								</entry>
								<entry align="center" colname="c5">
									<p>1</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>CLst_1298/9</p>
								</entry>
								<entry align="center" colname="c2">
									<p>1457047</p>
								</entry>
								<entry align="center" colname="c3">
									<p>-</p>
								</entry>
								<entry align="center" colname="c4">
									<p>T</p>
								</entry>
								<entry align="center" colname="c5">
									<p>1</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_1339/40</p>
								</entry>
								<entry align="center" colname="c2">
									<p>1499569</p>
								</entry>
								<entry align="center" colname="c3">
									<p>A</p>
								</entry>
								<entry align="center" colname="c4">
									<p>-</p>
								</entry>
								<entry align="center" colname="c5">
									<p>&#8722;1</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_1341</p>
								</entry>
								<entry align="center" colname="c2">
									<p>1502214</p>
								</entry>
								<entry align="center" colname="c3">
									<p>A</p>
								</entry>
								<entry align="center" colname="c4">
									<p>T</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_1420/1</p>
								</entry>
								<entry align="center" colname="c2">
									<p>1596278</p>
								</entry>
								<entry align="center" colname="c3">
									<p>A</p>
								</entry>
								<entry align="center" colname="c4">
									<p>T</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_1435</p>
								</entry>
								<entry align="center" colname="c2">
									<p>1606866</p>
								</entry>
								<entry align="center" colname="c3">
									<p>-</p>
								</entry>
								<entry align="center" colname="c4">
									<p>A</p>
								</entry>
								<entry align="center" colname="c5">
									<p>1</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_1435</p>
								</entry>
								<entry align="center" colname="c2">
									<p>1606893</p>
								</entry>
								<entry align="center" colname="c3">
									<p>T</p>
								</entry>
								<entry align="center" colname="c4">
									<p>-</p>
								</entry>
								<entry align="center" colname="c5">
									<p>&#8722;1</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_1474/5</p>
								</entry>
								<entry align="center" colname="c2">
									<p>1649400</p>
								</entry>
								<entry align="center" colname="c3">
									<p>-</p>
								</entry>
								<entry align="center" colname="c4">
									<p>T</p>
								</entry>
								<entry align="center" colname="c5">
									<p>1</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_1511/2</p>
								</entry>
								<entry align="center" colname="c2">
									<p>1694436</p>
								</entry>
								<entry align="center" colname="c3">
									<p>A</p>
								</entry>
								<entry align="center" colname="c4">
									<p>T</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0</p>
								</entry>
								<entry align="center" colname="c6">
									<p>N</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_1524/5</p>
								</entry>
								<entry align="center" colname="c2">
									<p>1707519</p>
								</entry>
								<entry align="center" colname="c3">
									<p>T</p>
								</entry>
								<entry align="center" colname="c4">
									<p>-</p>
								</entry>
								<entry align="center" colname="c5">
									<p>&#8722;1</p>
								</entry>
								<entry align="center" colname="c6">
									<p>N</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_1542/3</p>
								</entry>
								<entry align="center" colname="c2">
									<p>1724606</p>
								</entry>
								<entry align="center" colname="c3">
									<p>C</p>
								</entry>
								<entry align="center" colname="c4">
									<p>T</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0</p>
								</entry>
								<entry align="center" colname="c6">
									<p>N</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_1543</p>
								</entry>
								<entry align="center" colname="c2">
									<p>1725709</p>
								</entry>
								<entry align="center" colname="c3">
									<p>T</p>
								</entry>
								<entry align="center" colname="c4">
									<p>C</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_1605/6</p>
								</entry>
								<entry align="center" colname="c2">
									<p>1809359</p>
								</entry>
								<entry align="center" colname="c3">
									<p>-</p>
								</entry>
								<entry align="center" colname="c4">
									<p>T</p>
								</entry>
								<entry align="center" colname="c5">
									<p>1</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_1771/3</p>
								</entry>
								<entry align="center" colname="c2">
									<p>2005558</p>
								</entry>
								<entry align="center" colname="c3">
									<p>A</p>
								</entry>
								<entry align="center" colname="c4">
									<p>T</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0</p>
								</entry>
								<entry align="center" colname="c6">
									<p>N</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_1893</p>
								</entry>
								<entry align="center" colname="c2">
									<p>2151838</p>
								</entry>
								<entry align="center" colname="c3">
									<p>AA</p>
								</entry>
								<entry align="center" colname="c4">
									<p>GG</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_2041/2</p>
								</entry>
								<entry align="center" colname="c2">
									<p>2322549</p>
								</entry>
								<entry align="center" colname="c3">
									<p>T</p>
								</entry>
								<entry align="center" colname="c4">
									<p>A</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0</p>
								</entry>
								<entry align="center" colname="c6">
									<p>N</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_2046</p>
								</entry>
								<entry align="center" colname="c2">
									<p>2314413</p>
								</entry>
								<entry align="center" colname="c3">
									<p>-</p>
								</entry>
								<entry align="center" colname="c4">
									<p>C</p>
								</entry>
								<entry align="center" colname="c5">
									<p>1</p>
								</entry>
								<entry align="center" colname="c6">
									<p>N</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_2051</p>
								</entry>
								<entry align="center" colname="c2">
									<p>2322540</p>
								</entry>
								<entry align="center" colname="c3">
									<p>-</p>
								</entry>
								<entry align="center" colname="c4">
									<p>C</p>
								</entry>
								<entry align="center" colname="c5">
									<p>1</p>
								</entry>
								<entry align="center" colname="c6">
									<p>N</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_2090/1</p>
								</entry>
								<entry align="center" colname="c2">
									<p>2365983</p>
								</entry>
								<entry align="center" colname="c3">
									<p>G</p>
								</entry>
								<entry align="center" colname="c4">
									<p>T</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0</p>
								</entry>
								<entry align="center" colname="c6">
									<p>N</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_2360</p>
								</entry>
								<entry align="center" colname="c2">
									<p>2656461</p>
								</entry>
								<entry align="center" colname="c3">
									<p>-</p>
								</entry>
								<entry align="center" colname="c4">
									<p>CTC</p>
								</entry>
								<entry align="center" colname="c5">
									<p>NA</p>
								</entry>
								<entry align="center" colname="c6">
									<p>N</p>
								</entry>
							</row>
							<row rowsep="1">
								<entry align="center" colname="c1">
									<p>Clst_2564/5</p>
								</entry>
								<entry align="center" colname="c2">
									<p>2854567</p>
								</entry>
								<entry align="center" colname="c3">
									<p>-</p>
								</entry>
								<entry align="center" colname="c4">
									<p>T</p>
								</entry>
								<entry align="center" colname="c5">
									<p>1</p>
								</entry>
								<entry align="center" colname="c6">
									<p>Y</p>
								</entry>
							</row>
						</tbody>
					</tgroup>
				</table>
			</sec>
			<sec>
				<st>
					<p>Proteogenomic confirmation of genome annotation</p>
				</st><p>By comparing the proteogenomic database to annotated proteins, our analysis yielded 6,611 peptides aligning with annotated proteins and 312 peptides that did not align (expectation value cutoff for all peptides log(e)&#8201;&lt;&#8201;&#8722;1). The confidence of peptide sequence assignment was further strengthened by comparing computed hydrophobicity versus retention time (R<sup>2</sup>&#8201;=&#8201;0.93) (Figure&#160;<figr fid="F2">2</figr>). Correlation of hydrophobicity and retention time for unassigned peptides was much weaker (R<sup>2</sup>&#8201;=&#8201;0.15). Of this collection, only two peptides (DLAYKGQIPGVR and ICGRPHAYMR) were found in the same reading frame with nearby coordinates, indicating that they identified a missed protein in the annotation. These peptides were found to align with similar coordinates in the <it>C. thermocellum</it> DSM1237 protein A3DJI5 (30S ribosomal protein S14). One peptide (FMPELDILQK) supported an alternate reading frame that was also identified by GenePRIMP. This analysis provides observational support for annotation modifications suggested in the genome improvement workflow (Additional file <supplr sid="S3">3</supplr>: Table S2). Two more peptides in this collection (FLNEDLPLEER and MDMSQYLGIFVEESR) supported 5&#8217; extensions of annotated proteins that were not suggested by GenePRIMP. Correlation of hydrophobicity/retention time for these five &#8220;proteogenomic&#8221; peptides was similar to assigned peptides (R<sup>2</sup>&#8201;=&#8201;0.97). Although these observations were not independently confirmed, they are corroborated by the annotation of the alternate genome for this organism <abbrgrp>
						<abbr bid="B40">40</abbr>
					</abbrgrp>.</p>
				<fig id="F2"><title><p>Figure 2</p></title><caption><p>
   <b>
      <it>Proteogenomic analysis of C. stercorarium DSM8532</it>
   </b>
   <sup>
      <b>
         <it>T</it>
      </b>
   </sup>
</p></caption><text>
   <p><b><it>Proteogenomic analysis of C. stercorarium DSM8532</it></b><sup><b><it>T</it></b></sup><b>.</b> Correlation of hydrophobicity and retention time in formic acid (FA) for peptides mapped to a 6-reading frame database derived from raw 454 reads of <it>C. stercorarium</it> whole genome sequence, including peptides assigned to matching proteins of the annotated genome (black), peptides not matching annotation (red) and &#8220;proteogenomic&#8221; peptides resulting in changes to annotation (yellow).</p>
</text><graphic file="1471-2164-15-567-2"/></fig>
			</sec>
			<sec>
				<st>
					<p>Corroboration of pseudogenes</p>
				</st><p>A total of 34 genes identified as pseudogenes in the automatic annotation or through the genome improvement pipeline were found to have no peptides associated with them under the culture conditions in this study and were identified as such in the final annotated genome. Two other loci (Clst_0108 and Clst_1866) were found to have at least some peptide coverage (2% and 8% respectively) and were not annotated as pseudogenes. Most of the suggested alternatives by GenePRIMP/NCBI (76/94 or 83%) were identified as pseudogenes, however all alternatives corroborated by RNA-seq/proteogenomic analysis were not (Additional file <supplr sid="S3">3</supplr>: Table S2). These included three suggested gene joins (Clst_0130/1, Clst_0746/7 and Clst_0991/2) that were initially predicted to be pseudogenes, but had at least some coverage by peptides (19%, 6% and 1%, respectively) and were found to have intact reading frames once RNA-seq corrections were applied. Three other cases of suggested gene joins resulting in putative pseudogenes (Clst_0150/1, Clst_0580/1 and Clst_1874/5) were covered by peptides (0% and 5%, 0% and 2%, 45% and 5% respectively for unjoined genes) and were not joined in the final annotation. BLAST analysis was also performed in order to ensure that peptides observed for putative pseudogenes with peptide coverage were not misattributed due to presence of orthologous genes in the genome. Given inherent biases and contrasts between different gene-calling and annotation improvement algorithms, further in-depth proteomics analysis of sequenced organisms under different culture conditions will be required to test whether annotated pseudogenes are actually coding regions.</p>
			</sec>
			<sec>
				<st>
					<p>Carbohydrate-active enzymes</p>
				</st><p>A total of 106 genes encoding proteins with predicted activity on carbohydrates were identified through the CAZy database, including 67 glycoside hydrolases (GH), 18 glycosyltransferases (GT), 10 carbohydrate esterases (CE), 5 polysaccharide lyases (PL) and 8 genes with carbohydrate-binding motifs (CBM)/surface layer homology (SLH) domains only (Additional file <supplr sid="S4">4</supplr>: Table S3). GH enzymes from 32 different CAZy families were observed. Seventeen of these proteins, incuding 12 GH enzymes, were modular with multiple catalytic regions and/or one or more CBM/SLH domains. These results confirm 17 previously sequenced biomass-degrading enzymes identified for this organism, including cellulases (<it>celYZ</it>), cellobiose phosphorylases (<it>cepAB</it>), xylanases (<it>xynABC</it>), xylosidases (<it>xylAB, bxlAB, bglZ</it>), arabinofuranosidases (<it>arfAB</it>), a galactosidase (<it>agaA</it>), a pectate lyase (<it>pelA</it>) and an &#945;-rhamnosidase (<it>ramA</it>) (98-100% identical to coding regions identified in genome sequence). Peptides were observed for 85/105 or 81% of genes in this category, with highest peptide coverage (45-54%) of previously described genes for ArfB, BglZ, CepB and BxlA (Additional file <supplr sid="S4">4</supplr>: Table S3).</p>
				<suppl id="S4">
					<title>
						<p>Additional file 4: Table S3</p>
					</title>
					<text>
						<p>Annotation of carbohydrate-active enzymes in <it>C. stercorarium</it> DSM8532<sup>T</sup> and coverage by MS/MS peptides.</p>
					</text>
					<file name="1471-2164-15-567-S4.txt">
   <p>Click here for file</p>
</file>
				</suppl>
			</sec>
			<sec>
				<st>
					<p>Putative ABC-type carbohydrate transporters</p>
				</st><p>Of 372 enzymes with Transporter Classifications, 242 belong to the ATP-binding cassette (ABC) superfamily 3.A.1, and 118 of this subset are predicted carbohydrate importers based on automated annotation, organized in 42 contiguous clusters (Table&#160;<tblr tid="T2">2</tblr>). Although annotations of transmembrane (M) vs. solute-binding (S) enzymes were often inconsistent between IMG and ABCdb, 114 out of 118 enzymes fell into these two categories, leaving only 4 nucleotide-binding (N) enzyme-coding loci with predicted carbohydrate uptake activity (Table&#160;<tblr tid="T2">2</tblr>). Most of these were from Carbohydrate Uptake Transporter (CUT) family 1 (3.A.1.1), with 8/118 in 2 clusters belonging to CUT family 2 (3.A.1.2). Most CUT genes were organized in groups of 2&#8211;5 adjacently located coding regions, with all but two of these clusters containing M and S enzymes only. One cluster contained 5 genes, including 2&#160;N, 2&#160;M and an S enzyme (family CUT2), while four enzymes were not co-located with other CUT enzymes (1&#160;N, 1&#160;M and 2&#160;S enzymes). Less than half of these proteins (54/118 or 46%) were covered by peptides under these culture conditions. Peptides were observed for every gene in only 8 clusters, suggesting these specific protein clusters may be of particular relevance for transport <it>in vivo</it> (Figure&#160;<figr fid="F3">3</figr>A, Table&#160;<tblr tid="T2">2</tblr>).</p>
				<table id="T2">
					<title>
						<p>Table 2</p>
					</title>
					<caption>
						<p>
							<b>Organization and observation of ABC-type carbohydrate uptake transporters in genome, transcriptome and proteome of </b><b>
								<it>C. stercorarium </it>
							</b><b>DSM8532</b>
							<sup>
								<b>T</b>
							</sup>
						</p>
					</caption>
					<tgroup align="left" cols="5">
						<colspec align="center" colname="c1" colnum="1" colwidth="1*"/>
						<colspec align="center" colname="c2" colnum="2" colwidth="1*"/>
						<colspec align="center" colname="c3" colnum="3" colwidth="1*"/>
						<colspec align="center" colname="c4" colnum="4" colwidth="1*"/>
						<colspec align="center" colname="c5" colnum="5" colwidth="1*"/>
						<thead valign="top">
							<row>
								<entry align="center" colname="c1" nameend="c2" namest="c1" rowsep="1">
									<p>
										<b>Cluster</b>
									</p>
								</entry>
								<entry align="center" colname="c3" nameend="c4" namest="c3" rowsep="1">
									<p>
										<b>Order of genes in cluster</b>
										<sup>
											<b>2</b>
										</sup>
									</p>
								</entry>
								<entry align="center" colname="c5">
									<p>
										<b>Peptide coverage (%):</b>
									</p>
								</entry>
							</row>
							<row rowsep="1">
								<entry align="center" colname="c1">
									<p>
										<b>Loci range</b>
										<sup>
											<b>1</b>
										</sup>
									</p>
								</entry>
								<entry align="center" colname="c2">
									<p>
										<b>Strand</b>
									</p>
								</entry>
								<entry align="center" colname="c3">
									<p>
										<b>IMG/ER</b>
									</p>
								</entry>
								<entry align="center" colname="c4">
									<p>
										<b>ABCdb</b>
									</p>
								</entry>
								<entry colname="c5"/>
							</row>
						</thead>
						<tfoot>
							<p>
								<sup>1</sup>Range of loci in genome for genes in cluster (ie. Clst_0059-Clst_0061), <sup>2</sup>Order of genes in cluster according to IMG Expert Review annotation platform (IMG/ER) or the ABC transporter database (ABCdb), S&#8201;=&#8201;Solute-binding protein, M&#8201;=&#8201;Transmembrane domain, N&#8201;=&#8201;nucleotide (ATP)-binding domain, X&#8201;=&#8201;No match in ABCdb, *Indicates that genes in cluster belong to Carbohydrate Uptake Transporter family 2 (CUT2).</p>
						</tfoot>
						<tbody valign="top">
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0059-0061</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>S-S-M</p>
								</entry>
								<entry align="center" colname="c4">
									<p>M-M-X</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0/4/11</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0109-0112</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>M-M-S-S</p>
								</entry>
								<entry align="center" colname="c4">
									<p>M-M-X-S</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0/0/0/0</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0194-0196</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>S-S-M</p>
								</entry>
								<entry align="center" colname="c4">
									<p>S-M-M</p>
								</entry>
								<entry align="center" colname="c5">
									<p>53/10/13</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0200-0202</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>S-S-M</p>
								</entry>
								<entry align="center" colname="c4">
									<p>M-M-S</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0/0/2</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0209-0211</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>S-M-S</p>
								</entry>
								<entry align="center" colname="c4">
									<p>S-M-M</p>
								</entry>
								<entry align="center" colname="c5">
									<p>4/0/0</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0215-0217</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>S-S-M</p>
								</entry>
								<entry align="center" colname="c4">
									<p>M-M-S</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0/0/37</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0218-0221</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>S-S-S-M</p>
								</entry>
								<entry align="center" colname="c4">
									<p>S-S-M-M</p>
								</entry>
								<entry align="center" colname="c5">
									<p>8/16/0/0</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0228-0230</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>M-M-S</p>
								</entry>
								<entry align="center" colname="c4">
									<p>M-M-S</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0/0/36</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0432-0434</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>M-M-S</p>
								</entry>
								<entry align="center" colname="c4">
									<p>M-M-S</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0/0/10</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0444-0446</p>
								</entry>
								<entry align="center" colname="c2">
									<p>&#8722;</p>
								</entry>
								<entry align="center" colname="c3">
									<p>S-M-S</p>
								</entry>
								<entry align="center" colname="c4">
									<p>S-M-M</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0/0/5</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0456-0460*</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>N-N-M-S-M</p>
								</entry>
								<entry align="center" colname="c4">
									<p>N-N-M-M-S</p>
								</entry>
								<entry align="center" colname="c5">
									<p>12/2/0/2/9</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0472-0473</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>M-M</p>
								</entry>
								<entry align="center" colname="c4">
									<p>M-M</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0/0</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0476</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>S</p>
								</entry>
								<entry align="center" colname="c4">
									<p>S</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0479-0481</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>S-M-M</p>
								</entry>
								<entry align="center" colname="c4">
									<p>S-M-M</p>
								</entry>
								<entry align="center" colname="c5">
									<p>34/0/5</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0582</p>
								</entry>
								<entry align="center" colname="c2">
									<p>&#8722;</p>
								</entry>
								<entry align="center" colname="c3">
									<p>S</p>
								</entry>
								<entry align="center" colname="c4">
									<p>S</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0627-0629</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>S-M-M</p>
								</entry>
								<entry align="center" colname="c4">
									<p>S-M-M</p>
								</entry>
								<entry align="center" colname="c5">
									<p>24/3/7</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0666-0667</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>S-S</p>
								</entry>
								<entry align="center" colname="c4">
									<p>X-M</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0/0</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0673-0674</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>M-M</p>
								</entry>
								<entry align="center" colname="c4">
									<p>M-M</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0/0</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0797-0799</p>
								</entry>
								<entry align="center" colname="c2">
									<p>&#8722;</p>
								</entry>
								<entry align="center" colname="c3">
									<p>M-M-S</p>
								</entry>
								<entry align="center" colname="c4">
									<p>M-M-S</p>
								</entry>
								<entry align="center" colname="c5">
									<p>16/3/4</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0805-0807</p>
								</entry>
								<entry align="center" colname="c2">
									<p>&#8722;</p>
								</entry>
								<entry align="center" colname="c3">
									<p>M-M-S</p>
								</entry>
								<entry align="center" colname="c4">
									<p>M-M-X</p>
								</entry>
								<entry align="center" colname="c5">
									<p>18/0/0</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0848-0850</p>
								</entry>
								<entry align="center" colname="c2">
									<p>&#8722;</p>
								</entry>
								<entry align="center" colname="c3">
									<p>S-M-M</p>
								</entry>
								<entry align="center" colname="c4">
									<p>S-M-M</p>
								</entry>
								<entry align="center" colname="c5">
									<p>4/0/34</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0934-0936</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>M-S-S</p>
								</entry>
								<entry align="center" colname="c4">
									<p>S-M-M</p>
								</entry>
								<entry align="center" colname="c5">
									<p>24/0/3</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0969-0971</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>S-S-M</p>
								</entry>
								<entry align="center" colname="c4">
									<p>M-M-S</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0/0/6</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_0993-0995</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>S-S-S</p>
								</entry>
								<entry align="center" colname="c4">
									<p>M-M-S</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0/0/4</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_1007-1008</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>M-S</p>
								</entry>
								<entry align="center" colname="c4">
									<p>M-M</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0/0</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_1068-1070</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>M-S-S</p>
								</entry>
								<entry align="center" colname="c4">
									<p>S-M-M</p>
								</entry>
								<entry align="center" colname="c5">
									<p>5/0/0</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_1073-1075</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>S-M-S</p>
								</entry>
								<entry align="center" colname="c4">
									<p>M-M-S</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0/0/18</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_1077-1079</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>M-M-S</p>
								</entry>
								<entry align="center" colname="c4">
									<p>S-M-M</p>
								</entry>
								<entry align="center" colname="c5">
									<p>3/4/0</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_1083-1085</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>S-S-M</p>
								</entry>
								<entry align="center" colname="c4">
									<p>M-M-S</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0/0/18</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_1566-1567</p>
								</entry>
								<entry align="center" colname="c2">
									<p>&#8722;</p>
								</entry>
								<entry align="center" colname="c3">
									<p>M-S</p>
								</entry>
								<entry align="center" colname="c4">
									<p>M-M</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0/0</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_1587-1589</p>
								</entry>
								<entry align="center" colname="c2">
									<p>&#8722;</p>
								</entry>
								<entry align="center" colname="c3">
									<p>M-S-S</p>
								</entry>
								<entry align="center" colname="c4">
									<p>S-M-M</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0/0/29</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_1635-1637</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>S-M-S</p>
								</entry>
								<entry align="center" colname="c4">
									<p>S-M-M</p>
								</entry>
								<entry align="center" colname="c5">
									<p>24/0/0</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_2117-2119</p>
								</entry>
								<entry align="center" colname="c2">
									<p>&#8722;</p>
								</entry>
								<entry align="center" colname="c3">
									<p>M-M-S</p>
								</entry>
								<entry align="center" colname="c4">
									<p>M-M-S</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0/0/0</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_2139-2141</p>
								</entry>
								<entry align="center" colname="c2">
									<p>&#8722;</p>
								</entry>
								<entry align="center" colname="c3">
									<p>S-M-M</p>
								</entry>
								<entry align="center" colname="c4">
									<p>S-M-M</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0/0/15</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_2159-2161</p>
								</entry>
								<entry align="center" colname="c2">
									<p>&#8722;</p>
								</entry>
								<entry align="center" colname="c3">
									<p>M-M-S</p>
								</entry>
								<entry align="center" colname="c4">
									<p>S-M-M</p>
								</entry>
								<entry align="center" colname="c5">
									<p>14/22/28</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_2245-2247</p>
								</entry>
								<entry align="center" colname="c2">
									<p>&#8722;</p>
								</entry>
								<entry align="center" colname="c3">
									<p>S-M-M</p>
								</entry>
								<entry align="center" colname="c4">
									<p>S-M-M</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0/0/36</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_2458-2460*</p>
								</entry>
								<entry align="center" colname="c2">
									<p>+</p>
								</entry>
								<entry align="center" colname="c3">
									<p>M-N-S</p>
								</entry>
								<entry align="center" colname="c4">
									<p>S-N-M</p>
								</entry>
								<entry align="center" colname="c5">
									<p>59/69/9</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_2539-2541</p>
								</entry>
								<entry align="center" colname="c2">
									<p>&#8722;</p>
								</entry>
								<entry align="center" colname="c3">
									<p>S-M-S</p>
								</entry>
								<entry align="center" colname="c4">
									<p>S-M-M</p>
								</entry>
								<entry align="center" colname="c5">
									<p>28/0/0</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_2544</p>
								</entry>
								<entry align="center" colname="c2">
									<p>&#8722;</p>
								</entry>
								<entry align="center" colname="c3">
									<p>M</p>
								</entry>
								<entry align="center" colname="c4">
									<p>S</p>
								</entry>
								<entry align="center" colname="c5">
									<p>23</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_2579</p>
								</entry>
								<entry align="center" colname="c2">
									<p>&#8722;</p>
								</entry>
								<entry align="center" colname="c3">
									<p>N</p>
								</entry>
								<entry align="center" colname="c4">
									<p>N</p>
								</entry>
								<entry align="center" colname="c5">
									<p>41</p>
								</entry>
							</row>
							<row>
								<entry align="center" colname="c1">
									<p>Clst_2595-2597</p>
								</entry>
								<entry align="center" colname="c2">
									<p>&#8722;</p>
								</entry>
								<entry align="center" colname="c3">
									<p>S-S-M</p>
								</entry>
								<entry align="center" colname="c4">
									<p>S-M-M</p>
								</entry>
								<entry align="center" colname="c5">
									<p>0/0/43</p>
								</entry>
							</row>
							<row rowsep="1">
								<entry align="center" colname="c1">
									<p>Clst_2619-2621</p>
								</entry>
								<entry align="center" colname="c2">
									<p>&#8722;</p>
								</entry>
								<entry align="center" colname="c3">
									<p>S-M-S</p>
								</entry>
								<entry align="center" colname="c4">
									<p>M-M-X</p>
								</entry>
								<entry align="center" colname="c5">
									<p>28/4/9</p>
								</entry>
							</row>
						</tbody>
					</tgroup>
				</table>
				<fig id="F3"><title><p>Figure 3</p></title><caption><p>
   <it>Structure of ABC transporters and genes involved in hydrogen production and co-factor recycling</it>
</p></caption><text>
   <p><b><it>Structure of ABC transporters and genes involved in hydrogen production and co-factor recycling. </it></b><b>A</b>. Seven ABC transporters for which all proteins were observed, including solute-binding (blue), membrane (green) and nucleotide-binding (red) subunits, as defined by NCBI. <it>Note:</it> Clst_2458-60 is in Carbohydrate Uptake Transporter Group 2 (CUT2) while others are CUT1. <b>B</b><it>.</it> Main enzymatic subunits (white squares) and co-located genes (coloured squares) for six clusters designated as hydrogenases and/or oxidoreductases.</p>
</text><graphic file="1471-2164-15-567-3"/></fig>
			</sec>
			<sec>
				<st>
					<p>Glycolysis and pentose phosphate pathway</p>
				</st><p>With the exception of transaldolase, all genes with functional annotations associated with central glycolytic and pentose phosphate pathways were observed in the genome. Three phosphofructokinases (Pfk) were annotated and all had some coverage by peptides (Figure&#160;<figr fid="F4">4</figr>A), while only 4 out of 5 annotated phosphoglucomutases (Pgm) were observed in the proteome. Extensive coverage of xylose isomerase (Xyi) and xylose kinase (Xyk) by peptides was observed, indicating pentose utilization by this organism (Figure&#160;<figr fid="F4">4</figr>B). Only one of two annotated copies of the transketolase A and B genes were observed in the proteome (Clst_2184/5). Although glucose phosphate dehydrogenase (Gpd) converting glucose-6P to gluconolactone-6P was observed, neither gene annotated as phosphoglucolactonase (Pgl) had any associated peptides under these culture conditions, indicating a possible alternative source of 6P-D-gluconate for D-ribulose-5P synthesis. No known transaldolase (Tal) was observed in the genome or through proteogenomic analysis, indicating the existence of an alternative pathway for sedoheptulose-7P (S-7P) degradation in the pentose phosphate pathway.</p>
				<fig id="F4"><title><p>Figure 4</p></title><caption><p>
   <it>Partial glycolytic and pentose phosphate pathway enzymes in C. stercorarium DSM8532</it>
   <sup>
      <it>T</it>
   </sup>
</p></caption><text>
   <p><b><it>Partial glycolytic and pentose phosphate pathway enzymes in C. stercorarium DSM8532</it></b><sup><b><it>T</it></b></sup><b><it>.</it></b> Boxes show metabolic intermediates while lines indicate catalysis, with segmented lines joined by hollow circles indicating multi-enzyme complexes and parallel lines indicating enzymes with identical annotated functions (not all shown for visual clarity). Filled boxes indicate enzyme co-factors. Numbers indicate locus tag (ie. Clst_####). Colour of lines indicates percent coverage of gene by mapped peptides. <b>A</b>. Glycolysis, cellobiose/glucose to PEP, <b>B</b>. Pentose phosphate pathway, oxidative and non-oxidative. Gene symbols: <it>cep</it>&#8201;=&#8201;cellobiose phosphorylase, <it>pmu</it>&#8201;=&#8201;phosphoglucomutase, <it>glk</it>&#8201;=&#8201;glucokinase, <it>gpi</it>&#8201;=&#8201;glucose-6-phosphate isomerase, <it>pfk</it>&#8201;=&#8201;phosphofructokinase, <it>fba</it>&#8201;=&#8201;fructose bisphosphate aldolase, <it>tpi</it>, triosephosphate isomerase, <it>gap</it>&#8201;=&#8201;glyceraldehyde-3-phosphate dehydrogenase, <it>pgk</it>&#8201;=&#8201;phosphoglycerate kinase, <it>pgm</it>&#8201;=&#8201;phosphoglycerate mutase, <it>eno</it>&#8201;=&#8201;enolase, <it>ari</it>&#8201;=&#8201;arabinose isomerase, <it>rik</it>&#8201;=&#8201;ribulose kinase, <it>xyi</it>&#8201;=&#8201;xylose isomerase, <it>xyk</it>&#8201;=&#8201;xylulose kinase, <it>rep</it>&#8201;=&#8201;ribulose-5-phosphate 4-epimerase, <it>rpe</it>&#8201;=&#8201;ribulose-5-phosphate 3-epimerase, <it>gpd</it>&#8201;=&#8201;glucose-6-phosphate 1-dehydrogenase, <it>pgl</it>&#8201;=&#8201;6-phosphogluconolactonase, <it>pdg</it>&#8201;=&#8201;6-phosphogluconate dehydrogenase, <it>rpi</it>&#8201;=&#8201;ribose-5-phosphate isomerase B, <it>trk</it>&#8201;=&#8201;transketolase, <it>tal</it>&#8201;=&#8201;transaldolase.</p>
</text><graphic file="1471-2164-15-567-4"/></fig>
			</sec>
			<sec>
				<st>
					<p>PEP/pyruvate conversion and co-factor recycling</p>
				</st><p>Several potential pathways for PEP/pyruvate conversion were observed, including pyruvate kinase (Pyk), pyruvate dikinase (Ppd), PEP decarboxykinase (Pep) in combination with the malate dehydrogenase/malic enzyme (Mdh/Mle) shunt, or oxaloacetate (Oad) (Figure&#160;<figr fid="F5">5</figr>A). Conversion of pyruvate to acetyl-CoA may occur through two pyruvate:ferredoxin oxidoreductases (Por) or pyruvate dehydrogenase complex (Pdh), all of which were observed in the proteome. The bifunctional type IV alcohol/acetaldehyde dehydrogenase (AdhE), a single aldehyde dehydrogenase (Ald) and 6 alcohol dehydrogenases (Adh) were observed with varying levels of peptide coverage. Lactate hydrogenase (Ldh) and phosphoacetyltransferase/acetate kinase (Pat/Ack) were also observed in the genome and proteome. Oxoglutarate synthesis from oxaloacetate via citrate/isocitrate was also indicated by observation of the required enzymes in the proteome (Figure&#160;<figr fid="F5">5</figr>A).</p>
				<fig id="F5"><title><p>Figure 5</p></title><caption><p>
   <it>PEP/pyruvate conversion and co-factor recycling enzymes in C. stercorarium DSM8532</it>
   <sup>
      <it>T</it>
   </sup>
</p></caption><text>
   <p><b><it>PEP/pyruvate conversion and co-factor recycling enzymes in C. stercorarium DSM8532</it></b><sup><b><it>T</it></b></sup><b><it>.</it></b> Details as for Figure&#160;<figr fid="F4">4</figr>. <b>A</b>. PEP/pyruvate conversion to lactate, acetate, ethanol or oxoglutarate. <b>B</b>. Co-factor recycling and hydrogenases. Gene symbols: <it>pyk</it>&#8201;=&#8201;pyruvate kinase, <it>ppd</it>&#8201;=&#8201;pyruvate phosphate dikinase, <it>ldh</it>&#8201;=&#8201;lactate dehydrogenase, <it>pdh</it>&#8201;=&#8201;pyruvate dehydrogenase complex, <it>por</it>&#8201;=&#8201;pyruvate:ferredoxin oxidoreductase, <it>pat</it>&#8201;=&#8201;phosphate acetyltransferase, <it>ack</it>&#8201;=&#8201;acetate kinase, <it>ald</it>&#8201;=&#8201;aldehyde dehydrognase, <it>adh</it>&#8201;=&#8201;alcohol dehydrogenase, <it>adhE</it>&#8201;=&#8201;type IV bifunctional acetaldehyde/alcohol dehydrogenase, <it>pep</it>&#8201;=&#8201;phosphoenolpyruvate carboxykinase, <it>mdh</it>&#8201;=&#8201;malate dehydrogenase, <it>mle</it>&#8201;=&#8201;malic enzyme, <it>oad</it>&#8201;=&#8201;oxaloacetate decarboxylase, <it>cst</it>&#8201;=&#8201;citrate synthase, <it>aco</it>&#8201;=&#8201;aconitase, <it>idh</it>&#8201;=&#8201;isocitrate dehydrogenase, <it>fdx</it>&#8201;=&#8201;ferredoxin, <it>nfo</it>&#8201;=&#8201;NADH:ferredoxin oxidoreductase, <it>nfn</it>&#8201;=&#8201;NADH:ferredoxin: NADP&#8201;+&#8201;oxidoreductase, <it>nfh</it>&#8201;=&#8201;nickel-iron hydrogenase, <it>ffh</it>&#8201;=&#8201;monomeric iron-only hydrogenase, <it>fhm</it>&#8201;=&#8201;multimeric iron-only hydrogenase, <it>fhb</it>&#8201;=&#8201;bifunctional NADH:ferredoxin hydrogenase.</p>
</text><graphic file="1471-2164-15-567-5"/></fig><p>Ferredoxin-mediated regeneration of energy intermediates and/or hydrogen production was the annotated function of 21 coding regions, 18 of which are organized in 5 clusters (Figure&#160;<figr fid="F5">5</figr>B). Three Fe-Fe hydrogenases (2 monomeric and one tetrameric), a trimeric bifunctional NADH-oxidizing hydrogenase co-located with ferredoxin (Fdx), and a newly described dimeric enzyme complex coupling reduction of NADP&#8201;+&#8201;with oxidation of ferredoxin and NADH <abbrgrp>
						<abbr bid="B47">47</abbr>
					</abbrgrp> (illustrated in Figure&#160;<figr fid="F3">3</figr>B) were all observed in the proteome. A dimeric Ni-Fe hydrogenase and the 6-enzyme complex NADH:ferredoxin oxidoreductase (Nfo), also known as Rnf, were co-located in the genome, however only some components of these clusters were covered by peptides, indicating that they may play a less important role in metabolism under these culture conditions (Figure&#160;<figr fid="F5">5</figr>B).</p>
			</sec>
			<sec>
				<st>
					<p>Confirmation of end products predicted through pathway analysis</p>
				</st><p>In order to confirm function of lignocellulolytic, transport and central metabolic pathways, <it>C. stercorarium</it> DSM8532<sup>T</sup> was cultured on cellobiose, xylose and xylan (purified hemicellulose) and harvested at mid-exponential phase (Figure&#160;<figr fid="F6">6</figr>A). Results confirmed growth on xylan consistent with the presence of several xylanolytic enzymes (Additional file <supplr sid="S3">3</supplr>: Table S2). Profile of observed end products (carbon dioxide, lactate, acetate, ethanol) is consistent with the presence of required enzymes in pyruvate conversion pathways, while detection of hydrogen in gas phase is consistent with hydrogenases and oxidoreductases detected in this organism. Growth on xylose is consistent with presence of a known xylose transporter (XylFGH) and an intact pentose phosphate pathway. Given absence of transaldolase in the genome, alternative pathways for degradation of sedoheptulose-7P (S7P) are proposed (Figure&#160;<figr fid="F6">6</figr>B).</p>
				<fig id="F6"><title><p>Figure 6</p></title><caption><p>
   <it>End products of C. stercorarium DSM8532</it>
   <sup>
      <it>T</it>
   </sup>
</p></caption><text>
   <p><b><it>End products of C. stercorarium DSM8532</it></b><sup><b><it>T</it></b></sup><b><it>. </it></b><b>A</b>. Gas (top) and liquid (bottom) end products on cellobiose, xylose and xylan at mid-exponential phase (12&#160;h anaerobic culture at 60&#176;C). Mean and standard deviation of 25 replicates. <b>B</b>. Hypothetical alternatives to transaldolase for S-7P degradation in <it>C. stercorarium</it> DSM8532<sup>T</sup>. S-7P may be phosphorylated by one of three phosphofructokinases (<it>pfk</it>) and subsequently cleaved to E-4P and DHA-P by fructose bisphosphate aldolase (<it>fba</it>). Alternatively, S-7P may be cleaved to PEP and E-4P by 3-deoxy-D-arabinoheptulosonate-7-phosphate synthase (<it>pda</it>), an enzyme usually active in creating an S-7P-like molecule from PEP/E-4P in the shikimate pathway.</p>
</text><graphic file="1471-2164-15-567-6"/></fig>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Discussion</p>
			</st><p>We report a complete genome sequence for <it>C. stercorarium</it> DSM8532<sup>T</sup>, generated using a single sequencing platform and <it>in silico</it> gap-filling. This approach represents a significant advance and savings in wet-lab procedures for gap-closing, however a key requirement of the <it>in silico</it> technique is complete coverage of the genome by sequencing reads (Thallinger <it>et al.</it>, manuscript in preparation). Validation of the approach was provided by the recent release of a concurrent sequence where extensive wet-lab gap-closing was applied <abbrgrp>
					<abbr bid="B40">40</abbr>
				</abbrgrp>. Although a comparison of the two versions is beyond the scope of this study, a preliminary analysis indicates very few differences between the two, mostly found in an ambiguous region of our sequence.</p><p>Genome-wide studies increasingly employ complementary omics data in order to improve annotation and provide enhanced insight into bacterial metabolism <abbrgrp>
					<abbr bid="B12">12</abbr>
					<abbr bid="B48">48</abbr>
					<abbr bid="B49">49</abbr>
					<abbr bid="B50">50</abbr>
					<abbr bid="B51">51</abbr>
					<abbr bid="B52">52</abbr>
					<abbr bid="B53">53</abbr>
				</abbrgrp>. We report extensive coverage of the genome by transcripts and peptides using updated techniques (sequence-based RNA profiling, gel-free 2D proteomics). Complete coverage of virtually all genes by RNA-seq reads may indicate a background level of transcription for the entire genome. Although DNAseI treatment of cell extracts was performed, some residual DNA was observed in the RNA prep using fluorometric assays, and may have been sequenced at a background level <abbrgrp>
					<abbr bid="B54">54</abbr>
				</abbrgrp>. Therefore, further work will be required to ensure absence of DNA or subtraction of background signal. Despite this limitation, RNA-seq data were effectively applied to improve the genome sequence and correct homopolymer errors resulting from 454 sequencing, corroborating several suggestions for sequence changes proposed by the genome annotation improvement pipeline GenePRIMP and NCBI prior to sequence submission. Our results indicate that Illumina RNA-seq for WGS improvement may supplant Illumina DNA sequencing for this purpose <abbrgrp>
					<abbr bid="B55">55</abbr>
				</abbrgrp>, since genome coverage using RNA prepared as described in this study was sufficient to correct the majority of the homopolymer errors, even in non-coding regions.</p><p>Proteogenomic analysis resulted in several suggested improvements to the genome sequence, corroborating an inserted gene and an alternative reading frame suggested by GenePRIMP, as well as probable gene extensions not captured by other genome improvement procedures. Overall, proteogenomic analysis indicates the quality of the automated genome annotation in that only two cases of previously unannotated genes were observed, both of which were also highlighted through GenePRIMP.</p><p>Most of the genome improvements suggested by GenePRIMP and/or NCBI resulted in putative pseudogenes. However, a subset of suggestions (16/94) were not identified as pseudogenes and half of these (8/16) were corroborated by RNA-seq or proteogenomic analysis. This finding indicates that genome improvement suggestions by GenePRIMP or NCBI algorithms may be more reliable when not resulting in pseudogenes. An increased effort by bioinformaticians to identify erroneous pseudogenes in existing databases would be desirable, confirming the importance of using multiple data sources to improve genome annotation. Further, observation of peptides through proteomics provides direct insight into whether a particular coding region should or should not be defined as a pseudogene. Although only a single culture condition was used in this study, most coding regions defined as pseudogenes had no peptides observed, corroborating automatic annotation by IMG/ER, GenePRIMP and/or NCBI.</p><p>Since the purpose of this study was an improved whole genome sequence and annotation, we included all data meeting minimum quality standards to determine coverage of gene regions by RNA-seq and MS/MS proteomics data. Further work will be required to determine DNA contamination and/or background noise related to RNA-seq <abbrgrp>
					<abbr bid="B54">54</abbr>
				</abbrgrp>. High-throughput proteomics of HPLC-fractionated peptides in this study has provided exceptional depth of coding region coverage, comparable to other in-depth studies <abbrgrp>
					<abbr bid="B12">12</abbr>
				</abbrgrp>. However, signals generated by different proteomics and transcriptomics platforms may be biased by a number of factors <abbrgrp>
					<abbr bid="B56">56</abbr>
					<abbr bid="B57">57</abbr>
				</abbrgrp>. For example, samples are from cultures growing on hexose (transcriptome) or pentose (proteome), and cannot be strictly compared due to undetermined effects on the regulatory milieu, gene expression, relative abundance of transporters and branching central metabolic pathways more generally <abbrgrp>
					<abbr bid="B12">12</abbr>
				</abbrgrp>. Further work will be required to determine the relation of these terms to cellular metabolism and each other as technologies to measure them continue to evolve <abbrgrp>
					<abbr bid="B16">16</abbr>
					<abbr bid="B58">58</abbr>
					<abbr bid="B59">59</abbr>
				</abbrgrp>.</p><p>Lignocellulolytic enzymes are the only previously well-characterized components of <it>C. stercorarium</it> DSM8532<sup>T</sup> metabolism. In this study, some of the most important predicted cellulases were not detected in the proteome. Many lignocellulolytic enzymes encode transmembrane domains (25/106 or 24%), signal peptides (29/106 or 28%) or both (20/106 or 19%). Since proteomic profiles were generated from cell pellets growing on soluble sugars, detection of this enzyme class may be limited due to presence in culture supernatants only or possible lack of expression in the absence of lignocellulosic material <abbrgrp>
					<abbr bid="B12">12</abbr>
				</abbrgrp>. Further studies on complex substrates such as cellulose or hemicellulose will be required to determine their relative importance in lignocelluloysis. Although many of this organism&#8217;s nearest phylogenetic (16S) neighbours are cellulosome-encoders, CelYZ of <it>C. stercorarium</it> most closely resembles a similar pair in <it>C. phytofermentans</it>, another cellulolytic organism without cellulosome. Homologous enzymes containing cellulosomal dockerin domains are grouped together with another cellulase and adjoining cellulolosomal scaffoldin in <it>C. papyrosolvens</it> and <it>C. cellulolyticum</it>, while homologs are also present as non-dockerin-containing enzymes in <it>C. thermocellum</it>, although vastly separated from each other on the chromosome. These observations indicate the complex interweaving and reiteration of coding sequences across phylogenetically close but functionally divergent organisms, providing insight into horizontal gene transfer and small replicon-mediated evolution.</p><p>All but one ABC-type CUT cluster contained transmembrane and solute-binding components only, an arrangement frequently observed in Gram-positive organisms <abbrgrp>
					<abbr bid="B60">60</abbr>
				</abbrgrp>. We focused on a subset of 7 CUT clusters where every gene in the cluster was detected by MS/MS. Most clusters consisted of 2 transmembrane genes (COG0395 and COG1175) and a single solute-binding gene (COG4209) with clostridial homologues in <it>C. termitidis</it>, <it>C. phytofermentans</it>, <it>C. papyrosolvens</it> and <it>Thermoanaerobacterium xylanolyticum</it>. Surprisingly, several genes in these clusters, including Clst_0194-6, had near neighbours in <it>Treponema</it> (50-60% identical at amino acid-level), an otherwise distantly-related spirochaete. The nearest homologue for the nucleotide-binding gene was from <it>C. thermocellum</it> (78% identical). The single CUT2 cluster was homologous to the known xylose transporter XylFGH of <it>Thermoanaerobacter pseudethanolicus</it> and the newly described <it>T. thermohydrosulfuricus</it> WC1 <abbrgrp>
					<abbr bid="B17">17</abbr>
				</abbrgrp>.</p><p>Xylose transport and utilization by this organism confirms the importance of the pentose phosphate pathway, however lack of a known encoded transaldolase (EC 2.2.1.2, EC 4.1.2.-), indicates an alternative pathway for S-7P degradation. Two possibilities are proposed, the first involving one of three encoded phosphofructokinases (Pfk) and a bifunctional fructose bisphosphate aldolase (Fba) <abbrgrp>
					<abbr bid="B61">61</abbr>
				</abbrgrp> (Figure&#160;<figr fid="F6">6</figr>B). Recently, &#8220;forcing&#8221; this degradation in a transaldolase-knockout strain of <it>E. coli</it> may have occurred due to accumulation of S-7P, followed by creation of sedoheptulose 1,7-2P and cleavage to dihydroxyacetone-P (DHA-P) and erythrose-4P (E-4P) by Pfk and Fba respectively <abbrgrp>
					<abbr bid="B62">62</abbr>
				</abbrgrp>. Of 345 clostridia with whole genomes in IMG, 62 (18%) do not have an annotated transaldolase, however few previous studies have shown xylose utilization by confirmed transaldolase-deficient strains <abbrgrp>
					<abbr bid="B63">63</abbr>
				</abbrgrp>. Annotated Pfk genes may have distinct biological roles and use either pyrophosphate or ATP as a phosphate donor <abbrgrp>
					<abbr bid="B64">64</abbr>
				</abbrgrp>, therefore, we hypothesize that one of three Pfk may be involved specifically in transformation of S-7P to S-1,7-2P. An alternative hypothesis is that expression of 3-deoxy-D-arabinoheptulosonate-7-phosphate synthase (Pda) from the shikimate pathway creates a S-7P-like molecule from PEP and glyceraldehyde-3P (G-3P) and might catalyze the degradation of S-7P to PEP and G-3P during xylose utilization (Figure&#160;<figr fid="F6">6</figr>B). Further work will be required to define these potentially significant observations for pentose utilization in clostridia.</p><p>Muliple pathways for PEP/pyruvate conversion and co-factor recycling are consistent with previous literature on thermophilic clostridia <abbrgrp>
					<abbr bid="B10">10</abbr>
				</abbrgrp> and demonstrated end products for this organism during anaerobic culture. All expressed clusters are likely to contribute to hydrogen production and NADH oxidation, including a bifurcating NADH-Fd hydrogenase (Clst_0900, Clst_0902-4) <abbrgrp>
					<abbr bid="B65">65</abbr>
				</abbrgrp> with similarity to the designated bifurcating hydrogenase in <it>C. thermocellum</it> ATCC 27405, and a dimeric three-way oxidoreductase involving NADH, ferredoxin and NADP (Clst_0500-1) <abbrgrp>
					<abbr bid="B47">47</abbr>
				</abbrgrp>. Further study will be required to determine which pathways are most highly expressed and how they are regulated during metabolism, resulting in observed end product profiles for <it>C. stercorarium</it> on cellobiose, xylose and xylan. These profiles largely confirm previous culture-based literature <abbrgrp>
					<abbr bid="B4">4</abbr>
					<abbr bid="B20">20</abbr>
					<abbr bid="B66">66</abbr>
					<abbr bid="B67">67</abbr>
				</abbrgrp>, including generally elevated production of ethanol relative to lactate and acetate. Further testing of the robustness and reproducibility of molecular techniques in relation to culture-based parameters will help to determine a theoretical baseline of expected values for mid-exponential cells, linking expression profiles to strain performance in terms of efficient substrate utilization and ethanol production.</p>
		</sec>
		<sec>
			<st>
				<p>Conclusions</p>
			</st><p>We report a finished WGS for this well-characterized type strain in the context of detailed information about coverage of annotated gene regions using Illumina RNA-seq and high-throughput 2D MS/MS. To our knowledge, this is the first time a WGS has been enhanced using these advanced techniques. Our approach may represent an updated model for better definition of the molecular systems biology of an organism in an era where WGS have proliferated rapidly. Understanding the influence of environmental factors on expression of inter-connected enzymatic pathways will be critical to evaluate and improve ethanol production by selected organisms and consortia in consolidated bioprocessing.</p>
		</sec>
		<sec>
			<st>
				<p>Competing interests</p>
			</st><p>The authors declare that they have no competing interests.</p>
		</sec>
		<sec>
			<st>
				<p>Authors&#8217; contributions</p>
			</st><p>JS conceived the study, performed culture-based work, prepared cell extracts for molecular analysis, coordinated genome annotation and submission, conducted genome gap-closing, analyzed proteomic and transcriptomic data, and wrote the paper. TV conceived the study, carried out manual genome annotation and wrote the paper. PM and OVK prepared samples for mass spectrometry and carried out proteomics analysis. XZ, GA and BF coordinated bioinformatics for genomics, transcriptomics and proteomics. GGT coordinated genome improvements, including <it>in silico</it> gap-closing and integration of RNA-seq data. BH conducted genome annotation. JW and DB conceived the study and provided laboratory equipment/reagents. RS conceived the study, coordinated culture-based and molecular work, and wrote the paper. All authors read and approved the final manuscript.</p>
		</sec>
	</bdy>
	<bm>
		<ack>
			<sec>
				<st>
					<p>Acknowledgements</p>
				</st><p>Vic Spicer conceived and implemented the proteogenomics workflow. This study was funded by Genome Canada as part of MGCB2: &#8220;Microbial Genomics for Biofuels and Co-products from Biorefining Processes&#8221;, (<url>http://www.microbialrefinery.com</url>), and supported in part by the EU-FP7 COST Action SeqAhead [EC Grant BM1006] and the Austrian Centre of Industrial Biotechnology (ACIB) funded by FFG, BMWFJ, BMVIT, ZIT, SFG, and Zukunftsstiftung Tirol within the Austrian COMET program [FFG Grant 824186]. The authors would like to acknowledge scientists and staff at Genome Qu&#233;bec, McGill University, Montr&#233;al, Canada for RNA processing and next-generation RNA sequencing, and Manitoba Centre for Systems Biology and Proteomics, University of Manitoba, Winnipeg, Canada, for protein processing, liquid chromatography and mass spectrometry. Thanks to R.C. Carere, T. Rydzak, and N. Bjorklund for useful discussion.</p>
			</sec>
		</ack>
		<refgrp><bibl id="B1"><title><p>Plant cell walls to ethanol</p></title><aug><au><snm>Jordan</snm><fnm>DB</fnm></au><au><snm>Bowman</snm><fnm>MJ</fnm></au><au><snm>Braker</snm><fnm>JD</fnm></au><au><snm>Dien</snm><fnm>BS</fnm></au><au><snm>Hector</snm><fnm>RE</fnm></au><au><snm>Lee</snm><fnm>CC</fnm></au><au><snm>Mertens</snm><fnm>JA</fnm></au><au><snm>Wagschal</snm><fnm>K</fnm></au></aug><source>Biochem J</source><pubdate>2012</pubdate><volume>442</volume><fpage>241</fpage><lpage>252</lpage></bibl><bibl id="B2"><title><p>Challenges for biohydrogen production via direct lignocellulose fermentation</p></title><aug><au><snm>Levin</snm><fnm>DB</fnm></au><au><snm>Carere</snm><fnm>CR</fnm></au><au><snm>Cicek</snm><fnm>N</fnm></au><au><snm>Sparling</snm><fnm>R</fnm></au></aug><source>Int J Hyd Energy</source><pubdate>2009</pubdate><volume>34</volume><fpage>7390</fpage><lpage>7403</lpage></bibl><bibl id="B3"><title><p>Bacterial cellulose hydrolysis in anaerobic environmental subsystems - <it>Clostridium thermocellum</it> and <it>Clostridium stercorarium</it>, thermophilic plant-fiber degraders</p></title><aug><au><snm>Zverlov</snm><fnm>VV</fnm></au><au><snm>Schwarz</snm><fnm>WH</fnm></au></aug><source>Ann N Y Acad Sci</source><pubdate>2008</pubdate><volume>1125</volume><fpage>298</fpage><lpage>307</lpage></bibl><bibl id="B4"><title><p>Molecular characterization of four strains of the cellulolytic thermophile <it>Clostridium stercorarium</it></p></title><aug><au><snm>Schwarz</snm><fnm>W</fnm></au><au><snm>Bronnenmeier</snm><fnm>K</fnm></au><au><snm>Landmann</snm><fnm>B</fnm></au><au><snm>Wanner</snm><fnm>G</fnm></au><au><snm>Staudenbauer</snm><fnm>W</fnm></au><au><snm>Kurose</snm><fnm>N</fnm></au><au><snm>Takayama</snm><fnm>T</fnm></au></aug><source>Biosci Biotech Biochem</source><pubdate>1995</pubdate><volume>59</volume><fpage>1661</fpage><lpage>1665</lpage></bibl><bibl id="B5"><title><p>Combined inactivation of the <it>Clostridium cellulolyticum</it> lactate and malate dehydrogenase genes substantially increases ethanol yield from cellulose and switchgrass fermentations</p></title><aug><au><snm>Li</snm><fnm>Y</fnm></au><au><snm>Tschaplinski</snm><fnm>TJ</fnm></au><au><snm>Engle</snm><fnm>NL</fnm></au><au><snm>Hamilton</snm><fnm>CY</fnm></au><au><snm>Rodriguez</snm><fnm>M</fnm><suf>Jr</suf></au><au><snm>Liao</snm><fnm>JC</fnm></au><au><snm>Schadt</snm><fnm>CW</fnm></au><au><snm>Guss</snm><fnm>AM</fnm></au><au><snm>Yang</snm><fnm>Y</fnm></au><au><snm>Graham</snm><fnm>DE</fnm></au></aug><source>Biotechnol Biofuels</source><pubdate>2012</pubdate><volume>5</volume><fpage>2</fpage></bibl><bibl id="B6"><title><p>Engineered microbial systems for enhanced conversion of lignocellulosic biomass</p></title><aug><au><snm>Elkins</snm><fnm>JG</fnm></au><au><snm>Raman</snm><fnm>B</fnm></au><au><snm>Keller</snm><fnm>M</fnm></au></aug><source>Curr Opin Biotechnol</source><pubdate>2010</pubdate><volume>21</volume><fpage>657</fpage><lpage>662</lpage></bibl><bibl id="B7"><title><p>Improved ethanol production from various carbohydrates through anaerobic thermophilic co-culture</p></title><aug><au><snm>Xu</snm><fnm>L</fnm></au><au><snm>Tschirner</snm><fnm>U</fnm></au></aug><source>Biores Tech</source><pubdate>2011</pubdate><volume>102</volume><fpage>10065</fpage><lpage>10071</lpage></bibl><bibl id="B8"><title><p>Mechanisms of enhanced cellulosic bioethanol fermentation by co-cultivation of <it>Clostridium</it> and <it>Thermoanaerobacter</it> spp</p></title><aug><au><snm>He</snm><fnm>Q</fnm></au><au><snm>Hemme</snm><fnm>CL</fnm></au><au><snm>Jiang</snm><fnm>H</fnm></au><au><snm>He</snm><fnm>Z</fnm></au><au><snm>Zhou</snm><fnm>J</fnm></au></aug><source>Biores Tech</source><pubdate>2011</pubdate><volume>102</volume><fpage>9586</fpage><lpage>9592</lpage></bibl><bibl id="B9"><title><p>Consortia-mediated bioprocessing of cellulose to ethanol with a symbiotic <it>Clostridium phytofermentans</it>/yeast co-culture</p></title><aug><au><snm>Zuroff</snm><fnm>TR</fnm></au><au><snm>Xiques</snm><fnm>SB</fnm></au><au><snm>Curtis</snm><fnm>WR</fnm></au></aug><source>Biotechnol Biofuels</source><pubdate>2013</pubdate><volume>6</volume><fpage>59</fpage></bibl><bibl id="B10"><title><p>Linking genome content to biofuel production yields: a meta-analysis of major catabolic pathways among select H2 and ethanol-producing bacteria</p></title><aug><au><snm>Carere</snm><fnm>CR</fnm></au><au><snm>Rydzak</snm><fnm>T</fnm></au><au><snm>Verbeke</snm><fnm>TJ</fnm></au><au><snm>Cicek</snm><fnm>N</fnm></au><au><snm>Levin</snm><fnm>DB</fnm></au><au><snm>Sparling</snm><fnm>R</fnm></au></aug><source>BMC Microbiol</source><pubdate>2012</pubdate><volume>12</volume><fpage>295</fpage></bibl><bibl id="B11"><title><p>Proteomic analysis of <it>Clostridium thermocellum</it> core metabolism: relative protein expression profiles and growth phase-dependent changes in protein expression</p></title><aug><au><snm>Rydzak</snm><fnm>T</fnm></au><au><snm>McQueen</snm><fnm>PD</fnm></au><au><snm>Krokhin</snm><fnm>OV</fnm></au><au><snm>Spicer</snm><fnm>V</fnm></au><au><snm>Ezzati</snm><fnm>P</fnm></au><au><snm>Dwivedi</snm><fnm>RC</fnm></au><au><snm>Shamshurin</snm><fnm>D</fnm></au><au><snm>Levin</snm><fnm>DB</fnm></au><au><snm>Wilkins</snm><fnm>JA</fnm></au><au><snm>Sparling</snm><fnm>R</fnm></au></aug><source>BMC Microbiol</source><pubdate>2012</pubdate><volume>12</volume><fpage>214</fpage></bibl><bibl id="B12"><title><p>Proteome-wide systems analysis of a cellulosic biofuel-producing microbe</p></title><aug><au><snm>Tolonen</snm><fnm>AC</fnm></au><au><snm>Haas</snm><fnm>W</fnm></au><au><snm>Chilaka</snm><fnm>AC</fnm></au><au><snm>Aach</snm><fnm>J</fnm></au><au><snm>Gygi</snm><fnm>SP</fnm></au><au><snm>Church</snm><fnm>GM</fnm></au></aug><source>Mol Syst Biol</source><pubdate>2011</pubdate><volume>7</volume><fpage>461</fpage></bibl><bibl id="B13"><title><p><it>Clostridium thermocellum</it> ATCC27405 transcriptomic, metabolomic and proteomic profiles after ethanol stress</p></title><aug><au><snm>Yang</snm><fnm>S</fnm></au><au><snm>Giannone</snm><fnm>RJ</fnm></au><au><snm>Dice</snm><fnm>L</fnm></au><au><snm>Yang</snm><fnm>ZK</fnm></au><au><snm>Engle</snm><fnm>NL</fnm></au><au><snm>Tschaplinski</snm><fnm>TJ</fnm></au><au><snm>Hettich</snm><fnm>RL</fnm></au><au><snm>Brown</snm><fnm>SD</fnm></au></aug><source>BMC Genomics</source><pubdate>2012</pubdate><volume>13</volume><fpage>336</fpage></bibl><bibl id="B14"><title><p>VESPA: software to facilitate genomic annotation of prokaryotic organisms through integration of proteomic and transcriptomic data</p></title><aug><au><snm>Peterson</snm><fnm>ES</fnm></au><au><snm>McCue</snm><fnm>LA</fnm></au><au><snm>Schrimpe-Rutledge</snm><fnm>AC</fnm></au><au><snm>Jensen</snm><fnm>JL</fnm></au><au><snm>Walker</snm><fnm>H</fnm></au><au><snm>Kobold</snm><fnm>MA</fnm></au><au><snm>Webb</snm><fnm>SR</fnm></au><au><snm>Payne</snm><fnm>SH</fnm></au><au><snm>Ansong</snm><fnm>C</fnm></au><au><snm>Adkins</snm><fnm>JN</fnm></au><au><snm>Cannon</snm><fnm>WR</fnm></au><au><snm>Webb-Robertson</snm><fnm>B-JM</fnm></au></aug><source>BMC Genomics</source><pubdate>2012</pubdate><volume>13</volume><fpage>131</fpage></bibl><bibl id="B15"><title><p>Interactive microbial genome visualization with GView</p></title><aug><au><snm>Petkau</snm><fnm>A</fnm></au><au><snm>Stuart-Edwards</snm><fnm>M</fnm></au><au><snm>Stothard</snm><fnm>P</fnm></au><au><snm>Van Domselaar</snm><fnm>G</fnm></au></aug><source>Bioinformatics</source><pubdate>2010</pubdate><volume>26</volume><fpage>3125</fpage><lpage>3126</lpage></bibl><bibl id="B16"><title><p>Prokaryotic whole-transcriptome analysis: deep sequencing and tiling arrays</p></title><aug><au><snm>Siezen</snm><fnm>RJ</fnm></au><au><snm>Wilson</snm><fnm>G</fnm></au><au><snm>Todt</snm><fnm>T</fnm></au></aug><source>Microb Biotechnol</source><pubdate>2010</pubdate><volume>3</volume><fpage>125</fpage><lpage>130</lpage></bibl><bibl id="B17"><title><p>Genomic evaluation of <it>Thermoanaerobacter</it> spp. for the construction of designer co-cultures to improve lignocellulosic biofuel production</p></title><aug><au><snm>Verbeke</snm><fnm>TJ</fnm></au><au><snm>Zhang</snm><fnm>X</fnm></au><au><snm>Henrissat</snm><fnm>B</fnm></au><au><snm>Spicer</snm><fnm>V</fnm></au><au><snm>Rydzak</snm><fnm>T</fnm></au><au><snm>Krokhin</snm><fnm>OV</fnm></au><au><snm>Fristensky</snm><fnm>B</fnm></au><au><snm>Levin</snm><fnm>DB</fnm></au><au><snm>Sparling</snm><fnm>R</fnm></au></aug><source>PLoS One</source><pubdate>2013</pubdate><volume>8</volume><fpage>e59362</fpage></bibl><bibl id="B18"><title><p>Structure of the <it>Clostridium stercorarium</it> gene celY encoding the exo-1, 4-{beta}-glucanase Avicelase II</p></title><aug><au><snm>Bronnenmeier</snm><fnm>K</fnm></au><au><snm>Kundt</snm><fnm>K</fnm></au><au><snm>Riedel</snm><fnm>K</fnm></au></aug><source>Microbiol</source><pubdate>1997</pubdate><volume>143</volume><fpage>891</fpage><lpage>898</lpage></bibl><bibl id="B19"><title><p>Crystallization and preliminary X-ray analysis of xylanase B from <it>Clostridium stercorarium</it></p></title><aug><au><snm>Nishimoto</snm><fnm>M</fnm></au><au><snm>Fushinobu</snm><fnm>S</fnm></au><au><snm>Miyanaga</snm><fnm>A</fnm></au><au><snm>Wakagi</snm><fnm>T</fnm></au><au><snm>Shoun</snm><fnm>H</fnm></au><au><snm>Sakka</snm><fnm>K</fnm></au><au><snm>Ohmiya</snm><fnm>K</fnm></au><au><snm>Nirasawa</snm><fnm>S</fnm></au><au><snm>Kitaoka</snm><fnm>M</fnm></au><au><snm>Hayashi</snm><fnm>K</fnm></au></aug><source>Acta Crystallogr D Biol Crystallogr</source><pubdate>2004</pubdate><volume>60</volume><fpage>342</fpage><lpage>343</lpage></bibl><bibl id="B20"><title><p>Enzyme system of <it>Clostridium stercorarium</it> for hydrolysis of arabinoxylan: reconstitution of the in vivo system from recombinant enzymes</p></title><aug><au><snm>Adelsberger</snm><fnm>H</fnm></au><au><snm>Hertel</snm><fnm>C</fnm></au><au><snm>Glawischnig</snm><fnm>E</fnm></au><au><snm>Zverlov</snm><fnm>VV</fnm></au><au><snm>Schwarz</snm><fnm>WH</fnm></au></aug><source>Microbiol</source><pubdate>2004</pubdate><volume>150</volume><fpage>2257</fpage><lpage>2266</lpage></bibl><bibl id="B21"><title><p>Nucleotide sequence of the <it>Clostridium stercorarium</it> xynB gene encoding an extremely thermostable xylanase, and characterization of the translated product</p></title><aug><au><snm>Fukumura</snm><fnm>M</fnm></au><au><snm>Sakka</snm><fnm>K</fnm></au><au><snm>Shimada</snm><fnm>K</fnm></au><au><snm>Ohmiya</snm><fnm>K</fnm></au></aug><source>Biosci Biotech Biochem</source><pubdate>1995</pubdate><volume>59</volume><fpage>40</fpage><lpage>46</lpage></bibl><bibl id="B22"><title><p>A novel thermophilic pectate lyase containing two catalytic modules of <it>Clostridium stercorarium</it></p></title><aug><au><snm>Hla</snm><fnm>S</fnm></au><au><snm>Kurokawa</snm><fnm>K</fnm></au><au><snm>Suryani</snm><fnm></fnm></au><au><snm>Kimura</snm><fnm>T</fnm></au><au><snm>Ohmiya</snm><fnm>K</fnm></au><au><snm>Sakka</snm><fnm>K</fnm></au></aug><source>Biosci Biotech Biochem</source><pubdate>2005</pubdate><volume>69</volume><fpage>2138</fpage><lpage>2145</lpage></bibl><bibl id="B23"><title><p>Sequence analysis of the <it>Clostridium stercorarium</it> celZ gene encoding a thermoactive cellulase (Avicelase I): identification of catalytic and cellulose-binding domains</p></title><aug><au><snm>Jauris</snm><fnm>S</fnm></au><au><snm>R&#252;cknagel</snm><fnm>KP</fnm></au><au><snm>Schwarz</snm><fnm>WH</fnm></au><au><snm>Kratzsch</snm><fnm>P</fnm></au><au><snm>Bronnenmeier</snm><fnm>K</fnm></au><au><snm>Staudenbauer</snm><fnm>WL</fnm></au></aug><source>Mol Gen Genet</source><pubdate>1990</pubdate><volume>223</volume><fpage>258</fpage><lpage>267</lpage></bibl><bibl id="B24"><title><p>Purification and properties of a cellobiose phosphorylase (CepA) and a cellodextrin phosphorylase (CepB) from the cellulolytic thermophile <it>Clostridium stercorarium</it></p></title><aug><au><snm>Reichenbecher</snm><fnm>M</fnm></au><au><snm>Lottspeich</snm><fnm>F</fnm></au><au><snm>Bronnenmeier</snm><fnm>K</fnm></au></aug><source>Eur J Biochem</source><pubdate>1997</pubdate><volume>247</volume><fpage>262</fpage><lpage>267</lpage></bibl><bibl id="B25"><title><p>Nucleotide sequence of the <it>Clostridium stercorarium</it> xylA gene encoding a bifunctional protein with &#946;-d-xylosidase and &#945;-L-arabinofuranosidase activities, and properties of the translated product</p></title><aug><au><snm>Sakka</snm><fnm>K</fnm></au><au><snm>Yoshikawa</snm><fnm>K</fnm></au><au><snm>Kojima</snm><fnm>Y</fnm></au><au><snm>Karita</snm><fnm>S</fnm></au><au><snm>Ohmiya</snm><fnm>K</fnm></au><au><snm>Shimada</snm><fnm>K</fnm></au></aug><source>Biosci Biotech Biochem</source><pubdate>1993</pubdate><volume>57</volume><fpage>268</fpage><lpage>272</lpage></bibl><bibl id="B26"><title><p>
   <b>Cloning and expression of </b>
   <b>
      <it>Clostridium stercorarium </it>
   </b>
   <b>cellulase genes in </b>
   <b>
      <it>Escherichia coli.</it>
   </b>
</p></title><aug><au><snm>Schwarz</snm><fnm>W</fnm></au><au><snm>Jauris</snm><fnm>S</fnm></au><au><snm>Kouba</snm><fnm>M</fnm></au><au><snm>Bronnenmeier</snm><fnm>K</fnm></au><au><snm>Staudenbauer</snm><fnm>WL</fnm></au></aug><source>Biotechnol Lett</source><pubdate>1989</pubdate><volume>11</volume><fpage>461</fpage><lpage>466</lpage></bibl><bibl id="B27"><title><p>
   <b>Cloning, sequencing, and expression of the gene encoding the </b>
   <b>
      <it>Clostridium stercorarium </it>
   </b>
   <b>&#945;-Galactosidase Aga36A in </b>
   <b>
      <it>Escherichia coli.</it>
   </b>
</p></title><aug><au><snm>Suryani</snm><fnm></fnm></au><au><snm>Kimura</snm><fnm>T</fnm></au><au><snm>Sakka</snm><fnm>K</fnm></au><au><snm>Ohmiya</snm><fnm>K</fnm></au></aug><source>Biosci Biotech Biochem</source><pubdate>2003</pubdate><volume>67</volume><fpage>2160</fpage><lpage>2166</lpage></bibl><bibl id="B28"><title><p>Sequencing and expression of the gene encoding the <it>Clostridium stercorarium</it> beta-xylosidase Xyl43B in Escherichia coli</p></title><aug><au><snm>Suryani</snm><fnm></fnm></au><au><snm>Kimura</snm><fnm>T</fnm></au><au><snm>Sakka</snm><fnm>K</fnm></au><au><snm>Ohmiya</snm><fnm>K</fnm></au></aug><source>Biosci Biotech Biochem</source><pubdate>2004</pubdate><volume>68</volume><fpage>609</fpage><lpage>614</lpage></bibl><bibl id="B29"><title><p>Nucleotide sequence of arfB of <it>Clostridium stercorarium</it>, and prediction of catalytic residues of &#945;&#8210;L&#8210;arabinofuranosidases based on local similarity with several families of glycosyl hydrolases</p></title><aug><au><snm>Zverlov</snm><fnm>VV</fnm></au><au><snm>Liebl</snm><fnm>W</fnm></au><au><snm>Bachleitner</snm><fnm>M</fnm></au><au><snm>Schwarz</snm><fnm>WH</fnm></au></aug><source>FEMS Microbiol Lett</source><pubdate>1998</pubdate><volume>164</volume><fpage>337</fpage><lpage>343</lpage></bibl><bibl id="B30"><title><p>The thermostable &#945;&#8210;L&#8210;rhamnosidase RamA of <it>Clostridium stercorarium</it>: biochemical characterization and primary structure of a bacterial &#945;&#8210;L&#8210;rhamnoside hydrolase, a new type of inverting glycoside hydrolase</p></title><aug><au><snm>Zverlov</snm><fnm>V</fnm></au><au><snm>Hertel</snm><fnm>C</fnm></au><au><snm>Bronnenmeier</snm><fnm>K</fnm></au><au><snm>Hroch</snm><fnm>A</fnm></au><au><snm>Kellermann</snm><fnm>J</fnm></au><au><snm>Schwarz</snm><fnm>WH</fnm></au></aug><source>Mol Microbiol</source><pubdate>2000</pubdate><volume>35</volume><fpage>173</fpage><lpage>179</lpage></bibl><bibl id="B31"><title><p>Cloning, sequencing, and expression of the gene encoding the <it>Clostridium stercorarium</it> xylanase C in <it>Escherichia coli</it></p></title><aug><au><snm>Ali</snm><fnm>MK</fnm></au><au><snm>Fukumura</snm><fnm>M</fnm></au><au><snm>Sakano</snm><fnm>K</fnm></au><au><snm>Karita</snm><fnm>S</fnm></au><au><snm>Kimura</snm><fnm>T</fnm></au><au><snm>Sakka</snm><fnm>K</fnm></au><au><snm>Ohmiya</snm><fnm>K</fnm></au></aug><source>Biosci Biotech Biochem</source><pubdate>1999</pubdate><volume>63</volume><fpage>1596</fpage><lpage>1604</lpage></bibl><bibl id="B32"><title><p>R: A Language and Environment for Statistical Computing</p></title><aug><au><cnm>R Core Team</cnm></au></aug><source>R Foundation for Statistical Computing</source><pubdate>2013</pubdate><note>Vienna, Austria</note></bibl><bibl id="B33"><title><p>Origin of replication in circular prokaryotic chromosomes</p></title><aug><au><snm>Worning</snm><fnm>P</fnm></au><au><snm>Jensen</snm><fnm>LJ</fnm></au><au><snm>Hallin</snm><fnm>PF</fnm></au><au><snm>St&#230;rfeldt</snm><fnm>HH</fnm></au><au><snm>Ussery</snm><fnm>DW</fnm></au></aug><source>Environ Microbiol</source><pubdate>2006</pubdate><volume>8</volume><issue>2</issue><fpage>353</fpage><lpage>361</lpage></bibl><bibl id="B34"><title><p>IMG ER: a system for microbial genome annotation expert review and curation</p></title><aug><au><snm>Markowitz</snm><fnm>VM</fnm></au><au><snm>Mavromatis</snm><fnm>K</fnm></au><au><snm>Ivanova</snm><fnm>NN</fnm></au><au><snm>Chen</snm><fnm>I-MA</fnm></au><au><snm>Chu</snm><fnm>K</fnm></au><au><snm>Kyrpides</snm><fnm>NC</fnm></au></aug><source>Bioinformatics</source><pubdate>2009</pubdate><volume>25</volume><fpage>2271</fpage><lpage>2278</lpage></bibl><bibl id="B35"><title><p>The Carbohydrate-Active EnZymes database (CAZy): an expert resource for Glycogenomics</p></title><aug><au><snm>Cantarel</snm><fnm>BL</fnm></au><au><snm>Coutinho</snm><fnm>PM</fnm></au><au><snm>Rancurel</snm><fnm>C</fnm></au><au><snm>Bernard</snm><fnm>T</fnm></au><au><snm>Lombard</snm><fnm>V</fnm></au><au><snm>Henrissat</snm><fnm>B</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2009</pubdate><volume>37</volume><fpage>D233</fpage><lpage>D238</lpage></bibl><bibl id="B36"><title><p>The transporter classification database: recent advances</p></title><aug><au><snm>Saier</snm><fnm>MH</fnm></au><au><snm>Yen</snm><fnm>MR</fnm></au><au><snm>Noto</snm><fnm>K</fnm></au><au><snm>Tamang</snm><fnm>DG</fnm></au><au><snm>Elkan</snm><fnm>C</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2009</pubdate><volume>37</volume><fpage>D274</fpage><lpage>D278</lpage></bibl><bibl id="B37"><title><p>ABCdb: an online resource for ABC transporter repertories from sequenced archaeal and bacterial genomes</p></title><aug><au><snm>Fichant</snm><fnm>G</fnm></au><au><snm>Basse</snm><fnm>M-J</fnm></au><au><snm>Quentin</snm><fnm>Y</fnm></au></aug><source>FEMS Microbiol Lett</source><pubdate>2007</pubdate><volume>256</volume><fpage>333</fpage><lpage>339</lpage></bibl><bibl id="B38"><title><p>The surprising diversity of clostridial hydrogenases: a comparative genomic perspective</p></title><aug><au><snm>Calusinska</snm><fnm>M</fnm></au><au><snm>Happe</snm><fnm>T</fnm></au><au><snm>Joris</snm><fnm>B</fnm></au><au><snm>Wilmotte</snm><fnm>A</fnm></au></aug><source>Microbiol</source><pubdate>2010</pubdate><volume>156</volume><fpage>1575</fpage><lpage>1588</lpage></bibl><bibl id="B39"><title><p>GenePRIMP: a gene prediction improvement pipeline for prokaryotic genomes</p></title><aug><au><snm>Pati</snm><fnm>A</fnm></au><au><snm>Ivanova</snm><fnm>NN</fnm></au><au><snm>Mikhailova</snm><fnm>N</fnm></au><au><snm>Ovchinnikova</snm><fnm>G</fnm></au><au><snm>Hooper</snm><fnm>SD</fnm></au><au><snm>Lykidis</snm><fnm>A</fnm></au><au><snm>Kyrpides</snm><fnm>NC</fnm></au></aug><source>Nat Meth</source><pubdate>2010</pubdate><volume>7</volume><fpage>455</fpage><lpage>457</lpage></bibl><bibl id="B40"><title><p>Complete genome sequence of <it>Clostridium stercorarium</it> subsp. <it>stercorarium</it> strain DSM 8532, a thermophilic degrader of plant cell wall fibers</p></title><aug><au><snm>Poehlein</snm><fnm>A</fnm></au><au><snm>Zverlov</snm><fnm>VV</fnm></au><au><snm>Daniel</snm><fnm>R</fnm></au><au><snm>Schwarz</snm><fnm>WH</fnm></au><au><snm>Liebl</snm><fnm>W</fnm></au></aug><source>Genome Announc</source><pubdate>2013</pubdate><volume>1</volume><fpage>e00073</fpage><lpage>13</lpage></bibl><bibl id="B41"><title><p>TopHat: discovering splice junctions with RNA-Seq</p></title><aug><au><snm>Trapnell</snm><fnm>C</fnm></au><au><snm>Pachter</snm><fnm>L</fnm></au><au><snm>Salzberg</snm><fnm>SL</fnm></au></aug><source>Bioinformatics</source><pubdate>2009</pubdate><volume>25</volume><fpage>1105</fpage><lpage>1111</lpage></bibl><bibl id="B42"><title><p>The sequence alignment/map format and SAMtools</p></title><aug><au><snm>Li</snm><fnm>H</fnm></au><au><snm>Handsaker</snm><fnm>B</fnm></au><au><snm>Wysoker</snm><fnm>A</fnm></au><au><snm>Fennell</snm><fnm>T</fnm></au><au><snm>Ruan</snm><fnm>J</fnm></au><au><snm>Homer</snm><fnm>N</fnm></au><au><snm>Marth</snm><fnm>G</fnm></au><au><snm>Abecasis</snm><fnm>G</fnm></au><au><snm>Durbin</snm><fnm>R</fnm></au><au><cnm>1000 Genome Project Data Processing Subgroup</cnm></au></aug><source>Bioinformatics</source><pubdate>2009</pubdate><volume>25</volume><fpage>2078</fpage><lpage>2079</lpage></bibl><bibl id="B43"><title><p>End-product induced metabolic shifts in <it>Clostridium thermocellum</it> ATCC 27405</p></title><aug><au><snm>Rydzak</snm><fnm>T</fnm></au><au><snm>Levin</snm><fnm>DB</fnm></au><au><snm>Cicek</snm><fnm>N</fnm></au><au><snm>Sparling</snm><fnm>R</fnm></au></aug><source>Appl Microbiol Biotechnol</source><pubdate>2011</pubdate><volume>92</volume><fpage>199</fpage><lpage>209</lpage></bibl><bibl id="B44"><title><p><it>Thermoanaerobacter thermohydrosulfuricus</it> WC1 shows protein complement stability during fermentation of key lignocellulose-derived substrates</p></title><aug><au><snm>Verbeke</snm><fnm>TJ</fnm></au><au><snm>Spicer</snm><fnm>V</fnm></au><au><snm>Krokhin</snm><fnm>OV</fnm></au><au><snm>Zhang</snm><fnm>X</fnm></au><au><snm>Schellenberg</snm><fnm>JJ</fnm></au><au><snm>Fristensky</snm><fnm>B</fnm></au><au><snm>Wilkins</snm><fnm>JA</fnm></au><au><snm>Levin</snm><fnm>DB</fnm></au><au><snm>Sparling</snm><fnm>R</fnm></au></aug><source>Appl Environ Microbiol</source><pubdate>2014</pubdate><volume>80</volume><fpage>1602</fpage><lpage>1615</lpage></bibl><bibl id="B45"><title><p>Practical implementation of 2D HPLC scheme with accurate peptide retention prediction in both dimensions for high-throughput bottom-up proteomics</p></title><aug><au><snm>Dwivedi</snm><fnm>RC</fnm></au><au><snm>Spicer</snm><fnm>V</fnm></au><au><snm>Harder</snm><fnm>M</fnm></au><au><snm>Antonovici</snm><fnm>M</fnm></au><au><snm>Ens</snm><fnm>W</fnm></au><au><snm>Standing</snm><fnm>KG</fnm></au><au><snm>Wilkins</snm><fnm>JA</fnm></au><au><snm>Krokhin</snm><fnm>OV</fnm></au></aug><source>Anal Chem</source><pubdate>2008</pubdate><volume>80</volume><fpage>7036</fpage><lpage>7042</lpage></bibl><bibl id="B46"><title><p>A method for assessing the statistical significance of mass spectrometry-based protein identifications using general scoring schemes</p></title><aug><au><snm>Feny&#246;</snm><fnm>D</fnm></au><au><snm>Beavis</snm><fnm>RC</fnm></au></aug><source>Anal Chem</source><pubdate>2003</pubdate><volume>75</volume><fpage>768</fpage><lpage>774</lpage></bibl><bibl id="B47"><title><p>NADP&#8201;+&#8201;Reduction with reduced ferredoxin and NADP&#8201;+&#8201;Reduction with NADH are coupled via an electron-bifurcating enzyme complex in clostridium kluyveri</p></title><aug><au><snm>Wang</snm><fnm>S</fnm></au><au><snm>Huang</snm><fnm>H</fnm></au><au><snm>Moll</snm><fnm>J</fnm></au><au><snm>Thauer</snm><fnm>RK</fnm></au></aug><source>J Bacteriol</source><pubdate>2010</pubdate><volume>192</volume><fpage>5115</fpage><lpage>5123</lpage></bibl><bibl id="B48"><title><p>Complete genome and proteome of <it>Acholeplasma laidlawii</it></p></title><aug><au><snm>Lazarev</snm><fnm>VN</fnm></au><au><snm>Levitskii</snm><fnm>SA</fnm></au><au><snm>Basovskii</snm><fnm>YI</fnm></au><au><snm>Chukin</snm><fnm>MM</fnm></au><au><snm>Akopian</snm><fnm>TA</fnm></au><au><snm>Vereshchagin</snm><fnm>VV</fnm></au><au><snm>Kostrjukova</snm><fnm>ES</fnm></au><au><snm>Kovaleva</snm><fnm>GY</fnm></au><au><snm>Kazanov</snm><fnm>MD</fnm></au><au><snm>Malko</snm><fnm>DB</fnm></au><au><snm>Vitreschak</snm><fnm>AG</fnm></au><au><snm>Sernova</snm><fnm>NV</fnm></au><au><snm>Gelfand</snm><fnm>MS</fnm></au><au><snm>Demina</snm><fnm>IA</fnm></au><au><snm>Serebryakova</snm><fnm>MV</fnm></au><au><snm>Galyamina</snm><fnm>MA</fnm></au><au><snm>Vtyurin</snm><fnm>NN</fnm></au><au><snm>Rogov</snm><fnm>SI</fnm></au><au><snm>Alexeev</snm><fnm>DG</fnm></au><au><snm>Ladygina</snm><fnm>VG</fnm></au><au><snm>Govorun</snm><fnm>VM</fnm></au></aug><source>J Bacteriol</source><pubdate>2011</pubdate><volume>193</volume><fpage>4943</fpage><lpage>4953</lpage></bibl><bibl id="B49"><title><p>Comparative multi-omics systems analysis of <it>Escherichia coli</it> strains B and K-12</p></title><aug><au><snm>Yoon</snm><fnm>SH</fnm></au><au><snm>Han</snm><fnm>M-J</fnm></au><au><snm>Jeong</snm><fnm>H</fnm></au><au><snm>Lee</snm><fnm>CH</fnm></au><au><snm>Xia</snm><fnm>X-X</fnm></au><au><snm>Lee</snm><fnm>D-H</fnm></au><au><snm>Shim</snm><fnm>JH</fnm></au><au><snm>Lee</snm><fnm>SY</fnm></au><au><snm>Oh</snm><fnm>TK</fnm></au><au><snm>Kim</snm><fnm>JF</fnm></au></aug><source>Genome Biol</source><pubdate>2012</pubdate><volume>13</volume><fpage>R37</fpage></bibl><bibl id="B50"><title><p>Comparative omics-driven genome annotation refinement: application across <it>Yersiniae</it></p></title><aug><au><snm>Schrimpe-Rutledge</snm><fnm>AC</fnm></au><au><snm>Jones</snm><fnm>MB</fnm></au><au><snm>Chauhan</snm><fnm>S</fnm></au><au><snm>Purvine</snm><fnm>SO</fnm></au><au><snm>Sanford</snm><fnm>JA</fnm></au><au><snm>Monroe</snm><fnm>ME</fnm></au><au><snm>Brewer</snm><fnm>HM</fnm></au><au><snm>Payne</snm><fnm>SH</fnm></au><au><snm>Ansong</snm><fnm>C</fnm></au><au><snm>Frank</snm><fnm>BC</fnm></au><au><snm>Smith</snm><fnm>RD</fnm></au><au><snm>Peterson</snm><fnm>SN</fnm></au><au><snm>Motin</snm><fnm>VL</fnm></au><au><snm>Adkins</snm><fnm>JN</fnm></au></aug><source>PLoS One</source><pubdate>2012</pubdate><volume>7</volume><fpage>e33903</fpage></bibl><bibl id="B51"><title><p>Comparative genomic and transcriptomic analysis revealed genetic characteristics related to solvent formation and xylose utilization in <it>Clostridium acetobutylicum</it> EA 2018</p></title><aug><au><snm>Hu</snm><fnm>S</fnm></au><au><snm>Zheng</snm><fnm>H</fnm></au><au><snm>Gu</snm><fnm>Y</fnm></au><au><snm>Zhao</snm><fnm>J</fnm></au><au><snm>Zhang</snm><fnm>W</fnm></au><au><snm>Yang</snm><fnm>Y</fnm></au><au><snm>Wang</snm><fnm>S</fnm></au><au><snm>Zhao</snm><fnm>G</fnm></au><au><snm>Yang</snm><fnm>S</fnm></au><au><snm>Jiang</snm><fnm>W</fnm></au></aug><source>BMC Genomics</source><pubdate>2011</pubdate><volume>12</volume><fpage>93</fpage></bibl><bibl id="B52"><title><p>Evidence-based annotation of gene function in <it>Shewanella oneidensis</it> MR-1 using genome-wide fitness profiling across 121 conditions</p></title><aug><au><snm>Deutschbauer</snm><fnm>A</fnm></au><au><snm>Price</snm><fnm>MN</fnm></au><au><snm>Wetmore</snm><fnm>KM</fnm></au><au><snm>Shao</snm><fnm>W</fnm></au><au><snm>Baumohl</snm><fnm>JK</fnm></au><au><snm>Xu</snm><fnm>Z</fnm></au><au><snm>Nguyen</snm><fnm>M</fnm></au><au><snm>Tamse</snm><fnm>R</fnm></au><au><snm>Davis</snm><fnm>RW</fnm></au><au><snm>Arkin</snm><fnm>AP</fnm></au></aug><source>PLoS Genet</source><pubdate>2011</pubdate><volume>7</volume><fpage>e1002385</fpage></bibl><bibl id="B53"><title><p>Improved genome annotation for <it>Zymomonas mobilis</it></p></title><aug><au><snm>Yang</snm><fnm>S</fnm></au><au><snm>Pappas</snm><fnm>KM</fnm></au><au><snm>Hauser</snm><fnm>LJ</fnm></au><au><snm>Land</snm><fnm>ML</fnm></au><au><snm>Chen</snm><fnm>G-L</fnm></au><au><snm>Hurst</snm><fnm>GB</fnm></au><au><snm>Pan</snm><fnm>C</fnm></au><au><snm>Kouvelis</snm><fnm>VN</fnm></au><au><snm>Typas</snm><fnm>MA</fnm></au><au><snm>Pelletier</snm><fnm>DA</fnm></au><au><snm>Klingeman</snm><fnm>DM</fnm></au><au><snm>Chang</snm><fnm>Y-J</fnm></au><au><snm>Samatova</snm><fnm>NF</fnm></au><au><snm>Brown</snm><fnm>SD</fnm></au></aug><source>Nat Biotechnol</source><pubdate>2009</pubdate><volume>27</volume><fpage>893</fpage><lpage>894</lpage></bibl><bibl id="B54"><title><p>How deep is deep enough for RNA-Seq profiling of bacterial transcriptomes?</p></title><aug><au><snm>Haas</snm><fnm>BJ</fnm></au><au><snm>Chin</snm><fnm>M</fnm></au><au><snm>Nusbaum</snm><fnm>C</fnm></au><au><snm>Birren</snm><fnm>BW</fnm></au><au><snm>Livny</snm><fnm>J</fnm></au></aug><source>BMC Genomics</source><pubdate>2012</pubdate><volume>13</volume><fpage>734</fpage></bibl><bibl id="B55"><title><p>Optimizing hybrid assembly of next-generation sequence data from <it>Enterococcus faecium</it>: a microbe with highly divergent genome</p></title><aug><au><snm>Wang</snm><fnm>Y</fnm></au><au><snm>Yu</snm><fnm>Y</fnm></au><au><snm>Pan</snm><fnm>B</fnm></au><au><snm>Hao</snm><fnm>P</fnm></au><au><snm>Li</snm><fnm>Y</fnm></au><au><snm>Shao</snm><fnm>Z</fnm></au><au><snm>Xu</snm><fnm>X</fnm></au><au><snm>Li</snm><fnm>X</fnm></au></aug><source>BMC Syst Biol</source><pubdate>2012</pubdate><volume>6</volume><fpage>S21</fpage></bibl><bibl id="B56"><title><p>Quantitative iTRAQ secretome analysis of Cellulolytic <it>Thermobifida fusca</it></p></title><aug><au><snm>Adav</snm><fnm>SS</fnm></au><au><snm>Ng</snm><fnm>CS</fnm></au><au><snm>Arulmani</snm><fnm>M</fnm></au><au><snm>Sze</snm><fnm>SK</fnm></au></aug><source>J Proteome Res</source><pubdate>2010</pubdate><volume>9</volume><fpage>3016</fpage><lpage>3024</lpage></bibl><bibl id="B57"><title><p>Quantitative proteomics as a new piece of the systems biology puzzle</p></title><aug><au><snm>Bachi</snm><fnm>A</fnm></au><au><snm>Bonaldi</snm><fnm>T</fnm></au></aug><source>J Proteomics</source><pubdate>2008</pubdate><volume>71</volume><fpage>357</fpage><lpage>367</lpage></bibl><bibl id="B58"><title><p>
   <b>Quantifying </b>
   <b>
      <it>E. coli </it>
   </b>
   <b>Proteome and Transcriptome with Single-Molecule Sensitivity in Single Cells.</b>
</p></title><aug><au><snm>Taniguchi</snm><fnm>Y</fnm></au><au><snm>Choi</snm><fnm>PJ</fnm></au><au><snm>Li</snm><fnm>GW</fnm></au><au><snm>Chen</snm><fnm>H</fnm></au><au><snm>Babu</snm><fnm>M</fnm></au><au><snm>Hearn</snm><fnm>J</fnm></au><au><snm>Emili</snm><fnm>A</fnm></au><au><snm>Xie</snm><fnm>XS</fnm></au></aug><source>Science</source><pubdate>2010</pubdate><volume>329</volume><fpage>533</fpage><lpage>538</lpage></bibl><bibl id="B59"><title><p>Transcriptome and proteome exploration to model translation efficiency and protein stability in <it>Lactococcus lactis</it></p></title><aug><au><snm>Dressaire</snm><fnm>C</fnm></au><au><snm>Gitton</snm><fnm>C</fnm></au><au><snm>Loubi&#232;re</snm><fnm>P</fnm></au><au><snm>Monnet</snm><fnm>V</fnm></au><au><snm>Queinnec</snm><fnm>I</fnm></au><au><snm>Cocaign-Bousquet</snm><fnm>M</fnm></au></aug><source>PLoS Comput Biol</source><pubdate>2009</pubdate><volume>5</volume><fpage>e1000606</fpage></bibl><bibl id="B60"><title><p>Structure, function, and evolution of bacterial ATP-binding cassette systems</p></title><aug><au><snm>Davidson</snm><fnm>AL</fnm></au><au><snm>Dassa</snm><fnm>E</fnm></au><au><snm>Orelle</snm><fnm>C</fnm></au><au><snm>Chen</snm><fnm>J</fnm></au></aug><source>Microbiol Mol Biol Rev</source><pubdate>2008</pubdate><volume>72</volume><fpage>317</fpage><lpage>364</lpage></bibl><bibl id="B61"><title><p>Chloroplast class I and class II aldolases are bifunctional for fructose-1, 6-biphosphate and sedoheptulose-1, 7-biphosphate cleavage in the Calvin cycle</p></title><aug><au><snm>Flechner</snm><fnm>A</fnm></au><au><snm>Gross</snm><fnm>W</fnm></au><au><snm>Martin</snm><fnm>WF</fnm></au><au><snm>Schnarrenberger</snm><fnm>C</fnm></au></aug><source>FEBS Lett</source><pubdate>1999</pubdate><volume>447</volume><fpage>200</fpage><lpage>202</lpage></bibl><bibl id="B62"><title><p>Systematic phenome analysis of <it>Escherichia coli</it> multiple-knockout mutants reveals hidden reactions in central carbon metabolism</p></title><aug><au><snm>Nakahigashi</snm><fnm>K</fnm></au><au><snm>Toya</snm><fnm>Y</fnm></au><au><snm>Ishii</snm><fnm>N</fnm></au><au><snm>Soga</snm><fnm>T</fnm></au><au><snm>Hasegawa</snm><fnm>M</fnm></au><au><snm>Watanabe</snm><fnm>H</fnm></au><au><snm>Takai</snm><fnm>Y</fnm></au><au><snm>Honma</snm><fnm>M</fnm></au><au><snm>Mori</snm><fnm>H</fnm></au><au><snm>Tomita</snm><fnm>M</fnm></au></aug><source>Mol Syst Biol</source><pubdate>2009</pubdate><volume>5</volume><fpage>306</fpage></bibl><bibl id="B63"><title><p><it>Acetivibrio cellulosolvens</it> is a synonym for <it>Acetivibrio cellulolyticus</it>: emendation of the genus <it>Acetivibrio</it></p></title><aug><au><snm>Murray</snm><fnm>WD</fnm></au></aug><source>Int J Syst Evol Microbiol</source><pubdate>1986</pubdate><volume>36</volume><fpage>314</fpage><lpage>316</lpage></bibl><bibl id="B64"><title><p>Rampant horizontal gene transfer and phospho-donor change in the evolution of the phosphofructokinase</p></title><aug><au><snm>Bapteste</snm><fnm>E</fnm></au><au><snm>Moreira</snm><fnm>D</fnm></au><au><snm>Philippe</snm><fnm>H</fnm></au></aug><source>Gene</source><pubdate>2003</pubdate><volume>318</volume><fpage>185</fpage><lpage>191</lpage></bibl><bibl id="B65"><title><p>The iron-hydrogenase of <it>Thermotoga maritima</it> Utilizes Ferredoxin and NADH synergistically: a new perspective on anaerobic hydrogen production</p></title><aug><au><snm>Schut</snm><fnm>GJ</fnm></au><au><snm>Adams</snm><fnm>MWW</fnm></au></aug><source>J Bacteriol</source><pubdate>2009</pubdate><volume>191</volume><fpage>4451</fpage><lpage>4457</lpage></bibl><bibl id="B66"><title><p>Isolation and characterization of <it>Clostridium stercorarium</it> sp. nov., cellulolytic thermophile</p></title><aug><au><snm>Madden</snm><fnm>R</fnm></au></aug><source>Int J Syst Bacteriol</source><pubdate>1983</pubdate><volume>33</volume><fpage>837</fpage><lpage>840</lpage></bibl><bibl id="B67"><title><p>Transfer of <it>Thermobacteroides leptospartum</it> and <it>Clostridium thermolacticum</it> as <it>Clostridium stercorarium</it> subsp. <it>leptospartum</it> subsp. nov., comb. nov. and <it>C. stercorarium</it> subsp. <it>thermolacticum</it> subsp. nov., comb. nov</p></title><aug><au><snm>Fardeau</snm><fnm>M</fnm></au><au><snm>Ollivier</snm><fnm>B</fnm></au><au><snm>Garcia</snm><fnm>J</fnm></au><au><snm>Patel</snm><fnm>B</fnm></au></aug><source>Int J Syst Evol Microbiol</source><pubdate>2001</pubdate><volume>51</volume><fpage>1127</fpage><lpage>1131</lpage></bibl></refgrp>
	</bm>
</art>