Data Analysis

Proteogenomic analysis of Toxoplasma gondii (comparing genome releases 6 and 10) to demonstrate the performance of the ProteoAnnotator software


Project Description: Testing the performance of the ProteoAnnotator software via comparing analysis of annotating the Toxoplasma gondii gene models 6 and 10 Sample processing Protocol: The samples were generated from Toxoplasma gondii strain RH parasites. Tachyzoites were separated by 1D SDS-PAGE on a 12% (v/v) acrylamide gel, from which 16 gel bands were excised and digested with trypsin. The digests were then pooled into eight samples for LC-MS/MS analysis. Peptide mixtures were analyzed by on-line nanoflow liquid chromatography using the nanoACQUITY-nLC system (Waters MS technologies, Manchester, UK) coupled to an LTQ-Orbitrap Velos (ThermoFisher Scientific, Bremen, Germany) mass spectrometer equipped with the manufacturer’s nanospray ion source.


Data processing protocol: Thermo raw files were converted to MGF for searching using ProteoWizard. Searches were done using the ProteoAnnotator pipeline, which embeds OMSSA and X!Tandem search engines – wrapped by the SearchGUI Software. Search parameters were: precursor tolerance 5ppm, fragment tolerance (default: 0.5Da), fixed mods: carbamidomethyl on Cysteine, and variable modification of oxidation of methionine. Other parameters were left as defaults, as described at the SearchGUI website. Post-processing involving combining search engines according to PMID: 19253293, performing protein inference using an update to the algorithm described in PMID: 23813117, followed by bespoke statistical processing developed for the ProteoAnnotator software. Several different search databases were used. First, a combined search database was created from Toxoplasma gondii ME49 strain gene models release 6, (downloaded from EuPathDB), predicted gene models from AUGUSTUS and predicted gene models from GLIMMER (both built from the Toxoplasma gondii ME49 genome sequence). Second, the data were searched against Toxoplasma gondii ME49 strain gene models release 10 (downloaded from EuPathDB) with the same two predictions (AUGUSTUS and GLIMMER) as before, to demonstrate the improvement in genome annotation that has occurred between release 6 (2009) and release 10 of the T. gondii genome.