Ghali, F, Krishna, R, Perkins, S, Collins, A, Xia, D, Wastling, J and Jones, AR (2014) ProteoAnnotator--open source proteogenomics annotation software supporting PSI standards. Proteomics, 14 (23-24). 2731 - 2741. ISSN 2731-2741

[thumbnail of J Wastling - ProteoAnnotator - Open Source proteogenomics annotation software supporting PSI standards.pdf]
J Wastling - ProteoAnnotator - Open Source proteogenomics annotation software supporting PSI standards.pdf - Published Version
Available under License Creative Commons Attribution.

Download (797kB) | Preview


The recent massive increase in capability for sequencing genomes is producing enormous advances in our understanding of biological systems. However, there is a bottleneck in genome annotation--determining the structure of all transcribed genes. Experimental data from MS studies can play a major role in confirming and correcting gene structure--proteogenomics. However, there are some technical and practical challenges to overcome, since proteogenomics requires pipelines comprising a complex set of interconnected modules as well as bespoke routines, for example in protein inference and statistics. We are introducing a complete, open source pipeline for proteogenomics, called ProteoAnnotator, which incorporates a graphical user interface and implements the Proteomics Standards Initiative mzIdentML standard for each analysis stage. All steps are included as standalone modules with the mzIdentML library, allowing other groups to re-use the whole pipeline or constituent parts within other tools. We have developed new modules for pre-processing and combining multiple search databases, for performing peptide-level statistics on mzIdentML files, for scoring grouped protein identifications matched to a given genomic locus to validate that updates to the official gene models are statistically sound and for mapping end results back onto the genome. ProteoAnnotator is available from All MS data have been deposited in the ProteomeXchange with identifiers PXD001042 and PXD001390 (;

Item Type: Article
Uncontrolled Keywords: Open source, ProteoAnnotator, Proteogenomics, Proteomics Standards Initiative, mzIdentML, Genomics, Proteins, Proteomics, Software

Q Science > QH Natural history

Divisions: Faculty of Natural Sciences > School of Life Sciences
Related URLs:
Depositing User: Symplectic
Date Deposited: 18 May 2017 08:40
Last Modified: 18 May 2017 08:40

Actions (login required)

View Item
View Item