WP2: Standards development
Workpackage No.:
WP2
Workpackage Title:
Standards development
Activity Type:
COORD
Objectives:
The work package will develop and maintain the standard formats required for capturing and sharing data from
mass-spectrometry (MS) based proteomics studies, building upon the efforts of the HUPO PSI. At the moment,
the PSI is developing four XML-based data formats that each captures a different component of the proteomics
workflow: mzML (mass spectra, version 1.1), TraML (transitions from multiple/selected reaction monitoring
experiments, version 0.9), mzIdentML (peptide and protein identifications, version 1.0) and mzQuantML
(quantification of peptides and proteins from mass spectra, initial requirements drafted).
The main aim of this work package is to develop and release mzQuantML to capture data from all main
quantitative methods in mass spectrometry proteomics. Additionally, the WP will also contribute to the on-going
maintenance of release versions of mzML, TraML and mzIdentML, helping with bug fixes, updates to controlled
vocabularies and documentation.
Description:
Description of work
Task 1: Development of mzQuantML
The WP will drive the development the XML Schema (XSD) for mzQuantML in the context of the PSI Proteome
Informatics (PSI-PI) work group (http://www.psidev.info/index.php?q=node/319). The task of developing a
standard format for quantitative data is considerable, due to the range of different methods employed for
quantifying peptides and proteins from mass spectra, for example based on differential labeling of mixed
samples or by label-free analysis of parallel runs (2).
In the last spring PSI meeting (April 2009), it was decided that mzQuantML would be developed in a modular
manner. First, a core format will be defined to capture overall relative or absolute values of peptides and
proteins, and to relate these values back to the originating samples. Small modular extensions of the core will
then be released to capture the required metadata describing how the values were calculated. It is generally
accepted that standard formats are difficult to define until a particular technology or methodology has stabilized.
As such, modules will be released first for well-established quantitative techniques (Level 1: spectral counting,
differential isotope labeling). The technologies that are more challenging to represent (Level 2: label-free
analysis based on extracted ion chromatograms) will be developed next. There are new methods reported for
MS-quantitation on a monthly basis. Those methods that become well-established in the coming years will be
developed as level 3 of mzQuantML.
The main development activities for mzQuantML include: (i) interacting with bench scientists and extensive
reading of the literature to define accurate community requirements and a set of comprehensive use cases;
(ii) building drafts of the schema, a supporting controlled vocabulary and example files to exercise each use
case; (iii) coordinating regular conference calls with the other stakeholders (open-source software developers,
commercial software vendors and proteome informatics researchers) to ensure that the PSI-PI works as a
coherent development team; (iv) defining and adhering to a strict release schedule for modules to allow software
developers to schedule implementation of mzQuantML; (v) presentation at conferences and development
meetings to received feedback from those not engaged in PSI, include laboratory scientists.
Task 2: Maintenance of mzML, TraML and mzIdentML
The work package will also contribute to the routine maintenance of the already released versions of the
standards mzML, TraML and mzIdentML. In this context, regular updates to the PSI MS controlled vocabulary
(http://www.ebi.ac.uk/ontology-lookup/browse.do?ontName=MS) will be made to accommodate, for instance,
new experimental technologies or new software platforms. Documentation will also need to be updated
periodically. It is expected that the majority of the maintenance work will be devoted to mzIdentML, developed by
the PSI-PI working group, since it is the most recently released standard. On the other hand, updates to mzML
and TraML, developed by the PSI mass spectrometry (PSI-MS) working group, should be less time-consuming
since they are currently much more stable.
WP 2 will be coordinated by Dr. Andy Jones, from the University of Liverpool, who chairs the PSI-PI working
group. AJ will attend PSI-PI meetings and conference calls, ensuring that development milestones are met.
All participants contribute significant resources to this work package, to ensure that the developed standards
meet the needs of the ProteomeXchange consortium and the wider community. The significant overall resource
investment in this work package aims to overcome a limiting factor in standards development, namely the
irregular progress due to the voluntary nature of contributions which has dominated the first years of PSI
existence. With dedicated funding support from the recently completed EU ProDaC grant, we managed to stratify
the PSI standards development, and deliver standards to a predefined timetable.
Lead Partner:
Deliverables
Milestones
| Milestone No. | Milestone Name |
|---|---|
| MS2 | Framework for representation of the first full quantitative datasets |


