Difference between revisions of "MSblender"

From Marcotte Lab
Jump to: navigation, search
(Pre-processing)
(Pre-processing)
Line 32: Line 32:
 
* [http://www.ncbi.nlm.nih.gov/pubmed/17269722 MyriMatch], mvh
 
* [http://www.ncbi.nlm.nih.gov/pubmed/17269722 MyriMatch], mvh
 
* [http://www.ncbi.nlm.nih.gov/pubmed/20829449 MSGFDB], -log(SpecProb)
 
* [http://www.ncbi.nlm.nih.gov/pubmed/20829449 MSGFDB], -log(SpecProb)
 +
 +
For example, you can convert X!Tandem pepxml file to logE_hit_score as below:
 +
<pre>$ ../src/MSblender-20110130/pre/sequest_pepxml-to-xcorr_hit_list.py test.sequest.pepxml
 +
Write test.sequest.xcorr_hit_list ...
 +
$ </pre>
 +
 +
The hit_list file looks like as below:
 +
<pre># pepxml: test.sequest.pepxml
 +
#Spectrum_id Charge PrecursorMz MassDiff Peptide Protein MissedCleavages Score(Xcorr)
 +
MSups_5ul.04192.04194.2 2 577.843127 0.006395 MLVVLLQANR ANXA5_HUMAN_UPS|P08758|5000|0.5|319 0 0.524123
 +
MSups_5ul.07228.07228.4 4 689.596178 0.002584 SLLSNVEGDNAVPMQHNNRPTQPLK CAH1_HUMAN_UPS|P00915|5000|50000|260 1 2.518871
 +
MSups_5ul.11647.11647.2 2 592.839464 -0.000197 ADGLAVIGVLMK CAH1_HUMAN_UPS|P00915|5000|50000|260 0 2.787324
 +
MSups_5ul.05651.05651.3 3 549.303576 -0.003018 VWPHKDYPLIPVGK CATA_HUMAN_UPS|P04040|5000|5000|526 1 2.593570
 +
....
 +
</pre>
  
 
== Citation ==
 
== Citation ==

Revision as of 14:15, 30 January 2011

MSblender is a statistical tool for merging database search results from multiple database search engines for peptide identification based on a multivariate modelling approach. We will present this work at RECOMB-CP 2011 in March, 2011.

Contents

Authors

Prerequisites

(We tested our codes at Mac OSX (10.5 Leopard) and Ubuntu Linux (10.04 and later). We don't support MS Windows platform yet.) To run MSblender, you should install the following programs/packages on the machine.

  • python (2.5 or later)
  • gcc (we used version 4.4.3, but we believe that our ANSI-C based codes are not dependent on specific version of gcc).
  • GNU Scientific Library (version 1.13 or later)
    • If you use ubuntu (or debian) linux, install 'gsl-bin' and 'libgsl0-*' packages.
  • (Optional) matplotlib (python graph library). Only required for 'pre/plot-his_list.py' script.

Installation

  • Download source code from GitHub. Alternatively, you can download it from http://www.marcottelab.org/users/MSblender/src/MSblender-current.tgz .
  • Enter to 'c/' directory, and execute './compile' script. You should have GNU Scientific Library before running this script. It will generate 'msblender' and 'msblender.h.gch' files at the same directory.
  • That's it. Now you are ready to run MSblender.

How to use

MSblender is working in three steps: pre-processing, modelling and post-processing.

Pre-processing

First MSblender converts various search engine results into a unified tab-delimited text file called 'hit_list' format. Then it transfers 'hit_list' to MSblender modelling program input file.

Currently, MSblender supports the following search engine results (and scores).

For example, you can convert X!Tandem pepxml file to logE_hit_score as below:

$ ../src/MSblender-20110130/pre/sequest_pepxml-to-xcorr_hit_list.py test.sequest.pepxml 
Write test.sequest.xcorr_hit_list ... 
$ 

The hit_list file looks like as below:

# pepxml: test.sequest.pepxml
#Spectrum_id	Charge	PrecursorMz	MassDiff	Peptide	Protein	MissedCleavages	Score(Xcorr)
MSups_5ul.04192.04194.2	2	577.843127	0.006395	MLVVLLQANR	ANXA5_HUMAN_UPS|P08758|5000|0.5|319	0	0.524123
MSups_5ul.07228.07228.4	4	689.596178	0.002584	SLLSNVEGDNAVPMQHNNRPTQPLK	CAH1_HUMAN_UPS|P00915|5000|50000|260	1	2.518871
MSups_5ul.11647.11647.2	2	592.839464	-0.000197	ADGLAVIGVLMK	CAH1_HUMAN_UPS|P00915|5000|50000|260	0	2.787324
MSups_5ul.05651.05651.3	3	549.303576	-0.003018	VWPHKDYPLIPVGK	CATA_HUMAN_UPS|P04040|5000|5000|526	1	2.593570
....

Citation

  • T. Kwon*, H. Choi*, C. Vogel, A.I. Nesvizhskii, and E.M. Marcotte, MSblender: a probabilistic approach for integrating peptide identifications from multiple database search engines. Submitted.

See also