Comments on LC-MS/MS Presets

Default MS-BLAST Search settings (you see them once opening the front page) have been optimized for sequence-similarity identifications with relatively small and accurate peptide queries, which are typically produced by accurate interpretation of good quality MS/MS spectra from "hand-picked" precursors. These settings maximize the sensitivity of sequence-similarity identifications and strictly follow MS-BLAST scoring scheme (Habermann et al., 2004). Please note, however, that the scheme relies on the threshold scores determined in computational experiments using model, small size queries. This, however, could hardly be expanded to ca 100-fold larger queries produced by automated interpretation of hundreds of MS/MS spectra from data-dependent LC-MS/MS (Waridel et al., 2007) and therefore MS-BLAST settings required some practical adjustments. This is achieved by activating LC-MS/MS Presets box, which changes several important MS-BLAST settings:

  1. S and S2 are score thresholds for, respectively, the highest and all other HSPs reported for the database search hit. Specifying the threshold at reasonably high values prevents MS-BLAST from reporting many weakly matching HSPs that plug the scoring scheme and increase the false positive rate. However, if necessary, weaker alignments might still be included by lowering the corresponding thresholds.
  2. B, V, hspmax are, respectively, the number of allowed alignments, descriptions and HSPs. By default, they have been set at the arbitrary value of 1000, which, in our experience, should suffice for processing queries compiled by the automated interpretation of 500-600 MS/MS spectra. Note that unnecessary high settings slows down the searches considerably and only increases the number of reported non-confident alignments. Use higher settings only if MS-BLAST produces a warning message if the limit for B, V and hspmax was exceeded.
  3. Filtering of low complexity sequences has been engaged by setting it to default. This filter, effectively, eliminates low complexity sequence stretches that are common in human and sheep keratin peptides - ubiquitous contaminants commonly encountered in the analysis of in-gel digests. The filter does not delete these peptide, but substitutes the corresponding regions by zero-scoring X symbols - they could be recognized in the input query reported at the top of MS BLAST output page. Albeit low complexity filtering reduces the number of keratin-related hits, it could accidentally eliminate bona fide proteins. Hence, repeating the search with the low complexity filter turned off, it is recommended for the most complete characterization of the analyzed sample - note, however, that this search might require setting higher B, V and hspmax values.

References:

  • Habermann, B., Oegema, J., Sunyaev, S., and Shevchenko, A. (2004). The power and the limitations of cross-species protein identification by mass spectrometry-driven sequence similarity searches. Mol Cell Proteomics 3, 238-249.
  • Waridel, P., Frank, A., Thomas, H., Surendranath, V., Sunyaev, S., Pevzner, P., and Shevchenko, A. (2007). Sequence similarity-driven proteomics in organisms with unknown genomes by LC-MS/MS and automated de novo sequencing. Proteomics 7, in press.

E-mail us Last modified: 06/26/2007