TY - JOUR
T1 - UVnovo
T2 - A de Novo Sequencing Algorithm Using Single Series of Fragment Ions via Chromophore Tagging and 351 nm Ultraviolet Photodissociation Mass Spectrometry
AU - Robotham, Scott A.
AU - Horton, Andrew P.
AU - Cannon, Joe R.
AU - Cotham, Victoria C.
AU - Marcotte, Edward M.
AU - Brodbelt, Jennifer S.
N1 - Funding Information:
Funding from the NSF (Grant CHE-1402753) and the Welch Foundation (Grant F-1155) to J.S.B. and from the NIH, NSF, ARO Award W911NF-12-1-0390, Welch Foundation (Grant F-1515), DARPA, and DTRA to E.M.M. is acknowledged. J.R.C. acknowledges support from NIH Grant 1K12GM102745. Our data is publicly available at ProteomeXchange with accession ID: PXD003767. UVnovo is available for download at https://github.com/marcottelab/UVnovo. We thank Dr. William Press for helping to guide our earliest efforts in computational de novo sequence analysis, including suggesting the method of spectral normalization and the use of the HMM.
Publisher Copyright:
© 2016 American Chemical Society.
PY - 2016/4/5
Y1 - 2016/4/5
N2 - De novo peptide sequencing by mass spectrometry represents an important strategy for characterizing novel peptides and proteins, in which a peptide's amino acid sequence is inferred directly from the precursor peptide mass and tandem mass spectrum (MS/MS or MS3) fragment ions, without comparison to a reference proteome. This method is ideal for organisms or samples lacking a complete or well-annotated reference sequence set. One of the major barriers to de novo spectral interpretation arises from confusion of N- and C-terminal ion series due to the symmetry between b and y ion pairs created by collisional activation methods (or c, z ions for electron-based activation methods). This is known as the "antisymmetric path problem" and leads to inverted amino acid subsequences within a de novo reconstruction. Here, we combine several key strategies for de novo peptide sequencing into a single high-throughput pipeline: high-efficiency carbamylation blocks lysine side chains, and subsequent tryptic digestion and N-terminal peptide derivatization with the ultraviolet chromophore AMCA yield peptides susceptible to 351 nm ultraviolet photodissociation (UVPD). UVPD-MS/MS of the AMCA-modified peptides then predominantly produces y ions in the MS/MS spectra, specifically addressing the antisymmetric path problem. Finally, the program UVnovo applies a random forest algorithm to automatically learn from and then interpret UVPD mass spectra, passing results to a hidden Markov model for de novo sequence prediction and scoring. We show this combined strategy provides high-performance de novo peptide sequencing, enabling the de novo sequencing of thousands of peptides from an Escherichia coli lysate at high confidence. (Graph Presented).
AB - De novo peptide sequencing by mass spectrometry represents an important strategy for characterizing novel peptides and proteins, in which a peptide's amino acid sequence is inferred directly from the precursor peptide mass and tandem mass spectrum (MS/MS or MS3) fragment ions, without comparison to a reference proteome. This method is ideal for organisms or samples lacking a complete or well-annotated reference sequence set. One of the major barriers to de novo spectral interpretation arises from confusion of N- and C-terminal ion series due to the symmetry between b and y ion pairs created by collisional activation methods (or c, z ions for electron-based activation methods). This is known as the "antisymmetric path problem" and leads to inverted amino acid subsequences within a de novo reconstruction. Here, we combine several key strategies for de novo peptide sequencing into a single high-throughput pipeline: high-efficiency carbamylation blocks lysine side chains, and subsequent tryptic digestion and N-terminal peptide derivatization with the ultraviolet chromophore AMCA yield peptides susceptible to 351 nm ultraviolet photodissociation (UVPD). UVPD-MS/MS of the AMCA-modified peptides then predominantly produces y ions in the MS/MS spectra, specifically addressing the antisymmetric path problem. Finally, the program UVnovo applies a random forest algorithm to automatically learn from and then interpret UVPD mass spectra, passing results to a hidden Markov model for de novo sequence prediction and scoring. We show this combined strategy provides high-performance de novo peptide sequencing, enabling the de novo sequencing of thousands of peptides from an Escherichia coli lysate at high confidence. (Graph Presented).
UR - http://www.scopus.com/inward/record.url?scp=84963860840&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84963860840&partnerID=8YFLogxK
U2 - 10.1021/acs.analchem.6b00261
DO - 10.1021/acs.analchem.6b00261
M3 - Article
C2 - 26938041
AN - SCOPUS:84963860840
VL - 88
SP - 3990
EP - 3997
JO - Analytical Chemistry
JF - Analytical Chemistry
SN - 0003-2700
IS - 7
ER -