Generating Visualizations

The purpose of peptagram is to visually compare peptide hits in proteins across different proteomics experiments.

Peptagram makes visualizations of peptide hits of results generated from Morpheus, Mascot, Maxqaunt, ProteinProphet/PeptideProphet, X!Tandem and ProteinPilot.

Some test data is provided in example_data.zip. If you unzip this in the peptagram directory, you should get a sub-directory example_data, which contains a list of sub-directories mascot, morpheus etc.

For each search engine, there is a separate program to generate a peptagram. Let's look at one in detail in the next section.

Example: Making Morpheus Peptagrams

Morpheus is a fast search engine designed for high-quality data. As Morpheus does not come with a bundled viewer, peptagram provides a unique tool to view Morpheus search results.

The scripts that process Morpheus have morpheus_peptagram in their name:

  • mac_morpheus_peptagram.command - which can be clicked in Finder on Mac OSX
  • win_morpheus_peptagram.bat - which can be clicked in WIndows Filer Explorer
  • do_morpheus_peptagram.py - which is run on the command-line as python do_morpheus_peptagram.py -i

Test

To run an automated test, unzip example_data.zip. It should create an example_data directory containing sub-directories such as morpheus, mascot etc. Then to run the test:

python do_morpheus_peptagram.py test

Create a Peptagram with the GUI

To use morpheus_peptagram, first start the program, then you should get a window that looks something like this:

To load the Morpheus files, click + PSMs.tsv files and select all the .PSMs.tsv files that you want to compare. morpheus_peptagram will figure out the corresponding .protein_group.tsv files from these filenames.

Once selected, you'll see a list of files:

You can now reorder the .PSMs.tsv files into your preferred order by dragging the ☰ icon.

Then you can scroll down to the bottom, and click the submit button:

If there are any errors encountered, they'll appear below the submit button. Hopefully the error message at the last line will help you trouble-shoot the problem, and then you can click submit again.

If it worked, you'll get a link to a newly created directory containing your peptagram:

Later, you might want to tweak the options, such as loading the spectra from the original .mzML files, or restricting the display to matches by the Q-score.

Available Peptagram builders

  • morpheus_peptagram - processes Morpheus search engine results. Requires the modifications.tsv file for modified peptides, and optionally the .mzML file if you want to display spectra
  • mascot_peptagram - processes Mascot .dat. Shows the spectra for each PSM. Requires the original .fasta file to get the full length protein sequences
  • maxquant_peptagram - process Maxquant txt/summary directories. Shows the matched ions in the spectra only. Requires the original fasta file to get the full length protein sequences.
  • prophet_peptagram - processes the TPP's prot.xml and pep.xml files. Requires the original .fasta file to get the full length protein sequences
  • xtandem_peptagram - processes X!Tandem search results. Shows the spectra for each PSM.
  • pilot_peptagram - processes Protein Pilot .txt/.csv result files. Requires the original .fasta file to get the full length protein sequences

Managing the size of the peptagrams

As some peptagrams can get really big easily, there are a number of options to filter out low-quality matches, from using quality scores like pep and ionscore.

Some of the programs group similar proteins together (Morpheus, Maxquant, Protein Prophet). This effectively reduces the number of proteins, and therefore the size of the resultant peptagram. For the other search engines, you can load a text-file of seqids of contaminants to exclude. Or you can load another text-file of seqids that will be shown.

As well, you can limit the display to tryptic, semi-tryptic or modified peptides.

Combining Different Peptagrams

After you're created the peptagrams using the scripts described above, you can now recombine and edit them using reorder_peptagram.

With reorder_peptagram, you can load existing peptagrams, and stack all the rows of the different peptagrams against each other.

Then you reorder and relabel each row, before saving it to a new peptagram.

The one catch is that the sequence IDs of your proteins have to match. One way is to ensure that you use the same FASTA database for your search and that the sequence IDs are single words separated by a space from the description.