Help

Site Navigation

Help Pages

The PepX help pages contain a tutorial, detailed explanations about the methodology, structural classifications and the BriX peptide annotations, as well as an example on how to interact with PepX and how to use PepX search. You can browse through the items by using the navigation links below each help item, or by using the menu in the left pane.

 

Tutorial

This tutorial walks you to the process of finding hits for the thrombin-trypsin inhibitor complex.

Searching for the keywords “thrombin” and “inhibitor” provides a list of hits. For the selected entry 1BTH various types of information are shown, as well as a listing of the clusters the complex belongs to. General properties of the PDB entry are accompanied by 3D views of the full complex and detailed views of the peptide binding site generated by Yasara. The binding energy between protein and peptide as calculated by the FoldX force field is shown together with details for the hydrogen bond interactions.  Various statistics regarding the secondary structure content and flexibility parameters for the binding site are listed, as well as direct links to relevant databases. The peptides are annotated with naturally occurring backbone variations using fragment clusters from the BriX database.

fig2.png


 

Methodology

Selection Strategy

PepX was constructed from the Protein Data Bank. We filtered for protein-peptide complexes requiring

  1. X-Ray structures with a resolution lower than 2.5 Å
  2. peptides with a size from 5 to 35 amino acids
  3. peptides containing natural amino acids only
  4. receptors with a minimum size of 35 amino acids
  5. the first unit in the PDB in case of crystallographic symmetry

Peptide Definition

In PepX, we define a peptide as follows:

Clustering Algorithm

All the protein-peptide complexes in PepX were clustered on their binding sites using Hierarchical Agglomerative Clustering, the same algorithm used to construct BriX. The distance matrix used in the clustering contains the RMSD values between any two protein-peptide binding sites. computed with Mustang.

Alignment

The Alignment value is used to express the % of the Binding Site of the protein-peptide complex that is used in clustering. The higher the alignment, the more of the binding site is used in clustering, and thus the more clusters there will be.

Threshold

The Threshold value is the maximum allowed Root Mean Square Distance or RMSD between two PDBs. The threshold value is expressed in Ångström or Å. For tighter clustering (generating more clusters), you need to choose a small value (eg 1 Å). If you need less clusters, choose a higher value (eg 2 Å).

Structural Classifications

SCOP

The SCOP database aims to provide a detailed and comprehensive description of the structural and evolutionary relationships between all proteins whose structure is known, including all entries in the Protein Data Bank (PDB). SCOP information was obtained from the database website, more detailed information can be found at http://scop.mrc-lmb.cam.ac.uk/scop/

Classification

Proteins are classified to reflect both structural and evolutionary relatedness. Many levels exist in the hierarchy, but the principal levels are family, superfamily and fold, described below.

CATH

The CATH database is a hierarchical domain classification of protein structures in the Protein Data Bank. Only X-ray structures solved to resolution better than 4.0 angstroms are considered, together with NMR structures. All non-proteins, models, and structures with greater than 30% “C-alpha only” are excluded from CATH. Protein structures are classified using a combination of automated and manual procedures. CATH information was obtained from the database website, more detailed information can be found at http://www.cathdb.info/

Classification

There are four major levels in this hierarchy: Class, Architecture, Topology (fold family) and Homologous superfamily.

Interact with PepX

XML-based API

Description

All information contained within the PepX database is exposed as XML (Extensible Markup Language). When certain URLs are visited, an XML file with the requested data is returned, following the REST interface for data exchange. For example, calling the URL http://pepx.switchlab.org/clusters.xml?threshold=2&alignment=75 serves an XML file with a description of the clusters for threshold 2 Å and an alignment of 75%. The XML interface is implemented for clusters, PDBs and BriX classes providing backbone variations on the peptides.

Example

In the example, we will retrieve all the PDB entries in XML, pick PDB 1N7F, find all the clusters this entry belongs to and return a list with BriX class IDs that cover the peptide of this complex.

pdbs.xml

First, we retrieve an XML file of all PDB entries at http://pepx.switchlab.org/pdbs.xml. Each entry looks like the image below (for PDB 1N7F).

pdbs.png

pdb.xml/<id>

With the id field of this PDB entry, we can query for the all information regarding that PDB at http://pepx.switchlab.org/pdb.xml/20900.

pdb.png

pdb-clusters.xml/<id>

To retrieve the clusters at the different thresholds and alignments this PDB is part of, visit http://pepx.switchlab.org/pdb-clusters.xml/20900.

clusters.png

brix.xml/<id>

Finally, you can get all the BriX classes that cover the peptide of this protein-peptide complex at http://pepx.switchlab.org/brix.xml/20900

brix.png

Download Raw Files

Go to the download page.

 

BriX Peptide Annotations

brix.pngGiven the scarcity of protein-peptide structures and their obvious relevance for drug design, we provide an additional service for peptide design. Since it was recently shown that protein-peptide interactions can be reliably mimicked using interacting fragments from monomeric proteins, it is possible to provide structural variations of peptide ligands using protein fragments. Each ligand peptide in the PepX dataset is associated with its corresponding structural class from the database of protein fragment clusters, BriX. In these fragment clusters, sets of protein fragments with highly similar backbone structure are grouped.  Each protein fragment class represents a natural variation on a typical backbone conformation. Mapped on protein-peptide pairs, these structural classes can be used to model and design alternative peptides with slightly adapted backbone conformation that better fit given amino acid sequences.

pdz_covered_brix.png

PDZ domain covered with BriX fragments. To find out more about the covering strategies, check out the video in Related Work.

Search

(A) A simple, Google-like search on the contents of the database is implemented. The search accepts everything, from keywords to PDB identifiers. (B) Guided search uses structural classifications of SCOP and CATH and keywords from PDB and Pfam. (C) Tag clouds are generated from the various structural classification annotations in the database.

fig2.png