Methodology

Selection Strategy

PepX was constructed from the Protein Data Bank. We filtered for protein-peptide complexes requiring

  1. X-Ray structures with a resolution lower than 2.5 Å
  2. peptides with a size from 5 to 35 amino acids
  3. peptides containing natural amino acids only
  4. receptors with a minimum size of 35 amino acids
  5. the first unit in the PDB in case of crystallographic symmetry

Peptide Definition

In PepX, we define a peptide as follows:

Clustering Algorithm

All the protein-peptide complexes in PepX were clustered on their binding sites using Hierarchical Agglomerative Clustering, the same algorithm used to construct BriX. The distance matrix used in the clustering contains the RMSD values between any two protein-peptide binding sites. computed with Mustang.

Alignment

The Alignment value is used to express the % of the Binding Site of the protein-peptide complex that is used in clustering. The higher the alignment, the more of the binding site is used in clustering, and thus the more clusters there will be.

Threshold

The Threshold value is the maximum allowed Root Mean Square Distance or RMSD between two PDBs. The threshold value is expressed in Ångström or Å. For tighter clustering (generating more clusters), you need to choose a small value (eg 1 Å). If you need less clusters, choose a higher value (eg 2 Å).