Data Sources
- PDB
The Brookhaven Protein Data Bank (PDB) archive contains information about experimentally-determined structures of proteins, nucleic acids, and complex assemblies.
- Used to: Download protein-peptide complexes in 3D coordinates
- Location: www.pdb.org
- Reference: The RCSB PDB information portal for structural genomics. Kouranov A, Xie L, de la Cruz J, Chen L, Westbrook J, Bourne PE, Berman HM. Nucleic Acids Res. 2006 Jan 1;34(Database issue):D302-5.

- SCOP
The Structural Classification of Proteins (SCOP) database was created by manual inspection and several automated methods. It aims to provide a detailed and comprehensive description of the structural and evolutionary relationships between all proteins whose structure is known.
- Used to: Classify protein structures in structural hierarchy.
- Location: scop.mrc-lmb.cam.ac.uk/scop/
- Reference: SCOP: a structural classification of proteins database for the investigation of sequences and structures. Murzin AG, Brenner SE, Hubbard T, Chothia C. J Mol Biol. 1995 Apr 7;247(4):536-40.

- CATH
CATH is a hierarchical classification of protein domain structures, which clusters proteins at four major levels: Class (C), Architecture (A), Topology (T) and Homologous superfamily (H).
- Used to: Classify protein structures in structural hierarchy.
- Location: www.cathdb.info
- Reference: The CATH classification revisited--architectures reviewed and new ways to characterize structural divergence in superfamilies. Alison L. Cuff; Ian
- Sillitoe; Tony Lewis; Oliver C. Redfern; Richard Garratt; Janet Thornton; Christine A. Orengo (2008) Nucleic Acids Research.

- Pfam
The Pfam database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs).
- Used to: Annotate proteins with family information.
- Location: pfam.sanger.ac.uk/
- Reference: The Pfam protein families database.Finn RD, Tate J, Mistry J, Coggill PC, Sammut SJ, Hotz HR, Ceric G, Forslund K, Eddy SR, Sonnhammer EL, Bateman A. Nucleic Acids Res. 2008 Jan;36(Database issue):D281-8. Epub 2007 Nov 26.

- BriX
The BriX database contains a structural classification of protein fragments. The library comprises fragments ranging from 4 to 14 amino acids that are clustered against 6 different distance thresholds. This has lead to an alphabet of around 2000 frequently observed letters or structural classes per chain length.
- Used to: Annotate peptides with structurally similar protein fragments.
- Location: brix.vub.ac.be
- Reference: Reconstruction of protein backbones from the BriX collection of canonical protein fragments. Baeten L, Reumers J, Tur V, Stricher F, Lenaerts T, Serrano L, Rousseau F, Schymkowitz J. PLoS Comput Biol. 2008 May 23;4(5):e1000083.

- UniProt
The UniProt Knowledge Base is a comprehensive, high-quality and freely accessible resource of protein sequence and functional information.
- Used to: Annotate proteins with functional and sequence information.
- Location: www.uniprot.org
- Reference: The Universal Protein Resource (UniProt) 2009. UniProt Consortium. Nucleic Acids Res. 2009 Jan;37(Database issue):D169-74. Epub 2008 Oct 4.

- 3did
The database of 3D interacting domains (3did) is a collection of protein interactions for which high-resolution 3D structures are known. Besides interactions between globular domains, 3did also contains a hand-curated set of transient peptide-mediated interactions.
- Used to: Link protein-peptide complexes to 3did annotations.
- Location: 3did.irbbarcelona.org
- Reference: 3did Update: domain-domain and peptide-mediated interactions of known 3D structure. Stein A, Panjkovich A, Aloy P. Nucleic Acids Res. 2009 Jan;37(Database issue):D300-4. Epub 2008 Oct 25.

- Mustang
MUSTANG is a MUltiple STructural AligNment alGorithm that, given a set of protein structures, constructs a multiple alignment using the spatial information of the C-alpha atoms in the set.
- Used to: Align binding sites of clustered complexes.
- Location: www.cs.mu.oz.au/~arun/mustang/
- Reference: MUSTANG: a multiple structural alignment algorithm. Konagurthu AS, Whisstock JC, Stuckey PJ, Lesk AM. Proteins. 2006 Aug 15;64(3):559-74.

- FoldX
FoldX is an empirical force field optimised for energy calculations and protein design.
- Used to: Calculate binding energies between protein and ligand.
- Location: foldx.crg.es
- Reference: The FoldX web server: an online force field. Schymkowitz J, Borg J, Stricher F, Nys R, Rousseau F, Serrano L. Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W382-8.

- SwissKnife
SwissKnife is an object-oriented Perl library to handle Swiss-Prot entries.
- Used to: Parse the UniProt Knowledge Base.
- Location: swissknife.sourceforge.net/docs/
- Reference: Swissknife - 'lazy parsing' of SWISS-PROT entries. Hermjakob H, Fleischmann W, Apweiler R. Bioinformatics. 1999 Sep;15(9):771-2.

- Yasara
Yasara is a molecular-graphics, -modeling and -simulation program.
- Used to: Generate images of 3D structures of protein-peptide complexes and selected binding sites.
- Location: www.yasara.org
- Reference: Increasing the precision of comparative models with YASARA NOVA--a self-parameterizing force field. Krieger E, Koraimann G, Vriend G. Proteins. 2002 May 15;47(3):393-402.

- Hierarchical agglomeration
The BriX database was built using an hierarchical agglomeration algorithm.
- Used to: Cluster binding sites of protein-peptide complexes.
- Location: brix.vub.ac.be
- Reference: Reconstruction of® protein backbones from the BriX collection of canonical protein fragments. Baeten L, Reumers J, Tur V, Stricher F, Lenaerts T, Serrano L, Rousseau F, Schymkowitz J. PLoS Comput Biol. 2008 May 23;4(5):e1000083.
