Geometric potential function for model selection in protein structure and protein-protein binding prediction


 

Protein structure prediction and protein-protein docking are two fundamental problems in molecular biology. Solving these two problems require an effective potential function to select the correct models from an ensemble of alternative conformations.

 

Because residues located on protein surface, interior, and even different regions of a binding interface have different local environment and experience different solvation, we develop an empirical statistical potential function that is both packing and distance dependent. By using an accurate geometric model, our potential function can also eliminate implausible neighbor contacts and spurious interactions between two residues when a third residue is between them.

 

Our geometric packing potential can be used to effectively discriminate native or near-native structures from misfolded structures for both protein folding and protein-protein docking.

 

 

 

Reference

Xiang Li and Jie Liang. Geometric packing potential function for model selection in protein structure and protein-protein binding predictions. submitted.

 

Downloads:

·         geometric.tgz Linux  (gcc version 3.2.3 20030502 (Red Hat Linux 3.2.3-53))

·        Windows (Upon request)

 

Usage:  geometric.exe -(option) path filename

·        Option includes:

v     -f: reading a single file

v     -l: reading a file list, all files must be in the same directory as the list file

v     -p: file in pdb structure format. The program will first compute alpha contacts, and then score them

v     -c: file in contact format.  All pairs are included. If the alpha contacts were pre-computed,  use this option

v     -i: with this option, only interface contacts will be included for scoring

v     -o: contacts file will be output. Only effective when combined with the option 'p'.  it'd be helpful to output the contacts if you would probably re-calculate the results

v     -d: only dependent on distance, independent of packing. Useful when the packing information is unavailable

·        path: full path of the file, such as “/dump2/decoy/docking/Capri/Target1”, or “.”

·        filename: when option is 'f', the filename must be the name of a pdb file

           when option is 'l', the filename must be the name of a list file containing a list of pdb names, one name per line.

 

·     NOTE:

v     NOTE: 'f' and 'l' are exclusive to each other;

v     'p' and 'c' are exclusive to each other;

v     'o' can only be used together with 'p'.  If option 'i' is provided, only interfacial contacts will be output

v     The file named “geometric.pt”, which contains the geometric potential table, must be placed at the same directory as “geometric.exe”.

 

Usage examples:

1.      For scoring protein structure

1)       to compute the alpha contacts of ./1ctf.pdb and then score it:

Ø        geometric.exe –fp . 1ctf.pdb

 

2)                    to compute the alpha contacts of /decoy/4state/1ctf/1ctf.pdb and then score it.  At the same time,  the alpha contacts are output into the file  /decoy/4state/1ctf/1ctf.contacts:

Ø    geometric.exe –fop /decoy/4state/1ctf 1ctf.pdb

 

3)                    to score/decoy/4state/1ctf/1ctf.pdb if the 1ctf.contacts exists:

Ø       geometric.exe –fc /decoy/4state/1ctf 1ctf.pdb

 

4)                    to compute the alpha contacts of all the pdb files in /decoy/4state/1ctf and then score them.  At the same time,  the alpha contacts are output into the file  /decoy/4state/1ctf/*.contacts:

Ø    geometric.exe –lop /decoy/4state/1ctf list

v       Inside the file “list”,  structure lists must be formatted as one structure file name per line, as  the following:

 

1ctf.pdb

1ctf.a10043_r.pdb

1ctf.a10146_r.pdb

… …

Individual structure file can be named with any extension other than “.pdb”.

v       The output file “1ctf.scores”, which contains the scores for all the structures in list, will be saved at the upper director, i.e., /decoy/4state. 

v       If the path is just “.”, the output file will be named as “scores”, and will be saved at the “./”.

v       If “rmsds” file exists in /decoy/4state/1ctf/, RMSD value will be output in scores file.  The file “rmsds” must be formatted as the following:

1ctf          0.000

1ctf.a18001_r 5.277

1ctf.a18002_r 4.185

1ctf.a18005_r 6.839

1ctf.a18015_r 5.123

 

5)                    to score all the pdb files in /decoy/4state/1ctf if the contacts files exist:

Ø       geometric.exe –lc /decoy/4state/1ctf list

 

2.      For scoring protein-protein binding

6)       to compute the alpha contacts of ./capri.1.pdb and then score it:

Ø    geometric.exe –fpi . capri.1.pdb

 

7)                    to compute the alpha contacts of /decoy/Capri/Target1/capri.1.pdb and then score it.  At the same time,  the alpha contacts are output into the file  /decoy/Capri/Target1/capri.1.contacts:

Ø    geometric.exe –fopi /decoy/Capri/Target1 capri.1.pdb

 

8)                    to score/decoy/Capri/Target1/capri.1.pdb if the capri.1.contacts exists:

Ø       geometric.exe –fci /decoy/Capri/Target1 capri.1.pdb

 

9)                    to compute the alpha contacts of all the pdb files in /decoy/Capri/Target1 and then score them.  At the same time,  the alpha contacts are output into the file  /decoy/Capri/Target1/*.interface.contacts:

Ø    geometric.exe –lopi /decoy/Capri/Target1 list

v       Inside the file “list”,  structure lists must be formatted as one structure file name per line, as  the following:

capri.1.pdb

capri.1.a10043_r.pdb

capri.1.a10146_r.pdb

… …

 

Individual structure file can be named with any extension other than “.pdb”.

v       The output file “capri.1.scores”, which contains the scores for all the structures in list, will be saved at the upper director, i.e., /decoy/Capri. 

v       If the path is just “.”, the output file will be named as “scores”, and will be saved at the “./”.

v      If “rmsds” file exists in /decoy/Capri/Target1/, RMSD value will be output in scores file.  The file “rmsds” must be formatted as the following:

capri.1          0.000

capri.1a18001_r 5.277

capri.1.a18002_r 4.185

capri.1.a18005_r 6.839

capri.1.a18015_r 5.123

 

 

10)                to score all the pdb files in /decoy/Capri/Target1 if the contacts files exist:

Ø       geometric.exe –lci /decoy/Capri/Target1 list