************************************************************************
*                                                                       *
*                           The ECEPP Package                           *
*                                                                       *
*************************************************************************

What the Package Does
---------------------

The program performs the following calculations:
  1) Single Energy Evaluation.
  2) Single Energy Minimization
  3) Energy evaluation of Multiple Input Conformations
  4) Energy Minimization of Multiple Input Conformations
  5) Monte Carlo Search using a generalized MCM (EDMC) algorithm.
  6) PRODUCE an energy map for a pair of dihedral angles.
  7) Carry out an rms deviations analysis.
  8) Variable Target Function Procedure for structure determination.

Getting Started and Compiling the Eceppak Package
-------------------------------------------------
See the file "README" in the main eceppak directory.

How To Run this program
-----------------------

 - The script to run the program is called: recepp.s. When you SOURCE the
   cshrc file, an ALIAS is set up to SELECT the script for the correct 
   ARCHITECTURE. Files are stored in the proper subdirectory in eceppak/Scripts.

   To run the program you should give a set of arguments, the number of
   arguments depends on the architecture. You will get precise information 
   about the arguments that should be used by typing, 
                 recepp.s 

   IMPORTANT: if "recepp.s" is not recognized, you need to source the cshrc file. 
	      if the command does not execute properly, then, check your cshrc 
              file. It may have been set up incorrectly.  Look at previous
	      point "To Start" to do this setup.


What's New
----------

* The old set of ECEPP Input files has been replaced by a more flexible 
    file structure.
* The main input file contains now a series of cards that define the type 
    of run and parameters.

* Residue Data file has been enhanced.
    This file contains the ECEPP/3 residues and other non-standard ones.
    There are 72 residues (including N-methyl residues), and new end groups 
    defined.
    The file is found under eceppak/Data/Residue/rsdata.

    Among the changes introduced in rsdata are:
    (a) Data on loop closing pairs was added. The program uses a general 
        treatment for these pairs (introduced by A.Liwo).
    (b) It includes N-methyl residues.
    (c) Hydration atom types were added in the description of atoms 
        (old hrs.data).
    (d) Description of 1-4 interactions is included in a more general format.
    (e) C' was replaced by C, NP in PRO and HPRO was replaced by N to increase 
        compatibility with PDB format.
    (f) Atom type of protons in COOH groups ( ASP, GLU, meASP, meGLU and 
        Carboxyl-End terminal) changed to type 1, (as in ECEPP/3, no H-bonding 
        allowed).

* Hydration parameters for different surface models are provided under
    the subdirectory eceppak/data/Hydration_files. The SRFOPT set (srfopt.set) 
    of parameters is defined as the default. Other sets can be used by 
    modifying the recepp.s script (eceppak/Script/$ARCHITECTURE/recepp.s).
    
Examples
--------
   The Input files provided as examples (directory eceppak/Test) 
   will give you an idea of the calculations the program is able to do.
   There are several subdirectories here corresponding to the different
   type of runs eceppak can perform.
    
    
FILE(S)                       EXPLANATION
-------                       -----------

enk_sol.inp       Calculation of surface solvation energy.
                  To execute type:
                  "recepp.s ENERGY enk_sol ENK_sol dummy dummy"

enk_checkgrad.inp Checking Gradient calculation.
                  "recepp.s CHECKGRAD enk_checkgrad ENKGRAD  dummy dummy"

enk_sp.inp           Calculate energy using a soft-sphere potential.
                  "recepp.s ENERGY enk_sp ENKSP dummy dummy"

enk.inp           EDMC run.
                  "recepp.s EDMC enk enk_out dummy dummy"

mebmt.inp         Minimization (with output from minimizer).
                  "recepp.s MINIMIZE mpa1ot MPA1OT dummy dummy"

avian.inp         ECEPP/3 and solvation energy.
                  "recepp.s ENERGY avian AVIAN dummy dummy"

cala6.inp         Cyclic peptide and solvation energy.
                  "recepp.s ENERGY cala6 CALA6 dummy dummy"

hisp1.inp         EDMC run with two possible states for PRO (UP and DOWN).
                  and HIS (HID and HIE) residues.
                  "recepp.s EDMC hisp1 HISP1 dummy dummy"

cys1.inp          Input sequence with 1-letter code.
                  "recepp.s ENERGY cys1 CYS1 dummy dummy"

three_let.inp     Input sequence with 3-letters code.
                  "recepp.s ENERGY three_let THREE_LET dummy dummy"

CPEP.inp          Energy minimization of multiple input conformations.
outo.CPEP         set of conformations to be minimized.
                  "recepp.s MINIMIZE CPEP CPEPout CPEP dummy"

ala_map.inp       Energy map.

ala_rms1.inp      RMS deviation analysis; generation of a reference
                  conformation.
outo.ala_rms      Input conformations for comparison in ECEPP format.
ala_HELIX.pdb     Input for reference conformation generation in PDB format
                  To execute type:
                  "recepp.s RMS_FIT ala_rms1 ala_rms1 ala_rms ala_HELIX"
                  As a result you get, among others, a file xray.ala_HELIX
                  that could be save for future use.

ala_rms2.inp      RMS deviation analysis; comparison of a conformation
                  (file in pdb format) with the reference one.
ala135.pdb        Input conformation for comparison in PDB format (with end groups).
xray.ala_HELIX    Reference conformation for comparison (in ECEPP format).
                  To execute type:
                  "recepp.s RMS_FIT ala_rms2 ALA_RMS2 ala135 ala_HELIX"

timbck.inp        Calculate upper and lower bounds for distance constraints
tim.pdb           runs from a pdb file.
                  "recepp.s BOUNDS timbck TIMBCK tim"

vtf_tim.inp       Example of a run using the Variable Target Function procedure.
outo.vtf_tim      Usually constraints come from NMR experiments
bounds.timbck      " recepp.s VTF vtf_tim VTFOUT dummy  timbck"


tim_sp.inp        Example of a Monte Carlo run combining distance constraints
bounds.timbck     and a soft-sphere potential  (NMR refinement).

Output files for comparison with your results are provided in directory
test_output. 
NOTE: We have noticed that large differences can occur between EDMC runs
in different architectures. This appears to be related to machine precision. 
In general, a single energy calculation will tell you if the ECEPP/3 energy
function is working correctly. For EDMC runs, check if the program leads
to a sequence of improved energies.

                         *******************
                         *     TABLE 1     *
                         *******************
Conventions:
-----------
Residues can be specified using the ECEPP list number, a three-letter code or a
ONE letter code.

----------------------------------------------------------------------
                             ECEPP    ECEPP        3-letters 1-letter 
    RESIDUE                 LIST No.   KIND           code     code
----------------------------------------------------------------------
                                                                        
ALANINE                      1        -1               ALA      A  
ASPARTIC ACID                2        -2               ASP      D  
CYSTINE                      3        -3               CYS      C_ 
GLUTAMIC ACID                4        -4               GLU      E  
PHENYLALANINE                5        -5               PHE      F  
GLYCINE                      6         6               GLY      G  
HISTIDINE (HID)              7        -7               HIS      H  
ISOLEUCINE                   8        -8               ILE      I  
LYSINE                       9        -9               LYS      K  
LEUCINE                     10       -10               LEU      L  
METHIONINE                  11       -11               MET      M  
ASPARAGINE                  12       -12               ASN      N  
PROLINE-DOWN                13        13               PRO      P  
GLUTAMINE                   14       -14               GLN      Q  
ARGININE                    15       -15               ARG      R  
SERINE                      16       -16               SER      S  
THREONINE                   17       -17               THR      T  
VALINE                      18       -18               VAL      V  
TRYPTOPHAN                  19       -19               TRP      W  
TYROSINE                    20       -20               TYR      Y  
CYSTEINE                    21       -21               CYX      C  
HYDROXYPRO-DOWN             22       -22               HPD      P< 
NORLEUCINE                  23       -23               NOR      N< 
ORNITHINE                   24       -24               ORN      O  
HISTIDINE (HIE)             25       -26               HIE      H- 
BENZYL-ASPARTATE            26       -30               BZD      B< 
ORNITHINE +                 27       -25               OR+      O+ 
HISTIDINE+ (HIP)            28       -27               HI+      H+ 
LYSINE +                    29       -28               LY+      K+ 
ARGININE +                  30       -29               AR+      R+ 
ASPARTIC ACID -             31       -31               AS-      D- 
GLUTAMIC ACID -             32       -32               GL-      E- 
PROLINE-UP                  33        13               PRU      P% 
AZETIDIN                    34        13               AZE      P* 
HYDROXYPRO-UP               35       -22               HPU      P> 
TYROSINE -                  36       -36               TY-      Y- 
AMINOBUTYRIC ACI            37       -33               ABU      Z< 
AMINOISOBUTYRIC             38       -38               AIB      Z> 
SERINOLA                    39       -39               SLA      S< 
allo-ISOLEUCINE             40       -40               AIL      I* 
AMINOBUTYRIC LOO            41       -41               ASU      U< 
SXRAYIN1                    42       -42               SXY      X  
SLLXRAYIN                   43       -43               SLX      X* 
GLUTAMIC LOOP               44       -44               GLP      E_ 
LYSINE LOOP                 45       -45               LYP      K_ 
DAB LOOP                    46       -46               DAB      B_ 
GLYCINE LOOP                47        47               GYP      G_ 
LEUCINE LOOP                48       -48               LEP      L_ 
ASPARTIC LOOP               49       -49               ASX      D_ 
M-DUMMY50(mGLY)             50       -50               M50      @50
MeALANINE                   51       -51               M-A      @A 
MeASPARTIC ACID             52       -52               M-D      @D 
MeCYSTINE                   53       -53               M-C      @C_
MeGLUTAMIC ACID             54       -54               M-E      @E 
MePHENYLALANINE             55       -55               M-F      @F 
SARCOSINE                   56       -56               SAR      @G 
MeHISTIDINE                 57       -57               M-H      @H 
MeISOLEUCINE                58       -58               M-I      @I 
MeLYSINE                    59       -59               M-K      @K 
MeLEUCINE                   60       -60               M-L      @L 
MeMETHIONINE                61       -61               M-M      @M 
MeASPARAGINE                62       -62               M-N      @N 
MeDUMMY63                   63       -63               M63      @63
MeGLUTAMINE                 64       -64               M-Q      @Q 
MeARGININE                  65       -65               M-R      @R 
MeSERINE                    66       -66               M-S      @S 
MeTHREONINE                 67       -67               M-T      @T 
MeVALINE                    68       -68               M-V      @V 
MeTRYPTOPHAN                69       -69               M-W      @W 
MeTYROSINE                  70       -70               M-Y      @Y 
Me-BMT                      71       -71               BMT      @Z 
MeORNITHINE                 72       -72               MOR      @O 
----------------------------------------------------------------------


                             ECEPP    ECEPP        3-letters 1-letter 
END GROUPS                   LIST No.   KIND           code     code
----------------------------------------------------------------------

AMINO - H2                   1         1               H2N      H  
AMINO - H3+                  2         2               H3N      H+ 
AMINO -CH3                   3         3               CH3      M  
AMINO-COCH3                  4        -4               ACE      A  
FORMYL                       5        -5               FYL      F  
END-PRO,CIS-H                6        -6               CHP      P- 
END-PRO,TRANS-H              7        -7               THP      P  
END-H2+-PRO                  8        -8               AHP      P+ 
PYROGLUTAMIC                 9        -9               PGL      G  
AMINO (CYCLIZING            10        10               HN-      H_ 
CARBOXYL - COOH             11       -11               CXH      O  
CARBOXYL - O                12        12               OCC      O- 
CARBOXYL-CH3                13        13               CCC      L  
CARBOXYL-NH2                14       -14               NCC      N  
CARBOXYL-NHCH3              15       -15               NME      C  
N, N - DIMETHYL             16       -16               DME      D  
METHYL ESTER                17       -17               MES      T  
ETHYL ESTER                 18       -18               EES      E  
AMINO-T-BOC                 19        -9               BOC      B  
CARBOXYL(CYCLIZI            20        20               CXL      O_ 
MPA (HALF S-S)              21       -21               MPA      R_ 
DMP (HALF S-S)              22       -22               DMP      D_ 
CPP(AX) (HALF S-            23       -23               CPP      C_ 
CARBOXYL-CH2F               24        24               CHF      S  
OCA(AX) (HALF S-            25       -25               OCA      A_ 
OCA(EQ) (HALF S-            26       -26               OCE      E_ 
SCA(AX) (HALF S-            27       -27               SCA      S_ 
SCA(EQ) (HALF S-            28       -28               SCE      T_ 
CPP(EQ) (HALF S-            29       -29               CPE      F_ 
DANSYL                      30       -30               DAN      W  
CARBOXYL                    31        31               CXX      X  
AMINO-CYNAMONIC             32       -32               CYN      Y  

________________________________________________________________________
Note:
----
`@' is used to indicate N-methyl residues.
`_ 'is generally used to indicate a bridging residue (e.g. C_  indicates 
CYSTINE).
`+' and `-' are used to indicate a charged residue (e.g. K+ indicates 
charged lysine residue).

Description of the input file:
-----------------------------
The general input to the program is given through a file with 
a set of instructions. The program uses a parser to read these instructions. 
The parser reads and interpret the first 78 characters of a line. No
distinction is made between lower-case or upper-case letters.
The symbols # and ! are used to indicate the beginning of a comment. 
When any of this symbols are encountered, the parser will ignore the 
rest  of the line. 
Instructions related to a given procedure are associated into 
the so called  "Data Groups". A "Data Group" is identified by a main keyword 
which contains the symbol '$' as the first character, i.e. $EDMC, $CNTRL. 
Also the keyword $end or $END, should be present, indicating the end of the 
Data Group.
Any word included between the main keyword and $end, is considered an 
instruction. 
This is an example of a Data Group

$CNTRL
runtyp=Energy
$end

The following list contains the Data Groups already defined in ECEPPAK:

$BOUNDS, $BOUND_DEF, $BRIDGE, $CNTRL, $DIST_CONST, $EDMC, $FFIELD, 
$GEOM, $GRID, $MINIM, $REGIONS, $RMSFIT, $SCAN, $SELEC_PDB, $SEQ, 
$SPEC, $ENERCALC, $VTF, $WINDOWS, $OVERLAP_GRP and $OMCIS.
		

Three of the Data Groups are considered essential and without them
the program will abort. They are: $CNTRL, $SEQ and $GEOM.

$CNTRL is used, mainly, to indicate the type of calculation the user
  wants to perform.

$SEQ  provides the sequence of the molecule under study.

$GEOM Contains the set of internal variables (dihedral angles) of the
      initial conformation. 



Description of the Data Groups
------------------------------
$CNTRL  
This Data group is used to define the type of calculation the user would like to 
carry out. Also, there are a few instructions, common to different modules, that 
are defined here. The data group is essential. The program will not proceed
if the data group is not found.

Keywords of this data group are:
   KEYWORD        ARGUMENT                DESCRIPTION
   ------         -------                 -----------

   RUNTYP      =                       Define the type of calculation.
	
	         ENERGY               -Compute energy.
	         CHECKGRAD            -Check analytical gradient vs. numerical.
                 MINIMIZE             -Carry out energy minimization
                 EDMC                 -Carry out EDMC/MCM monte Carlo search.
                 RMS_FIT              -Compute rms deviations and fitting.
                 BOUNDS               -Computes upper and lower bounds from 
                               	        a reference conformation and generate
					a constraint file for future use.
                 VTF                  -Carry out a variable Target Function
                                         study.

   VERBOSE                             Print all information available.


   CHISCAN                             Carry out a systematic search with energy
				       minimization for low conformations of 
				       side chains dihedral angles. Specification
				       of the keyword RUNTYP = MINIMIZE is 
				       required. The set of dihedral angles to 
				       be scanned should be specified using the 
				       data group $SCAN. Also,  NSTEP should be
                                       specified.

           NSTEP = number              Number of step for the side chain search using
                                       the CHISCAN option.
                                       i.e. if nstep=6 the angles will be search
                               	       in increments of 60 degrees.

   PRINT_CART                          To request printing of Cartesian coord.  

       OUTFORMAT   =                    Format required for the output file
				        containing the Cartesian coordinates.
                     ECEPP              ECEPP format.
                     PDB                PDB format.
                     AMBER              AMBER (history) format.
                     CNDO               CNDO format.
                     CA_PDB             PDB (with CA  only)format.
                     SEL_PDB            PDB (for selected atoms only) format.
				        This atoms should be specified within
                                        the  $SELEC_PDB data group.

       FILE        =  name_of_file      Filename of the output Cartesian file. 
                                        In case of multiple conformations. A
                                        sequence of files will be written 
                                       	as name_of_fileNNN.*, where NNN is an
                                        integer from 000 to 999. 
 
       NO_HYDRG_IN_PDB                  Omit printing H atoms in PDB files


   NRES        = number                 number of residue on the specified
                                        molecule. It is not essential. The
                                        program will compute this value from 
                                        the sequence (see $SEQ data group).

   RES_CODE    =                        Specifies the input format of the sequence.
                 ECEPP                  ECEPP numbers are used. Default. 
                 THREE_LETTER           Sequence specified using a three-letter code.
                 ONE_LETTER             Sequence specified using a one-letter code.

   VAR_ANGLES  =                        Used to define the set of variables 
                 ALL                    All dihedral angles are variable. Default.
                 BACK                   Variable are the backbone dihedral angles.
                 SIDE                   Variable are the side chain dihedral angles.
                 SPEC                   Variable dihedral angles specified through
                                        $SPEC data group.
                 NONE                   ALL dihedral angles are fixed.
                 PHPS                   Only PHI and PSI Backbone dihedral angles.
	         BKSD                   Backbone dihedral angles.

   VAR_RES     =  number                Used to define as variables a group of 
                                        dihedral angles from specific residues.
                                        VAR_RES represents the number of residues that
                                        contain variable dihedral angles.
                                        The information of the specific residues 
                                        (sequence position) is entered through 
                                        the $SPEC data group.
                                        The set of dihedral angles to be varied is 
                                        defined by selecting a proper value of VAR_ANGLES.
                                        NOTE: Since the keyword VAR_RES works in combination 
                                       	with VAR_ANGLES, VAR_ANGLES cannot be set to SPEC.

   TIME        =  number                Estimated CPU time of the run. Program
                                        will end when this time limit is reached.
                                        Default is  10.0**10 sec.

   EMINIMA     =  number                Use to avoid printing of high energy
                                       	conformations during multiple evaluation
                                        of energies or minimizations.
                                        Works in conjunction with keywords $ENERCALC
                                        or $VTF. 

NOTE: The usage of the following keywords in $CNTRL data group is kept for consistency 
with previous version but is not recommended. They were incorporated into other data 
groups.  

   SURFACE_OUT                          Print exposed surface for atoms. 
                                        The keyword SOLVATION= SURFACE must be
                                        specified in datagroup $FFIELD

   MULT_CONF    =                       This flag is used to indicate the energy 
                                        evaluation or minimization multiple
                                        conformations.  
                  READ                  conformations are read from file (outo.*).
                                        The name of the input file is passed to
                                        the program through the recepp.s script
                                        as the 4th argument. 
                  RANDOM                Generate conformations from random sets
                                        of dihedral angles. In this case, MAXIT
                                        and SEED must be specified. 
                                        NOTE: The options of this keyword are 
                                        equivalent to keywords READ_CONF and 
                                        RAND_START in $ENERCALC and $VTF data groups. 

          MAXIT       = number          Maximum no. of randomly generated conformations.
                                        Used with MULT_CONF=RANDOM and MAXIT.

          SEED       =  number          Seed for random number generator. Used with
                                        MULT_CONF=RANDOM and MAXIT.
                                          
   REFERENCE                            Used to stop EDMC when the ZIMMERMAN
                                        Code of an accepted conformation 
                                        matches the one corresponding to the
                                        conformation provided as reference.
                                        now can be specified in $EDMC.
                                        If used with during energy evaluation (or 
                                        minimization) or VTF, it will print the 
                                        Zimmerman Code of the conformations. This
                                        option is also available (recommended use) 
                                        in data groups $ENERCALC or $VTF using
                                        the keyword ZIMMERMAN_CODE.
                                        

$BOUND_DEF  
This data group works in combination with runtyp= BOUNDS  (see $CNTRL keyword) 
and the data group $BOUNDS.

The specific keywords of this data group are:

   KEYWORD        ARGUMENT                DESCRIPTION
   ------         -------                 -----------


     TYPE_INPUT =  
                  PDB_NO_ENDG        Default. input file is PDB with 
                                     no end groups.
                  PDB_WITH_ENDG      input file is PDB with end groups.

     DELT_R     =                   Upper and Lower bounds can be obtained by:
                  PERCENTAGE        A- adding and subtracting a percentage 
				     (PERCENT) of the actual distance (R)
                                     to the computed value of R,  i.e upper 
				     bound= R+ (PERCENT/100)*R. Default. 
                  FIXED             B- adding and subtracting a fixed value 
				     (FIXVAL) to the actual distances.
                  
                  
     FIXVAL = number                See explanation for DELT_R.

     PERCENT = number               See explanation for DELT_R.

     WEIGHT = number                Weight associated to the constraints. 

     IGNORE_H                       Don't stop if H cannot be identified.

     MAXDIST = number               Is used to reduce the number of constraints.
                                    Only specified atoms separated by distances 
                                    smaller than MAXDIST will be used.
                                     (default is 100000.0).
        
     MINDIST = number               Is used to reduce the number of constraints.
                                    Only specified atoms separated by distances 
                                    greater than MINDIST will be used 
                                     (default is 0.0).
        
     FIRST_RESIDUE = number          This keyword allows the use of a portion
                                     of a PDB file to be read and use for 
                                     generation of distance constraints. 
                                     FIRST_RESIDUE should correspond to the
                                     PDB number of the first residue in the
                                     sequence. Note: sequence must be specified 
                                     sequentially and no residues should be
                                     missing.

    RESIDUE_GAP = number             Distance for residues separated in sequence by
                                     RESIDUE_GAP  or more residues will be computed
                                     (default is 0).

$BOUNDS  
This data group works in combination with runtyp= BOUNDS  (see $CNTRL keyword) 
and the data group $BOUNDS_DEF. The group does not have specific keywords. It 
is used to enter the names of atoms for with distance constraints are requested 
and the weight assigned to the constraint.
example:  Computed Bounds between CA atoms and give them a weight of 10.0 

CA CA  10.0 

 
$BRIDGE  
This data group is used to define the linkage between bridging residues.
The data group requires the specification of pairs of numbers corresponding to
the position in sequence of the bridging residues. The program recognizes 
residues that forms bridges. Consequently, there is no need to specify the
number of them.

$DIST_CONST  
This data group is used to define the  set of distance constraints.
It works in combination with one of the following keywords:
   a- RUNTYP= VTF in CNTRL data group, or 
   b- CONSTR_MOV in $EDMC or $FFIELD data groups.

The specific keywords of this data group are:

   KEYWORD        ARGUMENT                DESCRIPTION
   ------         -------                 -----------

     N1PAIR        = number      - Number of bounds read using atom number 
                                   as identification.  A tedious procedure 
                                   but needed from time to time.

     N2PAIR        = number      - Number of bounds read using specific 
                        	  alpha-numeric characters for the atoms 
                                  and corresponding residue.

     RESN1_IS_ONE                  This flag is used to introduce distance constraints 
                                   associated to a sequence without end-groups, i. e. 
                                   the first full residue is numbered as 1 (usual case 
                                   of constraints obtained from a typical PDB file). 
                                   ECEPP ALWAYS assumes that the chain has end groups.
                                   Consequently sequence numbering is usually shifted
                                   by one (+1) from the PDB sequencing.  
                                   The flag should be omitted (default) if the residue 
				   numbers in the distance-constraint file are the same
				   as in ECEPP.  (The sequence number is used to identify 
				   the atoms in subroutine CLASS).

     DIST_WEIGHT    = number      - A constant with units of kcal/mol/A that converts 
                                   the "Sum of Squares of Errors" into energy. (WEI) 

     ADAPT_WEI                    - This and the following keywords are used by EDMC 
				    method.  (experimental) ADAPT_WEI is used to 
				    indicate that the weight assigned to the distance 
				    energy term, EDIS, should be adapted during the 
				    course of a conformational search.
                                    The goal is to control the value of the distance 
                                    energy term during a simulation. This keyword
				    should be specified in combination with: 
				    (a) PERCENT_WEI;  or
				    (b) PERCENT_WEI,  DELTA_PERC_WEI, MAX_WEI and 
				    MIN_WEI.
     

     PERCENT_WEI    = real_number - Defines the 'expected' ratio between EDIS and the 
				    sum of the remaining energy terms. 
				    If the DELTA_PERC_WEI is omitted, the algorithm 
				    will try to keep this ratio approximately constant
                                    during the run.

     DELTA_PERC_WEI = number        This flag is used to modulate the effect of the 
				    distance constraint energy term on the search.
                                    Works in the following manner:
                                    DELTA_PERC_WEI/MAXIT will be added or subtracted
                                    from the initial PERCENT_WEI during the course of 
				    the run. In this way the algorithm tries to enforce
				    the distance constraints (when DELTA_PERC_WEI is 
				    positive) while it proceeds toward lower energies. 
				    The search will be directed toward constraints 
				    satisfaction.
                                    If DELTA_PERC_WEI is negative, on the other hand, 
				    the constraints will be less important as the run 
				    evolves and the search will be guided by the 
				    ECEPP/3 energy terms.

     MAX_WEI         = number       Maximum allowed value for DIST_WEIGHT; Works in
                                    conjunction with PERCENT_WEI

     MIN_WEI         = number       Minimum allowed value for DIST_WEIGHT; Works in
                                    conjunction with PERCENT_WEI

     SOFT_SWITCH    = number        Use a linear distance constraint function when
                                    the actual distance, d, is greater than the 
                                    upper bound plus the specified number.
                                    From Feng Ni (BRI, Montreal).
                                      

      SOFT_SLOPE     = number       Value of the slope on linear function
                                    From Feng Ni (BRI, Montreal).

     NUMBER_OF_GROUPS = number      Indicate the number of groups (set of protons)
                                    with overlapping resonances. This value, when 
                                    specified, should be greater than one (1).
                                    From Feng Ni (BRI, Montreal).

$EDMC  
This data group works in combination with runtyp= edmc  (see $CNTRL keyword).
This data group is used to define  parameters and different alternatives for
the Monte Carlo search.
The EDMC method is a procedure for searching the conformational space a 
polypeptide. It is based on a  Monte Carlo approach that combines minimization 
of the potential energy and a predictive algorithm that attempts to produce 
suitable rotations that lead to better energies.

The specific keywords of this data group are:

   KEYWORD        ARGUMENT                DESCRIPTION
   ------         -------                 -----------

     MCM                         - Carries out a Monte Carlo with energy 
				   Minimization search rather than the search 
				   available through the EDMC method. 
				   It is a special case of EDMC, in which all 
				   the perturbations are produced randomly.

     MOTION    =
		 CRANKSHAFT      - (the default) - backbone dihedral angles 
				   are associated in  rotatable pairs.
				   [ psi(i-1), phi(i)], (where i is the 
				   residue in the i-th position on the sequence)
				   When a member of a given pairs is selected 
				   for a change, say a  rotation 'delta', then,
				   an opposite rotation, '-delta', is added
				   to the the second dihedral angle. This type 
				   of movement tend to preserve the global 
				   conformation of a folded polypeptide while 
				   changing the local conformation.
			         
                 PIELA             Varies one backbone angle at a time (makes 
				   large changes)

		 LAMBDA            Varies the angles of rotation of peptide 
				   groups about virtual bond (CA-CA) axes. 
				   Doesn't change much backbone shape, but
				   rather optimizes the orientation of peptide 
				   groups.


     CONSTR_MOV                    Indicates that distance constraints should
				   be used. See $DIST_CONST keyword to find
				   out how to introduce  distance constraints.

     BACKUP     = number           Time interval in seconds in which restart 
				   information is punched.  (default 3600 s)

     RESTART                       Flag to indicate that the program should
				   continue a previous search. The program will
				   look automatically for a backup file.

     MAXIT     = number            Maximum number of steps (accepted 
                                   conformations) in MCM/EDMC

     RAND_START                    Start from a randomly-generated conformation.
				   This key works requires definition of SEED.

     OMEGA_180                     Works with RAND_START. Keep the omega's at 180.


     RAND_TO_ELEC     = number      Pre-defined ratio of random to electrostatic
				    sampling; default 0.1. RAND_TO_ELEC=1.0 is 
				    equivalent to the flag MCM.

     MAX_REPM     = number        - Maximum number of repetitions of a 
				    conformation.

     MAX_RAND     = number        - Maximum number of random-prediction trials.

     MAX_EL       = number        - Maximum number of electrostatic-prediction 
				    trials within an iteration.

     MAX_THERMAL     = number     - Maximum number of thermal movements.


     EFINAL     = number          - Target Energy. This represents a way to
				    stop the search when EFINAL is reached.
				    default is a very large negative number.


     TEMP     = number            - Temperature used during normal stages
				    of the search. 
				    The default is doing simulations at
				    a constant temperature. However, there 
				    are two other alternatives:
				    'Thermal_shock' and 'adapt_temp'.

     THERMAL_SHOCK                - Thermal shock Monte Carlo scheme. The 
				    system is suddenly "heated". Keywords that 
				    need to be specified are:

          T_LOW     = number      - lower bound of temperature.    

          T_UP     = number       - upper bound of temperature.
          
	  NTEMP     = number      - Number of steps in which the system is 
				    heated from T_LOW  to T_UP.

     ADAPT_TEMP                   - Adaptive temperature scheme. 
                                    If NHEAT=NCOOL=1, we have THERMAL_SHOCK.

        NHEAT     = number        - Number of heating steps.

        NCOOL     = number        - Number of cooling steps.

	T_LOW     = number        - lower bound of temperature.

	T_UP     = number         - upper bound of temperature.


     NPRINT_ELEC     = number     - printing of electrostatic diagnosis 
				    every NPRINT_ELEC accepted conformations.

     OMPROB     = number          - The priori probability that a cis peptide 
				    bond is being tried to be converted to a 
				    trans bond. The default is 5000 which means 
				    that the program will first attempt at 
				    making all the peptide bonds trans.

     HISP_CHANGE = number         - The probability that in a given iteration 
				    the program attempts at changing the 
				    conformations of HIS and PRO in the sequence
				    from PRO-UP to PRO-DOWN, (or vice versa), or
				    from HIE to HID, or vice versa (default ??).

     CONST_SEQ                    - The program will not change the protonation 
				    form of histidine and the internal geometry 
				    of proline. 

     TYPE_BKTK       =            - Defines the set of dihedral angles altered
				    during backtracking (during heating of the 
				    system).

		       BACK       - Only backbone dihedral angles can be moved.

		       ALL        - All dihedral angles can be moved.

     MAX_VAR_BKTK = number        - Maximum number of variables that can be
				    changed simultaneously during backtrack.

     REGION_SAMP     =            - Use the set of sampling regions specified
				    for specific amino acid.

		       UNIFORM    - Use uniform sampling through specified
				    regions

		       NONUNIFORM - Sample through specified regions using
				    provided weights.

     SEED     = number            - Initialization of the random-number 
				    generator. Any negative number

     PRINT_SAMPLED                - Print "extra" information from sampling.

     NWIND     = number           - Number of "windows" containing the 
				    specifications of the "bombing ranges", i.e.
				    the ranges of the residues whose angles 
				    will be targeted by random/electrostatic 
				    sampling procedure. The angles of the other
				    residues will only change during minimiza-
				    tions; no changes will be made in them 
				    during sampling. This option is useful, if 
				    you made a point mutation in a large 
				    protein and want to establish quickly the 
				    effect of this mutation on conformation. 
				    In such a case it is good to "bomb" only 
				    the mutated residue, instead of wasting 
				    "munitions" on the whole protein. Default 
				    is to "bomb" the whole molecule.

     MAX_BCKB_REP = number          The maximum number of times that the same 
                                    backbone conformation can be accepted. When 
                                    this limit is attained, the new generated 
                                    conformations having the same Zimmerman code 
                                    will be rejected, unless is an improvement on 
                                    the current global minimum. Default value is 20.

     PROMET                         The omegas of Pro and N-Met residues will be searched
                                    with similar probabilities as for PHIs and PSIs. 
 

     NPRINT_CONSTR   = number     - printing of information about distance constraints 
				    every NPRINT_CONSTR accepted conformations.

   REFERENCE                            Used to stop EDMC when the Zimmerman
                                        code of an accepted conformation
                                        matches the one corresponding to the
                                        conformation provided as reference (initial
                                        conformation in file *.inp).

$ENERCALC
  This data group is used to request energy evaluation or energy minimization
  of a (or many) conformation(s). 


The specific keywords of this data group are:

   KEYWORD        ARGUMENT                DESCRIPTION
   ------         -------                 -----------
    
     SINGLE_CONF                  - Carry out the procedure using as input the 
                                    conformation provided in data group $GEOM

     READ_CONF                    - Carry out the procedure starting
                                    from the set of conformations provided
                                    in a separate input file (outo format).

     RAND_START                   - Carry out the procedure starting
                                    from the set of randomly-generated
                                    conformations. 

     OMEGA_180                     Works with RAND_START. Keep the omega's at 180.

     MAXIT       = number         - Maximum no. of randomly generated conformations.

     SEED   =  number             - Seed for the random number generator.

     REGION_SAMP     =            - Use the set of sampling regions specified
                                    for specific amino acid.

                       UNIFORM    - Use uniform sampling through specified
                                    regions

                       NONUNIFORM - Sample through specified regions using
                                    provided weights.

     BACKUP      =   number       - This keywords should allow to stop the
                                    procedure nicely.  Not implemented, yet.

     RESTART                      - This keywords should allow to restart the
                                    procedure.   Not implemented, yet.

     NO_MINIMIZATION              - Use to check energy terms related to the
                                    distance constraints . No VTF minimization
                                    is being carried out.

     CONSTR_MOV                   - This keyword is used to indicate that distance 
                                    constraints are  used in the calculation. The key
                                    can be included, optionally, in the $FFIELD data 
                                    group.

     ZIMMERMAN_CODE               - This option is used to print the Zimmerman Code 
                                     of the conformation(s). 

$FFIELD 
 - Specific information about the force field used.


The specific keywords of this data group are:

   KEYWORD        ARGUMENT                DESCRIPTION
   ------         -------                 -----------

      FORCE_FIELD    =                     
					
			ECEPP              - ECEPP/3 force field (the default).
			SIMPLE_POTENTIAL   -  Max Vasquez's quartic potential 
					      for VDW distances.
			AMBER              - Not implemented yet.
			DISCOVER           - Not implemented yet.
			CHARMM             - Not implemented yet.

      SOLVATION      =                     - Compute solvation energy.
			                    (the default is NO solvation)

			SURFACE            -use surface-solvation models 
                                            developed by J. Vila and R. Williams. 

                        VOLUME             -use volume-solvation model developed by 
                                            Joe Augspurger (S_PAR_FILE=volume.set
                                            must be specified).

			ELECTROSTATIC      - Not implemented yet. Is intended 
					     to compute electrostatic solvation
					     using  the DELPHI program 
					     (B. Honig, Columbia Univ.).

			ALL                - SURFACE + ELECTROSTATIC 

          SURFACE_OUT                      - Print exposed surface for atoms. 
                                             The keyword SOLVATION= SURFACE must be
                                             specified.

      NO_SOLV_MIN                          - Used with SOLVATION to indicate
					     that solvation energy should
					     be added to the total energy after
					     energy-minimization of a 
					     conformation, but not used during 
					     the energy minimization process.


      RAD_FILE       = character_variable  - Input file with radia parameters
					     for different solvation types.

      S_PAR_FILE     = character_variable  - input file with solvation parameters 
					     for different solvation  types.
                                             SURFACE-HYDRATION FILES:srfopt.set (default),
                                             jrf.set,oons.set,solprmNW.nmr,optsl27.rall.
                                             VOLUME-HYDRATION FILE:volume.set.


      OM_TRANS                             - Impose a special one-fold potential
					     on all omega angles to keep them
                                             trans; this goes with the keyword
					     FORC.

            FORC           = number        - The torsional constant; 
					     the default value is 100

      NO_TORSIONALS                        - Omit torsional terms of the 
					     potential function. 

      THERMO

      TSTART         = number

      TEND           = number

      NSTEP          = number

      CONTACT_ENE    = number              - Defines the contact energy when
					     using the simplified potential.
                                             Used with FORCE_FIELD=SIMPLE_POTENTIAL


      PH             = number              - pH  value. Not used in the present version.

      RES_DBASE      = character_variable  - Used in some architectures (SUN) to define 
					    the residue data file, or to select a different
                                            file than the default ``rsdata".
					    Note: In general, the residue data file is
					    specified in the script file recepp.s.

      CUTOFF         =                     - Used to define cutoff in the energy terms.

			 NONE                default.

			 BLOCK               Used when a set of dihedral angles are kept 
					     fixed during the computations. In that case, 
					     the CUTOFF keyword can be used to omit 
					     the calculations of 1-4 and 1-5 interactions 
                                             that don't vary during energy minimization.

			 DISTANCE_CA         Not implemented, yet.


      OVER_CUTOFF    = number              - Used to pre-minimize a conformation
                                             using a simple potential function until
                                             every single term of the energy is lower 
                                             than the value specified by "number".  

      NON_OVERLAP_ENER                       logical flag to requested  printing of the 
                                             energy of a conformation after relief of  
                                             atomic overlaps when the conformation is 
                                             subjected to energy minimization using the 
                                             simple-potential function. 
                                             Should be specified with OVER_CUTOFF or
                                             FORCE_FIELD=SIMPLE_POTENTIAL

      VARDIEL                              - Use a distance-dependent dielectric
					     constant. Implementation of Feng 
					     Ni (BRI, Montreal).

NOTE: The usage of the following keywords in $FFIELD data group is kept for consistency
with previous version but is not recommended. They were incorporated into other data
groups. 

      CONSTR_MOV                           - This keyword is used to indicate
					     that distance constraints are 
					     used in the calculation. The
					     key can be included, optionally,
					     in the $EDMC data group.

$GEOM  
This is  another essential data group used to define the initial conformation
of the molecule. The program will not proceed if the data group is not found.
The data group should contain  the LIST OF DIHEDRAL ANGLES IN A FORMATTED INPUT 
(15f8.3).  One line per residue (or end group) is necessary or the program will
terminated with error. Blank lines are permitted. In this case, all dihedral 
angles will be set to zero, except when random generation of the starting 
conformation is requested.

$GRID  
These keyword  can work in combination with RUNTYP=ENERGY or 
RUNTYP= MINIMIZE. Generates an energy grid ( a two-dimensional energy map).
if RUNTYP= MINIMIZE is specified, the program will carry out the following 
procedure: 
1. The dihedral angles you define for the Phi-Psi map are kept fixed  
   during minimization.
2.- The program minimize the energy using the remaining variables dihedral 
angles.

The program scans two dihedral angles (ANG1 and ANG2) starting from
the values specified in FROM1 and FROM2, respectively. There are two 
alternative possibilities for specifying the scanning:
a- To give the final values of the dihedral angles (TO1 and TO2) and the
   number of steps (N1 and N2).
b- to give the step size (STEP1 and STEP2) and the number of steps (N1 and N2).

The specific keywords of this data group are:

   KEYWORD        ARGUMENT                DESCRIPTION
   ------         -------                 -----------

      ANG1      = character_variable       - Name used to describe the first 
                                             dihedral angle.Characters allowed 
					     are: PHI(n),PSI(n),OME(n),CHI(n),
                                             TAU(n) where n is the residue number.

      ANG2      = character_variable       - Name of the second dihedral angle.

      FROM1      = number                  - Initial value of first dihedral 
					      angle.

      FROM2      = number                  - Initial value of second dihedral 
					      angle.

      TO1        = number                  - Final value of first dihedral 
					      angle.

      TO2        = number                  - Final value of second dihedral 
					      angle.

      STEP1      = number                  - step size of first dihedral angle.

      STEP2      = number                  - step size of second dihedral angle.

      N1         = number                  - Number of steps for first dihedral 
					      angle.

      N2         = number                  - Number of steps for second 
					      dihedral angle.

      OMSCAN-OK                            - Used to confirm scanning over an
					     omega dihedral angle.

$MINIM  
This keyword is used to modify a few  parameters in the  minimization program 
of Gay (SUMSL, SMSNO).

The specific keywords of this data group are:

   KEYWORD        ARGUMENT                DESCRIPTION
   ------         -------                 -----------

     MINIMIZER    = 
		    SUMSL                  - Use the unconstrained minimization
					     solver with analytical gradient.

		    SMSNO                  - Use the unconstrained minimization
                                             solver with numerical gradient.



     MAXFUN       = number                 - Maximum number of function 
					      evaluations allowed. 

     MAXIT        = number                 - Maximum number of iterations 
					     allowed. 

     MAXSTEP      = number                 - Maximum value for V(RADFAC).


     VTNER1       = number                 - Helps decide when to check for 
					     FALSE convergence [V(26)].

     ABSTOL       = number                 - The absolute function convergence 
					     tolerance [V(31)].

     RELTOL       = number                 - The relative function convergence 
					     tolerance [V(32)]

     DSCALE       =                        -  Not implemented, yet.

		    NONE                        

		    FIXED

		    VARIABLE

     DVALUE       = number                  -  Initialization value  of the
					       scale vector D.

     FULL_PRINT                             -  Controls SUMSL printing.

     PRINT_RES_XG                           -  Prints out values of X's, 
					       gradient and D's on  return.

     PRINT_STAT                             -  Prints out summary of statistics.

     PRINT_INITIAL_X                        -  Print initial X's and D's.



$OMCIS
This datagroup is used to defined the residues for which the reference 
conformation for the peptide bond is cis.
Format:
NRES     res1   res2  ...  resk
where NRES is the number of residues for which the cis conformation of the
peptide bond is taken as the reference; and res1, res2,....resk are the numbers
representing the position  of the residue in the sequence.



$OVERLAP_GRP
This data group is used in a  version under development. 
It is used within the VTF procedure to defined a sets of atoms with 
overlapping resonances.
The format is as follows:
IGP GLB NG IR1 G1  IR2 G2  IR3 G3 .........IGn Gn 
  1 HX1  1  17 HN
  2 HX2  5   7 HM0   6 HM1   6 HM2  15 HM1  15 HM2
  3 HX3  2   6 HM0  15 HM0



$REGIONS  
This data group is used to define the sampling regions for amino acids in 
a Monte Carlo type of search.
The sampling can be UNIFORM or NONUNIFORM (this are keywords defined in
data groups $VTF  and $EDMC with the keyword REGION_SAMP).
If the sample is UNIFORM the input is specified using the following format:
residue_no.   region1 region2 .....   regionM
If the sample is NON-uniform the input is specified as:
residue_no.   region1 weight1  region2  weight2 .....   regionM weightM

where:
- residue no. belongs to  { 2, inumrs}
- regionI is one of the 16 regions of the PHI-PSI map using the Zimmerman's
  code A,A*,... H* , or any the four POPOV's regions. H-, H+ (HELIX) and
  S-, S+ (SHEET).
- weightI is an integer indicating the weight used to generate the sampling
  probability for the associated region.

Example of UNIFORM sampling:
3   A  A* C  C*
Example of NONUNIFORM sampling:
3   A 40  A* 10  C 30  C* 10

A continuation line should be indicated with the symbol '\'


$RMSFIT  
This data group is used for comparison of one or multiple conformations with 
a reference one. It works in combination with the keyword RUNTYP= RMS_FIT. 
This module calculates atomic rms deviations, rms distance deviations,
radia of gyration, and is able to produce fitting of conformations.
The program reads different types of reference and input files. By default, 
it tries to read the reference conformation from a file named xray.NAME_REF 
(where NAME_REF is a name provided by the user. NAME_REF is passed through an 
argument of the script that runs the program). 
When this file does not exist, the keyword  GENERATE_REF should be used to 
generate it. As a default for generation of the reference conformation, the 
set of dihedral angles provided as input in the $GEOM data group is used.
If a conformation given in PDB format is going to be used as reference,
the user should used the keyword TYPE_REF with the appropriate argument to
indicate this.

The specific keywords of this data group are:

   KEYWORD        ARGUMENT                DESCRIPTION
   ------         -------                 -----------

     GENERATE_REF                      - Used to indicate the generation of a 
					 reference  (or target) conformation
                                         in ECEPP format.

     TYPE_REF        =                 - Indicates the type of input format of 
					 the file containing the reference (or 
                                         target) conformation:

	               ECEPP             ECEPP dihedral angles provided with the 
					 $GEOM data group in *.inp file. Default.

	               PDB_NO_ENDG       Typical PDB where residue No.1 is
					 the first full residue. No end groups.

	               PDB_WITH_ENDG     Other files written in PDB format
					 where first and last residues are 
					 end groups.

     TYPE_INPUT      =                 - Used to indicate the type of format 
					 of the file to be used as input
					 (conformation(s) under study). 
					 Acceptable input formats are:

	               ECEPP             ECEPP dihedral angles using `outo'
					 format.

	               PDB_NO_ENDG       Classical PDB where residue No.1 is
					 the first full residue. No end groups.

	               PDB_WITH_ENDG     Other files written in PDB format
					 where first and last residues are 
					 end groups.

     IGNORE_H                          - Used to indicate the program not to
					 worry about mismatches in the H atom
					 names when reading PDB files.

     INIT_RES        = number          - Initial residue used on calculation.    
     IFIN_RES        = number          - Final residue used on calculation.

     ALL_HVY_ATOMS                     - Calculate rms of all heavy atoms of 
					 the specified residues.

     ALPHA_CARBONS                     - Calculate rms of CA atoms of the
					 the specified residues.

     BACKBONE                          - Calculate rms of backbone atoms 
					 (including CB) of the the specified 
					 residues.

     SIDE_CHAIN                        - Calculate rms of side-chain heavy atoms
                                         of the the specified residues.

     DISTANCE_RMS                      - Produces an additional report of the 
					 distance rms deviations for the input 
					 conformation(s) with respect to the 
					 reference conformation.


     CA_TRACE                          - Works in conjunction with the keyword
					 ALPHA_CARBONS. It is a flag to request
					 the generation of a series of aligned 
					 pdb file with  the CA traces.

     PDB_ALIGN_HVY                     - Write a pdb  file using alignment of 
					 all the heavy atoms.

     PDB_ALIGN_CA                      - Write a pdb  file using alignment of
					 the  CA atoms.

     PDB_ALIGN_BACK                    - Write a pdb  file using alignment of
					 the backbone atoms.

     PDB_ALIGN_SIDE                    - Write a pdb  file using alignment of
					 the side-chain heavy atoms.

     METHOD          =                 - Defines the type of algorithm used
					 to calculate RMS:
 
		       GOLUB           - Golub method. The default. 

		       KABSCH          - Kabsch method. This requires some 
					 IMSL routines. 

     FIRST_RESIDUE   = number          - This keyword is used to indicate that the 
					 first residue of the PDB reference file
					 is numbered as `number' instead of 1.

     ADOPT_REF_SEQ                     - This keyword is used to indicate the
                                         program to adopt the sequence of the reference
                                         conformation when the read conformations 
                                         have a different (or incompatible) sequence.  


$SCAN 
Scan carries out a systematic search of a set of specified dihedral 
angles. Angles should be specified in the following way (free format).

 residue_no.  no_of_dih_angles  no_first_dieh ... no_last_dieh


$SELEC_PDB 
This  data group works in combination with the keywords 
PRINT_CART  OUTFORMAT= SEL_PDB   (in $CNTRL data group)
The data group is used to define the set of atoms included in the
output pdb file. A free format is used to enter atom numbers (integer). 


$SEQ 
This the last ESSENTIAL data group. It is used to define the sequence 
of the molecule.  There are three different ways in which the sequence 
is defined:
(a) through ECEPP residue numbers (LIST); (b) Using a three-letter code; 
or (c) using a one-letter code.
The keyword  RES_CODE (in $CNTRL data group) is used to specify the options 
described previously. If this keyword is omitted, the program will attempt 
to read the sequence as ECEPP residue numbers. 

Rules:
-----
(a) ECEPP residue numbers are read using free format (default). Numbers 
    are integers defined as ECEPP LIST numbers.  Check column 2 of Table I 
    for correct assignment. A blank space is required between numbers.

(b) Three-letter code. These are characters variables defined in column 3 
    of Table I. A blank space is required between words.

(c) One-letter code. These are characters variables defined in column 4 
    of Table I. No blank space is required between descriptors (letters,
    usually).


$SPEC 
This data group is used to specify the set of variables dihedral angles.
This card usage depends on the values of the keywords VAR_ANGLES and 
VAR_RES ($CNTRL data group). 
(a) When  VAR_ANGLES  = SPEC is specified in the $CNTRL data group, 
   1- The VAR_RES should not be present in the $CNTRL data group. 
   2- The $SPEC data group is obligatory, and it must contain the following 
      specifications (in free format and one line per residue):

        res_num    num_var  num_1st_var ...  ... num_last_var 

      where: 
       res_num is the sequence number of the residue containing variables 
	 dihedral angles;
       num_var is the number of variables dihedral angles in the residue;
       num_1st_var, ..., num_last_var is a list of numbers (integers) that 
         point to the specific variables dihedral angles in the residue.
         The list must contain `num_var' integers.
  
(a) When VAR_RES = number_of_residues  (number_of_residues is an integer)
    is specified in the $CNTRL data group, 
    1- VAR_ANGLES can be given ANY value (all, back, bksd, etc.)  with the 
       EXCEPTION of `SPEC'.
    2- The $SPEC data group is required, and it must contain the sequence numbers
       of the residue for which the set of  dihedral angles will be defined
       as the variables. The residues should be given as a list of integers in
       free format.

(c) If  VAR_RES and VAR_ANGLES are both omitted in the $CNTRL data group,
    or VAR_RES is omitted and VAR_ANGLES is set to a value different from
    SPEC, then, the SPEC data group is not required.


$VTF 
   This data group is used to define the parameters for a Variable Target Function
   (VTF) calculation ( see as a reference Va'squez and Scheraga. J. Biomol. Struct. & 
   Dyn.  Vol 5(4) 757-784 (1988)). It works in combination with the keyword
   RUNTYP = VTF (data group $CNTRL) and the data group $DIST_CONST.


The specific keywords of this data group are:

   KEYWORD        ARGUMENT                DESCRIPTION
   ------         -------                 -----------

     SINGLE_CONF                  - Carry out the procedure using as input the 
                                    conformation provided in data group $GEOM

     READ_CONF                    - Carry out the procedure starting
                                    from the set of conformations provided
                                    in a separate input file (outo format).

     RAND_START                   - Carry out the procedure starting
                                    from the set of randomly-generated
                                    conformations. 

     OMEGA_180                     Works with RAND_START. Keep the omega's at 180.

     CONST_SEQ                    - The program will not change the protonation
                                    form of histidine and the internal geometry
                                    of proline.

     MAXIT       = number         - Maximum no. of randomly generated conformations.

     SEED   =  number             - Seed for the random number generator.


     RANK_ORDER     = number      - Determines the way the distances
                                    are order for the minimization
                                    steps during an iteration (IORDER).
                                    There are three possibilities:
                                    RANK_ORDER= 0,  Order by range, 
                                      `a la Braun-Go', i.e. 
                                    rank one ==> distance between nearest-neighbor
                                    residues;
                                    rank two ==> distance between second nearest-neighbor
                                    residues; etc.;
                                    RANK_ORDER= 1, Keep same order as in 
                                      input distances;
                                    RANK_ORDER= 2, Order by growing from 
                                      N-terminus.

     MAX_RANK                     - Parameter used to control the VTF 
                                    procedure. Usually, the procedure is  
                                    carried out by starting from random
                                    conformations and introducing the 
                                    distance constraints up to MAX_RANK
                                    (typically 10). From this run, a set
                                    of conformations is selected and a 
                                    second run is carried out with the full
                                    set of distance (i.e. the final rank is
                                    equal to the number of residues in the 
                                    chain.

     VTF_BY_RANK                    Indicates that distance should be included using
                                    the established ranks. Otherwise, the procedure
                                    introduces a few distance per minimization.
                                    NOTE: It is generally recommended to use the 
                                    keyword VTF_BY_RANK
                                     

     STEP_RANK = [-]number        - STEP_RANK > 0 defines the increment of the rank 
                                    for sequential minimizations within an iteration 
                                    of the VTF procedure (IFLOV). Additionally, 
                                    STEP_RANK different from zero implies that 
                                    torsional energy terms will NOT be included in the 
                                    energy minimization and disulfide bridge (DSB)
                                    information will NOT be used. 
                                    DSB closing at beginning of the VTF procedure interferes  
                                    greatly with the possibility of satisfying distance 
                                    constraints for smaller ranks. Consequently, it is
                                    recommended to add the DSB as an extra set of distance
                                    constraints. Since ECEPP assigns very high weights to 
                                    force DSB, adding the DSB as additional 
                                    distances constraints allows you to play with 
                                    different values of the weights.
                                    If STEP_RANK < 0 is given, all the distance 
                                    constraints will be included at once.
                                    With STEP_RANK=0  the VTF procedure INCLUDES
                                    torsional energy, DSB information and proceeds
                                    including distance constraints with a rank increment of 1.
                                  
BIG_VIOLATION = number              Is used to determine which conformations are reasonable 
                                    after the whole process of generation is finished 
                                    (conformation that should be saved). If the  maximum
                                    violation is greater than the BIG_VIOLATION (+ 10%), 
                                    the conformation is rejected.  All the conformations VTF 
                                    produces should be reasonable.

STEPS_ON_ERROR = number            -At every step of the VTF procedure, we check if the maximum
                                    violation exceeds BIG_VIOLATION. After 'n' consecutive 
                                    steps of the vtf procedure that exceed BIG_VIOLATION 
                                    (with n=STEPS_ON_ERROR), the generation process is aborted 
                                    and a new trial is started. The idea is  to cut time in 
                                    useless  energy-minimization. A conformation that do not
                                    satisfy the distance criteria after "n" steps have little 
                                    chances to reorganize later, when additional distances are 
                                    added. STEPS_ON_ERROR should not be very small, since 
                                    some conformations with distances that exceed BIG_VIOLATION 
                                    at certain stage of generation can reorganize later on, as 
                                    additional distances are included in the minimization.
                                    (From my experience, values for STEPS_ON_ERROR of 
                                    15 to 20 seem to work better).

     REGION_SAMP     =            - Use the set of sampling regions specified
				    for specific amino acid.

                       UNIFORM    - Use uniform sampling through specified
                                    regions

                       NONUNIFORM - Sample through specified regions using
                                    provided weights.

     BACKUP          =            - This keywords should allow to stop the 
				    procedure nicely.  Not implemented, yet.


     RESTART                      - This keywords should allow to restart the 
				    procedure.   Not implemented, yet.

     NO_MINIMIZATION              - Use to check energy terms related to the
				    distance constraints . No VTF minimization 
				    is being carried out.

     ZIMMERMAN_CODE               - This option is used to print the Zimmerman Code 
                                     of the conformation(s). 


$WINDOWS 
This data group contains the ranges of residues whose dihedral angles will be 
changed during sampling. There are as many non-empty lines as the number
of these ranges is. Each line contains the following two integers, read in free
format:

iw1 (the first residue of the range); iw2 (the last residue of the range).





Description of  the Data included for each Residue in rsdata
_____________________________________________________________

A few changes in the original ECEPP3 residue data file have been included to add
flexibility to the program. The goal is to minimize the coding of instructions 
that are residue-specific.
example:

ISOLEUCINE          ILE  I    0    0   0.000    F
   19    4-0.9437972 0.3305252-0.0993245 0.9950551
  -8   4
   1.35       3    1    4
  1.35        3    1    5
  1.35        3    1    6
  1.35        3    1    7
      0.350197-0.548499-0.759283       3  5  3         8  0   0
     -0.379217 0.310917-0.871508       5  9  3        11  1   1
      0.999962-0.005163-0.007059       5 10  3        14  0  -1
      0.350197-0.548498-0.759284      10 16  3        17  0   0
                                     N  14 22  -4.59   11    7 10
   -0.4226    0.9063    0.0          HN  2  2   2.27    7    4  6
    1.4530    0.0       0.0          CA  9  7   0.82   17   11 16
    1.7797   -0.4805    0.9222       HA  1  0   0.26   11    7 10
    1.9888   -0.8392   -1.1617       CB  9  7  -0.06    0   17 19
    1.9587    1.4440    0.0          C   7 14   5.80   11    8 10
    1.1648    2.3835    0.0          O  17 26  -4.95    8    0  0
    1.6625   -1.8692   -1.0179       HB  1  0   0.32   17   11 16
    1.4086   -0.3635   -2.4951       CG2 6  5  -0.96   17   14 16
    3.5188   -0.8471   -1.1725       CG1 6  6  -0.25    0   11 13
    0.3225   -0.4528   -2.4713       HG2 1  0   0.32   14    0  0
    1.6840    0.6781   -2.6602       HG2 1  0   0.32   14    0  0
    1.8059   -0.9770   -3.3037       HG2 1  0   0.32   14    0  0
    3.8906    0.1742   -1.2551       HG1 1  0   0.19    0   17 19
    3.8906   -1.2463   -0.2289       HG1 1  0   0.19    0   17 19
    4.0546   -1.6863   -2.3342       CD1 6  5  -0.96    0    0  0
    3.7005   -2.7124   -2.2348       HD1 1  0   0.32    0    0  0
    3.7005   -1.2695   -3.2771       HD1 1  0   0.32    0    0  0
    5.1444   -1.6749   -2.3183       HD1 1  0   0.32    0    0  0
    3.2771    1.5756    0.0

Description of first line:
  (TITL(L,I),L=1,4),ARES(I),ONE_LET(I),NFATO(I), QQQ_READ(I),PK0_READ(I), NMETR

(TITL(L,I),L=1,4) Residue name.
- ARES(I)     Three-letter-code residue identifier, used for sequence definition.
- ONE_LET(I)  One-letter-code residue identifier, used for sequence definition.
- NFATO (I)  Indicates that the 3 initial atoms (N, HN, and CA)  of the first 
             full residue should be generated using the data from the amino-end
             (NFATO=0), or the data from the residue (NFATO=1). In particular, 
              this assignment affects the charges of these atoms.  
- QQQ_READ(i) Net charge of the ionized residue  (used on specific versions 
              of the code).
- PK0_READ(I) pKa0 of the ionizable group (used on specific versions of the 
              code).
- NMETHYL     Logic variable to indicate if this is an N-methylated residue.

Description of second line:
 NATOMS(I),NCHI(I),SNTH2(I),CSTH2(I),SDEL(I), CDEL(I)
Same as in ECEPP/3 manual. 

Description of third line:
 KNDRES(I),NT,NGEOM(I),NTOR(I)
- KNDRES(I) and NGEOM(I) same as in ECEPP/3 manual.
- NTOR(I) is the number of torsional terms that are associated with EXPLICIT  
  dihedral angles, while NT is the TOTAL number of the torsional terms associated 
  with a residue, i.e. including the possible angles of the bridge formed by 
  this residue.  The parameters of the IMPLICIT torsional angles (i.e. those which
  will be calculated from the Cartesian coordinates after a bridge is formed) are 
  stored in the arrays after the parameters of the explicit angles.

Description of 4th to 7th lines:
AR(J,I),NBB(J,I),NSS(J,I),NANG(J,I)
Same as in ECEPP/3 manual.

Description of 8th to 11th lines:
(CHIANG(L,J,I),L=1,3),NDPT1(J,I),NDPT2(J,I), NUM(J,I),LRT1(J,I), IBRNCH(J,KINDI),
ISHFK(J,KINDI)
- CHIANG, NDPT1, NDPT2, NUM and LRT1  same as in ECEPP/3 manual.  LRT1 is used in 
      this program (not in the original ECEPP/3).
- IBRNCH The program now handles more than one branch on the side-chains.  If there 
       is a branch this is defined specifically (IBRNCH =1) for the bond that branches
       out. Also, to bring compatibility with the IUPAC conventions (ECEPP reads 
       the torsional angles following this convention), a variable ISHFK is defined 
       for each bond to indicate is there is a shift of the bond definition given 
       in rsdata.  In some cases, like ILE,  organization of the rsdata file for 
       generation purposes (in ECEPP/3) requires a different rearrangement of the 
       bonds numbers. In the specific case of ILE, lines 9 and 10 indicate that
       the dihedral angle input for bond 2 and 3 have to be exchanged.

Description of 12th to 31th lines:
 (XOORD(L,J-1,I),L=1,3),ALPHA(J,I),LTYPE(J,I), NTYPE(J,I),CHG(J,I),NSN15(J,I),
NSN14(J,I),NFN14(J,I)

- XOORD, ALPHA, LTYPE, CHG, NSN15, NSN14 and NFN14 same as in ECEPP/3 manual.
- NTYPE atom type for surface solvation models.



How to Build a File  with Distance and/or Dihedral Angle Constraint (bounds.*)
______________________________________________________________________________

A distance constraint energy term can be used in the calculations.
The algorithm used in this program represents a modification of the one 
originally implemented in Max Vasquez's VTF (Vasquez, M. & Scheraga, H. A. 
 1988. "Variable-Target-Function and Build-up procedures for the calculation 
of protein conformation Application to bovine pancreatic trypsin inhibitor
using limited simulated nuclear magnetic resonance data."
J. Biomol. Struct. Dyn. vol. 5, 757-784.  

The functional form is:
             Econs= WEI_ENE * Sum [ wei(j)*(| rj - R|)^2]; 
                            j in {pairs}                       

                        for rj < R or rj < R  with R an upper or lower bound.
where
       rj is the actual interproton distance.
       R  is either, an upper bound or a lower bound.
       wei is a factor or weight used to make the constraint more (or less)
	   relevant with respect to others.
       WEI_ENE is a factor that weights the distance energy term with 
       respect to other energy terms ( like electrostatic, torsional,etc.)

Distance constrains are included in the calculations in the following
manner. 
1.-   Use the $DIST_CONST data group (NOT the $BOUNDS) to specify the
number of constrains and setup other parameters.

2.- Generate a file (bounds.FILENAME) containing the information for each
constraint as in the following examples. There are two alternative ways to 
describe the constraints:

  a.- Using the ecepp number for the specific atoms.  The information should 
be written in one line per constraint (78 characters or less), and given in 
a free format as:
              mol1 iatm1      mol2 iatm2     lowb     upb      weight
where
     mol1 is the molecule containing the first atom (integer).
     iatm1 is the first atom defining the constraint (integer).
     mol2 is the molecule containing the second atom (integer).
     iatm2 is the second atom defining the constraint (integer).
     lowb  lower bound (real).
     upb    upper bound (real).
     wei    weighting factor (real). 

example
               1     34      1     51      1.900       5.000        10.0


   b.- 
           1    1    HCA     1     1    HCB   -1.000   3.000     10.0
          mol1  res1  atm    mol2  res2  atm    low-b    upp-b   weight
where 
mol1 is the molecule containing the first atom
res1 is the residue containing the first atom
C if lower-bound is -1.000, then VDW contact is assumed.


example:
        mol1 res1 iatm1 mol2 res2 iatm2     lowb   upb     weight 
(the file is a FORMATTED one). 
           1   1 CA      1  29 CA      7.186   7.942  10.000

will specify: the atoms defining the distance, upper and lower 
bounds, and a parameter (a weight) for the constraint.

A line starting with !  is  considered as a comment. 
See the example file bounds.timbck 
You should enter the number of constraints used in $DIST_CONST as
N1PAIR= mmm and N2PAIR= nnn, where N1PAIR and N2PAIR are the number
of constrains specified using format (a) or (b).

NOTE:  if a  lower-bound is -1.000, then VDW contact is assumed.

DIHEDRAL ANGLES CONSTRAINTS can also be included in the simulations.
The functional form for the penalty energy is the same one used
for the distance constraints (formula written above).
The dihedral angles constraints  are included  in the 'bounds.*' file 
as follows:
i.  The word DIHEDRAL must come after the last distance constraint.
ii. The next line should contain a number (real) that represents the
    conversion factor (or penalty weight), WEIDIH (equivalent to
    WEI_ENE in formula above). 
iii. Each subsequent line contains a description of  a dihedral angle 
     constraint with the following information:
     residue number,  dihedral angle number, expected mean value, maximum 
      deviation,  and specified weight value. 
example:

DIHEDRAL
100.00
         2        1          -40      40     1000.00
         5        2          -60      20     1000.00

where:
  WEIDIH = 100.00. 
  Two dihedral angle constraints are included:
  first: residue 2 , dihedral angle 1 (phi) is forced to  adopt a value of
  -40 deg. and the allowed deviation is 40 ( allowed values are those within 
  the interval [-80,0] )  
  second: residue 5,  dihedral angle 2 (psi) is forced to adopt values within
  the interval [-80, -40]. 

Random Number Generators,
------------------------
The program uses two random number generators. The serial version
uses the  VRND program (Prof. Ken Wilson).

The parallel version uses PRNG (Prof. Mal Kalos).
PRNG (parallel random number generator) is freely available 
by anonymous ftp.  
It's really easy to install it on any 64-bit machine such as the SGI PC.

CTC staff can get it without ftp'ing.

cp /afs/theory/archive/ftp/pub/utilities/prng.tar.Z to wherever you 
want to build it.  There are also two Makefiles in the eceppak PRNG 
directory, Makefile.DEC8400 and Makefile.IBMSP2.  
Any of these makefiles  should be appropriately changed for the  specific
architecture where the user intend to install the program.

If you are not CTC staff, here's how you can get the tar file:

ftp ftp.tc.cornell.edu
login in as user anonymous
give email address as password
cd pub/utilities
get prng.tar.Z

-----------------------------------------------------------
IF IMSL libraries are not available in your computer:

Edit the file orient1.F and comment the lines:

#ifdef AIX
      CALL DEVCSF (3,RTR,3,EIGVAL,T,3)
      IJUMP=1
#endif

Also, removed  "-limsl" from the "make" file,

LIBS = -L/usr/local/lib -limsl

should read:

LIBS = -L/usr/local/lib 

Finally, recompile the program.

Only the "GOLUB" option will work for calculations of rms deviations.