Grade2 outputs

Running the grade2 command, as described in the Usage and Examples chapters, will result in outputs both to file(s) and to the terminal. This chapter gives a guide to as to what to expect.

Terminal Output

grade2 writes out information about the restraint generation process as it runs to the terminal. This output is intended to be intelligible and to give an indication that the restraints generation process it proceeding normally for the ligand in question.

For example, generating a restraint dictionary for the PDB chemical component ID VIA (Sildenafil) running

$ grade2 --PDB_ligand VIA

produces an initial output giving copyright, authors and program version information, following the normal BUSTER package convention:

$ grade2 --PDB_ligand VIA
 set CSDHOME=/home/software/xtal/CCDC/CSDS/2021.3/CSD_2022 from $BDG_TOOL_MOGUL=/home/software/xtal/CCDC/CSDS/2021.3/CSD_2022/bin/mogul
 ############################################################################
 ##   [grade2] ligand restraint dictionary generation
 ############################################################################

      Copyright (C) 2019-2022 by Global Phasing Limited

                All rights reserved.

                This software is proprietary to and embodies the confidential
                technology of Global Phasing Limited (GPhL). Possession, use,
                duplication or dissemination of the software is authorised
                only pursuant to a valid written licence from GPhL.

   Version:   1.1.0 <2022-02-01>
   Authors:   Smart OS, Sharff A, Holstein J, Womack TO,  Flensburg C,
              Keller P, Paciorek W, Vonrhein C and Bricogne G

 -----------------------------------------------------------------------------

This is followed by output lines saying where the information from where the PDB chemical components definition for VIA is collected and giving web URLs to get further information about VIA (https://www.rcsb.org/ligand/VIA and https://www.ebi.ac.uk/pdbe-srv/pdbechem/chemicalCompound/show/VIA ).

Collected PDB chemical components definition for PDB id VIA
from: ftp://ftp.ebi.ac.uk/pub/databases/msd/pdbechem_v2/V/VIA/VIA.cif
Molecule name: "5-{2-ethoxy-5-[(4-methylpiperazin-1-yl)sulfonyl]phenyl}-1-methyl-3-propyl-1H,6H,7H-pyrazolo[4,3-D]pyrimidin-7-one"
For more information about "VIA" see:
---- https://www.rcsb.org/ligand/VIA
---- https://www.ebi.ac.uk/pdbe-srv/pdbechem/chemicalCompound/show/VIA

This is followed by information that the VIA molecule has a nitrogen atom that will normally be charged at neutral pH and a proton has been added to the molecule:

 WARNING: Charging groups likely to be charged at neutral pH.
 WARNING: ---- If you do not want this, rerun with the option: -N, --no_charging
         charging trialkylamine to trialkylammonium
         add new proton HN17 onto atom N17 (existing hydrogen atom_ids: None)
[16:42:04] WARNING: Proton(s) added/removed

If this charging is not wanted then the use --no_charging command-line option. For more details on the charging process see the Charging chapter.

A check is then made that the RDKit molecule generated for restraint production has an InChI that matches that from the input file (if this is available). InChI is short for International_Chemical_Identifier and provide a way to quickly check that the stereochemistry of molecules match. In this case, the output indicates that there is a match other than for the protonation layer, as would be expected giving the charging:

RDKit molecule generated has the same InChIKey as the other than the last protonation character.
---- This indicates that the stereochemistry matches other than the change caused by charging.

For all input sources, a check is made comparing the InChiKey of the RDKit molecule generated for restraint production with those for known PDB components (from the wwPDB Chemical Component Dictionary https://www.wwpdb.org/data/ccd ). In this case, the CHECK produces the expected result - that the molecule matches component VIA (apart from the checking):

CHECK: Check the molecule's InChiKey against known PDB components:
CHECK: Match to PDB chemical component(s) with a different number of protons:
CHECK:   VIA https://www.rcsb.org/ligand/VIA "5-{2-ethoxy-5-[(4-methylpiperazin-1-yl)sulfonyl]phenyl}-1-methyl-3-propyl-1H,6H,7H-pyrazolo[4,3-D]pyrimidin-7-one"

The check is most important for when using a molecule from a SMILES string or file input when this matches an existing PDB component (see example of this). If there is a match then it normally makes sense to use the restraint dictionary for the matching PDB component (see FAQ on matching components).

The checks are followed by information about the progress of restraint-generation including the force field used, the Mogul version and the final geometry optimization:

Minimization with MMFF94s reduces energy from 64.88 to 8.35 kcal/mol
Using CCDC Mogul-like geometry analysis.
Mogul version 2021.3.0, CSD version 543, csd-python-api 3.0.9
Geometry Optimize coordinates against restraints using gelly ....
---- gelly: Took 1472 steps, reducing the rms gradient to 0.05
---- gelly: and the rms bond deviation to 0.002 Angstroms.

The final part of the terminal output gives information about the output files produced and suggestions as to commands to view the results:

Have written CIF-format restraint dictionary to:   VIA.restraints.cif
Have written ideal coordinates to PDB-format file: VIA.xyz.pdb
Have written ideal coordinates to SDF-format file: VIA.xyz.sdf
Have written ideal coordinates in  MOL2-format to: VIA.xyz.mol2
Have written schematic 2D diagram SVG-format file: VIA.diagram.svg
Have written 2D diagram & atom_id labels to file:  VIA.diagram.atom_labels.svg
Suggestion: to view/edit the restraints, use one of the commands:
   coot -p VIA.xyz.pdb --dict VIA.restraints.cif
   EditREFMAC VIA.restraints.cif VIA.xyz.pdb VIA
Normal termination (7 secs)

grade2 follows standard Unix (and BUSTER) practice with normal output being written to STDOUT and errors to STDERR. This means that redirection or pipe/tee can be used to capture the output to a file (see How do I save terminal output to a file? for a guide to the many ways to do this).

CIF-format restraint dictionary

The CIF-format restraint dictionary file is the principal output of Grade2. The file lists the restraints generated as well as the important run-related information. The CIF-format restraint dictionary produced by Grade2 can be used with the BUSTER refine, Rhofit, Buster-report and the EditREFMAC restraint editor. In addition it can be used with Coot and should work with other 3rd-party refinement programs. Please let us know any compatibility issues you find.

The CIF-format restraint dictionary standard used by Grade2 is currently rather loosely set by what is understood by REFMAC and Coot, and has many items not set in the official PDBx/mmCIF Dictionary. Grade2-specific extensions are stored as data categories with name starting _gphl_ for instance _gphl_chem_comp_info. The command-line option --no_extra can be used to turn off Grade2-specific CIF categories and items.

Atom information

A Grade2 CIF-format restraint dictionary will always contain a chem_comp_atom category that defines the atoms of the ligand. Take for example, two atoms extracted from the restraint dictionary for the charged-version of PDB component VIA (sildenafil):

loop_
_chem_comp_atom.comp_id
_chem_comp_atom.atom_id
_chem_comp_atom.type_symbol
_chem_comp_atom.type_energy
_chem_comp_atom.partial_charge
_chem_comp_atom.charge
_chem_comp_atom.x
_chem_comp_atom.y
_chem_comp_atom.z
VIA  C18 C  CH2  0.091 0  1.888  2.510 -4.235
VIA  N17 N  NT1 -0.335 1  0.943  3.489 -3.615

Note that some items in the category do not follow the official PDBx/mmCIF Dictionary chem_comp_atom definitions.

Each atom in the molecule is identified by an atom ID (aka atom name) assigned in the chem_comp_atom.atom_id item. Atom IDs must be unique within a particular ligand and are used to define the atoms in each of the restraints. Grade2 has a number of options to set atom IDs, as described in the Atom Naming chapter.

The chem_comp_atom.type_symbol item provides an upper case version of the atom's element.

The chem_comp_atom.type_energy item is a widely-used extension to the official PDBx/mmCIF Dictionary giving an atom type as defined in the CCP4 suite file $CCP4/lib/data/monomers/ener_lib.cif. The type_energy is used by BUSTER to setup non-bonded contacts allowing atoms that can form hydrogen bonds to get closer than normal hydrogen bond contacts.

Note that formal atomic charges are given as item _chem_comp_atom.charge as these are important in unambiguously defining the chemistry of a ligand. In the example above, nitrogen atom N17 of the VIA is assigned a formal charge of +1 after the piperazine is protonated (see the VIA example in the Charging chapter). The now-obsolete program Grade fails to provide formal atomic charge information, making it difficult to use Grade restraint dictionaries as an input to Grade2, see FAQ: on using Grade input.

Partial atomic charges are given in addition to the formal charges. The partial charges by the Gasteiger and Marseli (1980) method as implemented in the RDKit ComputeGasteigerCharges module. Please note that there are many ways of calculating partial charges and so care needs to be taken that they are suitable before using them for any given application.

Cartesian coordinates for each atom are given in the CIF items chem_comp_atom.x, chem_comp_atom.y, and chem_comp_atom.z. These CIF items do not comply to the PDBx/mmCIF Dictionary chem_comp_atom standard but are widely used. The Cartesian coordinates are "ideal", as described below . In the Coot program, the conformation described by the coordinates can be retrieved by first importing the CIF dictionary, then by using either the File ... Get Monomer option or the Calculate ... Modelling >>> Monomer from Dictionary option. The ideal coordinates are also used by the Rhofit ligand fitting program.

Bond restraints

A Grade2 CIF-format restraint dictionary will contain a chem_comp_bond category giving information about each of the bonds that join the atoms of the ligand (except for ligands that are monoatomic). For example the following defines the first two bonds extracted from the restraint dictionary for the charged-version of PDB component VIA (sildenafil):

_chem_comp_bond.comp_id
_chem_comp_bond.atom_id_1
_chem_comp_bond.atom_id_2
_chem_comp_bond.type
_chem_comp_bond.aromatic
_chem_comp_bond.value_dist_nucleus
_chem_comp_bond.value_dist_nucleus_esd
_chem_comp_bond.value_dist
_chem_comp_bond.value_dist_esd
_chem_comp_bond.source_value_dist_nucleus
_chem_comp_bond.source_value_dist_nucleus_esd
_chem_comp_bond.source_value_dist
_chem_comp_bond.source_value_dist_esd
VIA C34  C33 single n 1.513 0.033 1.513 0.033 Mogul_mean_1207_hits     Mogul_sd Mogul_mean_1207_hits  Mogul_sd
VIA C34 H341 single n 1.093 0.020 0.979 0.015  MMFF94s_equilibrium default_to_H               ecloud    ecloud
  • chem_comp_bond.comp_id lists the chemical component ID (aka residue residue name) of the ligand in this case VIA.

  • chem_comp_bond.atom_id_1, & .atom_id_2 each bond joins two atoms identified by their atom IDs in items chem_comp_bond.atom_id_1 and chem_comp_bond.atom_id_2. The atom IDs must appear in the preceding chem_comp_atom table.

  • chem_comp_bond.type gives the order of bond and is one of single, double or triple. Note that the bonds in aromatic groups are assigned alternating single and double types by the RDKit Kekulize method. The bond type is also used for 2D schematic pictures and will be displayed in coot.

  • chem_comp_bond.aromatic the aromatic item is set to y or n depending on whether the bond is assigned to be aromatic by RDKit. The RDKit book section on aromaticity provides a description of the approach taken and starts with an instructive paragraph:

    "Aromaticity is one of those unpleasant topics that is simultaneously simple and impossibly complicated. Since neither experimental nor theoretical chemists can agree with each other about a definition, it’s necessary to pick something arbitrary and stick to it. This is the approach taken in the RDKit."

    For this reason it is important to for downstream procedures not to rely on the aromatic item. It is likely to vary between different programs and there is no "correct" definition. The aromatic item is provided for consistency with older programs and it would have been better if it had not been adopted in the past.

  • chem_comp_bond.value_dist_nucleus, & value_dist the items value_dist_nucleus and value_dist both give an ideal length for the bond in Å. For bonds that do not involve a hydrogen atom they will have an identical value. For bonds to hydrogen atoms, the value_dist_nucleus gives the normal bond distance between the nucleus of the two atoms, whereas the value_dist gives the shorter bond length that is suitable for X-ray refinement (Stewart et al., 1965).

    The value_dist_nucleus is used to define a harmonic restraint for the bonds, as \({b_{ideal}}\) in the formula below.

    \[V_{bond} = W_{bond} \sum_{bonds} \left( \frac{b - b_{ideal}}{\sigma} \right)^2\]

    where \({b}\) is the actual bond length and \({\sigma}\) the estimated standard deviation (the next item).

  • chem_comp_bond.value_dist_nucleus_esd, & .value_dist_esd the items value_dist_nucleus_esd and value_dist_esd provide the "estimated standard deviation" of value_dist_nucleus and value_dist in Å. This parameter is also know as the "standard uncertainty" or the "sigma" (\({\sigma}\) in the equation above). Values are taken from the standard deviation of the bond length distribution found from small molecule crystal structures, when these are available.

  • chem_comp_bond.source_value_dist_nucleus, .source_value_dist_nucleus_esd, .source_value_dist, & .source_value_dist_esd These source items provide information about the source of each the parameters defining the restraint. These items have been introduced by the Grade2 program as data provenance is important in any scientific study, and we think it is important to record where restraints come from. If the extract items are a problem, then use the command-line option --no_extra to turn off Grade2-specific CIF categories and items

    When available Grade2 will base ideal bond lengths on values from the Mogul tool analysis of CSD small molecule X-ray crystal structures. In these cases the source item will start Mogul. For example Mogul_mean_1207_hits shows that a parameter is taken from the mean of a distribution of 1207 values from relevant CSD structures.

    If Mogul cannot be used to obtain a value, for instance for all parameters involving hydrogen atoms, then a value will be obtained from a force field. For example, a source MMFF94s_equilibrium shows that the value is an equilibrium length obtained from the RDKit implementation of the MMFF force field (Tosco et al., 2014).

Bond angle restraints

The chem_comp_angle CIF category defines restraints on bond angles. For example the following defines the two bond angles extracted from the restraint dictionary for the charged-version of PDB component VIA (sildenafil):

_chem_comp_angle.comp_id
_chem_comp_angle.atom_id_1
_chem_comp_angle.atom_id_2
_chem_comp_angle.atom_id_3
_chem_comp_angle.value_angle
_chem_comp_angle.value_angle_esd
_chem_comp_angle.source_value_angle
_chem_comp_angle.source_value_angle_esd
VIA H342 C34 H343 108.6 3.0 MMFF94s_optimised_coords default
VIA  C34 C33  C32 112.6 2.8      Mogul_mean_528_hits Mogul_sd
  • chem_comp_angle.comp_id lists the chemical component ID (aka residue residue name) of the ligand, in this case VIA.

  • chem_comp_angle.atom_id_1, .atom_id_2, & .atom_id_3 give the atom IDs for the 3 atoms that form the bond angle.

  • chem_comp_angle.value_angle is the ideal or target angle of the restraint in degrees.

  • chem_comp_angle.value_angle_esd is the estimated standard deviation, also known as standard uncertainty or sigma for the restraint in degrees.

  • chem_comp_angle.source_value_angle These source items provide information about the source of each the parameters defining the restraint. Please see the chem_comp_angle.source* documentation above for details. When Mogul information is unavailable for a particular bond angle, then ideal angles are based on a force field. For instance above the bond angle involving hydrogen atoms has a source MMFF94s_optimised_coords, this means the "ideal" angle is based on the angle found after the ligand has been energy minimised with the RDKit implementation of the MMFF force field (Tosco et al., 2014).

Plane restraints

Planar restraints are specified in the chem_comp_plane_atom category. A separate line is used for each atom involved the plane (that can involve many atoms). For example, here are 2 of the 18 planes that Grade2 produces for the charged-version of PDB component VIA (sildenafil):

_chem_comp_plane_atom.comp_id
_chem_comp_plane_atom.plane_id
_chem_comp_plane_atom.atom_id
_chem_comp_plane_atom.dist_esd
_chem_comp_plane_atom.source
VIA  atom-C4  C9  0.02              Mogul_sum_angles_362
VIA  atom-C4  C4  0.02              Mogul_sum_angles_362
VIA  atom-C4  C5  0.02              Mogul_sum_angles_362
VIA  atom-C4  O3  0.02              Mogul_sum_angles_362
VIA   ring5A C30 0.005 Mogul+_ring_tors_rmsd_0.6_56_hits
VIA   ring5A N29 0.005 Mogul+_ring_tors_rmsd_0.6_56_hits
VIA   ring5A N28 0.005 Mogul+_ring_tors_rmsd_0.6_56_hits
VIA   ring5A C24 0.005 Mogul+_ring_tors_rmsd_0.6_56_hits
VIA   ring5A C25 0.005 Mogul+_ring_tors_rmsd_0.6_56_hits
  • chem_comp_plane_atom.comp_id lists the chemical component ID (aka residue residue name) of the ligand, in this case VIA.

  • chem_comp_plane_atom.plane_id provides a ID for the plane in question. For the two planes in the example the plane IDs are atom-C4 and ring5A. Grade2 uses descriptive plane IDs where possible. In the example, atom-C4 is a plane that holds atom C4 flat involving three atoms to which it is bonded (C9, C5 and O3). Plane ring5A is a plane that holds a five-membered ring in the ligand flat (in this case the pyrazole ring in VIA). If there is a second five-membered ring in the ligand that is flat then the ring would be assigned the ID ring5B.

    A plane ID that starts 2fold- is used for torsion angles that Grade2 assigns to be flat but where there is no preference as to whether the torsion is predominately 0º or 180º. Note that such a group will be held planar by a plane restraint rather than a 2-fold torsion angle as some programs, such as Coot, do not activate torsion angle restraints by default (and there can be differences in handling non-bonded contacts between the atoms involved).

    A plane ID that starts trans- or cis- imposes a plane restraint that Mogul indicates has a strong preference to be around 180º or 0º respectively. The plane restraint imposes no preference to either conformation but a corresponding 1-fold torsion is defined that is normally inactive.

    If you ever edit a restraint dictionary to introduce your own plane definitions, it should be noted that some programs have a 8-character limit to the plane_id.

  • chem_comp_plane_atom.atom_id lists the atom ID for an atom within the plane (that is defined on multiple lines).

  • chem_comp_plane_atom.dist_esd is the estimated standard deviation, also known as standard uncertainty or sigma for the restraint in Ångstroms. The plane restraint provides a harmonic penalty forcing atoms towards the mean plane formed by the atoms. The dist_esd determines the stiffness of the restraint. The previous Grade program and many other restraint generation tool use a sigma of 0.02Å for all planes. Grade2 goes beyond this assigning values of sigma depending on the tightness of distributions from Mogul + custom ring analysis, please see the Treatment of Planar Groups chapter for more information.

    In the example above, the plane ID atom-C4 that holds atom C4 is assigned the default sigma of 0.02Å. In constrast, the plane ring5A that holds the pyrazole ring flat is assigned a tighter sigma 0.005Å.

  • chem_comp_plane_atom.source This is an Grade2 extra item that provides some source information as to why the plane was assigned. If the extract items are a problem, then use the command-line option --no_extra to turn off Grade2-specific CIF categories and items.

    In the example above, the plane ID atom-C4 has a source Mogul_sum_angles_362. Atom C4 is assigned to be planar because it is bonded to 3 other atoms and the sum of ideal angles for the 3 bond angle restraints with C4 as a central atom is 362º. All three bond angles restraints are from Mogul distributions. Currently, if the sum of angles from Mogul is above 356º then a plane restraint is added, with the default sigma of 0.02Å.

    For the pyrazole ring plane ID ring5A the source is Mogul+_ring_tors_rmsd_0.6_56_hits. The plane is assigned from Mogul + custom ring analysis from 56 Mogul hits. The ring torsions of the hits have a root mean squared deviation from zero of 0.6º. This means that the rings are very flat. The sigma for the plane restraint is set at the limit of 0.005Å.

    For planes holding atoms bonded to one or more hydrogen atoms are normally set from the the RDKit implementation of the MMFF force field (Tosco et al., 2014). This will have a source item like MMFF_out_of_plane_koop_0.015, meaning the that the atom is held planar on the basis of the MMFF out-of-plane term.

Torsion angle restraints

Restraints on torsion angles are specified by the chem_comp_tor category.

As well defining restraints for refinement the chem_comp_tor records, Coot uses the records in its "Edit Chi Angles" side-menu option to allow adjustment of a ligands conformation. For subtle changes in conformation "Edit Chi Angles" is sometimes more useful than dragging about the ligand in real space refine, for example it allows a saturated six-membered ring to be positioned without disruption of its chair conformation.

Screen grab of Coot "edit Chi angles" being used on VIA

Here are some of the 33 torsion restraints that Grade2 produces for the charged-version of PDB component VIA (sildenafil):

_chem_comp_tor.comp_id
_chem_comp_tor.id
_chem_comp_tor.atom_id_1
_chem_comp_tor.atom_id_2
_chem_comp_tor.atom_id_3
_chem_comp_tor.atom_id_4
_chem_comp_tor.value_angle
_chem_comp_tor.value_angle_esd
_chem_comp_tor.period
_chem_comp_tor.source
VIA CONST_ring6B-6   C9  C8  C7   C6   0.0 1000000.0  0                                            planar_ring
VIA  puck_ring6C-1  C19 C18 N17  C16 -60.0      12.3  3                   Mogul+_pucker_tors_rmsd_57.1_53_hits
VIA  puck_ring6C-2  C18 N17 C16  C15  60.0      12.3  3                   Mogul+_pucker_tors_rmsd_57.1_53_hits
VIA  puck_ring6C-3  N17 C16 C15  N14 -60.0      12.3  3                   Mogul+_pucker_tors_rmsd_57.1_53_hits
VIA  puck_ring6C-4  C16 C15 N14  C19  60.0      12.3  3                   Mogul+_pucker_tors_rmsd_57.1_53_hits
VIA  puck_ring6C-5  C15 N14 C19  C18 -60.0      12.3  3                   Mogul+_pucker_tors_rmsd_57.1_53_hits
VIA  puck_ring6C-6  N14 C19 C18  N17  60.0      12.3  3                   Mogul+_pucker_tors_rmsd_57.1_53_hits
VIA       3fold-23 H341 C34 C33  C32 180.0      15.0  3                     from_MMFF94s_optimised_coordinates
VIA       3fold-24  C34 C33 C32  C30 180.0      12.3  3               Mogul_3fold_92.2%_within_10degs_103_hits
VIA        free-25  C33 C32 C30  C25   0.0 1000000.0 10                                   unrestrained_default
VIA        free-26  C24 N28 C31 H311   0.0 1000000.0 10                                   unrestrained_default
VIA       2fold-27  N22 C21  C9   C8 180.0 1000000.0  2  Mogul_plane_rms_out_of_plane_torsion29.4_degs_46_hits
  • chem_comp_tor.comp_id lists the chemical component ID (aka residue residue name) of the ligand, in this case VIA.

  • chem_comp_tor.id provides a ID for the torsion angle.

    A torsion ID that starts CONST_ is used for torsions within planar rings. No active restraint is placed on such torsions.

    For six-membered saturated rings, such as the piperazine group in VIA, the chem_comp_tor.id of the six ring torsion angles will start with puck_ring6. 3-fold torsion restraints with minima at +60º and -60º (the 180º minimum is irrelevant because of the ring closure).

    If Grade2 judges that a should have a 3-fold active torsion restraint the chem_comp_tor.id will start 3fold (and chem_comp_tor.period will be set to 3).

    Grade2 does not impose active 2-fold or 1-fold torsion restraints, instead using a plane restraint to hold the atoms planar. In such a case the chem_comp_plane_atom.plane_id and chem_comp_tor.id will be consistent. 2-fold torsions will have an ID starting 2-fold. 1-fold torsions will have an ID starting trans- or cis-. In all these, cases the torsion restraint is inactivated by setting chem_comp_tor.value_angle_esd to a very large value (1,000,000º).

  • chem_comp_tor.atom_id_1, .atom_id_2, .atom_id_3, & .atom_id_4 give the atom IDs for the 4 atoms that form the torsion angle.

  • chem_comp_tor.value_angle is the ideal or target angle of the restraint in degrees. For 3-fold torsion angles there will be two additional minima at \({\pm}\) 120º

  • chem_comp_tor.value_angle_esd is the estimated standard deviation, also known as standard uncertainty or sigma for the restraint in degrees. Inactive restraints are produced by setting the sigma to a very large value (1,000,000º).

  • chem_comp_tor.period is the periodicity of the restraint - that is the number of minima in 360º range of the angle.

  • chem_comp_tor.source provides information about the source of the restraint. For example Mogul+_pucker_tors_rmsd_57.1_53_hits says that a six-membered ring is set to have restraint to maintain its pucker \({\pm}\) 60º as the 53 CSD hits from Mogul+ analysis have an root mean square deviation from 0º of 57.1º.

Chiral centre restraints

Restraints controlling the configuration of chiral centres within a ligand are specified by the chem_comp_chir category. It should be noted that this category differs markedly from the official chem_comp_chir and reflects the de facto standard used by Libcheck and succeeding programs.

BUSTER reads chem_comp_chir but then controls chirality using a restraint on the improper torsion angle rather than a chiral volume as explained in the GELLY documentation Appendix E: CHIRAL Restraints in gelly.

Here are the 4 chem_comp_chir records that Grade2 produces for the PDB component RIB

_chem_comp_chir.comp_id
_chem_comp_chir.id
_chem_comp_chir.atom_id_centre
_chem_comp_chir.atom_id_1
_chem_comp_chir.atom_id_2
_chem_comp_chir.atom_id_3
_chem_comp_chir.volume_sign
_chem_comp_chir.source
RIB chir_01 C4 C5 O4 C3 negativ rdkit
RIB chir_02 C3 C4 O3 C2 negativ rdkit
RIB chir_03 C2 C3 O2 C1 negativ rdkit
RIB chir_04 C1 O4 C2 O1 negativ rdkit
  • chem_comp_chir.comp_id lists the chemical component ID (aka residue residue name) of the ligand, in this case RIB.

  • chem_comp_chir.chir_id provides an ID for the chiral centre. Grade2 uses IDs starting chir_01

  • chem_comp_chir.atom_id_centre provides the atom ID for the chiral atom.

  • chem_comp_chir.atom_id_1, .atom_id_2, & .atom_id_3 provide the atom IDs of three atoms that are bonded to the chiral atom.

  • chem_comp_chir.volume_sign specifies the chiral configuration of the centre. Possible values are positiv, negativ and both. When there is a chiral centre whose configuration has not been set in the input, for example a SMILES string that lacks stereo specification, the volume_sign is set to both.

  • chem_comp_chir.source provides information about the source of the assignment.

Systematic names

If available, the output CIF-format restraint dictionary will contain information as to the systematic name of the ligand. The pdbx_chem_comp_identifier data category will be used. Systematic names for PDB ligands are automatically obtained from the input PDB chemical component definition (if the ligand is charged by Grade2 then " (CHARGED)" will be added). The --pubchem_names option can be used to do a online lookup the systematic name for ligands that occur in PubChem. The --systematic option allows the systematic name to be manually set. Currently, we are not aware of any open source systematic chemical name programs but commercial programs to produce systematic names are available from ACD/Labs, OpenEye and Chemaxon.

Database Information

From Grade2 version 1.3.0 information about entries for the ligand in Chemical databases is included in the output CIF restraint dictionary. This information is held in the CIF data category gphl_chem_comp_database. For example, for PDB chemical component VIA output restraint dictionary will have the automatically have the information:

 #
loop_
_gphl_chem_comp_database.comp_id
_gphl_chem_comp_database.id
_gphl_chem_comp_database.database
_gphl_chem_comp_database.url
_gphl_chem_comp_database.details
VIA VIA PDB                                   https://www.rcsb.org/ligand/VIA "RCSB PDB"
VIA VIA PDB https://www.ebi.ac.uk/pdbe-srv/pdbechem/chemicalCompound/show/VIA       PDBe
#

For PDB chemical components Grade2 automatically provides details to access RCSB PDB and PDBe pages. The --pubchem_names also automatically sets appropriate gphl_chem_comp_database records if a match is found.

Users can provide information about in-house databases using the --database_id option.

"Ideal" coordinate files

At the end of the restraint generation process a geometry optimization of the coordinates of molecule with the gelly geometry-only minimizer is made. This produces a set of coordinates that where the bond length, bond angles and other terms are adjusted to be as close as possible to the "ideal" values. These coordinates are then used to output files in a variety of formats. Please note that the "ideal" coordinates can be trapped at a local minimum.

PDB-format

PDB-format is a widely used exchange chemical file format for proteins. Grade PDB-format ideal coordinates are written using RDKit routines, as so have CONECT records giving the bond order that are recognized by some molecular graphics programs (such as Jmol).

SDF-format

Please note that SDF-format file will use Kekulé bonding (where aromatic bonds are marked with alternating single and double bonds) whereas the MOL2-format uses CSD conventions for aromatic bonds.

MOL2-format

The MOL2-format file uses aromatic bonding following the CSD convention. This makes them suitable for running addition Mogul geometry analysis.

schematic 2D molecular diagrams

As well as writing a CIF restraint dictionary and "ideal" coordinates files Grade2 will produce schematic 2D molecular diagrams that can be useful. For instance, for the PDB ligand CFF caffeine, running grade2 --PDB_ligand CFF will write two SVG files that can be visualized using a web-browser (such as Chrome):

  1. CFF.diagram.svg

    CFF.diagram.svg converted to png
  2. CFF.diagram.atom_labels.svg the diagram with atoms labelled

    CFF.diagram.atom_labels.svg converted to png