Grade2 outputs¶
Running the grade2
command, as described in the Usage and
Examples chapters, will result in outputs both to file(s) and
to the terminal. This chapter gives a guide to as to what to expect.
Terminal Output¶
grade2
writes out information about the restraint generation process
as it runs to the terminal. This output is intended to be intelligible
and to give an indication that the restraints generation process
it proceeding normally for the ligand in question.
For example, generating a restraint dictionary for
the PDB chemical component ID VIA
(Sildenafil) running
$ grade2 --PDB_ligand VIA
produces an initial output giving copyright, authors and program version information, following the normal BUSTER package convention:
$ grade2 --PDB_ligand VIA
set CSDHOME=/home/software/xtal/CCDC/CSDS/2021.3/CSD_2022 from $BDG_TOOL_MOGUL=/home/software/xtal/CCDC/CSDS/2021.3/CSD_2022/bin/mogul
############################################################################
## [grade2] ligand restraint dictionary generation
############################################################################
Copyright (C) 2019-2022 by Global Phasing Limited
All rights reserved.
This software is proprietary to and embodies the confidential
technology of Global Phasing Limited (GPhL). Possession, use,
duplication or dissemination of the software is authorised
only pursuant to a valid written licence from GPhL.
Version: 1.1.0 <2022-02-01>
Authors: Smart OS, Sharff A, Holstein J, Womack TO, Flensburg C,
Keller P, Paciorek W, Vonrhein C and Bricogne G
-----------------------------------------------------------------------------
This is followed by output lines saying where the information from where the PDB chemical components definition for VIA is collected and giving web URLs to get further information about VIA (https://www.rcsb.org/ligand/VIA and https://www.ebi.ac.uk/pdbe-srv/pdbechem/chemicalCompound/show/VIA ).
Collected PDB chemical components definition for PDB id VIA
from: ftp://ftp.ebi.ac.uk/pub/databases/msd/pdbechem_v2/V/VIA/VIA.cif
Molecule name: "5-{2-ethoxy-5-[(4-methylpiperazin-1-yl)sulfonyl]phenyl}-1-methyl-3-propyl-1H,6H,7H-pyrazolo[4,3-D]pyrimidin-7-one"
For more information about "VIA" see:
---- https://www.rcsb.org/ligand/VIA
---- https://www.ebi.ac.uk/pdbe-srv/pdbechem/chemicalCompound/show/VIA
This is followed by information that the VIA
molecule has a nitrogen atom
that will normally be charged at neutral pH and a proton has been added
to the molecule:
WARNING: Charging groups likely to be charged at neutral pH.
WARNING: ---- If you do not want this, rerun with the option: -N, --no_charging
charging trialkylamine to trialkylammonium
add new proton HN17 onto atom N17 (existing hydrogen atom_ids: None)
[16:42:04] WARNING: Proton(s) added/removed
If this charging is not wanted then the use --no_charging command-line option. For more details on the charging process see the Charging chapter.
A check is then made that the RDKit molecule generated for restraint production has an InChI that matches that from the input file (if this is available). InChI is short for International_Chemical_Identifier and provide a way to quickly check that the stereochemistry of molecules match. In this case, the output indicates that there is a match other than for the protonation layer, as would be expected giving the charging:
RDKit molecule generated has the same InChIKey as the other than the last protonation character.
---- This indicates that the stereochemistry matches other than the change caused by charging.
For all input sources, a check is made comparing the InChiKey of the RDKit molecule
generated for restraint production with those for known PDB components (from
the wwPDB Chemical Component Dictionary https://www.wwpdb.org/data/ccd ).
In this case, the CHECK
produces the expected result - that the molecule
matches component VIA
(apart from the checking):
CHECK: Check the molecule's InChiKey against known PDB components:
CHECK: Match to PDB chemical component(s) with a different number of protons:
CHECK: VIA https://www.rcsb.org/ligand/VIA "5-{2-ethoxy-5-[(4-methylpiperazin-1-yl)sulfonyl]phenyl}-1-methyl-3-propyl-1H,6H,7H-pyrazolo[4,3-D]pyrimidin-7-one"
The check is most important for when using a molecule from a SMILES string or file input when this matches an existing PDB component (see example of this). If there is a match then it normally makes sense to use the restraint dictionary for the matching PDB component (see FAQ on matching components).
The checks are followed by information about the progress of restraint-generation including the force field used, the Mogul version and the final geometry optimization:
Minimization with MMFF94s reduces energy from 64.88 to 8.35 kcal/mol
Using CCDC Mogul-like geometry analysis.
Mogul version 2021.3.0, CSD version 543, csd-python-api 3.0.9
Geometry Optimize coordinates against restraints using gelly ....
---- gelly: Took 1472 steps, reducing the rms gradient to 0.05
---- gelly: and the rms bond deviation to 0.002 Angstroms.
The final part of the terminal output gives information about the output files produced and suggestions as to commands to view the results:
Have written CIF-format restraint dictionary to: VIA.restraints.cif
Have written ideal coordinates to PDB-format file: VIA.xyz.pdb
Have written ideal coordinates to SDF-format file: VIA.xyz.sdf
Have written ideal coordinates in MOL2-format to: VIA.xyz.mol2
Have written schematic 2D diagram SVG-format file: VIA.diagram.svg
Have written 2D diagram & atom_id labels to file: VIA.diagram.atom_labels.svg
Suggestion: to view/edit the restraints, use one of the commands:
coot -p VIA.xyz.pdb --dict VIA.restraints.cif
EditREFMAC VIA.restraints.cif VIA.xyz.pdb VIA
Normal termination (7 secs)
grade2
follows standard Unix (and BUSTER) practice with normal output being written
to STDOUT
and errors to STDERR
. This means that redirection or
pipe/tee
can be used to capture the output to a file (see
How do I save terminal output to a file?
for a guide to the many ways to do this).
CIF-format restraint dictionary¶
The CIF-format restraint dictionary file is the principal output of Grade2. The file lists the restraints generated as well as the important run-related information. The CIF-format restraint dictionary produced by Grade2 can be used with the BUSTER refine, Rhofit, Buster-report and the EditREFMAC restraint editor. In addition it can be used with Coot and should work with other 3rd-party refinement programs. Please let us know any compatibility issues you find.
The CIF-format restraint dictionary standard used by Grade2 is currently rather
loosely set by what is understood by
REFMAC
and Coot,
and has many items not set in the official PDBx/mmCIF Dictionary.
Grade2-specific extensions are stored as data categories with name starting _gphl_
for instance _gphl_chem_comp_info
. The
command-line option --no_extra can be used to turn
off Grade2-specific CIF categories and items.
Atom information¶
A Grade2 CIF-format restraint dictionary will always contain
a chem_comp_atom
category that defines the atoms of the ligand.
Take for example, two atoms extracted from the restraint dictionary for
the charged-version of PDB component VIA
(sildenafil):
loop_
_chem_comp_atom.comp_id
_chem_comp_atom.atom_id
_chem_comp_atom.type_symbol
_chem_comp_atom.type_energy
_chem_comp_atom.partial_charge
_chem_comp_atom.charge
_chem_comp_atom.x
_chem_comp_atom.y
_chem_comp_atom.z
VIA C18 C CH2 0.091 0 1.888 2.510 -4.235
VIA N17 N NT1 -0.335 1 0.943 3.489 -3.615
Note that some items in the category do not follow the official PDBx/mmCIF Dictionary chem_comp_atom definitions.
Each atom in the molecule is identified by an atom ID (aka atom name)
assigned in the chem_comp_atom.atom_id
item. Atom IDs must be unique
within a particular ligand and are used to define the atoms in each of the
restraints. Grade2 has a number of options to set atom IDs, as described
in the Atom Naming chapter.
The chem_comp_atom.type_symbol item provides an upper case version of the atom's element.
The chem_comp_atom.type_energy
item is a widely-used extension to
the official PDBx/mmCIF Dictionary giving an atom type as defined
in the CCP4 suite file $CCP4/lib/data/monomers/ener_lib.cif
.
The type_energy
is used by BUSTER to setup non-bonded contacts
allowing atoms that can form hydrogen bonds to get closer than
normal hydrogen bond contacts.
Note that formal atomic charges are given as item _chem_comp_atom.charge
as these are important in unambiguously defining the chemistry of a ligand.
In the example above, nitrogen atom N17
of the VIA
is assigned a formal charge of +1 after the piperazine is protonated (see the
VIA example in the Charging chapter).
The now-obsolete program Grade fails to provide formal atomic charge information,
making it difficult to use Grade restraint dictionaries as an input to Grade2,
see FAQ: on using Grade input.
Partial atomic charges are given in addition to the formal charges. The partial charges by the Gasteiger and Marseli (1980) method as implemented in the RDKit ComputeGasteigerCharges module. Please note that there are many ways of calculating partial charges and so care needs to be taken that they are suitable before using them for any given application.
Cartesian coordinates for each atom are given in the CIF items
chem_comp_atom.x
, chem_comp_atom.y
, and chem_comp_atom.z
.
These CIF items do not comply to the PDBx/mmCIF Dictionary
chem_comp_atom
standard but are widely used. The Cartesian coordinates are "ideal",
as described below .
In the Coot program, the conformation described by the coordinates can be
retrieved by first importing the CIF dictionary, then by using either
the File ... Get Monomer option or the
Calculate ... Modelling >>> Monomer from Dictionary option.
The ideal coordinates are also used by the
Rhofit
ligand fitting program.
Bond restraints¶
A Grade2 CIF-format restraint dictionary will contain
a chem_comp_bond
category giving information about each of the
bonds that join the atoms of the ligand (except for ligands that are
monoatomic). For example the following defines the first two bonds
extracted from the restraint dictionary for
the charged-version of PDB component VIA
(sildenafil):
_chem_comp_bond.comp_id
_chem_comp_bond.atom_id_1
_chem_comp_bond.atom_id_2
_chem_comp_bond.type
_chem_comp_bond.aromatic
_chem_comp_bond.value_dist_nucleus
_chem_comp_bond.value_dist_nucleus_esd
_chem_comp_bond.value_dist
_chem_comp_bond.value_dist_esd
_chem_comp_bond.source_value_dist_nucleus
_chem_comp_bond.source_value_dist_nucleus_esd
_chem_comp_bond.source_value_dist
_chem_comp_bond.source_value_dist_esd
VIA C34 C33 single n 1.513 0.033 1.513 0.033 Mogul_mean_1207_hits Mogul_sd Mogul_mean_1207_hits Mogul_sd
VIA C34 H341 single n 1.093 0.020 0.979 0.015 MMFF94s_equilibrium default_to_H ecloud ecloud
chem_comp_bond.comp_id
lists the chemical component ID (aka residue residue name) of the ligand in this caseVIA
.chem_comp_bond.atom_id_1
, &.atom_id_2
each bond joins two atoms identified by their atom IDs in items chem_comp_bond.atom_id_1 and chem_comp_bond.atom_id_2. The atom IDs must appear in the preceding chem_comp_atom table.chem_comp_bond.type
gives the order of bond and is one ofsingle
,double
ortriple
. Note that the bonds in aromatic groups are assigned alternatingsingle
anddouble
types by the RDKit Kekulize method. The bondtype
is also used for 2D schematic pictures and will be displayed in coot.chem_comp_bond.aromatic
thearomatic
item is set toy
orn
depending on whether the bond is assigned to be aromatic by RDKit. The RDKit book section on aromaticity provides a description of the approach taken and starts with an instructive paragraph:"Aromaticity is one of those unpleasant topics that is simultaneously simple and impossibly complicated. Since neither experimental nor theoretical chemists can agree with each other about a definition, it’s necessary to pick something arbitrary and stick to it. This is the approach taken in the RDKit."
For this reason it is important to for downstream procedures not to rely on the
aromatic
item. It is likely to vary between different programs and there is no "correct" definition. Thearomatic
item is provided for consistency with older programs and it would have been better if it had not been adopted in the past.chem_comp_bond.value_dist_nucleus
, &value_dist
the itemsvalue_dist_nucleus
andvalue_dist
both give an ideal length for the bond in Å. For bonds that do not involve a hydrogen atom they will have an identical value. For bonds to hydrogen atoms, thevalue_dist_nucleus
gives the normal bond distance between the nucleus of the two atoms, whereas thevalue_dist
gives the shorter bond length that is suitable for X-ray refinement (Stewart et al., 1965).The
value_dist_nucleus
is used to define a harmonic restraint for the bonds, as \({b_{ideal}}\) in the formula below.\[V_{bond} = W_{bond} \sum_{bonds} \left( \frac{b - b_{ideal}}{\sigma} \right)^2\]where \({b}\) is the actual bond length and \({\sigma}\) the estimated standard deviation (the next item).
chem_comp_bond.value_dist_nucleus_esd
, &.value_dist_esd
the itemsvalue_dist_nucleus_esd
andvalue_dist_esd
provide the "estimated standard deviation" ofvalue_dist_nucleus
andvalue_dist
in Å. This parameter is also know as the "standard uncertainty" or the "sigma" (\({\sigma}\) in the equation above). Values are taken from the standard deviation of the bond length distribution found from small molecule crystal structures, when these are available.
chem_comp_bond.source_value_dist_nucleus
,.source_value_dist_nucleus_esd
,.source_value_dist
, &.source_value_dist_esd
These source items provide information about the source of each the parameters defining the restraint. These items have been introduced by the Grade2 program as data provenance is important in any scientific study, and we think it is important to record where restraints come from. If the extract items are a problem, then use the command-line option --no_extra to turn off Grade2-specific CIF categories and itemsWhen available Grade2 will base ideal bond lengths on values from the Mogul tool analysis of CSD small molecule X-ray crystal structures. In these cases the source item will start
Mogul
. For exampleMogul_mean_1207_hits
shows that a parameter is taken from the mean of a distribution of 1207 values from relevant CSD structures.If Mogul cannot be used to obtain a value, for instance for all parameters involving hydrogen atoms, then a value will be obtained from a force field. For example, a source
MMFF94s_equilibrium
shows that the value is an equilibrium length obtained from the RDKit implementation of the MMFF force field (Tosco et al., 2014).
Bond angle restraints¶
The chem_comp_angle
CIF category defines restraints on bond angles.
For example the following defines the two bond angles
extracted from the restraint dictionary for
the charged-version of PDB component VIA
(sildenafil):
_chem_comp_angle.comp_id
_chem_comp_angle.atom_id_1
_chem_comp_angle.atom_id_2
_chem_comp_angle.atom_id_3
_chem_comp_angle.value_angle
_chem_comp_angle.value_angle_esd
_chem_comp_angle.source_value_angle
_chem_comp_angle.source_value_angle_esd
VIA H342 C34 H343 108.6 3.0 MMFF94s_optimised_coords default
VIA C34 C33 C32 112.6 2.8 Mogul_mean_528_hits Mogul_sd
chem_comp_angle.comp_id
lists the chemical component ID (aka residue residue name) of the ligand, in this caseVIA
.chem_comp_angle.atom_id_1
,.atom_id_2
, &.atom_id_3
give the atom IDs for the 3 atoms that form the bond angle.chem_comp_angle.value_angle
is the ideal or target angle of the restraint in degrees.chem_comp_angle.value_angle_esd
is the estimated standard deviation, also known as standard uncertainty or sigma for the restraint in degrees.chem_comp_angle.source_value_angle
These source items provide information about the source of each the parameters defining the restraint. Please see the chem_comp_angle.source* documentation above for details. When Mogul information is unavailable for a particular bond angle, then ideal angles are based on a force field. For instance above the bond angle involving hydrogen atoms has a sourceMMFF94s_optimised_coords
, this means the "ideal" angle is based on the angle found after the ligand has been energy minimised with the RDKit implementation of the MMFF force field (Tosco et al., 2014).
Plane restraints¶
Planar restraints are specified in the chem_comp_plane_atom category. A separate line is used for each atom involved the plane (that can involve many atoms). For example, here are 2 of the 18 planes that Grade2 produces for the charged-version of PDB component VIA (sildenafil):
_chem_comp_plane_atom.comp_id
_chem_comp_plane_atom.plane_id
_chem_comp_plane_atom.atom_id
_chem_comp_plane_atom.dist_esd
_chem_comp_plane_atom.source
VIA atom-C4 C9 0.02 Mogul_sum_angles_362
VIA atom-C4 C4 0.02 Mogul_sum_angles_362
VIA atom-C4 C5 0.02 Mogul_sum_angles_362
VIA atom-C4 O3 0.02 Mogul_sum_angles_362
VIA ring5A C30 0.005 Mogul+_ring_tors_rmsd_0.6_56_hits
VIA ring5A N29 0.005 Mogul+_ring_tors_rmsd_0.6_56_hits
VIA ring5A N28 0.005 Mogul+_ring_tors_rmsd_0.6_56_hits
VIA ring5A C24 0.005 Mogul+_ring_tors_rmsd_0.6_56_hits
VIA ring5A C25 0.005 Mogul+_ring_tors_rmsd_0.6_56_hits
chem_comp_plane_atom.comp_id
lists the chemical component ID (aka residue residue name) of the ligand, in this caseVIA
.chem_comp_plane_atom.plane_id
provides a ID for the plane in question. For the two planes in the example the plane IDs areatom-C4
andring5A
. Grade2 uses descriptive plane IDs where possible. In the example,atom-C4
is a plane that holds atomC4
flat involving three atoms to which it is bonded (C9
,C5
andO3
). Planering5A
is a plane that holds a five-membered ring in the ligand flat (in this case the pyrazole ring inVIA
). If there is a second five-membered ring in the ligand that is flat then the ring would be assigned the IDring5B
.A plane ID that starts
2fold-
is used for torsion angles that Grade2 assigns to be flat but where there is no preference as to whether the torsion is predominately 0º or 180º. Note that such a group will be held planar by a plane restraint rather than a 2-fold torsion angle as some programs, such as Coot, do not activate torsion angle restraints by default (and there can be differences in handling non-bonded contacts between the atoms involved).A plane ID that starts
trans-
orcis-
imposes a plane restraint that Mogul indicates has a strong preference to be around 180º or 0º respectively. The plane restraint imposes no preference to either conformation but a corresponding 1-fold torsion is defined that is normally inactive.If you ever edit a restraint dictionary to introduce your own plane definitions, it should be noted that some programs have a 8-character limit to the
plane_id
.chem_comp_plane_atom.atom_id
lists the atom ID for an atom within the plane (that is defined on multiple lines).chem_comp_plane_atom.dist_esd
is the estimated standard deviation, also known as standard uncertainty or sigma for the restraint in Ångstroms. The plane restraint provides a harmonic penalty forcing atoms towards the mean plane formed by the atoms. Thedist_esd
determines the stiffness of the restraint. The previous Grade program and many other restraint generation tool use a sigma of 0.02Å for all planes. Grade2 goes beyond this assigning values of sigma depending on the tightness of distributions from Mogul + custom ring analysis, please see the Treatment of Planar Groups chapter for more information.In the example above, the plane ID
atom-C4
that holds atom C4 is assigned the default sigma of 0.02Å. In constrast, the planering5A
that holds the pyrazole ring flat is assigned a tighter sigma 0.005Å.chem_comp_plane_atom.source
This is an Grade2 extra item that provides some source information as to why the plane was assigned. If the extract items are a problem, then use the command-line option --no_extra to turn off Grade2-specific CIF categories and items.In the example above, the plane ID
atom-C4
has a sourceMogul_sum_angles_362
. AtomC4
is assigned to be planar because it is bonded to 3 other atoms and the sum of ideal angles for the 3 bond angle restraints withC4
as a central atom is 362º. All three bond angles restraints are from Mogul distributions. Currently, if the sum of angles from Mogul is above 356º then a plane restraint is added, with the default sigma of 0.02Å.For the pyrazole ring plane ID
ring5A
thesource
isMogul+_ring_tors_rmsd_0.6_56_hits
. The plane is assigned from Mogul + custom ring analysis from 56 Mogul hits. The ring torsions of the hits have a root mean squared deviation from zero of 0.6º. This means that the rings are very flat. The sigma for the plane restraint is set at the limit of 0.005Å.For planes holding atoms bonded to one or more hydrogen atoms are normally set from the the RDKit implementation of the MMFF force field (Tosco et al., 2014). This will have a source item like
MMFF_out_of_plane_koop_0.015
, meaning the that the atom is held planar on the basis of the MMFF out-of-plane term.
Torsion angle restraints¶
Restraints on torsion angles are specified by the chem_comp_tor category.
As well defining restraints for refinement the chem_comp_tor
records, Coot uses the records in its "Edit Chi Angles" side-menu option
to allow adjustment of a ligands conformation.
For subtle changes in conformation "Edit Chi Angles" is sometimes
more useful than dragging about the ligand in real space refine,
for example it allows a saturated six-membered ring to be positioned
without disruption of its chair conformation.
Here are some of the 33 torsion restraints that Grade2 produces for the charged-version of PDB component VIA (sildenafil):
_chem_comp_tor.comp_id
_chem_comp_tor.id
_chem_comp_tor.atom_id_1
_chem_comp_tor.atom_id_2
_chem_comp_tor.atom_id_3
_chem_comp_tor.atom_id_4
_chem_comp_tor.value_angle
_chem_comp_tor.value_angle_esd
_chem_comp_tor.period
_chem_comp_tor.source
VIA CONST_ring6B-6 C9 C8 C7 C6 0.0 1000000.0 0 planar_ring
VIA puck_ring6C-1 C19 C18 N17 C16 -60.0 12.3 3 Mogul+_pucker_tors_rmsd_57.1_53_hits
VIA puck_ring6C-2 C18 N17 C16 C15 60.0 12.3 3 Mogul+_pucker_tors_rmsd_57.1_53_hits
VIA puck_ring6C-3 N17 C16 C15 N14 -60.0 12.3 3 Mogul+_pucker_tors_rmsd_57.1_53_hits
VIA puck_ring6C-4 C16 C15 N14 C19 60.0 12.3 3 Mogul+_pucker_tors_rmsd_57.1_53_hits
VIA puck_ring6C-5 C15 N14 C19 C18 -60.0 12.3 3 Mogul+_pucker_tors_rmsd_57.1_53_hits
VIA puck_ring6C-6 N14 C19 C18 N17 60.0 12.3 3 Mogul+_pucker_tors_rmsd_57.1_53_hits
VIA 3fold-23 H341 C34 C33 C32 180.0 15.0 3 from_MMFF94s_optimised_coordinates
VIA 3fold-24 C34 C33 C32 C30 180.0 12.3 3 Mogul_3fold_92.2%_within_10degs_103_hits
VIA free-25 C33 C32 C30 C25 0.0 1000000.0 10 unrestrained_default
VIA free-26 C24 N28 C31 H311 0.0 1000000.0 10 unrestrained_default
VIA 2fold-27 N22 C21 C9 C8 180.0 1000000.0 2 Mogul_plane_rms_out_of_plane_torsion29.4_degs_46_hits
chem_comp_tor.comp_id
lists the chemical component ID (aka residue residue name) of the ligand, in this caseVIA
.chem_comp_tor.id
provides a ID for the torsion angle.A torsion ID that starts
CONST_
is used for torsions within planar rings. No active restraint is placed on such torsions.For six-membered saturated rings, such as the piperazine group in
VIA
, thechem_comp_tor.id
of the six ring torsion angles will start with puck_ring6. 3-fold torsion restraints with minima at +60º and -60º (the 180º minimum is irrelevant because of the ring closure).If Grade2 judges that a should have a 3-fold active torsion restraint the
chem_comp_tor.id
will start 3fold (andchem_comp_tor.period
will be set to 3).Grade2 does not impose active 2-fold or 1-fold torsion restraints, instead using a plane restraint to hold the atoms planar. In such a case the
chem_comp_plane_atom.plane_id
andchem_comp_tor.id
will be consistent. 2-fold torsions will have an ID starting2-fold
. 1-fold torsions will have an ID startingtrans-
orcis-
. In all these, cases the torsion restraint is inactivated by settingchem_comp_tor.value_angle_esd
to a very large value (1,000,000º).chem_comp_tor.atom_id_1
,.atom_id_2
,.atom_id_3
, &.atom_id_4
give the atom IDs for the 4 atoms that form the torsion angle.chem_comp_tor.value_angle
is the ideal or target angle of the restraint in degrees. For 3-fold torsion angles there will be two additional minima at \({\pm}\) 120ºchem_comp_tor.value_angle_esd
is the estimated standard deviation, also known as standard uncertainty or sigma for the restraint in degrees. Inactive restraints are produced by setting the sigma to a very large value (1,000,000º).chem_comp_tor.period
is the periodicity of the restraint - that is the number of minima in 360º range of the angle.chem_comp_tor.source
provides information about the source of the restraint. For exampleMogul+_pucker_tors_rmsd_57.1_53_hits
says that a six-membered ring is set to have restraint to maintain its pucker \({\pm}\) 60º as the 53 CSD hits from Mogul+ analysis have an root mean square deviation from 0º of 57.1º.
Chiral centre restraints¶
Restraints controlling the configuration of chiral centres within a ligand
are specified by the chem_comp_chir
category. It should be noted that
this category differs markedly from the official
chem_comp_chir
and reflects the de facto standard used by Libcheck and succeeding programs.
BUSTER reads chem_comp_chir
but then controls chirality using a restraint
on the improper torsion angle rather than a chiral volume as explained
in the GELLY documentation
Appendix E: CHIRAL Restraints in gelly.
Here are the 4 chem_comp_chir
records that Grade2 produces
for the PDB component RIB
_chem_comp_chir.comp_id
_chem_comp_chir.id
_chem_comp_chir.atom_id_centre
_chem_comp_chir.atom_id_1
_chem_comp_chir.atom_id_2
_chem_comp_chir.atom_id_3
_chem_comp_chir.volume_sign
_chem_comp_chir.source
RIB chir_01 C4 C5 O4 C3 negativ rdkit
RIB chir_02 C3 C4 O3 C2 negativ rdkit
RIB chir_03 C2 C3 O2 C1 negativ rdkit
RIB chir_04 C1 O4 C2 O1 negativ rdkit
chem_comp_chir.comp_id
lists the chemical component ID (aka residue residue name) of the ligand, in this caseRIB
.chem_comp_chir.chir_id
provides an ID for the chiral centre. Grade2 uses IDs startingchir_01
chem_comp_chir.atom_id_centre
provides the atom ID for the chiral atom.chem_comp_chir.atom_id_1
,.atom_id_2
, &.atom_id_3
provide the atom IDs of three atoms that are bonded to the chiral atom.chem_comp_chir.volume_sign
specifies the chiral configuration of the centre. Possible values arepositiv
,negativ
andboth
. When there is a chiral centre whose configuration has not been set in the input, for example a SMILES string that lacks stereo specification, thevolume_sign
is set toboth
.chem_comp_chir.source
provides information about the source of the assignment.
Systematic names¶
If available, the output CIF-format restraint dictionary will contain
information as to the systematic name of the ligand.
The
pdbx_chem_comp_identifier
data category will be used. Systematic names for PDB ligands
are automatically obtained from the input PDB chemical component
definition (if the ligand is charged by
Grade2 then " (CHARGED)"
will be added).
The --pubchem_names option can be used
to do a online lookup the systematic name for ligands that occur in PubChem.
The --systematic option allows the systematic
name to be manually set.
Currently, we are not aware of any open source systematic chemical name programs
but commercial programs to produce systematic names are available from
ACD/Labs,
OpenEye
and Chemaxon.
Database Information¶
From Grade2 version 1.3.0 information about entries
for the ligand in Chemical databases is included in the output CIF restraint
dictionary. This information is held in the CIF data category
gphl_chem_comp_database
. For example, for PDB chemical component VIA
output restraint dictionary will have the automatically have the information:
#
loop_
_gphl_chem_comp_database.comp_id
_gphl_chem_comp_database.id
_gphl_chem_comp_database.database
_gphl_chem_comp_database.url
_gphl_chem_comp_database.details
VIA VIA PDB https://www.rcsb.org/ligand/VIA "RCSB PDB"
VIA VIA PDB https://www.ebi.ac.uk/pdbe-srv/pdbechem/chemicalCompound/show/VIA PDBe
#
For PDB chemical components Grade2 automatically provides details
to access RCSB PDB and PDBe pages. The --pubchem_names
also automatically sets appropriate gphl_chem_comp_database
records if a match is found.
Users can provide information about in-house databases using the --database_id option.
"Ideal" coordinate files¶
At the end of the restraint generation process a geometry optimization of the coordinates of molecule with the gelly geometry-only minimizer is made. This produces a set of coordinates that where the bond length, bond angles and other terms are adjusted to be as close as possible to the "ideal" values. These coordinates are then used to output files in a variety of formats. Please note that the "ideal" coordinates can be trapped at a local minimum.
PDB-format¶
PDB-format is a widely used exchange chemical file format for proteins.
Grade PDB-format ideal coordinates are written using RDKit routines,
as so have CONECT
records giving the bond order that are recognized
by some molecular graphics programs (such as Jmol).
SDF-format¶
Please note that SDF-format file will use Kekulé bonding (where aromatic bonds are marked with alternating single and double bonds) whereas the MOL2-format uses CSD conventions for aromatic bonds.
MOL2-format¶
The MOL2-format file uses aromatic bonding following the CSD convention. This makes them suitable for running addition Mogul geometry analysis.
schematic 2D molecular diagrams¶
As well as writing a CIF restraint dictionary and "ideal" coordinates files
Grade2 will produce schematic 2D molecular diagrams that can be useful.
For instance, for the PDB ligand CFF
caffeine,
running grade2 --PDB_ligand CFF
will write two SVG files that
can be visualized using a web-browser (such as Chrome
):