Installation, Configuration & Testing¶
Installation¶
Grade2 is installed as part of the BUSTER distribution. For a full description of how to install BUSTER please see the installation documentation available at https://www.globalphasing.com/buster/manual/installation/index.html
Grade2 makes extensive use of the CSD Python API from the CCDC. Because of this Grade2 requires an installation of the CSD-Core package, that provides the API and the CSD and Mogul databases, to work. For details on how to obtain CSD-Core please see https://www.ccdc.cam.ac.uk/solutions/csd-core/
Note
If no local installation of CSD-Core is available, we recommend using the Grade Web Server, for non-confidential ligands.
Initial Configuration¶
To work Grade2 needs to be able to locate the CSD Python API and the directory containing the databases (only the CSD and Mogul databases are required for Grade2). The locations for these are found by setting environment variable(s). If you have followed the BUSTER Snapshot Installation Guide available at https://www.globalphasing.com/buster/manual/installation/index.html it is likely that Grade2 will already work.
Note
The configuration procedure has been altered in Grade2 release
1.6.0 in July 2024. If you are
still using an old version, please refer to the instructions in
$BDG_home/docs/grade2/html/installation.html
,
or better still update BUSTER to the latest release.
To check whether Grade2 can find the CSD installations use the grade2
command-line option -checkdeps
:
grade2 -checkdeps
If this results in a final line starting with SUCCESS
then Grade2 has been
successfully setup, for example:
$ grade2 -checkdeps
INFO: BDG_CSD_TOP_DIRECTORY set to /home/software/CCDC_2024.1
INFO: Running /home/software/CCDC_2024.1/ccdc-utilities/csd-location/c_linux-64/bin/csd_location.x \
/home/software/CCDC_2024.1/ccdc-data
INFO: ---- this writes to file ~/.config/CCDC/CSD.ini setting:
INFO: CSD data root folder = /home/software/CCDC_2024.1/ccdc-data
############################################################################
## [grade2] ligand restraint dictionary generation
... (output abbreviated) ...
Version: 1.6.0rc1 <2024-04-16>
... (output abbreviated) ...
-checkdeps option: verbose dependencies check for required external tools
with tests all tools work properly.
CSD installation found
Test using carbon dioxide from SMILES O=C=O bond angle:
RDKit generated molecule and coordinates from input SMILES: O=C=O
CHECK: Check the molecule's InChiKey against known PDB components:
CHECK: Exact match to PDB chemical component(s):
CHECK: CO2 https://www.rcsb.org/ligand/CO2 "carbon dioxide"
For help on checks against known PDB components, , see: ....
---- https://gphl.gitlab.io/grade2_docs/faqs.html#checkpdbmatch
Minimization with MMFF94s reduces energy from 30.80 to 0.00 kcal/mol
Using CCDC Mogul-like geometry analysis.
Mogul version 2024.1.0, CSD version 545, csd-python-api 3.1.0
Mogul Data Libraries: as545be_ASER
Geometry Optimization is turned off.
2D coordinates: generate new with RDKit Compute2DCoords
Result: O=C=O ideal bond angle 179.3 degs from Mogul_mean_35_hits
SUCCESS: grade2 -checkdeps indicates that everything needed to run grade2 works fine
If instead, grade2 -checkdeps
has lines starting with ERROR
then
you should set the environment variable BDG_CSD_TOP_DIRECTORY
to
the location the top-level directory of the CSD installation
on your system (see below).
Please also make sure that the withdrawn environment
variable BDG_TOOL_MOGUL
has not been set.
If you are a sh
, bash
or dash
user this can be achieved by
commands like:
unset BDG_TOOL_MOGUL
export BDG_CSD_TOP_DIRECTORY=/home/software/CCDC_2024.1/
whereas if you are a tcsh
or csh
you should use commands like:
unset BDG_TOOL_MOGUL
setenv BDG_CSD_TOP_DIRECTORY /home/software/CCDC_2024.1/
You will need to modify the command used to the correct location of the CSD top directory on your system.
Once you have found the environment variables necessary to get
grade2 -checkdeps
reporting SUCCESS
it is best if these are added to
the BUSTER setup_local.sh
and/or setup_local.csh
files, as explained
in the BUSTER Snapshot Installation Guide BUSTER configure section.
This means that the Grade2 configuration will done together
with BUSTER by the setup script setup.sh
or setup.csh
.
Note
The BUSTER CSD configuration procedure for Grade2 is shared with the buster-report and the (old, deprecated) Grade programs
Testing Grade2¶
To test whether Grade2 has been configured correctly then use the grade2
command-line option -checkdeps
:
grade2 -checkdeps
If this does not result in a final line that starts with SUCCESS
then
please follow instructions in the Configuration section above.
To test that all the components used by Grade2 work as expected on your
system then run the command grade2_tests -n auto
. For example:
$ grade2_tests -n auto
INFO: BDG_CSD_TOP_DIRECTORY set to /home/software/CCDC_2024.1
INFO: Running /home/software/CCDC_2024.1/ccdc-utilities/csd-location/c_linux-64/bin/csd_location.x \
/home/software/CCDC_2024.1/ccdc-data
INFO: ---- this writes to file ~/.config/CCDC/CSD.ini setting:
INFO: CSD data root folder = /home/software/CCDC_2024.1/ccdc-data
==================================================================================== test session starts ====================================================================================
platform linux -- Python 3.9.16, pytest-7.4.0, pluggy-1.2.0
rootdir: /home/software/xtal/GPhL/BUSTER/nightly/trunk/20240423/.mc/linux64/3.9/lib/python3.9/site-packages/grade2/tests
configfile: pytest.ini
plugins: mock-3.11.1, cov-4.1.0, xdist-3.3.1
16 workers [583 items] ipped
..................................................................................................................................................................................... [ 31%]
..........................................................................................ss......................................................................................... [ 62%]
.................................................................................................................................................................s................... [ 93%]
........s.........s................s.... [100%]
============================================================================== 577 passed, 8 skipped in 36.07s ==============================================================================
Grade2 version information:
grade2 1.6.0rc1 (2024-04-16), RDKit 2023.09.6, Mogul 2024.1.0, CSD 545, csd_python_api 3.1.0
Mogul Data Libraries: as545be_ASER
grade2 modules from: /home/software/xtal/GPhL/BUSTER/nightly/trunk/20240423/.mc/linux64/3.9/lib/python3.9/site-packages/grade2/*.cpython-39-x86_64-linux-gnu.so
ccdc modules from: /home/software/CCDC_2024.1/ccdc-software/csd-python-api/miniconda/lib/python3.9/site-packages/ccdc/*.py
csd_directory: /home/software/CCDC_2024.1/ccdc-data/csd
PDB components InChiKey store last modified date: 2024-04-05
grade2_tests
will run over 500 unit, functional and integration tests
written as part of the test-driven development used when developing Grade2.
Any failure is serious, so please report it to us.
Please also see the Examples section for how to
run/test the grade2
command-line tool.
Note
The -n auto
argument of grade2_tests
specifies that the tests
will use as many processes as your computer has physical CPU cores.
To run the tests on a single specify do not specify the option.
Advanced Configuration¶
If is possible to configure site-specific features for Grade2 by using the environment variables as set out below. Do not worry about these if are a new user of Grade2.
Grade2 environment variables¶
In general, environment variables are used to configure an installation of
Grade2 by specifying site-specific choices
that have can be set once and will not vary
from run to run. In contrast,
command-line arguments
are used to specify things that will vary for individual grade2
runs.
It is best Grade2 environment variables are are added to
the BUSTER setup_local.sh
or setup_local.csh
file, as explained
in the BUSTER Snapshot Installation Guide BUSTER configure section,
so that there are setup together with BUSTER. Users can of course set or
alter any of the environment variables if they wish by using
export
or setenv
(depending on the shell they use).
BDG_CSD_TOP_DIRECTORY¶
BDG_CSD_TOP_DIRECTORY
is the environment variable that gives the location
of the top-level directory for the CSD installation
(that is required to run Grade2).
For Grade2 to work BDG_CSD_TOP_DIRECTORY
must be set.
The CSD installation top-level directory must contain the subdirectory
ccdc-software
and the subdirectory ccdc-utilities
. It will
also normally contain the subdirectory ccdc-data
that will be used
by Grade2 as the default location of the Mogul and CSD databases (unless this is
overridden by setting BDG_CCDC_DATA).
Please see the Initial Configuration
section above for more detail on setting BDG_CSD_TOP_DIRECTORY
.
BDG_CCDC_DATA¶
The environment variable BDG_CCDC_DATA
can be used to specify the location
of the ccdc-data
directory that will be used as the location of the Mogul and
CSD databases. The BDG_CCDC_DATA
must contain the subdirectories csd
and mogul
for Grade2 to work
(it will also normally contain other subdirectories such as isostar
).
If BDG_CCDC_DATA
is not set, then Grade2 will by default use
the subdirectory ccdc-data
of
BDG_CSD_TOP_DIRECTORY.
BDG_CCDC_DATA
can be used if you are seeing
long Grade2 runs because of CSD is installed on a slow networked disk
as explained the Making a local copy of ccdc-data
section below.
BDG_GRADE2_PYTHON_VERSION¶
Grade2 is distributed with BUSTER within in a miniconda environment.
In order to work with the CSD Python API that is loaded at run time
from the CSD Python API distributed with the CSD the two
Python versions must be compatible. To use Grade2 following release
1.4.1 or above with a CSD installation
that predates 2023.2.0
, released in July 2023 set the
environment variable BDG_GRADE2_PYTHON_VERSION
to 3.7
.
If not set, then BDG_GRADE2_PYTHON_VERSION
currently defaults
to 3.9
.
BDG_GRADE2_PUBCHEM_NAMES_ON_ACCEPT_SMILES_TO_WEB¶
The grade2 command line option --pubchem_names
does an online lookup the systematic (IUPAC) name for ligands
that occur in PubChem.
This online search involves uploading the
SMILES string of the molecule to the PubChem server.
For this reason, the --pubchem_names
option should not be used for confidential ligands.
To be extra careful, by default the --pubchem_names
option is deactivated until the environment variable
is set. To enable the --pubchem_names
option set
BDG_GRADE2_PUBCHEM_NAMES_ON_ACCEPT_SMILES_TO_WEB
to "yes"
.
Note
Apologies for the long name of the environment variable
BDG_GRADE2_PUBCHEM_NAMES_ON_ACCEPT_SMILES_TO_WEB
but it is chosen
to explicitly show you have accepted that the ligand's
SMILES string is uploaded to a public web server.
BDG_GRADE2_LIGAND_LOOKUP¶
The --lookup option provides a mechanism whereby
an external script
is invoked to look up details of a ligand from a database.
To use your own script, set environment variable BDG_GRADE2_LIGAND_LOOKUP
to the location of the script. Please see the
--lookup option for more details.
BDG_GRADE2_MOGUL_IN_HOUSE_DATABASE¶
Grade2 can use an additional in-house Mogul database, as described in
the in-house Mogul databases chapter. Once
you have prepares the database please set
BDG_GRADE2_MOGUL_IN_HOUSE_DATABASE
to the full path of the directory
containing the in-house Mogul database.
BDG_GRADE2_SSL_DISABLE_VERIFICATION¶
When BUSTER is installed on some Ubuntu Linux OS the
--lookup ID using the default
pubchem_g2_lookup_script.py
script distributed with BUSTER can fail
terminating with an error message that includes SSL: CERTIFICATE_VERIFY_FAILED
.
A similar error when the --pubchem_names option
is used. If the problem occurs set environment variable
BDG_GRADE2_SSL_DISABLE_VERIFICATION
to "yes"
. This will
disable SSL verification for PubChem lookups and should
mean the options work. As you should not using the options
for any confidential ligands turning off verification should be fine.
BDG_GRADE2_CIF_LOOP_ALL¶
Set BDG_GRADE2_CIF_LOOP_ALL
to "yes"
to write all CIF categories as
loops, even if they only contain a single item. Currently this only
affects category gphl_chem_comp_info
which
by default is written using key-value pairs as this makes inspection easier.
All other CIF categories are written as loops anyway.
BDG_GRADE2_TEST_WEB¶
By default, the grade2_tests
tool does not run tests that use external
online services, as these can be unavailable because of maintenance or
network issues. Set BDG_GRADE2_CIF_LOOP_ALL
to "yes"
to turn
on the tests involving external online services. This will enable testing
of the --pubchem_names and the
--lookup ID options.
Withdrawn Environment Variables¶
Prior to release 1.6.0 in July 2024 the environment
variable BDG_TOOL_MOGUL
was used to configure Grade2,
buster-report and Grade. Grade2 release 1.6.0
will continue to work using BDG_TOOL_MOGUL
but will produce an initial
warning message, for example:
$ grade2 -checkdeps
WARNING: The old configuration environment variable BDG_TOOL_MOGUL is set, instead
WARNING: please set the new configuration environment variable BDG_CSD_TOP_DIRECTORY
WARNING: (to the top directory of your CSD installation). For now, I am doing
WARNING: this for you by:
WARNING:
WARNING: unset BDG_TOOL_MOGUL
WARNING: export BDG_CSD_TOP_DIRECTORY=/home/software/xtal/CCDC/CSDS/2023.3/
WARNING:
WARNING: For more information please see:
WARNING:
WARNING: https://gphl.gitlab.io/grade2_docs/installation.html#withdrawn
WARNING:
INFO: BDG_CSD_TOP_DIRECTORY set to /home/software/xtal/CCDC/CSDS/2023.3/
INFO: Running /home/software/xtal/CCDC/CSDS/2023.3//ccdc-utilities/csd-location/c_linux-64/bin/csd_location.x \
/home/software/xtal/CCDC/CSDS/2023.3//ccdc-data
INFO: ---- this writes to file ~/.config/CCDC/CSD.ini setting:
INFO: CSD data root folder = /home/software/xtal/CCDC/CSDS/2023.3//ccdc-data
############################################################################
## [grade2] ligand restraint dictionary generation
############################################################################
... (output abbreviated) ...
It is best if you alter the BUSTER setup_local.sh
or setup_local.csh
file, as explained in the BUSTER Snapshot Installation Guide
BUSTER configure section, setting the environment variable
BDG_CSD_TOP_DIRECTORY
rather than BDG_TOOL_MOGUL
.
The environment variables CSD_HOME
and BDG_TOOL_CSD_PYTHON_API
have also been supported in the past and should no longer be set.
From the next release of Grade2 setting any of the obsolete environment
variables BDG_TOOL_MOGUL
, CSD_HOME
or BDG_TOOL_CSD_PYTHON_API
will result in Grade2 terminating with an error message.
Please note that the next release will not work with CSD releases that pre-date CSD 2023.2 and we advise update to CSD 2024.1 or subsequent.
The CSD.ini configuration file¶
A major revision and improvement of the directory
structure and installation procedures was made in CSD-Core release 2023.1,
that introduced a configuration file CSD.ini
.
The directory structure was improved to separate out the databases from the software. Prior to this release there was a tight coupling between the software and the databases in the same CSD installation. Following the revision, CSD software can now be used with the databases from different releases and in different locations. This is a great improvement.
The location of databases used is specified in
CSD.ini
. Normally this file is individual to each user, and for Linux and macOS is located in the user's home directory with the path:~/.config/CCDC/CSD.ini
CSD-Core includes a program
csd_location
, that, given a directory path containing the databases, updates the user'sCSD.ini
file.To make sure that the latest databases are used following a CSD-Core update, Grade2 and buster-report have been altered from release 1.6.0 to invoke the
csd_location
tool on every run. By default, theccdc-data
location is set from the environment variable BDG_CSD_TOP_DIRECTORY, ensuring that the database and software are kept in sync when different versions of the CSD-Core are used.The BDG_CCDC_DATA environment variable can be used to set the database location to other locations: for instance, in making a local copy of ccdc-data.
When Grade2 and Buster-report update the
CSD.ini
file, this is noted in the terminal log on lines startingINFO:
, for example:
$ grade2 -P ATP
INFO: BDG_CSD_TOP_DIRECTORY set to /home/software/CCDC_2024.1
INFO: Running /home/software/CCDC_2024.1/ccdc-utilities/csd-location/c_linux-64/bin/csd_location.x \
/home/software/CCDC_2024.1/ccdc-data
INFO: ---- this writes to file ~/.config/CCDC/CSD.ini setting:
INFO: CSD data root folder = /home/software/CCDC_2024.1/ccdc-data
############################################################################
## [grade2] ligand restraint dictionary generation
...
Note
On every run, Grade2 and Buster-report will overwrite
the user's ~/.config/CCDC/CSD.ini
configuration file.
Making a local copy of ccdc-data
to speed up Grade2¶
If you are seeing long Grade2 runs because of CSD is installed
on a slow networked disk then it is possible for a user to
speed things up by using
copies of the ccdc-data
subdirectories csd
and mogul
on a fast local filesystem.
Note
Although individual users can perform this procedure without root
privilege and separately from the main CSD installation, you are
advised to talk to your system manager and/or software installer
before doing so. Filing up the /tmp
disk could make you unpopular!
The procedure is to first check that there is sufficient free space
on the disk that you want to use (it is best if this an SSD).
For example if you want to use /tmp
then use the command df -h /tmp
, like this:
$ df -h /tmp
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg0-tmp 49G 48M 47G 1% /tmp
in this case the disk has 47GB free. This can be compared to the size of database directories that need to be copied:
$ du -hsx $BDG_CSD_TOP_DIRECTORY/ccdc-data/csd $BDG_CSD_TOP_DIRECTORY/ccdc-data/mogul
12G /home/software/CCDC_2024.1/ccdc-data/csd
6.9G /home/software/CCDC_2024.1/ccdc-data/mogul
So in this case, things are fine as around 19GB are required and compared to 47GB free space.
Make a directory for the local copy. For example /tmp/ccdc-data_2024.1_local
$ mkdir /tmp/ccdc-data_2024.1_local
Then copy of over the complete contents of the directories ccdc-data/csd
and ccdc-data/mogul
into your local copy. For example:
$ cp -a $BDG_CSD_TOP_DIRECTORY/ccdc-data/csd /tmp/ccdc-data_2024.1_local/
$ cp -a $BDG_CSD_TOP_DIRECTORY/ccdc-data/mogul /tmp/ccdc-data_2024.1_local/
This may take some time as this involves copying 19GB of data from a slow network disk. The local directory now has copies of the two required databases:
$ du -hsx /tmp/ccdc-data_2024.1_local/*
12G /tmp/ccdc-data_2024.1_local/csd
6.9G /tmp/ccdc-data_2024.1_local/mogul
To use the local directory then set the environment variable BDG_CCDC_DATA to its location, this is normally is done by:
$ export BDG_CCDC_DATA=/tmp/ccdc-data_2024.1_local/
Once you have checked that this works,
it is best that BDG_CCDC_DATA
is added to
the BUSTER setup_local.sh
or setup_local.csh
file, as explained
in the Initial Configuration section, above.
Making a local copy of ccdc-data
should result in a speedup of Grade2
and buster-report (please see example timings).
Note
It is only necessary to copy over the csd
and mogul
sub-directories,
to run Grade2, buster-report and Grade.