Using an in-house Mogul database

Introduction

If you work for an organisation that has a large set of in-house small molecule structures it is possible to use them to build a Mogul database that can be searched in addition to the CSD structures. Please note that to do so you will need to be an established CCDC Research Partner, please contact the CCDC for more details.

Note

Please note that it is only worthwhile to put in the effort of preparing an in-house Mogul database if your set of in-house structures provides coverage of chemical groups that are not already present in the CSD, but are represented in the ligands you are working on.

Preparing an in-house Mogul database

The process of preparing an in-house Mogul database has two major stages:

  1. Create an sqlite database of your structures from your set of small molecule CIF coordinate files. The CSD-Editor software should be used for this stage. Please note that the CSD-Editor program is available for Linux (and Windows) but is not supplied for MacOS. The procedures for this stage are described in the csd-editor-industrial documentation that can be found in the CSD distribution:

    ~/CCDC/ccdc-software/csd-editor/docs/csd_editor_industrial/csd-editor-industrial.html
    
  2. Once you have prepared the CSD-format sqlite database with your structures and checked that this works with Mercury, this can be used to build a corresponding Mogul data library using the mogulbuilder.py script. The mogulbuilder.py script is distributed to established CCDC Research Partners separately from the main CSD installation, please contact the CCDC for more details.

    Please note that the Python libraries necessary to run the script are now all included in the CSD Python API (there used to be a separate conda package). The mogulbuilder.py script can be run using the run CSD Python API command, for instance to get the script's help message:

    ~/CCDC/ccdc-software/csd-python-api/run_csd_python_api mogulbuilder.py -h
    

    The script should be run using three positional arguments:

    ~/CCDC/ccdc-software/csd-python-api/run_csd_python_api mogulbuilder.py structure_database.sqlite output_directory name_db
    

    The Mogul database is output to the output_directory and will have the name name_db. Depending on the number of structures in the input database the script will take a number of hours to run.

    Once the script has run, use an editor to check the file mogul.path in the output_directory. This should list the full path of the input structure sqlite database on the line that starts CSD, but older versions of the script fail to record the path. To use the Mogul database with Grade2 it is essential for the full path to be listed on the CSD line.

    Now that you have prepared your in-house Mogul database, please check that it works with the standalone Mogul program as explained in the section "Configuring multiple data libraries in the Mogul GUI" in the documentation supplied with the script.

    For further details on the mogulbuilder.py script please refer to the documentation supplied with it.

Using an in-house Mogul database with Grade2

Note

Please note that Grade2 support for in-house Mogul databases requires recent versions of Grade2 (1.6.0 and following) and CSD-Core (2024.1 and following).

Before using an in-house Mogul database please ensure that the mogul.path file in the database correctly provides the full path of the input structure sqlite database on the line that starts CSD, as explained above. You should also ensure that the database works correctly with the standalone Mogul program.

To use the database with Grade2 then set the environment variable BDG_GRADE2_MOGUL_IN_HOUSE_DATABASE to the full path of the directory containing the database. For further details see the Grade2 environment variables section.

Once you have done this then you can check that Grade2 recognizes the in-house Mogul database by running the Grade2 -checkdeps option:

$ grade2 -checkdeps

Look through the terminal output for a line listing the Mogul data libraries, this should list the name for your in-house library that you set in the mogulbuilder.py run. For example:

Mogul Data Libraries: as545be_ASER, in_house_library_name