The CDK does not have an implementation of (part of) the CIP rules. However, we recently started a collaboration with Dr Lars Carlsson in the Computational Toxicology, Global Safety Assessment group at AstraZeneca R&D Mölndal, headed by Dr Scott Boyer. Within this collaboration I have started an partial implementation of the CIP rules. The full set of rules is quite extensive, and some subrules are outside the scope of the collaboration. For example, we will likely not look at axial or helical stereochemistry within this collaboration. The kind of things it is able to do is distinguish between these mirror images (yeah, I should use Jmol, but ChemPedia needs more plugging right now: click the images):
The current patch is not looking into the problem of which atom is chiral; that problem is quite complex in itself, and Tim is writing up a nice set of blogs about that. Further, the current aims focuses only at application to atoms of ligancy four; that is, carbons.
The CIP rules uniquely define the stereochemistry of such a carbon, by uniquely ordering the ligands around the atom. Using rules the ligands are ordered, and they include rules defining priority based on atomic number, mass number, etc. It is the recursion that makes things more interesting, but I will not delve into the details of the algorithm here (see the aforelinked Wikipedia page instead, or a cheminformatics book like the one shown on the right). Here, I want to introduce some of the API of the current patch for the CDK.
Ligands and their Priorities
Core to the implementation are the CIP priority rules, that allow ordering of the ligand. So, we define a molecule, and ligands:
IMolecule molecule = parser.parseSmiles("IC(Br)(Cl)[H]"); ILigand ligand1 = new Ligand( molecule.getAtom(1), molecule.getAtom(2) ); ILigand ligand2 = new Ligand( molecule, molecule.getAtom(1), molecule.getAtom(0) ); ISequenceSubRuleThis JUnit test looks at the chiral compound given earlier, but without specifying the stereochemistry using the @@/@ SMILES syntax; we get to that later. Here, the example defines two ligands around atom 1 (which is the carbon; the index starts at 0). The first ligand is the bromine, the second ligand is the iodine. Because the latter takes priority according to the CIP rules, the compare(ligand1, ligand2) returns -1.
rule = new CIPLigandRule(); Assert.assertEquals(-1, rule.compare(ligand1, ligand2)); Assert.assertEquals(1, rule.compare(ligand2, ligand1));
This CIPLigandRule is used in the CIPTool to provide more user-oriented methods. The goal, obviously, is this bit of code:
IMolecule molecule = parser.parseSmiles("ClC(Br)(I)[H]"); LigancyFourChirality chirality = CIPTool.defineLigancyFourChirality( molecule, 1, 4, 0, 2, 3, STEREO.CLOCK_WISE ); Assert.assertEquals( CIP_CHIRALITY.R, CIPTool.getCIPChirality(chirality) );Because we do not have 3D coordinates in our SMILES, we define the stereochemistry as CLOCK_WISE and ANTI_CLOCK_WISE. The former here means that, looking from the first ligand, following atoms 2, 3, and 4 are oriented in a circle in a clock-wise turn. This defines uniquely the geometrical orientation, but which changes between CLOCK_WISE and ANTI_CLOCK_WISE upon every atom-atom exchange. Therefore, we uniquely prioritize the ligands, project, and translate the resulting CLOCK_WISE or ANTI_CLOCK_WISE in the appropriate R and S stereochemistry.
That's all for now. Questions, ideas and others most welcome in the comment!