-
Notifications
You must be signed in to change notification settings - Fork 2
Creating a New Substituent Type
AaronTools is packaged with a number of substituents in the built-in Substituent Library. Additional substituents may be easily added to your Personal Aaron Library. If you require more flexibility, substituents can also be built up from ones that are already in the library.
The AaronTools utility libadd_substituent
can facilitate selection of a substituent from a provided XYZ file and add it to the personal substituent library ($HOME/Aaron_libs/Subs
).
In the following sections, the target atom will refer to the atom on the substituent side of the bond connecting the substituent to the rest of the molecule. The avoid atom will refer to the atom on the molecule side of this bond.
To add a new Substituent:
- Determine the target and avoid atom indices in your XYZ file.
- Determine the number of conformers you want Aaron to consider and the rotation angle between each conformer.
- Run
libadd_substituent file.xyz -t <target> -a <avoid> -c <num-confs> <rot-angle> -n <name>
If the number of conformers or the rotation angle are not supplied, one conformer with zero degree rotation will be assumed. See the Format for Substituent XYZ Files section below if a change to the number of conformers or rotation angle is desired.
For example, to extract the phenyl group from the structure above, we would first identity the target atom as 31 and the avoid atom as 27. Due to the symmetry for the phenyl group, we will only request two conformers, 90 degrees apart. This would be executed via the command line as:
libadd_substituent example.xyz -t 31 -a 27 -c 2 90 -n phenyl
Now, our personal library contains $HOME/Aaron_libs/Subs/phenyl.xyz
, which can be used for other tasks as you would the built-in substituents from the Substituent Library.
Note that in this example, the phenyl group is already a built-in substituent. Aaron will first search for named substituents in your personal library before searching in the built-in library:
$HOME/Aaron_libs/Subs $QCHASM/AaronTools/Subs
Thus, substituents built using this method will be used instead of built-in ones of the same name.
Substituent files follow the format of a standard XYZ file, but with the additional conformer information in the comment line.
The comment line should be formatted as CF:NumConfs,RotAngle
, as seen in this phenyl example:
11 CF:2,90 C 1.408780 0.000000 0.000000 C 2.192902 -0.484731 1.059260 C 2.039680 0.608587 -1.100436 C 3.584822 -0.381150 0.994992 H 1.713119 -0.931768 1.921314 C 3.429329 0.718638 -1.148310 H 1.429491 0.990910 -1.916441 C 4.212647 0.216034 -0.103289 H 4.181388 -0.761232 1.821954 H 3.899334 1.192009 -2.007794 H 5.296754 0.294017 -0.141870
Substituents in the library can also be chained together to produce a new substituent. Unlike libadd_substituent
, these substituents are not saved anywhere. They are created every time you need them in Aaron or AaronTools. As an example, you could use substitute
to attach 3,4-dimethoxybenzyl to a methane molecule:
substitute methane.xyz -s 1=1-{34-OMe-Ph}Me -o out.xyz
AaronTools will attach the OMe
substituent to the 3 and 4 positions of the Ph
substituent. It will then attach the entire thing to the Me
substituent. Finally, substitute
will attach this to methane.xyz
.
The positions on the substituent are determined by how many bonds a particular atom is from where the substituent attaches to the molecule. Hydrogen atoms and non-hydrogen atoms without a bond to hydrogen are not counted, with the exception of the atom attached to the molecule (e.g. atom 1 on tBu
is a quaternary carbon, but atom 2 on Bn
is the ortho carbon and not the ipso carbon of the ring). In the case of branching, the path with the shortest bond at the point of branching is considered first, followed by the next chain. While this may not produce the proper enumeration of atoms, it does assign each modifiable position a unique number. Example atom numberings for different substituent foundations are shown below.
Et :
|
Ph :
|
Mes :
|
COOH :
|
The number of rotamers depends on the symmetry of the new substituent compared to the symmetry of the "foundation" of the substituent. Going back to the 1-{34-OMe-Ph}Me
substituent, the foundation of the 34-OMe-Ph
part of this substituent would be Ph
. The Ph
substituent has C2 symmetry with respect to the bond where it connects to the rest of the molecule. If a 180° rotation (determined by order of rotational symmetry) can be applied to this part of this substituent, and any substituents on the phenyl can be rotated so that this geometry is identical to the pre-rotation geometry, then AaronTools will only consider two possible rotations for the bond between the phenyl ring and whatever molecule it is attached to. The OMe
in the 4 position can be rotated another 180° to line up with the OMe
in the unrotated geometry. The OMe
in the 3 position makes the 34-OMe-Ph
unable to line up with the rotated geometry, so AaronTools will consider four rotations for this part of the substituent moving forward. Likewise, the Me
substituent is C3 symmetric, but the 34-OMe-Ph
breaks this symmetry. Therefore, AaronTools determines that this substituent has three rotamers. If substituents were added symmetrically to Me
(e.g. 111-Et-Me
) only two rotamers would be considered. Both of the OMe
substituents have two conformers each, which is specified in their coordinate files. Running make_conf
reveals that AaronTools will consider 48 conformers for this substituent:
make_conf out.xyz -s 1=1-{34-OMe-Ph}Me | grep -c Cf 48
Several common substituent abbreviations are recognized by AaronTools. A list is included below. The string that would generate this substituent is also given, as well as the total number of conformers for this substituent. Note - the proper number of conformers might not be scanned if you build substituents off of these (e.g. 2-Me-Bn
will not have all conformers 1-{2-Me-Ph}Me
would, even though the generated structure is the same). There are cases where building on already built substituents will produce all the expected conformers, but this can only happen if symmetry is maintained (e.g. 4-Me-Bn
).
abbreviation | name | string | conformers |
Bn | benzyl | 1-Ph-Me | 6 |
MePh2 | diphenylmethyl | 11-Ph-Me | 12 |
MePh3 | triphenylmethyl | 111-Ph-Me | 16 |
EtF5 | pentafluoroethyl | 1-CF3-11-F-Me | 6 |
iBu | iso-butyl | 1-iPr-Me | 9 |
nBu | n-butyl | 1-{1-Et-Me}Me | 27 |
Pr | propyl | 1-Et-Me | 9 |
Boc | t-butyloxycarbonyl | 2-tBu-COOH | 4 |
CBz | carboxybenzyl | 1-Bn-COOH | 12 |
MOM | methoxymethyl | 1-OMe-Me | 6 |
PMB | 4-methoxybenzyl | 4-OMe-Bn | 12 |
Troc | 2,2,2-trichloroethyl carbonate | 1-{2-{1-{111-Cl-Me}Me}COOH}OH | 24 |
BOM | benzyloxymethyl acetal | 1-{1-{1-Bn-OH}Me}OH | 72 |
TBDPS | t-butyldiphenylsilyl ether | 1-{1-tBu-11-Ph-SiH3}OH | 48 |
TBS | t-butyldimethylsilyl ether | 1-{1-tBu-11-Me-SiH3}OH | 12 |
TIPS | triisopropylsilyl ether | 1-{111-iPr-SiH3}OH | 108 |
TES | triethylsilyl ether | 1-{111-Et-SiH3}OH | 108 |
name | string |
o-tolyl | 2-Me-Ph |
trimethyl silyl | 111-Me-SiH3 |
4-methoxybenzyl ether | 1-{1-{4-OMe-Ph}Me}OH |
4-methoxybenzyl ether | 1-{1-{4-OMe-Ph}-Me}-OH |
2-(4-trifluoromethylphenyl)-2-(4-methylphenyl)ethyl | 2-{4-CF3-Ph}-2-{4-Me-Ph}Et |
the other stereoisomer of the previous substituent | 2-{4-Me-Ph}-2-{4-CF3-Ph}Et |
3,5-bis(trifluoromethyl)phenyl | 35-CF3-Ph |
3,5-bis(trifluoromethyl)phenyl | 3-5-CF3-Ph |
You can also use a more explicit notation when building substituents. This may be useful when your substituent has R/S enantiomers (e.g. sec-butyl) or if you want to replace something that is not a hydrogen atom. As an example of this notation, here's how you'd use substitute
to attach 3,4-dimethoxybenzyl to methane, where 3,4-dimethoxybenzyl is built from other substituents:
substitute methane.xyz -s 1="foundation=Me positions=4 decorations={{foundation=Ph positions=8,9 decorations={{OMe},{OMe}}}}"
foundation
specifies the substituent to which 'decorations
' are added, and positions
specifies where the respective decorations should be placed. One important difference between this more explicit notation and the more natural notation described above is the position numbering. Instead of looking for H atoms on the 3rd and 4th heavy atom on Ph
, the exact positions of these H's were specified The 8th and 9th atoms in our Ph.xyz
are the meta and para H's, as shown below.
When using this syntax, the decorations
section must be enclosed in curly braces. Each individual decoration must also be enclosed in its own set of curly braces.