Getting Started with VPT
I’ve set up this test environment so we can play with things more easily. This kind of environment can easily be set up on Mox or in a container too so that we can run stuffthere.
Setting up the Environment
The most basic thing we need to do is make sure we have a python environment we can work with. To do that we’ll set up what python calls a virtual environment which just means python makes a copy of its dependency directories so you can install things without worrying about global conflicts.
First, we need to install Python 3.9+, which just means going to the python downloads page and using the correct installer.
Once we have python installed, we can cd into this directory and create an
environment (which I’m calling mcenv) like
python3.9 -m venv mcenv
Next, we’ll activate this environment.
. mcenv/bin/activate
Finally, we need to install the wrapper package for the development
version of my VPT stuff, which I chose to call Psience.
pip install mccoygroup-psience
And that should be it!
When we need to update the package, we can do this with pip, too like
pip install --upgrade mccoygroup-psience
Using JupyterLab
Lately, I’ve been doing a lot of my work in JupyterLab, which is basically python’s version of a Mathematica-like interface. It’s a little bit clunky, but you get used to it. I set up a demo notebook so we can see some examples of running jobs. To use this, just run
jupyter lab
and then use the browser window to open VPT.ipynb
It’s worth checking out the JupyterLab documentation as well.
Running Scripts
Most of my calculations are run through python scripts.
Since the VPT2 package is just a part of the larger Psience library, it is straightforward to create arbitrarily complicated scripts.
This is how I tend to run my results for papers, when the number of
systems and options for the runs changes frequently.
Basic Examples
The simplest way to run jobs is through VPTRunner.run_simple.
The signature for that is
VPTRunner.run_simple(
system,
states,
target_property=None,
corrected_fundamental_frequencies=None,
calculate_intensities=True,
**opts
)
where the properties are as follows
system:list|str|Moleculethe system spec, either as a
Molecule, molecule spec (atoms, coords, opts) or a file to construct aMoleculestates:int|listthe states to get corrections for either an
int(up to that many quanta) or an explicit state listtarget_property:strthe target property to get corrections for (one of ‘frequencies’, ‘intensities’, ‘wavefunctions’)
corrected_fundamental_frequencies:Iterable[float]|Nonea set of fundamental frequencies to use to get new zero-order energies
calculate_intensities:bool default:Truewhether or not to calculate energies
opts:Anyoptions that work for a
VPTSystem,VPTStateSpace,VPTRuntimeOptions,VPTSolverOptions, orVPTHamiltonianOptionsobject which will be filtered automatically
This can be used quite simply if one has an fchk file containing a partial quartic expansion of a potential (e.g. from a freq=Anh job in Gaussian)
VPTRunner.run_simple(
"HOH_freq.fchk",
3 # up through three quanta of excitation
)
...
Harmonic Anharmonic
State Frequency Intensity Frequency Intensity
0 0 1 3937.52466 67.02051 3744.74223 64.17167
0 1 0 3803.29960 4.14283 3621.97931 3.11401
1 0 0 1622.30302 67.45626 1572.70734 68.32367
0 0 2 7875.04932 0.00000 7391.41648 0.01483
0 2 0 7606.59919 0.00000 7155.85397 0.31496
2 0 0 3244.60604 0.00000 3117.39090 0.55473
0 1 1 7740.82426 0.00000 7200.36337 2.20979
1 0 1 5559.82768 0.00000 5294.37886 3.76254
1 1 0 5425.60262 0.00000 5174.61359 0.06232
0 0 3 11812.57398 0.00000 10940.02275 0.04985
0 3 0 11409.89879 0.00000 10601.62396 0.00898
3 0 0 4866.90906 0.00000 4634.05068 0.00350
0 1 2 11678.34892 0.00000 10680.67944 0.00001
1 0 2 9497.35234 0.00000 8917.98240 0.00333
0 2 1 11544.12385 0.00000 10567.87984 0.08362
2 0 1 7182.13070 0.00000 6815.99171 0.16303
1 2 0 9228.90221 0.00000 8688.41518 0.00427
2 1 0 7047.90564 0.00000 6699.22408 0.00661
1 1 1 9363.12728 0.00000 8729.92693 0.09713
For more sophisticated calculations, many other flags are provided
(from VPTStateSpace)
degeneracy_specs:'auto' | list | dictA specification of degeneracies, either as polyads, explicit groups of states, or parameters to a method. (see Details for more info)
(from VPTHamiltonianOptions)
mode_selection:Iterable[int]|Nonethe set of the supplied normal modes to do perturbation theory on (can also be used to rearrange modes to put them in ordering from Herzberg notation) for example one can use
[4, 5, 6]to get the three highest-frequency modes in a 4 atom system or[4, 3, 2, 1, 0, 5]to put the modes of formaldehyde into Herzberg order (as the modes are initially ordered lowest to highest frequency)pseudopotential_terms:Iterable[np.ndarray]explicit values for the psuedopotential terms
coriolis_terms:Iterable[np.ndarray]explicit values for the Coriolis terms
kinetic_terms:Iterable[np.ndarray]explicit values for the kinetic terms (e.g. from analytic models), same format as for the potential
potential_terms:Iterable[np.ndarray]explicit values for the potential terms (e.g. from analytic models), should be a list of tensors starting with the Hessian with each axis of length
nmodesinclude_pseudopotential:boolwhether or not to include the pseudopotential/Watson term
include_coriolis_coupling:boolwhether or not to include Coriolis coupling in Cartesian normal mode calculation
backpropagate_internals:boolwhether or not to do Cartesian coordinate calculations with values backpropagated from internals
(from VPTSolverOptions)
check_overlap:bool default:Truewhether or not to ensure states are normalized in the VPT
zero_order_energy_corrections:dictenergies to use for the zero-order states instead of the diagonal of
H(0)low_frequency_mode_cutoff:float (default:500 cm-1)the energy below which to consider a mode to be “low frequency”
order:intthe order of perturbation theory to apply
expansion_order:int | dictthe order to go to in the expansions of the perturbations, this can be supplied for different properties independently, like
expansion_order = { 'potential':some_int, 'kinetic':some_int, 'dipole':some_int }zero_element_warning:boolwhether or not to warn if an element of the representations evaluated to zero (i.e. we wasted effort)
(from VPTRuntimeOptions)
results:str|Checkpointer|None default:Nonethe
Checkpointerto write corrections out tologger:str|Logger|bool|None default:Nonethe
Loggerobject to use when logging the status of the calculation (Truemeans log normally)nondeg_hamiltonian_precision:intthe precision with which to print out elements in the degenerate coupling Hamiltonians in the log file
matrix_element_threshold:float|None default:Nonethe minimum size of matrix element to keep
operator_chunk_size:int|None default:Nonethe number of representation matrix elements to calculate in at one time
Anne Input Helpers
I’ve also attached an object called AnneInputHelpers to VPTRunner object set up to make running VPT jobs a little bit simpler.
This object exists to enable people who aren’t comfortable with python and standard python data loading to supply their custom job input data as text files.
It exposes one core function run_anne_job, that will take a folder with the appropriate files and run a VPT job.
The signature of the function (as of when this is being written) looks like
VPTRunner.helpers.run_anne_job(
base_dir,
states=2,
order=None,
expansion_order=None,
atoms_file='atom.dat',
masses_file='mass.dat',
coords_file='cart_ref.dat',
modes_file='nm_int.dat',
zmat_file='z_mat.dat',
potential_files=('cub.dat', 'quart.dat', 'quintic.dat', 'sextic.dat'),
dipole_files=('lin_dip.dat', 'quad_dip.dat', "cub_dip.dat", "quart_dip.dat", 'quintic_dip.dat'),
results_file=None,
**opts
)
The arguments are as follows
base_dir(str): the directory where the files for running perturbation theory livestates(intorlist): the states to do PT for, anintmeans “do PT for all states with up through this many quantaorder(int): the order to which to do PT, inferred automatically from theexpansion_orderif not providedexpansion_order(intordict): the order to which to take each part of the expansion. If not supplied, the order will be inferred from the potential and dipole files supplied. If anint, all terms are taken out to that order. Otherwise supply adictlike
{
'potential':int,
'kinetic':int,
'dipole':int,
}
atoms_file(str): a file that contains the atomic numbers, like
Header Line
atom1 atom2 ... atomN
masses_file(str): a file that contains the atomic masses, like
mass1 mass2 ... massN
Note that only one of atoms and masses needs to be supplied.
If both are provided, masses will take precedence.
modes_file(str): a file that contains the frequencies of the modes as well as the transformation matrix from internals to normal modes (L) and the inverse transformation matrix (L^-1) like
freq_1 freq_2 freq_3 ... freq_m
L11 L12 L13 .. L21 L22 ... Lmn
K11 K12 K13 .. K21 K22 ... Knm
zmat_file(str): a file containing the molecular Z-matrix for internal coordinate inputscoords_file(str): a file containing the molecular Cartesian coordinates, looking like
x1 y1 z1
x2 y2 z2
. . .
. . .
. . .
xN yN zN
potential_files(list[str]): a list of files containing the potential tensors in dimensionless normal modes, starting with the Hessian and going up from theredipole_files(list[str]): a list of files containing the dipole tensors in dimensionless normal modes, starting with the linear dipole and going up from there
Tensor files are just lists of indices and corresponding values, like
i1_1 i1_2 i1_3 ... val1
i2_1 i2_2 i2_3 ... val2
. . . . .
. . . . .
. . . . .
iN_1 iN_2 iN_3 ... valN
For the dipole_files, the x/y/z axis index should be the left-most column
The job can of course be run without these files, by simply using vanilla numpy and python to set up the job, but this has proven convenient.
Extra Options
There are a very large number of knobs and levers one can turn when running a PT job. Here are some of the standard ones
internals: an internal coordinate spec, three forms of this are supportedNone: use Cartesians- Z-matrix as a 4-column array of
ints with each row being refs to[atom, bond, angle, dihedral](whereatomallows for rearrangement of the atoms) dict: a spec with a Z-matrix and/or coordinate transform like
{
'zmatrix': [[atom1, bond1, angle1, dihed1], [atom2, bond2, angle2, dihed2], ...] or None,
'conversion': 'a function to convert from Z-matrix coordinates to desired coordinates',
'inverse': 'the inverse conversion'
}
coordinate_transformation([conversion, inverse]): a shorthand inrun_anne_jobthat allows for just specifying a conversion function and inverse to apply to the Z-matrix coordinateslogger(str): a file to print the job log info toresults(str): a file to save results to (must have the extension.hdf5or.json)degeneracy_specs: The specification of degeneracies.
There are multiple possible values for this spec.
The simplest is to use the automatic approach, in which we supply a numeric type (int, float, etc.) to use as the WFC threshold.
The next simplest is to explicitly supply the groups we want, like
[
[ # the first resonant space
state_1,
state_2,
state_3
],
[ # the second
state_5, state_11, ...
],
...
]
We can also supply pairs of relations for determining resonances, like
[
[state_1, state_2], # A first relation
[state_3, state_4], # another relation
...
]
To allow for extra options, you can also supply a dict. If you wanted to have a different wfc_threshold and you wanted to do the secondary resonant space splitting step with a very large threshold, you could do that by supplying
{
'wfc_threshold':.1,
'energy_cutoff':1.0 # in Hartree
}
or you can explicitly add extra groups to the pairs of polyad rules by saying
{
'polyads':[
[state_1, state_2], # A first relation
[state_3, state_4], # another relation
...
],
'extra_groups': [
[ # the first resonant space
state_a,
state_b,
state_c
],
[ # the second
state_d, state_e, ...
],
...
]
}
This also allows us to define more resonance handling strategies.
The Martin Test is supported,
{
'martin_threshold':.1/219465, #in Hartree
}
As are total quanta vectors
{
'nT': [1, 1, 1, 0, 2, 2, 0] # e.g.
}
mode_selection(list[int]): the subset of normal modes to use in the calculation as alistofints corresponding to the desired modes (can also be used to rearrange from freq. ordering to Herzberg)basis_postfilters(list[dict]): a list of filters to apply sequentially to the basis of states used in the PT, each filter can look like one of the following- for excluding quanta
{
'max_quanta': [2, -1, 1, -1, ...] # the max number of quanta allowed in a given mode in the basis (-1 means infinity)
}
- for excluding transitions
{
'excluded_transitions': [[0, 0, 1, 0, ...], [1, 0, 0, 0, ...], ...] # a set of transitions that are forbidden on the input states
}
- for excluding transitions
{
'test': func # a function that takes the basis and tests if states should be allowed
}
operator_coefficient_threshold(floatorNone): the minimum coefficient size for a term in the expansion of the Hamiltonian (Nonemeans no threshold), jobs run faster with a larger threshold but the results are more approximate
See the docs for the full list of options available to the Hamiltonian, Solver, and Runtime.
Getting Expansions
We can also run a standard job with the results keyword to save the expansions (and wave functions and energies) to a file, which allows us to then extract the expansions.
Since it’s easiest to run this all in a single directory, I added a little helper function run_fchk_job. This has the file signature
VPTRunner.helpers.run_fchk_job(
base_dir,
states=2,
fchk_file='fchk.fchk',
zmat_file='z_mat.dat',
results_file='output.hdf5',
**opts
)
which if given the file fchk.fchk in base_dir will run the job using the appropriate internals from zmat_file and save the outputs to output.hdf5 and from there we can extract the potential and dipole expansions with the helpful functions extract_potential and extract_dipole_expansion like
VPTRunner.helpers.extract_dipole_expansion(base_dir)
For example, we can run a job like this
VPTRunner.helpers.run_fchk_job(
"h2co/r",
2,
fchk_file='h2co.fchk'
)
and pull the expansions like this
VPTRunner.helpers.extract_potential('h2co/r')
Defining Custom Coordinate Systems
To use custom coordinate systems, we define a conversion function and an inverse
def conv(r, t, f, **kwargs):
"""
Takes the bond lengths (`r`), angles `(t)` and dihedrals `(f)`,
and returns new coordinates that are functions of these coordinates
"""
# ... means skip all indices until the specified ones (look up Numpy Fancy Indexing)
cp1 = np.cos(f[..., 3]) # skip indices 0, 1, and 2 as these correspond to the embedding
ct1 = np.cos(t[..., 2]) # skip 0 and 1 for embedding
ct2 = np.cos(t[..., 3])
st1 = np.sin(t[..., 2])
st2 = np.sin(t[..., 3])
f[..., 3] = np.arccos(st1 * st2 * cp1 + ct1 * ct2)
return np.array([r, t, f])
def inv(r, t, f, **kwargs):
cp1 = np.cos(f[..., 3])
ct1 = np.cos(t[..., 2])
ct2 = np.cos(t[..., 3])
st1 = np.sin(t[..., 2])
st2 = np.sin(t[..., 3])
f[..., 3] = np.arccos((cp1 - ct1 * ct2) / (st1 * st2))
return np.array([r, t, f])
then we add the flag coordinate_transformation=[conv, inv] to get the runner to use the transformations and inverse
VPTRunner.helpers.run_anne_job(
'nh3_2/r',
2,
order=2,
degeneracy_specs='auto',
coordinate_transformation=[conv, inv]
)