RDF#
- class mbgdml.analysis.rdf.RDF(Z, entity_ids, comp_ids, cell_vectors, bin_width=0.05, rdf_range=(0.0, 15.0), inter_only=True, use_ray=False, ray_address='auto', n_workers=1)[source]#
Handles calculating the radial distribution function (RDF), \(g(r)\), of a constant volume simulation.
- Parameters:
Z (
numpy.ndarray, ndim:1) – Atomic numbers of all atoms in the system.entity_ids (
numpy.ndarray, ndim:1) – Integers that specify which fragment each atom belongs to for all structures.comp_ids (
numpy.ndarray, ndim:1) – Labels for eachentity_idused to determine the desired entity for RDF computations.cell_vectors (
numpy.ndarray) – The three cell vectors.inter_only (
bool, default:True) – Only intermolecular distances are allowed. IfTrue, atoms that have the sameentity_idare ignored.use_ray (
bool, default:False) – Use ray to parallelize computations.n_workers (
int, default:1) – Total number of workers available for ray. This is ignored ifuse_rayisFalse.ray_address (
str, default:"auto") – Ray cluster address to connect to.
- run(R, comp_id_pair, entity_idxs, step=1)[source]#
Perform the RDF computation.
- Parameters:
R (
numpy.ndarray, ndim:3) –Cartesian coordinates of all atoms in the system under periodic boundary conditions.
Tip
We recommend a memory-map for large
Rto reduce memory requirements.comp_id_pair (
tuple) – The component ID of the entities to consider. For example,('h2o', 'h2o')or('h2o', 'meoh').entity_idxs (
tuple, ndim:1or2) – The atom indices in each component to compute distances from.step (
int) – Number of structures/frames to skip between each analyzed frame.
- Returns:
numpy.ndarray–self.bins: \(r\) as the midpoint of each histogram bin.numpy.ndarray–self.results: \(g(r)\) with respect tobins.
Examples
Suppose we want to compute the \(O_{w}\)-\(O_{m}\) RDF where \(O_{w}\) and \(O_{m}\) are the oxygen atoms of water and methanol, respectively. We can define our system as such.
>>> import numpy as np >>> Z = np.array([8, 1, 1, 8, 1, 6, 1, 1, 1]*25) >>> entity_ids = np.array([0, 0, 0, 1, 1, 1, 1, 1, 1]*25) >>> comp_ids = np.array(['h2o', 'meoh']*25) >>> cell_vectors = np.array([[16., 0., 0.], [0., 16., 0.], [0., 0., 16.]])
Note
The above information is an arbitrary system made up of 25 water and 25 methanol molecules. They are contained in a 16 Angstrom periodic box where the atoms are always in the same order: water, methanol, water, methanol, water, etc.
This information completely specifies our system to prepare for computing the RDF. We initialize our object with this information.
>>> from mbgdml.analysis.rdf import RDF >>> rdf = RDF(Z, entity_ids, comp_ids, cell_vectors)
From here we need to specify what RDF to compute with
rdf.run(). We assume the Cartesian coordinates forRare already loaded as a 3D array or memory-map.The last two pieces of information are
comp_id_pairandentity_idxs.comp_id_pairspecifies what components or species we want to compute our RDF with respect to. In this example, we want \(O_{w}\)-\(O_{m}\).entity_idxsspecifies which atom in each entity to use. Oxygen is the first atom in both water and methanol (i.e., index of0).>>> comp_id_pair = ('h2o', 'meoh') >>> entity_idxs = (0, 0)
We can then compute our RDF!
>>> bins, gr = rdf.run(R, comp_id_pair, entity_idxs)
Notes
inter_onlyonly comes into play when there is a chance of also computing intramolecular distances during the RDF calculation. Take the hydroxyl OH RDF in a pure methanol simulation for instance. Ourcomp_id_pairandentity_idxswould be('meoh', 'meoh')and(0, 1). The O-H intramolecular bond distance would be a perfectly valid atom pair. Usually we are interested in intermolecular distances.inter_onlycontrols whether intramolecular distances are included (inter_only = False) or not (inter_only = True).entity_idxscan specify one or more atoms for each component. For example, if you wanted to compute the OH RDF of pure water you could useentity_idxs = (0, (1, 2)).TODO: Support different cell sizes for each structure.