Reference: Other Functionality#

MPI parallelization#

class pystella.DomainDecomposition(proc_shape, halo_shape, rank_shape=None, grid_shape=None)[source]#

Implements functions needed for the MPI domain decomposition of a 3D grid.

If mpi4py is not installed, then only single-rank operation is supported.

Parameters:

proc_shape –
A 3-tuple specifying the shape of the MPI processor grid.

Note

Currently, proc_shape[2] must be 1, i.e., only two-dimensional domain decompositions are supported.
halo_shape – The number of halo layers on (both sides of) each axis of the computational grid (i.e., those used for grid synchronization or enforcing boundary conditions). May be a 3-tuple or an int (which is interpreted as that value for each axis). Note that zero is a valid value along any axis (in which case grid synchronization for such axes is skipped). For example, halo_shape=(2,0,1) specifies that there are two additional layers at each the beginning and end of the first axis, none for the second, and one for the third. For rank_shape=(Nx, Ny, Nz), an appropriately padded array would then have shape (Nx+4, Ny, Nz+2).

The following keyword arguments are recognized:

Parameters:

rank_shape – A 3-tuple specifying the shape of the computational sub-grid on the calling process. Defaults to None, in which case the global size is not fixed (and will be inferred when, e.g., share_halos() is called, at a slight performance penalty).
grid_shape – A 3-tuple specifying the shape of the global computational grid. If passed, rank_shape is determined for each rank. Note that proc_shape need not evenly divide grid_shape, i.e., different ranks may have different rank_shapes.

Raises:

NotImplementedError – if proc_shape[2] != 1.
ValueError – if the size of the processor grid proc_shape[0] * proc_shape[1] * proc_shape[2] is not equal to the total number of ranks the application was launched with (i.e., that returned by mpi4py.MPI.COMM_WORLD.Get_size()).
ValueError – if both rank_shape and grid_shape are passed.

share_halos(queue, fx)[source]#

Communicates halo data across all axes, imposing periodic boundary conditions.

Parameters:

queue – The pyopencl.CommandQueue to enqueue kernels and copies.
fx – The pyopencl.array.Array whose halo elements are to be synchronized across ranks. The number of halo layers on each face of the grid is fixed by halo_shape, while the shape of fx (i.e., subtracting halo padding specified by halo_shape) is only fixed if rank_shape or grid_shape were passed at object creation).

remove_halos(queue, in_array, out_array)[source]#

Removes the halo padding from an array.

The only restriction on the shapes of the three-dimensional input arrays is that the shape of in_array is larger than that of out_array by 2*halo_shape along each axis.

Parameters:

queue – The pyopencl.CommandQueue to enqueue kernels and copies.
in_array – The subarray whose halos will be removed. May be either a pyopencl.array.Array or a numpy.ndarray.
out_array – The output array. May be either a pyopencl.array.Array or a numpy.ndarray.

gather_array(queue, in_array, out_array, root)[source]#

Gathers the subdomains of an array from each rank into a single array of the entire grid, removing halo padding from in_array.

Parameters:

queue – The pyopencl.CommandQueue to enqueue kernels and copies.
in_array – The subarrays to be gathered. May be either a pyopencl.array.Array or a numpy.ndarray.
out_array – The output array for the gathered grid. May be either a pyopencl.array.Array or a numpy.ndarray on rank root, and None otherwise.
root – The rank to which in_array is gathered.

restore_halos(queue, in_array, out_array)[source]#

Adds halo padding to an array.

The only restriction on the shapes of the three-dimensional input arrays is that the shape of out_array is larger than that of in_array by 2*halo_shape along each axis.

Note

Since share_halos() is not currently implemented for numpy.ndarrays, this method does not automatically share halos after they are restored. Thus, halos must be shared manually after the fact (for now).

Parameters:

queue – The pyopencl.CommandQueue to enqueue kernels and copies.
in_array – The array to add halos to. May be either a pyopencl.array.Array or a numpy.ndarray.
out_array – The output array. May be either a pyopencl.array.Array or a numpy.ndarray.

scatter_array(queue, in_array, out_array, root)[source]#

Scatters the values of a single array of the entire grid to invidual subdomains on each rank. The per-rank out_array is padded by halo_shape.

Note

Since share_halos() is not currently implemented for numpy.ndarrays, this method does not automatically share halos after scattering. Thus, halos must be shared manually after the fact (for now).

Parameters:

queue – The pyopencl.CommandQueue to enqueue kernels and copies.
in_array – The full array to be scattered. May be either a pyopencl.array.Array or a numpy.ndarray on rank root, and None otherwise.
out_array – The output array for the scattered arrays. May be either a pyopencl.array.Array or a numpy.ndarray.
root – The rank from which in_array is scattered.

rankID(rx, ry, rz)[source]#

Returns:: The (integer) rank number corresponding to the processor grid site, (rx, ry, rz).

rank_tuple#: A 3-tuple containing the MPI rank’s location in the processor grid.

bcast(x, root)[source]#

A wrapper to MPI.COMM_WORLD.bcast() which broadcasts an arbitrary python object.

Parameters:

x – The value to broadcasted. Must be defined on all ranks (i.e., set x = None on ranks other than root).
root – The rank whose value of x should be broadcasted.

Returns:

The broadcasted value, on all ranks.

allreduce(rank_reduction, op=None)[source]#

A wrapper to MPI.COMM_WORLD.allreduce() which reduces and broadcasts a rank rank_reduction from the root rank.

Parameters:

rank_reduction – The rank’s individual value to be reduced.
op – The MPI reduction operation to perform. Defaults to MPI.SUM.

Returns:

The reduced value, on all ranks.

get_rank_shape_start(grid_shape, rank_tuple=None)[source]#

Computes the calling process’s rank_shape and starting indices rank_start (as would be indexed into the global array) corresponding to a global array with shape grid_shape.

Parameters:

grid_shape – A 3-tuple specifying the shape of the corresponding global array.
rank_tuple – A 3-tuple specifying for which MPI rank in the processor grid for which to compute outputs. Defaults to that of the calling process (i.e., rank_tuple).

Returns:

(rank_shape, rank_start).

comm#: An mpi4py.MPI.COMM_WORLD if mpi4py is installed, else None.

rank#: The integral rank of the calling process, i.e., that returned by mpi4py.MPI.COMM_WORLD.Get_rank().

nranks#: The total number of ranks, i.e., that returned by mpi4py.MPI.COMM_WORLD.Get_size().

proc_shape#

rank_shape#

Expansion#

class pystella.Expansion(energy, Stepper, mpl=1.0, dtype=<class 'numpy.float64'>)[source]#

Implements the time stepping of the scale factor evolution for conformal FLRW spacetimes with line element

\[\mathrm{d} s^2 = a(\tau)^2 \left( - \mathrm{d} \tau^2 + \delta_{ij} \mathrm{d} x^i \mathrm{d} x^j \right).\]

Below, the averaged energy density and pressure are

\[ \begin{align}\begin{aligned}\bar{\rho} &\equiv - \left\langle T_{\hphantom{0}0}^0 \right\rangle\\\bar{P} &\equiv \frac{1}{3} \left\langle T_{\hphantom{i}i}^i \right\rangle.\end{aligned}\end{align} \]

Parameters:

energy – The initial energy density, used to initialize \(\partial a / \partial \tau\).
Stepper – A Stepper to use for time stepping.
mpl – The unreduced Planck mass, \(m_\mathrm{pl}^2 \equiv 1 / G_N\). Setting this value chooses the units of the system. For example, to work in units of the reduced Planck mass, \(M_\mathrm{pl}^2 \equiv (8 \pi G_N)^{-1}\), pass mpl=np.sqrt(8*np.pi). Defaults to 1.
dtype – The datatype of the input and output arrays. Defaults to float64.

adot_friedmann_1(a, energy)[source]#

Parameters:

a – The current scale factor, \(a\).
energy – The current energy density, \(\bar{\rho}\).

Returns:

The value of \(\partial_\tau a\) as given by Friedmann’s first equation,

\[\mathcal{H}^2 \equiv \left( \frac{\partial_\tau a}{a} \right)^2 = \frac{8 \pi a^2}{3 m_\mathrm{pl}^2} \bar{\rho}\]

addot_friedmann_2(a, energy, pressure)[source]#

Parameters:

a – The current scale factor, \(a\).
energy – The current energy density, \(\bar{\rho}\).
pressure – The current pressure, \(\bar{P}\).

Returns:

The value of \(\partial_\tau^2 a\) as given by Friedmann’s second equation,

\[\partial_\tau \mathcal{H} + \mathcal{H}^2 = \frac{\partial_\tau^2 a}{a} = \frac{4 \pi a^2}{3 m_\mathrm{pl}^2} \left( \bar{\rho} - 3 \bar{P} \right)\]

step(stage, energy, pressure, dt)[source]#

Executes one stage of the time stepper.

Parameters:

stage – Which stage of the integrator to call.
energy – The current energy density, \(\bar{\rho}\).
pressure – The current pressure, \(\bar{P}\).
dt – The timestep to take.

constraint(energy)[source]#

A dimensionless measure of the satisfaction of the first Friedmann equation (as a constraint on the evolution), equal to

\[\left\vert \frac{1}{\mathcal{H}} \sqrt{\frac{8 \pi a^2}{3 m_\mathrm{pl}^2} \rho} - 1 \right\vert\]

where \(\mathcal{H}\) the current conformal Hubble parameter, \(\partial_\tau a / a\).

Parameters:: energy – The current energy density, \(\bar{\rho}\).

Statistical analysis#

class pystella.FieldStatistics(decomp, halo_shape, **kwargs)[source]#

A subclass of Reduction which computes the mean and variance of fields.

Parameters:

decomp – An instance of DomainDecomposition.
halo_shape – The number of halo layers on (both sides of) each axis of the computational grid. May either be an int, interpreted as a value to fix the parameter h to, or a tuple, interpreted as values for hx, hy, and hz. Defaults to None, in which case no such values are fixed at kernel creation.

The following keyword-only arguments are recognized:

Parameters:: max_min – A bool determining whether to also compute the actual and absolute maxima and minima of fields. Defaults to False.

Any remaining keyword arguments are passed to Reduction().

__call__(f, queue=None, allocator=None)[source]#

Parameters:: f – The array whose statistics will be computed. If f has more than three axes, all the outer axes are looped over. As an example, if f has shape (2, 3, 130, 130, 130), this method loops over the outermost two axes with shape (2, 3), and the resulting output data would have the same shape.

The following keyword arguments are recognized:

Parameters:

queue – A pyopencl.CommandQueue. Defaults to fx.queue.
allocator – A pyopencl allocator used to allocate temporary arrays, i.e., most usefully a pyopencl.tools.MemoryPool. See the note in the documentation of SpectralCollocator().

Returns:

A dict of means and variances, whose values are lists of the statistic (key) for each array in fields.

class pystella.FieldHistogrammer(decomp, num_bins, dtype, **kwargs)[source]#

A subclass of Histogrammer which computes field histograms with both linear and logarithmic binning.

Parameters:

decomp – An instance of DomainDecomposition.
num_bins – The number of bins of the computed histograms.
dtype – The datatype of the resulting histogram.

The following keyword-only arguments are recognized (in addition to those accepted by ElementWiseMap):

Parameters:: halo_shape – The number of halo layers on (both sides of) each axis of the computational grid. May either be an int, interpreted as a value to fix the parameter h to, or a tuple, interpreted as values for hx, hy, and hz. Defaults to 0, i.e., no padding.

New in version 2020.1.

Changed in version 2020.2: Positional argument rank_shape no longer required.

__call__(f, queue=None, **kwargs)[source]#

Parameters:: f – The array whose histograms will be computed. If f has more than three axes, all the outer axes are looped over. As an example, if f has shape (2, 3, 130, 130, 130), this method loops over the outermost two axes with shape (2, 3), and the resulting output data would have the same shape.

The following keyword arguments are recognized:

Parameters:

queue – A pyopencl.CommandQueue. Defaults to fx.queue.
allocator – A pyopencl allocator used to allocate temporary arrays, i.e., most usefully a pyopencl.tools.MemoryPool. See the note in the documentation of SpectralCollocator().

In addition, the keyword arguments min_f, max_f, min_log_f, and max_log_f are recognized and used to define binning. Each must have the same shape as the outer axes of f (e.g., (2, 3) in the example above). Unless values for each of these is passed, they all will be computed automatically.

Warning

This class prevents any out-of-bounds writes when calculating the bin number, ensuring that 0 <= bin < num_bins. When passing minimum and maximum values, the first and last bins may be incorrect if f in fact has values outside of the passed minimum/maximum values.

Returns:

A dict with the the following items:

"linear": the histogram(s) of f with linear binning
"linear_bins": the bins used for the linear histogram(s) of f
"log": the histogram(s) of f with logarithmic binning
"log_bins": the bins used for the logarithmic histogram(s) of f

Each numpy array has shape f.shape[:-3] + (num_bins,).

Utilities#

class pystella.output.OutputFile(context=None, name=None, runfile=None, **kwargs)[source]#

A wrapper to h5py:File which collects and saves useful run information and provides functionality to append to datasets.

No arguments are required, but the following keyword arguments are recognized:

Parameters:

context – A pyopencl.Context. If not None, information about the device, driver, and platform is saved to the attrs dictionary. Defaults to None.
name – The name of the .h5 (sans the extension) file to create. If None, a unique filename is chosen based on the current date and time. Defaults to None.
runfile – A file whose content will be saved as a string to attrs["runfile"], if not None. Useful for attaching the run file of a simulation to its output. Defaults to None.

Any remaining keyword arguments are saved to the attrs dictionary. If any value val is not of valid type to be saved, the val.__name__ attribute is saved if the value is a type instance, or else str(val) is saved.

Versions and git revisions (when available) of pystella and its dependencies are saved as "versions" and "git_revs" h5py:Datasets. The hostname is recorded in the "hostname" key of the attrs dictionary.

output(group, **kwargs)[source]#

Appends values to datasets within a h5py:Group named group. group is created if it does not exist, and the h5py:Dataset’s of this h5py:Group are determined by the keys of keyword arguments. If group already exists, iterates over each h5py:Dataset and appends values from keyword arguments (matching h5py:Dataset names to keys).

Parameters:: group – The h5py:Group to append h5py:Dataset values to.

If group already exists, a keyword argument for each h5py:Dataset in group must be provided.

pystella.choose_device_and_make_context(platform_choice=None, device_choice=None)[source]#

A wrapper that chooses a device and creates a pyopencl.Context on a particular device.

Parameters:

platform_choice – An integer or string specifying which pyopencl.Platform to choose. Defaults to None, in which case the environment variables PYOPENCL_CTX or PYOPENCL_TEST are queried. If none of the above are specified, then the first platform is chosen.
device_choice – An integer or string specifying which pyopencl.Device to run on. Defaults to None, in which case a device is chosen according to the node-local MPI rank. (Note that this requires initializing MPI, i.e., importing mpi4py.MPI.)

Returns:

A pyopencl.Context.