Reference: Other Functionality#
MPI parallelization#
- class pystella.DomainDecomposition(proc_shape, halo_shape, rank_shape=None, grid_shape=None)[source]#
Implements functions needed for the MPI domain decomposition of a 3D grid.
If
mpi4py
is not installed, then only single-rank operation is supported.- Parameters:
proc_shape –
A 3-
tuple
specifying the shape of the MPI processor grid.Note
Currently,
proc_shape[2]
must be1
, i.e., only two-dimensional domain decompositions are supported.halo_shape – The number of halo layers on (both sides of) each axis of the computational grid (i.e., those used for grid synchronization or enforcing boundary conditions). May be a 3-
tuple
or anint
(which is interpreted as that value for each axis). Note that zero is a valid value along any axis (in which case grid synchronization for such axes is skipped). For example,halo_shape=(2,0,1)
specifies that there are two additional layers at each the beginning and end of the first axis, none for the second, and one for the third. Forrank_shape=(Nx, Ny, Nz)
, an appropriately padded array would then have shape(Nx+4, Ny, Nz+2)
.
The following keyword arguments are recognized:
- Parameters:
rank_shape – A 3-
tuple
specifying the shape of the computational sub-grid on the calling process. Defaults to None, in which case the global size is not fixed (and will be inferred when, e.g.,share_halos()
is called, at a slight performance penalty).grid_shape – A 3-
tuple
specifying the shape of the global computational grid. If passed,rank_shape
is determined for each rank. Note thatproc_shape
need not evenly dividegrid_shape
, i.e., different ranks may have differentrank_shape
s.
- Raises:
NotImplementedError – if
proc_shape[2] != 1
.ValueError – if the size of the processor grid
proc_shape[0] * proc_shape[1] * proc_shape[2]
is not equal to the total number of ranks the application was launched with (i.e., that returned bympi4py.MPI.COMM_WORLD.Get_size()
).ValueError – if both
rank_shape
andgrid_shape
are passed.
Communicates halo data across all axes, imposing periodic boundary conditions.
- Parameters:
queue – The
pyopencl.CommandQueue
to enqueue kernels and copies.fx – The
pyopencl.array.Array
whose halo elements are to be synchronized across ranks. The number of halo layers on each face of the grid is fixed byhalo_shape
, while the shape offx
(i.e., subtracting halo padding specified byhalo_shape
) is only fixed ifrank_shape
orgrid_shape
were passed at object creation).
- remove_halos(queue, in_array, out_array)[source]#
Removes the halo padding from an array.
The only restriction on the shapes of the three-dimensional input arrays is that the shape of
in_array
is larger than that ofout_array
by2*halo_shape
along each axis.- Parameters:
queue – The
pyopencl.CommandQueue
to enqueue kernels and copies.in_array – The subarray whose halos will be removed. May be either a
pyopencl.array.Array
or anumpy.ndarray
.out_array – The output array. May be either a
pyopencl.array.Array
or anumpy.ndarray
.
- gather_array(queue, in_array, out_array, root)[source]#
Gathers the subdomains of an array from each rank into a single array of the entire grid, removing halo padding from
in_array
.- Parameters:
queue – The
pyopencl.CommandQueue
to enqueue kernels and copies.in_array – The subarrays to be gathered. May be either a
pyopencl.array.Array
or anumpy.ndarray
.out_array – The output array for the gathered grid. May be either a
pyopencl.array.Array
or anumpy.ndarray
on rankroot
, and None otherwise.root – The rank to which
in_array
is gathered.
- restore_halos(queue, in_array, out_array)[source]#
Adds halo padding to an array.
The only restriction on the shapes of the three-dimensional input arrays is that the shape of
out_array
is larger than that ofin_array
by2*halo_shape
along each axis.Note
Since
share_halos()
is not currently implemented fornumpy.ndarray
s, this method does not automatically share halos after they are restored. Thus, halos must be shared manually after the fact (for now).- Parameters:
queue – The
pyopencl.CommandQueue
to enqueue kernels and copies.in_array – The array to add halos to. May be either a
pyopencl.array.Array
or anumpy.ndarray
.out_array – The output array. May be either a
pyopencl.array.Array
or anumpy.ndarray
.
- scatter_array(queue, in_array, out_array, root)[source]#
Scatters the values of a single array of the entire grid to invidual subdomains on each rank. The per-rank
out_array
is padded byhalo_shape
.Note
Since
share_halos()
is not currently implemented fornumpy.ndarray
s, this method does not automatically share halos after scattering. Thus, halos must be shared manually after the fact (for now).- Parameters:
queue – The
pyopencl.CommandQueue
to enqueue kernels and copies.in_array – The full array to be scattered. May be either a
pyopencl.array.Array
or anumpy.ndarray
on rankroot
, and None otherwise.out_array – The output array for the scattered arrays. May be either a
pyopencl.array.Array
or anumpy.ndarray
.root – The rank from which
in_array
is scattered.
- rankID(rx, ry, rz)[source]#
- Returns:
The (integer) rank number corresponding to the processor grid site,
(rx, ry, rz)
.
- bcast(x, root)[source]#
A wrapper to
MPI.COMM_WORLD.bcast()
which broadcasts an arbitrary python object.- Parameters:
x – The value to broadcasted. Must be defined on all ranks (i.e., set
x = None
on ranks other thanroot
).root – The rank whose value of
x
should be broadcasted.
- Returns:
The broadcasted value, on all ranks.
- allreduce(rank_reduction, op=None)[source]#
A wrapper to
MPI.COMM_WORLD.allreduce()
which reduces and broadcasts a rankrank_reduction
from theroot
rank.- Parameters:
rank_reduction – The rank’s individual value to be reduced.
op – The MPI reduction operation to perform. Defaults to
MPI.SUM
.
- Returns:
The reduced value, on all ranks.
- get_rank_shape_start(grid_shape, rank_tuple=None)[source]#
Computes the calling process’s
rank_shape
and starting indicesrank_start
(as would be indexed into the global array) corresponding to a global array with shapegrid_shape
.- Parameters:
grid_shape – A 3-
tuple
specifying the shape of the corresponding global array.rank_tuple – A 3-
tuple
specifying for which MPI rank in the processor grid for which to compute outputs. Defaults to that of the calling process (i.e.,rank_tuple
).
- Returns:
(rank_shape, rank_start)
.
- rank#
The integral rank of the calling process, i.e., that returned by
mpi4py.MPI.COMM_WORLD.Get_rank()
.
- nranks#
The total number of ranks, i.e., that returned by
mpi4py.MPI.COMM_WORLD.Get_size()
.
- proc_shape#
- rank_shape#
Expansion#
- class pystella.Expansion(energy, Stepper, mpl=1.0, dtype=<class 'numpy.float64'>)[source]#
Implements the time stepping of the scale factor evolution for conformal FLRW spacetimes with line element
\[\mathrm{d} s^2 = a(\tau)^2 \left( - \mathrm{d} \tau^2 + \delta_{ij} \mathrm{d} x^i \mathrm{d} x^j \right).\]Below, the averaged energy density and pressure are
\[ \begin{align}\begin{aligned}\bar{\rho} &\equiv - \left\langle T_{\hphantom{0}0}^0 \right\rangle\\\bar{P} &\equiv \frac{1}{3} \left\langle T_{\hphantom{i}i}^i \right\rangle.\end{aligned}\end{align} \]- Parameters:
energy – The initial energy density, used to initialize \(\partial a / \partial \tau\).
Stepper – A
Stepper
to use for time stepping.mpl – The unreduced Planck mass, \(m_\mathrm{pl}^2 \equiv 1 / G_N\). Setting this value chooses the units of the system. For example, to work in units of the reduced Planck mass, \(M_\mathrm{pl}^2 \equiv (8 \pi G_N)^{-1}\), pass
mpl=np.sqrt(8*np.pi)
. Defaults to1
.dtype – The datatype of the input and output arrays. Defaults to float64.
- adot_friedmann_1(a, energy)[source]#
- Parameters:
a – The current scale factor, \(a\).
energy – The current energy density, \(\bar{\rho}\).
- Returns:
The value of \(\partial_\tau a\) as given by Friedmann’s first equation,
\[\mathcal{H}^2 \equiv \left( \frac{\partial_\tau a}{a} \right)^2 = \frac{8 \pi a^2}{3 m_\mathrm{pl}^2} \bar{\rho}\]
- addot_friedmann_2(a, energy, pressure)[source]#
- Parameters:
a – The current scale factor, \(a\).
energy – The current energy density, \(\bar{\rho}\).
pressure – The current pressure, \(\bar{P}\).
- Returns:
The value of \(\partial_\tau^2 a\) as given by Friedmann’s second equation,
\[\partial_\tau \mathcal{H} + \mathcal{H}^2 = \frac{\partial_\tau^2 a}{a} = \frac{4 \pi a^2}{3 m_\mathrm{pl}^2} \left( \bar{\rho} - 3 \bar{P} \right)\]
- step(stage, energy, pressure, dt)[source]#
Executes one stage of the time stepper.
- Parameters:
stage – Which stage of the integrator to call.
energy – The current energy density, \(\bar{\rho}\).
pressure – The current pressure, \(\bar{P}\).
dt – The timestep to take.
- constraint(energy)[source]#
A dimensionless measure of the satisfaction of the first Friedmann equation (as a constraint on the evolution), equal to
\[\left\vert \frac{1}{\mathcal{H}} \sqrt{\frac{8 \pi a^2}{3 m_\mathrm{pl}^2} \rho} - 1 \right\vert\]where \(\mathcal{H}\) the current conformal Hubble parameter, \(\partial_\tau a / a\).
- Parameters:
energy – The current energy density, \(\bar{\rho}\).
Statistical analysis#
- class pystella.FieldStatistics(decomp, halo_shape, **kwargs)[source]#
A subclass of
Reduction
which computes the mean and variance of fields.- Parameters:
decomp – An instance of
DomainDecomposition
.halo_shape – The number of halo layers on (both sides of) each axis of the computational grid. May either be an
int
, interpreted as a value to fix the parameterh
to, or atuple
, interpreted as values forhx
,hy
, andhz
. Defaults to None, in which case no such values are fixed at kernel creation.
The following keyword-only arguments are recognized:
- Parameters:
max_min – A
bool
determining whether to also compute the actual and absolute maxima and minima of fields. Defaults to False.
Any remaining keyword arguments are passed to
Reduction()
.- __call__(f, queue=None, allocator=None)[source]#
- Parameters:
f – The array whose statistics will be computed. If
f
has more than three axes, all the outer axes are looped over. As an example, iff
has shape(2, 3, 130, 130, 130)
, this method loops over the outermost two axes with shape(2, 3)
, and the resulting output data would have the same shape.
The following keyword arguments are recognized:
- Parameters:
queue – A
pyopencl.CommandQueue
. Defaults tofx.queue
.allocator – A
pyopencl
allocator used to allocate temporary arrays, i.e., most usefully apyopencl.tools.MemoryPool
. See the note in the documentation ofSpectralCollocator()
.
- Returns:
A
dict
of means and variances, whose values are lists of the statistic (key) for each array infields
.
- class pystella.FieldHistogrammer(decomp, num_bins, dtype, **kwargs)[source]#
A subclass of
Histogrammer
which computes field histograms with both linear and logarithmic binning.- Parameters:
decomp – An instance of
DomainDecomposition
.num_bins – The number of bins of the computed histograms.
dtype – The datatype of the resulting histogram.
The following keyword-only arguments are recognized (in addition to those accepted by
ElementWiseMap
):- Parameters:
halo_shape – The number of halo layers on (both sides of) each axis of the computational grid. May either be an
int
, interpreted as a value to fix the parameterh
to, or atuple
, interpreted as values forhx
,hy
, andhz
. Defaults to0
, i.e., no padding.
New in version 2020.1.
Changed in version 2020.2: Positional argument
rank_shape
no longer required.- __call__(f, queue=None, **kwargs)[source]#
- Parameters:
f – The array whose histograms will be computed. If
f
has more than three axes, all the outer axes are looped over. As an example, iff
has shape(2, 3, 130, 130, 130)
, this method loops over the outermost two axes with shape(2, 3)
, and the resulting output data would have the same shape.
The following keyword arguments are recognized:
- Parameters:
queue – A
pyopencl.CommandQueue
. Defaults tofx.queue
.allocator – A
pyopencl
allocator used to allocate temporary arrays, i.e., most usefully apyopencl.tools.MemoryPool
. See the note in the documentation ofSpectralCollocator()
.
In addition, the keyword arguments
min_f
,max_f
,min_log_f
, andmax_log_f
are recognized and used to define binning. Each must have the same shape as the outer axes off
(e.g.,(2, 3)
in the example above). Unless values for each of these is passed, they all will be computed automatically.Warning
This class prevents any out-of-bounds writes when calculating the bin number, ensuring that
0 <= bin < num_bins
. When passing minimum and maximum values, the first and last bins may be incorrect iff
in fact has values outside of the passed minimum/maximum values.- Returns:
A
dict
with the the following items:"linear"
: the histogram(s) off
with linear binning"linear_bins"
: the bins used for the linear histogram(s) off
"log"
: the histogram(s) off
with logarithmic binning"log_bins"
: the bins used for the logarithmic histogram(s) off
Each
numpy
array has shapef.shape[:-3] + (num_bins,)
.
Utilities#
- class pystella.output.OutputFile(context=None, name=None, runfile=None, **kwargs)[source]#
A wrapper to
h5py:File
which collects and saves useful run information and provides functionality to append to datasets.No arguments are required, but the following keyword arguments are recognized:
- Parameters:
context – A
pyopencl.Context
. If not None, information about the device, driver, and platform is saved to theattrs
dictionary. Defaults to None.name – The name of the
.h5
(sans the extension) file to create. If None, a unique filename is chosen based on the current date and time. Defaults to None.runfile – A file whose content will be saved as a string to
attrs["runfile"]
, if not None. Useful for attaching the run file of a simulation to its output. Defaults to None.
Any remaining keyword arguments are saved to the
attrs
dictionary. If any valueval
is not of valid type to be saved, theval.__name__
attribute is saved if the value is atype
instance, or elsestr(val)
is saved.Versions and git revisions (when available) of
pystella
and its dependencies are saved as"versions"
and"git_revs"
h5py:Dataset
s. The hostname is recorded in the"hostname"
key of theattrs
dictionary.- output(group, **kwargs)[source]#
Appends values to datasets within a
h5py:Group
namedgroup
.group
is created if it does not exist, and theh5py:Dataset
’s of thish5py:Group
are determined by the keys of keyword arguments. Ifgroup
already exists, iterates over eachh5py:Dataset
and appends values from keyword arguments (matchingh5py:Dataset
names to keys).- Parameters:
group – The
h5py:Group
to appendh5py:Dataset
values to.
If
group
already exists, a keyword argument for eachh5py:Dataset
ingroup
must be provided.
- pystella.choose_device_and_make_context(platform_choice=None, device_choice=None)[source]#
A wrapper that chooses a device and creates a
pyopencl.Context
on a particular device.- Parameters:
platform_choice – An integer or string specifying which
pyopencl.Platform
to choose. Defaults to None, in which case the environment variablesPYOPENCL_CTX
orPYOPENCL_TEST
are queried. If none of the above are specified, then the first platform is chosen.device_choice – An integer or string specifying which
pyopencl.Device
to run on. Defaults to None, in which case a device is chosen according to the node-local MPI rank. (Note that this requires initializing MPI, i.e., importingmpi4py.MPI
.)
- Returns: