Reference: Other Functionality¶
MPI parallelization¶
-
class
pystella.
DomainDecomposition
(proc_shape, halo_shape, rank_shape=None)[source]¶ Implements functions needed for the MPI domain decomposition of a 3D grid.
If
mpi4py
is not installed, then only single-rank operation is supported.-
__init__
(proc_shape, halo_shape, rank_shape=None)[source]¶ - Parameters
proc_shape –
A 3-
tuple
specifying the shape of the MPI processor grid.Note
Currently,
proc_shape[2]
must be1
, i.e., only two-dimensional domain decompositions are supported.halo_shape – The number of halo layers on (both sides of) each axis of the computational grid (i.e., those used for grid synchronization or enforcing boundary conditions). May be a 3-
tuple
or anint
(which is interpreted as that value for each axis). Note that zero is a valid value along any axis (in which case grid synchronization for such axes is skipped). For example,halo_shape=(2,0,1)
specifies that there are two additional layers at each the beginning and end of the first axis, none for the second, and one for the third. Forrank_shape=(Nx, Ny, Nz)
, an appropriately padded array would then have shape(Nx+4, Ny, Nz+2)
.
The following keyword arguments are recognized:
- Parameters
rank_shape – A 3-
tuple
specifying the shape of the computational sub-grid on the calling process. Defaults to None, in which case the global size is not fixed (and will be inferred when, e.g.,share_halos()
is called, at a slight performance penalty).- Raises
NotImplementedError – if
proc_shape[2] != 1
.ValueError – if the size of the processor grid
proc_shape[0] * proc_shape[1] * proc_shape[2]
is not equal to the total number of ranks the application was launched with (i.e., that returned bympi4py.MPI.COMM_WORLD.Get_size()
).
Communicates halo data across all axes, imposing periodic boundary conditions.
- Parameters
queue – The
pyopencl.CommandQueue
to enqueue kernels and copies.fx – The
pyopencl.array.Array
whose halo elements are to be synchronized across ranks. The number of halo layers on each face of the grid is fixed byhalo_shape
, while the shape offx
(i.e., subtracting halo padding specified byhalo_shape
) is only fixed ifrank_shape
was passed to__init__()
.
-
remove_halos
(queue, in_array, out_array)[source]¶ Removes the halo padding from an array.
The only restriction on the shapes of the three-dimensional input arrays is that the shape of
in_array
is larger than that ofout_array
by2*halo_shape
along each axis.- Parameters
queue – The
pyopencl.CommandQueue
to enqueue kernels and copies.in_array – The subarray whose halos will be removed. May be either a
pyopencl.array.Array
or anumpy.ndarray
.out_array – The output array. May be either a
pyopencl.array.Array
or anumpy.ndarray
.
-
gather_array
(queue, in_array, out_array, root)[source]¶ Gathers the subdomains of an array from each rank into a single array of the entire grid, removing halo padding from
in_array
.- Parameters
queue – The
pyopencl.CommandQueue
to enqueue kernels and copies.in_array – The subarrays to be gathered. May be either a
pyopencl.array.Array
or anumpy.ndarray
.out_array – The output array for the gathered grid. May be either a
pyopencl.array.Array
or anumpy.ndarray
on rankroot
, and None otherwise.root – The rank to which
in_array
is gathered.
-
restore_halos
(queue, in_array, out_array)[source]¶ Adds halo padding to an array.
The only restriction on the shapes of the three-dimensional input arrays is that the shape of
out_array
is larger than that ofin_array
by2*halo_shape
along each axis.Note
Since
share_halos()
is not currently implemented fornumpy.ndarray
’s, this method does not automatically share halos after they are restored. Thus, halos must be shared manually after the fact (for now).- Parameters
queue – The
pyopencl.CommandQueue
to enqueue kernels and copies.in_array – The array to add halos to. May be either a
pyopencl.array.Array
or anumpy.ndarray
.out_array – The output array. May be either a
pyopencl.array.Array
or anumpy.ndarray
.
-
scatter_array
(queue, in_array, out_array, root)[source]¶ Scatters the values of a single array of the entire grid to invidual subdomains on each rank. The per-rank
out_array
is padded byhalo_shape
.Note
Since
share_halos()
is not currently implemented fornumpy.ndarray
’s, this method does not automatically share halos after scattering. Thus, halos must be shared manually after the fact (for now).- Parameters
queue – The
pyopencl.CommandQueue
to enqueue kernels and copies.in_array – The full array to be scattered. May be either a
pyopencl.array.Array
or anumpy.ndarray
on rankroot
, and None otherwise.out_array – The output array for the scattered arrays. May be either a
pyopencl.array.Array
or anumpy.ndarray
.root – The rank from which
in_array
is scattered.
-
rankID
(rx, ry, rz)[source]¶ - Returns
The (integer) rank number corresponding to the processor grid site,
(rx, ry, rz)
.
-
bcast
(x, root)[source]¶ A wrapper to
MPI.COMM_WORLD.bcast()
which broadcasts an arbitrary python object.- Parameters
x – The value to broadcasted. Must be defined on all ranks (i.e., set
x = None
on ranks other thanroot
).root – The rank whose value of
x
should be broadcasted.
- Returns
The broadcasted value, on all ranks.
-
allreduce
(rank_reduction, op=None)[source]¶ A wrapper to
MPI.COMM_WORLD.allreduce()
which reduces and broadcasts a rankrank_reduction
from theroot
rank.- Parameters
rank_reduction – The rank’s individual value to be reduced.
op – The MPI reduction operation to perform. Defaults to
MPI.SUM
.
- Returns
The reduced value, on all ranks.
-
comm
¶ An
mpi4py.MPI.COMM_WORLD
ifmpi4py
is installed, else None.
-
rank
¶ The integral rank of the calling process, i.e., that returned by
mpi4py.MPI.COMM_WORLD.Get_rank()
.
-
nranks
¶ The total number of ranks, i.e., that returned by
mpi4py.MPI.COMM_WORLD.Get_size()
.
-
proc_shape
¶
-
rank_shape
¶
-
Expansion¶
-
class
pystella.
Expansion
(energy, Stepper, mpl=1.0, dtype=<class 'numpy.float64'>)[source]¶ Implements the time stepping of the scale factor evolution for conformal FLRW spacetimes with line element
\[\mathrm{d} s^2 = a(\tau)^2 \left( - \mathrm{d} \tau^2 + \delta_{ij} \mathrm{d} x^i \mathrm{d} x^j \right).\]Below, the averaged energy density and pressure are
\[ \begin{align}\begin{aligned}\bar{\rho} &\equiv - \left\langle T_{\hphantom{0}0}^0 \right\rangle\\\bar{P} &\equiv \frac{1}{3} \left\langle T_{\hphantom{i}i}^i \right\rangle.\end{aligned}\end{align} \]-
__init__
(energy, Stepper, mpl=1.0, dtype=<class 'numpy.float64'>)[source]¶ - Parameters
energy – The initial energy density, used to initialize \(\partial a / \partial \tau\).
Stepper – A
Stepper
to use for time stepping.mpl – The unreduced Planck mass, \(m_\mathrm{pl}^2 \equiv 1 / G_N\). Setting this value chooses the units of the system. For example, to work in units of the reduced Planck mass, \(M_\mathrm{pl}^2 \equiv (8 \pi G_N)^{-1}\), pass
mpl=np.sqrt(8*np.pi)
. Defaults to1
.dtype – The datatype of the input and output arrays. Defaults to float64.
-
adot_friedmann_1
(a, energy)[source]¶ - Parameters
a – The current scale factor, \(a\).
energy – The current energy density, \(\bar{\rho}\).
- Returns
The value of \(\partial_\tau a\) as given by Friedmann’s first equation,
\[\mathcal{H}^2 \equiv \left( \frac{\partial_\tau a}{a} \right)^2 = \frac{8 \pi a^2}{3 m_\mathrm{pl}^2} \bar{\rho}\]
-
addot_friedmann_2
(a, energy, pressure)[source]¶ - Parameters
a – The current scale factor, \(a\).
energy – The current energy density, \(\bar{\rho}\).
pressure – The current pressure, \(\bar{P}\).
- Returns
The value of \(\partial_\tau^2 a\) as given by Friedmann’s second equation,
\[\partial_\tau \mathcal{H} + \mathcal{H}^2 = \frac{\partial_\tau^2 a}{a} = \frac{4 \pi a^2}{3 m_\mathrm{pl}^2} \left( \bar{\rho} - 3 \bar{P} \right)\]
-
step
(stage, energy, pressure, dt)[source]¶ Executes one stage of the time stepper.
- Parameters
stage – Which stage of the integrator to call.
energy – The current energy density, \(\bar{\rho}\).
pressure – The current pressure, \(\bar{P}\).
dt – The timestep to take.
-
constraint
(energy)[source]¶ A dimensionless measure of the satisfaction of the first Friedmann equation (as a constraint on the evolution), equal to
\[\left\vert \frac{1}{\mathcal{H}} \sqrt{\frac{8 \pi a^2}{3 m_\mathrm{pl}^2} \rho} - 1 \right\vert\]where \(\mathcal{H}\) the current conformal Hubble parameter, \(\partial_\tau a / a\).
- Parameters
energy – The current energy density, \(\bar{\rho}\).
-
Statistical analysis¶
-
class
pystella.
FieldStatistics
(decomp, halo_shape, **kwargs)[source]¶ A subclass of
Reduction
which computes the mean and variance of fields.-
__init__
(decomp, halo_shape, **kwargs)[source]¶ - Parameters
decomp – An instance of
DomainDecomposition
.halo_shape – The number of halo layers on (both sides of) each axis of the computational grid. May either be an
int
, interpreted as a value to fix the parameterh
to, or atuple
, interpreted as values forhx
,hy
, andhz
. Defaults to None, in which case no such values are fixed at kernel creation.
The following keyword-only arguments are recognized:
- Parameters
max_min – A
bool
determining whether to also compute the actual and absolute maxima and minima of fields. Defaults to False.
Any remaining keyword arguments are passed to
Reduction.__init__()
.
-
__call__
(f, queue=None, allocator=None)[source]¶ - Parameters
f – The array whose statistics will be computed. If
f
has more than three axes, all the outer axes are looped over. As an example, iff
has shape(2, 3, 130, 130, 130)
, this method loops over the outermost two axes with shape(2, 3)
, and the resulting output data would have the same shape.
The following keyword arguments are recognized:
- Parameters
queue – A
pyopencl.CommandQueue
. Defaults tofx.queue
.allocator – A
pyopencl
allocator used to allocate temporary arrays, i.e., most usefully apyopencl.tools.MemoryPool
. See the note in the documentation ofSpectralCollocator()
.
- Returns
A
dict
of means and variances, whose values are lists of the statistic (key) for each array infields
.
-
-
class
pystella.
FieldHistogrammer
(decomp, num_bins, rank_shape, dtype, **kwargs)[source]¶ A subclass of
Histogrammer
which computes field histograms with both linear and logarithmic binning.-
__init__
(decomp, num_bins, rank_shape, dtype, **kwargs)[source]¶ - Parameters
decomp – An instance of
DomainDecomposition
.num_bins – The number of bins of the computed histograms.
rank_shape – A 3-
tuple
specifying the shape of the computational sub-grid on the calling process.dtype – The datatype of the resulting histogram.
The following keyword-only arguments are recognized (in addition to those accepted by
ElementWiseMap
):
-
__call__
(f, queue=None, **kwargs)[source]¶ - Parameters
f – The array whose histograms will be computed. If
f
has more than three axes, all the outer axes are looped over. As an example, iff
has shape(2, 3, 130, 130, 130)
, this method loops over the outermost two axes with shape(2, 3)
, and the resulting output data would have the same shape.
The following keyword arguments are recognized:
- Parameters
queue – A
pyopencl.CommandQueue
. Defaults tofx.queue
.allocator – A
pyopencl
allocator used to allocate temporary arrays, i.e., most usefully apyopencl.tools.MemoryPool
. See the note in the documentation ofSpectralCollocator()
.
In addition, the keyword arguments
min_f
,max_f
,min_log_f
, andmax_log_f
are recognized and used to define binning. Each must have the same shape as the outer axes off
(e.g.,(2, 3)
in the example above). Unless values for each of these is passed, they all will be computed automatically.Warning
This class prevents any out-of-bounds writes when calculating the bin number, ensuring that
0 <= bin < num_bins
. When passing minimum and maximum values, the first and last bins may be incorrect iff
in fact has values outside of the passed minimum/maximum values.- Returns
A
dict
with the the following items:'linear'
: the histogram(s) off
with linear binning'linear_bins'
: the bins used for the linear histogram(s) off
'log'
: the histogram(s) off
with logarithmic binning'log_bins'
: the bins used for the logarithmic histogram(s) off
Each
numpy
array has shapef.shape[:-3] + (num_bins,)
.
New in version 2020.1.
-
Utilities¶
-
class
pystella.output.
OutputFile
(context=None, name=None, runfile=None, **kwargs)[source]¶ A wrapper to
File
which collects and saves useful run information and provides functionality to append to datasets.-
__init__
(context=None, name=None, runfile=None, **kwargs)[source]¶ No arguments are required, but the following keyword arguments are recognized:
- Parameters
context – A
pyopencl.Context
. If not None, information about the device, driver, and platform is saved to theattrs
dictionary. Defaults to None.name – The name of the
.h5
(sans the extension) file to create. If None, a unique filename is chosen based on the current date and time. Defaults to None.runfile – A file whose content will be saved as a string to
attrs['runfile']
, if not None. Useful for attaching the run file of a simulation to its output. Defaults to None.
Any remaining keyword arguments are saved to the
attrs
dictionary. If any valueval
is not of valid type to be saved, theval.__name__
attribute is saved if the value is atype
instance, or elsestr(val)
is saved.Versions and git revisions (when available) of
pystella
and its dependencies are saved as'versions'
and'git_revs'
Dataset
’s. The hostname is recorded in the'hostname'
key of theattrs
dictionary.
-
output
(group, **kwargs)[source]¶ Appends values to datasets within a
Group
namedgroup
.group
is created if it does not exist, and theDataset
’s of thisGroup
are determined by the keys of keyword arguments. Ifgroup
already exists, iterates over eachDataset
and appends values from keyword arguments (matchingDataset
names to keys).If
group
already exists, a keyword argument for eachDataset
ingroup
must be provided.
-
-
pystella.
choose_device_and_make_context
(platform_choice=None, device_choice=None)[source]¶ A wrapper to choose a device and create a
pyopencl.Context
on a particular device.- Parameters
platform_number – An integer specifying which element of the
list
returned bypyopencl.get_platforms()
to choose. Defaults to None, in which case a NVIDIA platform. If one is not found, then the first platform is chosen.device_number – An integer specifying which device to run on. Defaults to None, in which case a device is chosen according to any available environment variable defining the local MPI rank (defaulting to 0). Currently only looks for OpenMPI and MVAPICH environment variables.
- Returns