petsc-3.8.4 2018-03-24

PCTELESCOPE

Runs a KSP solver on a sub-group of processors. MPI processes not in the sub-communicator are idle during the solve.

Options Database

	-pc_telescope_reduction_factor <r>	- factor to use communicator size by. e.g. with 64 MPI processes and r=4, the new sub-communicator will have 64/4 = 16 ranks.
	-pc_telescope_ignore_dm	- flag to indicate whether an attached DM should be ignored

-pc_telescope_subcomm_type <interlaced,contiguous> -how to define the reduced communicator. see PetscSubcomm for more.

Notes

The preconditioner is deemed telescopic as it only calls KSPSolve() on a single sub-communicator, in contrast with PCREDUNDANT which calls KSPSolve() on N sub-communicators. This means there will be MPI processes which will be idle during the application of this preconditioner.

The default KSP is PREONLY. If a DM is attached to the PC, it is re-partitioned on the sub-communicator. Both the Bmat operator and the right hand side vector are permuted into the new DOF ordering defined by the re-partitioned DM. Currently only support for re-partitioning a DMDA is provided. Any nullspace attached to the original Bmat operator is extracted, re-partitioned and set on the repartitioned Bmat operator. KSPSetComputeOperators() is not propagated to the sub KSP. Currently there is no support for the flag -pc_use_amat

Assuming that the parent preconditioner (PC) is defined on a communicator c, this implementation creates a child sub-communicator (c') containing fewer MPI processes than the original parent preconditioner (PC).

Developer Notes

During PCSetup, the B operator is scattered onto c'. Within PCApply, the RHS vector (x) is scattered into a redundant vector, xred (defined on c'). Then, KSPSolve() is executed on the c' communicator.

The communicator used within the telescoping preconditioner is defined by a PetscSubcomm using the INTERLACED creation routine by default (this can be changed with -pc_telescope_subcomm_type). We run the sub KSP on only the ranks within the communicator which have a color equal to zero.

The telescoping preconditioner is aware of nullspaces and near nullspaces which are attached to the B operator. In the case where B has a (near) nullspace attached, the (near) nullspace vectors are extracted from B and mapped into a new (near) nullspace, defined on the sub-communicator, which is attached to B' (the B operator which was scattered to c')

The telescoping preconditioner is aware of an attached DM. In the event that the DM is of type DMDA (2D or 3D - 1D support for 1D DMDAs is not provided), a new DMDA is created on c' (e.g. it is re-partitioned), and this new DM is attached the sub KSPSolve(). The design of telescope is such that it should be possible to extend support for re-partitioning other to DM's (e.g. DMPLEX). The user can supply a flag to ignore attached DMs.

By default, B' is defined by simply fusing rows from different MPI processes

When a DMDA is attached to the parent preconditioner, B' is defined by: (i) performing a symmetric permutation of B into the ordering defined by the DMDA on c', (ii) extracting the local chunks via MatCreateSubMatrices(), (iii) fusing the locally (sequential) matrices defined on the ranks common to c and c' into B' using MatCreateMPIMatConcatenateSeqMat()

Limitations/improvements include the following. VecPlaceArray() could be used within PCApply() to improve efficiency and reduce memory usage.

The symmetric permutation used when a DMDA is encountered is performed via explicitly assmbleming a permutation matrix P, and performing P^T.A.P. Possibly it might be more efficient to use MatPermute(). We opted to use P^T.A.P as it appears VecPermute() does not supported for the use case required here. By computing P, one can permute both the operator and RHS in a consistent manner.

Mapping of vectors is performed in the following way. Suppose the parent comm size was 4, and we set a reduction factor of 2; this would give a comm size on c' of 2. Using the interlaced creation routine, the ranks in c with color = 0 will be rank 0 and 2. We perform the scatter to the sub-comm in the following way. [1] Given a vector x defined on comm c

rank(c) : _________ 0 ______ ________ 1 _______ ________ 2 _____________ ___________ 3 __________ x : [0, 1, 2, 3, 4, 5] [6, 7, 8, 9, 10, 11] [12, 13, 14, 15, 16, 17] [18, 19, 20, 21, 22, 23]

scatter to xtmp defined also on comm c so that we have the following values

rank(c) : ___________________ 0 ________________ _1_ ______________________ 2 _______________________ __3_ xtmp : [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11] [ ] [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23] [ ]

The entries on rank 1 and 3 (ranks which do not have a color = 0 in c') have no values

[2] Copy the value from rank 0, 2 (indices with respect to comm c) into the vector xred which is defined on communicator c'. Ranks 0 and 2 are the only ranks in the subcomm which have a color = 0.

rank(c') : ___________________ 0 _______________ ______________________ 1 _____________________ xred : [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11] [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]

Contributed by Dave May

Reference

Dave A. May, Patrick Sanan, Karl Rupp, Matthew G. Knepley, and Barry F. Smith, "Extreme-Scale Multigrid Components within PETSc". 2016. In Proceedings of the Platform for Advanced Scientific Computing Conference (PASC '16). DOI: 10.1145/2929908.2929913

PCTELESCOPE

Options Database

Notes

Developer Notes

Reference

See Also