MatMPISELLSetPreallocation#
Preallocates memory for a MATMPISELL
sparse parallel matrix in sell format. For good matrix assembly performance the user should preallocate the matrix storage by setting the parameters d_nz
(or d_nnz
) and o_nz
(or o_nnz
).
Synopsis#
#include "petscmat.h"
#include "petscmat.h"
PetscErrorCode MatMPISELLSetPreallocation(Mat B, PetscInt d_nz, const PetscInt d_nnz[], PetscInt o_nz, const PetscInt o_nnz[])
Collective
Input Parameters#
B - the matrix
d_nz - number of nonzeros per row in DIAGONAL portion of local submatrix (same value is used for all local rows)
d_nnz - array containing the number of nonzeros in the various rows of the DIAGONAL portion of the local submatrix (possibly different for each row) or NULL (
PETSC_NULL_INTEGER
in Fortran), ifd_nz
is used to specify the nonzero structure. The size of this array is equal to the number of local rows, i.e ‘m’. For matrices that will be factored, you must leave room for (and set) the diagonal entry even if it is zero.o_nz - number of nonzeros per row in the OFF-DIAGONAL portion of local submatrix (same value is used for all local rows).
o_nnz - array containing the number of nonzeros in the various rows of the OFF-DIAGONAL portion of the local submatrix (possibly different for each row) or NULL (
PETSC_NULL_INTEGER
in Fortran), ifo_nz
is used to specify the nonzero structure. The size of this array is equal to the number of local rows, i.e ‘m’.
Example usage#
Consider the following 8x8 matrix with 34 non-zero values, that is assembled across 3 processors. Lets assume that proc0 owns 3 rows, proc1 owns 3 rows, proc2 owns 2 rows. This division can be shown as follows
1 2 0 | 0 3 0 | 0 4
Proc0 0 5 6 | 7 0 0 | 8 0
9 0 10 | 11 0 0 | 12 0
-------------------------------------
13 0 14 | 15 16 17 | 0 0
Proc1 0 18 0 | 19 20 21 | 0 0
0 0 0 | 22 23 0 | 24 0
-------------------------------------
Proc2 25 26 27 | 0 0 28 | 29 0
30 0 0 | 31 32 33 | 0 34
This can be represented as a collection of submatrices as
A B C
D E F
G H I
Where the submatrices A,B,C are owned by proc0, D,E,F are owned by proc1, G,H,I are owned by proc2.
The ‘m’ parameters for proc0,proc1,proc2 are 3,3,2 respectively. The ‘n’ parameters for proc0,proc1,proc2 are 3,3,2 respectively. The ‘M’,’N’ parameters are 8,8, and have the same values on all procs.
The DIAGONAL submatrices corresponding to proc0,proc1,proc2 are
submatrices [A], [E], [I] respectively. The OFF-DIAGONAL submatrices
corresponding to proc0,proc1,proc2 are [BC], [DF], [GH] respectively.
Internally, each processor stores the DIAGONAL part, and the OFF-DIAGONAL
part as MATSEQSELL
matrices. For example, proc1 will store [E] as a MATSEQSELL
matrix, ans [DF] as another SeqSELL matrix.
When d_nz
, o_nz
parameters are specified, d_nz
storage elements are
allocated for every row of the local diagonal submatrix, and o_nz
storage locations are allocated for every row of the OFF-DIAGONAL submat.
One way to choose d_nz
and o_nz
is to use the max nonzerors per local
rows for each of the local DIAGONAL, and the OFF-DIAGONAL submatrices.
In this case, the values of d_nz,o_nz are
proc0 dnz = 2, o_nz = 2
proc1 dnz = 3, o_nz = 2
proc2 dnz = 1, o_nz = 4
We are allocating m*(d_nz+o_nz) storage locations for every proc. This translates to 3*(2+2)=12 for proc0, 3*(3+2)=15 for proc1, 2*(1+4)=10 for proc3. i.e we are using 12+15+10=37 storage locations to store 34 values.
When d_nnz
, o_nnz
parameters are specified, the storage is specified
for every row, corresponding to both DIAGONAL and OFF-DIAGONAL submatrices.
In the above case the values for d_nnz,o_nnz are
proc0 d_nnz = [2,2,2] and o_nnz = [2,2,2]
proc1 d_nnz = [3,3,2] and o_nnz = [2,1,1]
proc2 d_nnz = [1,1] and o_nnz = [4,4]
Here the space allocated is according to nz (or maximum values in the nnz if nnz is provided) for DIAGONAL and OFF-DIAGONAL submatrices, i.e (2+2+3+2)*3+(1+4)*2=37
Notes#
If the *_nnz parameter is given then the *_nz parameter is ignored
The stored row and column indices begin with zero.
The parallel matrix is partitioned such that the first m0 rows belong to process 0, the next m1 rows belong to process 1, the next m2 rows belong to process 2 etc.. where m0,m1,m2… are the input parameter ‘m’.
The DIAGONAL portion of the local submatrix of a processor can be defined as the submatrix which is obtained by extraction the part corresponding to the rows r1-r2 and columns c1-c2 of the global matrix, where r1 is the first row that belongs to the processor, r2 is the last row belonging to the this processor, and c1-c2 is range of indices of the local part of a vector suitable for applying the matrix to. This is an mxn matrix. In the common case of a square matrix, the row and column ranges are the same and the DIAGONAL part is also square. The remaining portion of the local submatrix (mxN) constitute the OFF-DIAGONAL portion.
If o_nnz
, d_nnz
are specified, then o_nz
, and d_nz
are ignored.
You can call MatGetInfo()
to get information on how effective the preallocation was;
for example the fields mallocs,nz_allocated,nz_used,nz_unneeded;
You can also run with the option -info and look for messages with the string
malloc in them to see if additional memory allocation was needed.
See Also#
Mat
, MatCreate()
, MatCreateSeqSELL()
, MatSetValues()
, MatCreatesell()
,
MATMPISELL
, MatGetInfo()
, PetscSplitOwnership()
, MATSELL
Level#
intermediate
Location#
Examples#
Implementations#
MatMPISELLSetPreallocation_MPISELL in src/mat/impls/sell/mpi/mpisell.c
Index of all Mat routines
Table of Contents for all manual pages
Index of all manual pages