Getting Started#
PETSc consists of a collection of classes, which are discussed in detail in later parts of the manual (The Solvers in PETSc/TAO and Additional Information). The important PETSc classes include
index sets (
IS
), for indexing into vectors, renumbering, permuting, etc;vectors (
Vec
); Vectors and Parallel DataKrylov subspace methods (
KSP
); KSP: Linear System Solverspreconditioners, including multigrid, block solvers, patch solvers, and sparse direct solvers (
PC
);nonlinear solvers (
SNES
); SNES: Nonlinear Solverstimesteppers for solving time-dependent (nonlinear) PDEs, including support for differential algebraic equations, and the computation of adjoints (sensitivities/gradients of the solutions) (
TS
); TS: Scalable ODE and DAE Solversscalable optimization algorithms including a rich set of gradient-based optimizers, Newton-based optimizers and optimization with constraints (
Tao
). TAO: Optimization Solverscode for managing interactions between mesh data structures and vectors, matrices, and solvers (
DM
); DM Basics
Each class consist of an abstract interface (simply a set of calling sequences; an abstract base class in C++) and an implementation for each algorithm and data structure. This design enables easy comparison and use of different algorithms (for example, to experiment with different Krylov subspace methods, preconditioners, or truncated Newton methods). Hence, PETSc provides a rich environment for modeling scientific applications as well as for rapid algorithm design and prototyping.
The classes enable easy customization and extension of both algorithms and implementations. This approach promotes code reuse and flexibility, and also separates the issues of parallelism from the choice of algorithms. The PETSc infrastructure creates a foundation for building large-scale applications.
It is useful to consider the interrelationships among different pieces of PETSc. Numerical Libraries in PETSc is a diagram of some of these pieces. The figure illustrates the library’s hierarchical organization, which enables users to employ the solvers that are most appropriate for a particular problem.
Suggested Reading#
The manual is divided into four parts:
Introduction to PETSc describes the basic procedure for using the PETSc library and presents simple examples of solving linear systems with PETSc. This section conveys the typical style used throughout the library and enables the application programmer to begin using the software immediately.
The Solvers in PETSc/TAO explains in detail the use of the various PETSc algebraic objects, such as vectors, matrices, index sets and the PETSc solvers including linear and nonlinear solvers, time integrators, and optimization support. DM: Interfacing Between Solvers and Models/Discretizations details how a user’s models and discretizations can easily be interfaced with the solvers by using the DM construct. The Additional Information describes a variety of useful information, including profiling, the options database, viewers, error handling, and some details of PETSc design.
PETSc has evolved to become quite a comprehensive package, and therefore this manual can be rather intimidating for new users. Bear in mind that PETSc can be used efficiently before one understands all of the material presented here. Furthermore, the definitive reference for any PETSc function is always the online manual page. Manual pages for all PETSc functions can be accessed here. The manual pages provide hyperlinked indices (organized by both concept and routine name) to the tutorial examples and enable easy movement among related topics.
Visual Studio Code, Eclipse, Emacs, and Vim users may find their development environment’s options for searching in the source code are useful for exploring the PETSc source code. Details of these feature are provided in Developer Environments.
The complete PETSc distribution, manual pages, and additional information are available via the PETSc home page. The PETSc home page also contains details regarding installation, new features and changes in recent versions of PETSc, machines that we currently support, and a frequently asked questions (FAQ) list.
Note to Fortran Programmers: In most of the manual, the examples and calling sequences are given for the C/C++ family of programming languages. However, Fortran programmers can use all of the functionality of PETSc from Fortran, with only minor differences in the user interface. PETSc for Fortran Users provides a discussion of the differences between using PETSc from Fortran and C, as well as several complete Fortran examples.
Note to Python Programmers: To program with PETSc in Python you need to enable Python bindings
(i.e petsc4py) with the configure option --with-petsc4py=1
. See the
PETSc installation guide
for more details.
Running PETSc Programs#
Before using PETSc, the user must first set the environmental variable
$PETSC_DIR
, indicating the full path of the PETSc home directory. For
example, under the Unix bash shell a command of the form
$ export PETSC_DIR=$HOME/petsc
can be placed in the user’s .bashrc
or other startup file. In
addition, the user may need to set the environment variable
$PETSC_ARCH
to specify a particular configuration of the PETSc
libraries. Note that $PETSC_ARCH
is just a name selected by the
installer to refer to the libraries compiled for a particular set of
compiler options and machine type. Using different values of
$PETSC_ARCH
allows one to switch between several different sets (say
debug and optimized) of libraries easily. To determine if you need to
set $PETSC_ARCH
, look in the directory indicated by $PETSC_DIR
, if
there are subdirectories beginning with arch
then those
subdirectories give the possible values for $PETSC_ARCH
.
See Tutorials, by Mathematical Problem to immediately jump in and run PETSc code.
All PETSc programs use the MPI (Message Passing Interface) standard for message-passing communication [For94]. Thus, to execute PETSc programs, users must know the procedure for beginning MPI jobs on their selected computer system(s). For instance, when using the MPICH implementation of MPI and many others, the following command initiates a program that uses eight processors:
$ mpiexec -n 8 ./petsc_program_name petsc_options
PETSc also comes with a script that automatically uses the correct
mpiexec
for your configuration.
$ $PETSC_DIR/lib/petsc/bin/petscmpiexec -n 8 ./petsc_program_name petsc_options
All PETSc-compliant programs support the use of the -help
option as well as the -version
option.
Certain options are supported by all PETSc programs. We list a few
particularly useful ones below; a complete list can be obtained by
running any PETSc program with the option -help
.
-log_view
- summarize the program’s performance (see Profiling)-fp_trap
- stop on floating-point exceptions; for example divide by zero-malloc_dump
- enable memory tracing; dump list of unfreed memory at conclusion of the run, see Detecting Memory Allocation Problems and Memory Usage,-malloc_debug
- enable memory debugging (by default this is activated for the debugging version of PETSc), see Detecting Memory Allocation Problems and Memory Usage,-start_in_debugger
[noxterm,gdb,lldb]
[-display name]
- start all processes in debugger. See Debugging, for more information on debugging PETSc programs.-on_error_attach_debugger
[noxterm,gdb,lldb]
[-display name]
- start debugger only on encountering an error-info
- print a great deal of information about what the program is doing as it runs
Writing PETSc Programs#
Most PETSc programs begin with a call to
PetscInitialize(int *argc,char ***argv,char *file,char *help);
which initializes PETSc and MPI. The arguments argc
and argv
are
the command line arguments delivered in all C and C++ programs. The
argument file
optionally indicates an alternative name for the PETSc
options file, .petscrc
, which resides by default in the user’s home
directory. Runtime Options provides details
regarding this file and the PETSc options database, which can be used
for runtime customization. The final argument, help
, is an optional
character string that will be printed if the program is run with the
-help
option. In Fortran the initialization command has the form
call PetscInitialize(character(*) file,integer ierr)
Where the file argument is optional. PetscInitialize()
automatically calls MPI_Init()
if MPI has not
been not previously initialized. In certain circumstances in which MPI
needs to be initialized directly (or is initialized by some other
library), the user can first call MPI_Init()
(or have the other
library do it), and then call PetscInitialize()
. By default,
PetscInitialize()
sets the PETSc “world” communicator
PETSC_COMM_WORLD
to MPI_COMM_WORLD
.
For those not familiar with MPI, a communicator is a way of indicating
a collection of processes that will be involved together in a
calculation or communication. Communicators have the variable type
MPI_Comm
. In most cases users can employ the communicator
PETSC_COMM_WORLD
to indicate all processes in a given run and
PETSC_COMM_SELF
to indicate a single process.
MPI provides routines for generating new communicators consisting of subsets of processors, though most users rarely need to use these. The book Using MPI, by Lusk, Gropp, and Skjellum [GLS94] provides an excellent introduction to the concepts in MPI. See also the MPI homepage. Note that PETSc users need not program much message passing directly with MPI, but they must be familiar with the basic concepts of message passing and distributed memory computing.
All PETSc programs should call PetscFinalize()
as their final (or
nearly final) statement. This routine handles options to be called at the conclusion of the
program, and calls MPI_Finalize()
if PetscInitialize()
began
MPI. If MPI was initiated externally from PETSc (by either the user or
another software package), the user is responsible for calling
MPI_Finalize()
.
Error Checking#
Most PETSc functions return a PetscErrorCode
, which is an integer
indicating whether an error has occurred during the call. The error code
is set to be nonzero if an error has been detected; otherwise, it is
zero. For the C/C++ interface, the error variable is the routine’s
return value, while for the Fortran version, each PETSc routine has as
its final argument an integer error variable.
One should always check these routine values as given below in the C/C++ and Fortran formats, respectively:
PetscCall(PetscFunction(Args));
or
! within the main program
PetscCallA(PetscFunction(Args,ierr))
! within any subroutine
PetscCall(PetscFunction(Args,ierr))
These macros check the returned error code and if it is nonzero they call the PETSc error
handler and then return from the function with the error code. PetscCallA()
calls abort
after calling the error handler because it is not possible to return from a Fortran main
program. The above macros should be used in all subroutines to enable
a complete error traceback. See Error Checking for more details on PETSc error handling.
Simple PETSc Examples#
To help the user start using PETSc immediately, we begin with a simple
uniprocessor example that
solves the one-dimensional Laplacian problem with finite differences.
This sequential code, which can be found in
$PETSC_DIR/src/ksp/ksp/tutorials/ex1.c
, illustrates the solution of
a linear system with KSP
, the interface to the preconditioners,
Krylov subspace methods, and direct linear solvers of PETSc. Following
the code we highlight a few of the most important parts of this example.
Listing: KSP Tutorial src/ksp/ksp/tutorials/ex1.c
static char help[] = "Solves a tridiagonal linear system with KSP.\n\n";
/*
Include "petscksp.h" so that we can use KSP solvers. Note that this file
automatically includes:
petscsys.h - base PETSc routines petscvec.h - vectors
petscmat.h - matrices petscpc.h - preconditioners
petscis.h - index sets
petscviewer.h - viewers
Note: The corresponding parallel example is ex23.c
*/
#include <petscksp.h>
int main(int argc, char **args)
{
Vec x, b, u; /* approx solution, RHS, exact solution */
Mat A; /* linear system matrix */
KSP ksp; /* linear solver context */
PC pc; /* preconditioner context */
PetscReal norm; /* norm of solution error */
PetscInt i, n = 10, col[3], its;
PetscMPIInt size;
PetscScalar value[3];
PetscFunctionBeginUser;
PetscCall(PetscInitialize(&argc, &args, (char *)0, help));
PetscCallMPI(MPI_Comm_size(PETSC_COMM_WORLD, &size));
PetscCheck(size == 1, PETSC_COMM_WORLD, PETSC_ERR_WRONG_MPI_SIZE, "This is a uniprocessor example only!");
PetscCall(PetscOptionsGetInt(NULL, NULL, "-n", &n, NULL));
/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Compute the matrix and right-hand-side vector that define
the linear system, Ax = b.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
/*
Create vectors. Note that we form 1 vector from scratch and
then duplicate as needed.
*/
PetscCall(VecCreate(PETSC_COMM_SELF, &x));
PetscCall(PetscObjectSetName((PetscObject)x, "Solution"));
PetscCall(VecSetSizes(x, PETSC_DECIDE, n));
PetscCall(VecSetFromOptions(x));
PetscCall(VecDuplicate(x, &b));
PetscCall(VecDuplicate(x, &u));
/*
Create matrix. When using MatCreate(), the matrix format can
be specified at runtime.
Performance tuning note: For problems of substantial size,
preallocation of matrix memory is crucial for attaining good
performance. See the matrix chapter of the users manual for details.
*/
PetscCall(MatCreate(PETSC_COMM_SELF, &A));
PetscCall(MatSetSizes(A, PETSC_DECIDE, PETSC_DECIDE, n, n));
PetscCall(MatSetFromOptions(A));
PetscCall(MatSetUp(A));
/*
Assemble matrix
*/
value[0] = -1.0;
value[1] = 2.0;
value[2] = -1.0;
for (i = 1; i < n - 1; i++) {
col[0] = i - 1;
col[1] = i;
col[2] = i + 1;
PetscCall(MatSetValues(A, 1, &i, 3, col, value, INSERT_VALUES));
}
i = n - 1;
col[0] = n - 2;
col[1] = n - 1;
PetscCall(MatSetValues(A, 1, &i, 2, col, value, INSERT_VALUES));
i = 0;
col[0] = 0;
col[1] = 1;
value[0] = 2.0;
value[1] = -1.0;
PetscCall(MatSetValues(A, 1, &i, 2, col, value, INSERT_VALUES));
PetscCall(MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY));
PetscCall(MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY));
/*
Set exact solution; then compute right-hand-side vector.
*/
PetscCall(VecSet(u, 1.0));
PetscCall(MatMult(A, u, b));
/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Create the linear solver and set various options
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
PetscCall(KSPCreate(PETSC_COMM_SELF, &ksp));
/*
Set operators. Here the matrix that defines the linear system
also serves as the matrix that defines the preconditioner.
*/
PetscCall(KSPSetOperators(ksp, A, A));
/*
Set linear solver defaults for this problem (optional).
- By extracting the KSP and PC contexts from the KSP context,
we can then directly call any KSP and PC routines to set
various options.
- The following four statements are optional; all of these
parameters could alternatively be specified at runtime via
KSPSetFromOptions();
*/
PetscCall(KSPGetPC(ksp, &pc));
PetscCall(PCSetType(pc, PCJACOBI));
PetscCall(KSPSetTolerances(ksp, 1.e-5, PETSC_DEFAULT, PETSC_DEFAULT, PETSC_DEFAULT));
/*
Set runtime options, e.g.,
-ksp_type <type> -pc_type <type> -ksp_monitor -ksp_rtol <rtol>
These options will override those specified above as long as
KSPSetFromOptions() is called _after_ any other customization
routines.
*/
PetscCall(KSPSetFromOptions(ksp));
/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Solve the linear system
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
PetscCall(KSPSolve(ksp, b, x));
/*
View solver info; we could instead use the option -ksp_view to
print this info to the screen at the conclusion of KSPSolve().
*/
PetscCall(KSPView(ksp, PETSC_VIEWER_STDOUT_SELF));
/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Check the solution and clean up
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
PetscCall(VecAXPY(x, -1.0, u));
PetscCall(VecNorm(x, NORM_2, &norm));
PetscCall(KSPGetIterationNumber(ksp, &its));
PetscCall(PetscPrintf(PETSC_COMM_SELF, "Norm of error %g, Iterations %" PetscInt_FMT "\n", (double)norm, its));
/* check that KSP automatically handles the fact that the the new non-zero values in the matrix are propagated to the KSP solver */
PetscCall(MatShift(A, 2.0));
PetscCall(KSPSolve(ksp, b, x));
/*
Free work space. All PETSc objects should be destroyed when they
are no longer needed.
*/
PetscCall(VecDestroy(&x));
PetscCall(VecDestroy(&u));
PetscCall(VecDestroy(&b));
PetscCall(MatDestroy(&A));
PetscCall(KSPDestroy(&ksp));
/*
Always call PetscFinalize() before exiting a program. This routine
- finalizes the PETSc libraries as well as MPI
- provides summary and diagnostic information if certain runtime
options are chosen (e.g., -log_view).
*/
PetscCall(PetscFinalize());
return 0;
}
Include Files#
The C/C++ include files for PETSc should be used via statements such as
#include <petscksp.h>
where petscksp.h
is the include file for the linear solver library.
Each PETSc program must specify an include file that corresponds to the
highest level PETSc objects needed within the program; all of the
required lower level include files are automatically included within the
higher level files. For example, petscksp.h
includes petscmat.h
(matrices), petscvec.h
(vectors), and petscsys.h
(base PETSc
file). The PETSc include files are located in the directory
$PETSC_DIR/include
. See Modules and Include Files
for a discussion of PETSc include files in Fortran programs.
The Options Database#
As shown in Simple PETSc Examples, the user can
input control data at run time using the options database. In this
example the command PetscOptionsGetInt(NULL,NULL,"-n",&n,&flg);
checks whether the user has provided a command line option to set the
value of n
, the problem dimension. If so, the variable n
is set
accordingly; otherwise, n
remains unchanged. A complete description
of the options database may be found in Runtime Options.
Vectors#
One creates a new parallel or sequential vector, x
, of global
dimension M
with the commands
where comm
denotes the MPI communicator and m
is the optional
local size which may be PETSC_DECIDE
. The type of storage for the
vector may be set with either calls to VecSetType()
or
VecSetFromOptions()
. Additional vectors of the same type can be
formed with
VecDuplicate(Vec old,Vec *new);
The commands
VecSet(Vec x,PetscScalar value);
VecSetValues(Vec x,PetscInt n,PetscInt *indices,PetscScalar *values,INSERT_VALUES);
respectively set all the components of a vector to a particular scalar value and assign a different value to each component. More detailed information about PETSc vectors, including their basic operations, scattering/gathering, index sets, and distributed arrays, is discussed in Chapter Vectors and Parallel Data.
Note the use of the PETSc variable type PetscScalar
in this example.
The PetscScalar
is simply defined to be double
in C/C++ (or
correspondingly double precision
in Fortran) for versions of PETSc
that have not been compiled for use with complex numbers. The
PetscScalar
data type enables identical code to be used when the
PETSc libraries have been compiled for use with complex numbers.
Numbers discusses the use of complex
numbers in PETSc programs.
Matrices#
Usage of PETSc matrices and vectors is similar. The user can create a
new parallel or sequential matrix, A
, which has M
global rows
and N
global columns, with the routines
MatCreate(MPI_Comm comm,Mat *A);
MatSetSizes(Mat A,PETSC_DECIDE,PETSC_DECIDE,PetscInt M,PetscInt N);
where the matrix format can be specified at runtime via the options
database. The user could alternatively specify each processes’ number of
local rows and columns using m
and n
.
Generally one then sets the “type” of the matrix, with, for example,
MatSetType(A,MATAIJ);
This causes the matrix A
to used the compressed sparse row storage
format to store the matrix entries. See MatType
for a list of all
matrix types. Values can then be set with the command
MatSetValues(Mat A,PetscInt m,PetscInt *im,PetscInt n,PetscInt *in,PetscScalar *values,INSERT_VALUES);
After all elements have been inserted into the matrix, it must be processed with the pair of commands
Matrices discusses various matrix formats as well as the details of some basic matrix manipulation routines.
Linear Solvers#
After creating the matrix and vectors that define a linear system,
Ax
\(=\) b
, the user can then use KSP
to solve the
system with the following sequence of commands:
KSPCreate(MPI_Comm comm,KSP *ksp);
KSPSetOperators(KSP ksp,Mat Amat,Mat Pmat);
KSPSetFromOptions(KSP ksp);
KSPSolve(KSP ksp,Vec b,Vec x);
KSPDestroy(KSP ksp);
The user first creates the KSP
context and sets the operators
associated with the system (matrix that defines the linear system,
Amat
and matrix from which the preconditioner is constructed,
Pmat
). The user then sets various options for customized solution,
solves the linear system, and finally destroys the KSP
context. We
emphasize the command KSPSetFromOptions()
, which enables the user to
customize the linear solution method at runtime by using the options
database, which is discussed in Runtime Options. Through this database, the
user not only can select an iterative method and preconditioner, but
also can prescribe the convergence tolerance, set various monitoring
routines, etc. (see, e.g., Profiling Programs).
KSP: Linear System Solvers describes in detail the KSP
package,
including the PC
and KSP
packages for preconditioners and Krylov
subspace methods.
Nonlinear Solvers#
Most PDE problems of interest are inherently nonlinear. PETSc provides
an interface to tackle the nonlinear problems directly called SNES
.
SNES: Nonlinear Solvers describes the nonlinear
solvers in detail. We highly recommend most PETSc users work directly with
SNES
, rather than using PETSc for the linear problem and writing their own
nonlinear solver.
Error Checking#
As noted above PETSc functions return a PetscErrorCode
, which is an integer
indicating whether an error has occurred during the call. Below, we indicate a traceback
generated by error detection within a sample PETSc program. The error
occurred on line 3618 of the file
$PETSC_DIR/src/mat/impls/aij/seq/aij.c
and was caused by trying to
allocate too large an array in memory. The routine was called in the
program ex3.c
on line 66. See
Error Checking for details regarding error checking
when using the PETSc Fortran interface.
$ cd $PETSC_DIR/src/ksp/ksp/tutorials
$ make ex3
$ mpiexec -n 1 ./ex3 -m 100000
[0]PETSC ERROR: --------------------- Error Message --------------------------------
[0]PETSC ERROR: Out of memory. This could be due to allocating
[0]PETSC ERROR: too large an object or bleeding by not properly
[0]PETSC ERROR: destroying unneeded objects.
[0]PETSC ERROR: Memory allocated 11282182704 Memory used by process 7075897344
[0]PETSC ERROR: Try running with -malloc_dump or -malloc_view for info.
[0]PETSC ERROR: Memory requested 18446744072169447424
[0]PETSC ERROR: Petsc Development GIT revision: v3.7.1-224-g9c9a9c5 GIT Date: 2016-05-18 22:43:00 -0500
[0]PETSC ERROR: ./ex3 on a arch-darwin-double-debug named Patricks-MacBook-Pro-2.local by patrick Mon Jun 27 18:04:03 2016
[0]PETSC ERROR: Configure options PETSC_DIR=/Users/patrick/petsc PETSC_ARCH=arch-darwin-double-debug --download-mpich --download-f2cblaslapack --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-debugging=1 --with-precision=double --with-scalar-type=real --with-viennacl=0 --download-c2html -download-sowing
[0]PETSC ERROR: #1 MatSeqAIJSetPreallocation_SeqAIJ() line 3618 in /Users/patrick/petsc/src/mat/impls/aij/seq/aij.c
[0]PETSC ERROR: #2 PetscTrMallocDefault() line 188 in /Users/patrick/petsc/src/sys/memory/mtr.c
[0]PETSC ERROR: #3 MatSeqAIJSetPreallocation_SeqAIJ() line 3618 in /Users/patrick/petsc/src/mat/impls/aij/seq/aij.c
[0]PETSC ERROR: #4 MatSeqAIJSetPreallocation() line 3562 in /Users/patrick/petsc/src/mat/impls/aij/seq/aij.c
[0]PETSC ERROR: #5 main() line 66 in /Users/patrick/petsc/src/ksp/ksp/tutorials/ex3.c
[0]PETSC ERROR: PETSc Option Table entries:
[0]PETSC ERROR: -m 100000
[0]PETSC ERROR: ----------------End of Error Message ------- send entire error message to [email protected]
When running the debug version of the PETSc libraries, it does a great
deal of checking for memory corruption (writing outside of array bounds
etc). The macro CHKMEMQ
can be called anywhere in the code to check
the current status of the memory for corruption. By putting several (or
many) of these macros into your code you can usually easily track down
in what small segment of your code the corruption has occurred. One can
also use Valgrind to track down memory errors; see the FAQ.
For complete error handling, calls to MPI functions should be made with PetscCallMPI(MPI_Function(Args))
.
In the main Fortran program the calls should be PetscCallMPIA(MPI_Function(Args))
.
PETSc has a small number of C/C++ only macros that do not explicitly return error codes. These are used in the style
XXXBegin(Args);
other code
XXXEnd();
and include PetscOptionsBegin()
, PetscOptionsEnd()
, PetscObjectOptionsBegin()
,
PetscOptionsHeadBegin()
, PetscOptionsHeadEnd()
, PetscDrawCollectiveBegin()
, PetscDrawCollectiveEnd()
,
MatPreallocateEnd()
, and MatPreallocateBegin()
. These should not be checked for error codes.
Another class of functions with the Begin()
and End()
paradigm
including PetscLogBegin()
, PetscLogEnd()
, MatAssemblyBegin()
, and MatAssemblyEnd()
do return error codes that should be checked.
PETSc also has a set of C/C++ only macros that return an object, or NULL
if an error has been detected. These include
PETSC_VIEWER_STDOUT_WORLD
, PETSC_VIEWER_DRAW_WORLD
, PETSC_VIEWER_STDOUT_(MPI_Comm)
, and PETSC_VIEWER_DRAW_(MPI_Comm)
.
Finally PetscObjectComm((PetscObject)x)
returns the communicator associated with the object x
or MPI_COMM_NULL
if an
error was detected.
Parallel and GPU Programming#
Numerical computing today has multiple levels of parallelism (concurrency).
Low-level, single instruction multiple data (SIMD) parallelism or, somewhat similar, on-GPU parallelism,
Medium-level, multiple instruction shared memory parallelism (thread parallelism), and
High-level, distributed memory parallelism
Traditional CPUs support the lower two levels via, for example, Intel AVX-like instructions (CPU SIMD parallelism) and Unix threads, often managed by using OpenMP pragmas (CPU OpenMP parallelism), (or multiple processes). GPUs also support the lower two levels via kernel functions (GPU kernel parallelism) and streams (GPU stream parallelism). Distributed memory parallelism is created by combining multiple CPUs and/or GPUs and using MPI for communication (MPI Parallelism).
In addition there is also concurrency between computations (floating point operations) and data movement (from memory to caches and registers and via MPI between distinct memory nodes).
PETSc provides support for all these levels of parallelism but its strongest support is for MPI-based distributed memory parallelism.
MPI Parallelism#
Since PETSc uses the message-passing model for parallel programming and employs MPI for all interprocessor communication, the user is free to employ MPI routines as needed throughout an application code. However, by default the user is shielded from many of the details of message passing within PETSc, since these are hidden within parallel objects, such as vectors, matrices, and solvers. In addition, PETSc provides tools such as generalized vector scatters/gathers to assist in the management of parallel data.
Recall that the user must specify a communicator upon creation of any PETSc object (such as a vector, matrix, or solver) to indicate the processors over which the object is to be distributed. For example, as mentioned above, some commands for matrix, vector, and linear solver creation are:
The creation routines are collective over all processors in the communicator; thus, all processors in the communicator must call the creation routine. In addition, if a sequence of collective routines is being used, they must be called in the same order on each processor.
The next example, given below, illustrates the solution of a linear system in parallel. This code, corresponding to KSP Tutorial ex2, handles the two-dimensional Laplacian discretized with finite differences, where the linear system is again solved with KSP. The code performs the same tasks as the sequential version within Simple PETSc Examples. Note that the user interface for initiating the program, creating vectors and matrices, and solving the linear system is exactly the same for the uniprocessor and multiprocessor examples. The primary difference between the examples in Simple PETSc Examples and here is that each processor forms only its local part of the matrix and vectors in the parallel case.
Listing: KSP Tutorial src/ksp/ksp/tutorials/ex2.c
static char help[] = "Solves a linear system in parallel with KSP.\n\
Input parameters include:\n\
-view_exact_sol : write exact solution vector to stdout\n\
-m <mesh_x> : number of mesh points in x-direction\n\
-n <mesh_y> : number of mesh points in y-direction\n\n";
/*
Include "petscksp.h" so that we can use KSP solvers.
*/
#include <petscksp.h>
int main(int argc, char **args)
{
Vec x, b, u; /* approx solution, RHS, exact solution */
Mat A; /* linear system matrix */
KSP ksp; /* linear solver context */
PetscReal norm; /* norm of solution error */
PetscInt i, j, Ii, J, Istart, Iend, m = 8, n = 7, its;
PetscBool flg;
PetscScalar v;
PetscFunctionBeginUser;
PetscCall(PetscInitialize(&argc, &args, (char *)0, help));
PetscCall(PetscOptionsGetInt(NULL, NULL, "-m", &m, NULL));
PetscCall(PetscOptionsGetInt(NULL, NULL, "-n", &n, NULL));
/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Compute the matrix and right-hand-side vector that define
the linear system, Ax = b.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
/*
Create parallel matrix, specifying only its global dimensions.
When using MatCreate(), the matrix format can be specified at
runtime. Also, the parallel partitioning of the matrix is
determined by PETSc at runtime.
Performance tuning note: For problems of substantial size,
preallocation of matrix memory is crucial for attaining good
performance. See the matrix chapter of the users manual for details.
*/
PetscCall(MatCreate(PETSC_COMM_WORLD, &A));
PetscCall(MatSetSizes(A, PETSC_DECIDE, PETSC_DECIDE, m * n, m * n));
PetscCall(MatSetFromOptions(A));
PetscCall(MatMPIAIJSetPreallocation(A, 5, NULL, 5, NULL));
PetscCall(MatSeqAIJSetPreallocation(A, 5, NULL));
PetscCall(MatSeqSBAIJSetPreallocation(A, 1, 5, NULL));
PetscCall(MatMPISBAIJSetPreallocation(A, 1, 5, NULL, 5, NULL));
PetscCall(MatMPISELLSetPreallocation(A, 5, NULL, 5, NULL));
PetscCall(MatSeqSELLSetPreallocation(A, 5, NULL));
/*
Currently, all PETSc parallel matrix formats are partitioned by
contiguous chunks of rows across the processors. Determine which
rows of the matrix are locally owned.
*/
PetscCall(MatGetOwnershipRange(A, &Istart, &Iend));
/*
Set matrix elements for the 2-D, five-point stencil in parallel.
- Each processor needs to insert only elements that it owns
locally (but any non-local elements will be sent to the
appropriate processor during matrix assembly).
- Always specify global rows and columns of matrix entries.
Note: this uses the less common natural ordering that orders first
all the unknowns for x = h then for x = 2h etc; Hence you see J = Ii +- n
instead of J = I +- m as you might expect. The more standard ordering
would first do all variables for y = h, then y = 2h etc.
*/
for (Ii = Istart; Ii < Iend; Ii++) {
v = -1.0;
i = Ii / n;
j = Ii - i * n;
if (i > 0) {
J = Ii - n;
PetscCall(MatSetValues(A, 1, &Ii, 1, &J, &v, ADD_VALUES));
}
if (i < m - 1) {
J = Ii + n;
PetscCall(MatSetValues(A, 1, &Ii, 1, &J, &v, ADD_VALUES));
}
if (j > 0) {
J = Ii - 1;
PetscCall(MatSetValues(A, 1, &Ii, 1, &J, &v, ADD_VALUES));
}
if (j < n - 1) {
J = Ii + 1;
PetscCall(MatSetValues(A, 1, &Ii, 1, &J, &v, ADD_VALUES));
}
v = 4.0;
PetscCall(MatSetValues(A, 1, &Ii, 1, &Ii, &v, ADD_VALUES));
}
/*
Assemble matrix, using the 2-step process:
MatAssemblyBegin(), MatAssemblyEnd()
Computations can be done while messages are in transition
by placing code between these two statements.
*/
PetscCall(MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY));
PetscCall(MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY));
/* A is symmetric. Set symmetric flag to enable ICC/Cholesky preconditioner */
PetscCall(MatSetOption(A, MAT_SYMMETRIC, PETSC_TRUE));
/*
Create parallel vectors.
- We form 1 vector from scratch and then duplicate as needed.
- When using VecCreate(), VecSetSizes and VecSetFromOptions()
in this example, we specify only the
vector's global dimension; the parallel partitioning is determined
at runtime.
- When solving a linear system, the vectors and matrices MUST
be partitioned accordingly. PETSc automatically generates
appropriately partitioned matrices and vectors when MatCreate()
and VecCreate() are used with the same communicator.
- The user can alternatively specify the local vector and matrix
dimensions when more sophisticated partitioning is needed
(replacing the PETSC_DECIDE argument in the VecSetSizes() statement
below).
*/
PetscCall(VecCreate(PETSC_COMM_WORLD, &u));
PetscCall(VecSetSizes(u, PETSC_DECIDE, m * n));
PetscCall(VecSetFromOptions(u));
PetscCall(VecDuplicate(u, &b));
PetscCall(VecDuplicate(b, &x));
/*
Set exact solution; then compute right-hand-side vector.
By default we use an exact solution of a vector with all
elements of 1.0;
*/
PetscCall(VecSet(u, 1.0));
PetscCall(MatMult(A, u, b));
/*
View the exact solution vector if desired
*/
flg = PETSC_FALSE;
PetscCall(PetscOptionsGetBool(NULL, NULL, "-view_exact_sol", &flg, NULL));
if (flg) PetscCall(VecView(u, PETSC_VIEWER_STDOUT_WORLD));
/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Create the linear solver and set various options
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
PetscCall(KSPCreate(PETSC_COMM_WORLD, &ksp));
/*
Set operators. Here the matrix that defines the linear system
also serves as the preconditioning matrix.
*/
PetscCall(KSPSetOperators(ksp, A, A));
/*
Set linear solver defaults for this problem (optional).
- By extracting the KSP and PC contexts from the KSP context,
we can then directly call any KSP and PC routines to set
various options.
- The following two statements are optional; all of these
parameters could alternatively be specified at runtime via
KSPSetFromOptions(). All of these defaults can be
overridden at runtime, as indicated below.
*/
PetscCall(KSPSetTolerances(ksp, 1.e-2 / ((m + 1) * (n + 1)), 1.e-50, PETSC_DEFAULT, PETSC_DEFAULT));
/*
Set runtime options, e.g.,
-ksp_type <type> -pc_type <type> -ksp_monitor -ksp_rtol <rtol>
These options will override those specified above as long as
KSPSetFromOptions() is called _after_ any other customization
routines.
*/
PetscCall(KSPSetFromOptions(ksp));
/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Solve the linear system
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
PetscCall(KSPSolve(ksp, b, x));
/* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Check the solution and clean up
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
PetscCall(VecAXPY(x, -1.0, u));
PetscCall(VecNorm(x, NORM_2, &norm));
PetscCall(KSPGetIterationNumber(ksp, &its));
/*
Print convergence information. PetscPrintf() produces a single
print statement from all processes that share a communicator.
An alternative is PetscFPrintf(), which prints to a file.
*/
PetscCall(PetscPrintf(PETSC_COMM_WORLD, "Norm of error %g iterations %" PetscInt_FMT "\n", (double)norm, its));
/*
Free work space. All PETSc objects should be destroyed when they
are no longer needed.
*/
PetscCall(KSPDestroy(&ksp));
PetscCall(VecDestroy(&u));
PetscCall(VecDestroy(&x));
PetscCall(VecDestroy(&b));
PetscCall(MatDestroy(&A));
/*
Always call PetscFinalize() before exiting a program. This routine
- finalizes the PETSc libraries as well as MPI
- provides summary and diagnostic information if certain runtime
options are chosen (e.g., -log_view).
*/
PetscCall(PetscFinalize());
return 0;
}
CPU SIMD parallelism#
SIMD parallelism occurs most commonly in the Intel advanced vector extensions (AVX) Wikipedia https://en.wikipedia.org/wiki/Advanced_Vector_Extensions families of instructions. It may be automatically used by the optimizing compiler, or in low-level libraries that PETSc uses such as BLAS (see BLIS https://github.com/flame/blis, or rarely, directly in PETSc C/C++ code, as in MatMult_SeqSELL https://petsc.org/main/src/mat/impls/sell/seq/sell.c.html#MatMult_SeqSELL.
CPU OpenMP parallelism#
OpenMP parallelism is thread parallelism. Multiple threads (independent streams of instructions) process data and perform computations on different parts of memory that is shared (accessible) to all of the threads. The OpenMP model is most-often based on inserting pragmas into code indicating that a series of instructions (often within a loop) can be run in parallel. This is also called a fork-join model of parallelism, since much of the code remains sequential and only the computationally expensive parts in the ‘parallel region’ are parallel. OpenMP thus makes it relatively easy to add some degree of parallelism to a conventional sequential code in a shared memory environment.
POSIX threads (pthreads) is a library that may be called from C/C++. The library contains routines to create, join, and remove threads plus manage communications and synchronizations between threads. Pthreads is rarely used directly in numerical libraries and applications. Sometimes OpenMP is implemented on top of pthreads.
If one adds OpenMP parallelism to an MPI code one must make sure not to over-subscribe the hardware resources. For example, if MPI already has one rank per hardware core then using four OpenMP threads per MPI rank will slow the code down since now one core will need to switch back and forth between four OpenMP threads. There are limited practical advantages to a combined MPI and OpenMP model in PETSc, but it is possible.
For application codes that uses certain external packages including BLAS/LAPACK, SuperLU_DIST, MUMPS, MKL, and SuiteSparse one can build PETSc and these
packages to take advantage of OpenMP by using the configure option --with-openmp
. The number of OpenMP threads used in the application can be controlled with
the PETSc command line option -omp_num_threads <num>
or the environmental variable OMP_NUM_THREADS
. Running a PETSc program with -omp_view
will display the
number of threads being used. The default number is often absurdly high for the given hardware so we recommend always setting it appropriately.
Users can also put OpenMP pragmas into their own code. However since standard PETSc is not thread-safe, they should not, in general,
call PETSc routines from inside the parallel regions.
PETSc MPI based linear solvers may be accessed from a sequential or OpenMP program with the PCMPI
solver wrapper, see Using a MPI parallel linear solver from a non-MPI program.
There is an OpenMP thread-safe subset of PETSc that may be configured for using --with-threadsafety [--with-openmp or
--download-concurrencykit]
. KSP Tutorial ex61f demonstrates
how this may be used with OpenMP. In this mode one may have individual OpenMP threads that each manage their own
(sequential) PETSc objects (each thread can interact only with its own objects). This
is useful when one has many small systems (or sets of ODEs) that must be integrated in an
“embarrassingly parallel” fashion on multicore systems.
See also
Edward A. Lee, The Problem with Threads, Technical Report No. UCB/EECS-2006-1 January [DOI] 10, 2006
GPU kernel parallelism#
GPUs offer at least two levels of clearly defined parallelism. Kernel level parallelism is much like SIMD parallelism applied to loops; many different “iterations” of the loop index run on different hardware but in “lock-step” at the same time. PETSc utilizes this parallelism with three similar, but slightly different models:
CUDA, which is provided by NVIDIA and runs on NVIDIA GPUs
HIP, provided by AMD, which can, in theory, run on both AMD and NVIDIA GPUs
and Kokkos, an open-source package that provides a slightly higher level programming model to utilize GPU kernels.
To utilize this one configures PETSc with either –with-cuda or –with-hip and, if they plan to use Kokkos, also –with-kokkos –with-kokkos-kernels.
In the GPU programming model that PETSc uses the GPU memory is distinct from the CPU memory. This means that data that resides on the CPU memory must be copied to the GPU (often this copy is done automatically by the libraries and the user does not need to manage it) if one wishes to use the GPU computational power on it. This memory copy is slow compared to the GPU speed hence it is crucial to minimize these copies. This often translates to trying to do almost all the computation on the GPU and not constantly switching between computations on the CPU and the GPU on the same data.
PETSc utilizes GPUs by providing vector and matrix classes (Vec and Mat) that are specifically written to run fast on the GPU. However, since it is difficult to write an entire PETSc code that runs only on the GPU one can also access and work with (for example, put entries into) the vectors and matrices on the CPU. The vector classes are VECCUDA, MATAIJCUSPARSE, VECKOKKOS, MATAIJKOKKOS, and VECHIP (matrices are not yet supported from PETSc with HIP).
More details on using GPUs from PETSc will follow in this document.
GPU stream parallelism#
Incomplete
Compiling and Running Programs#
The output below illustrates compiling and running a PETSc program using MPICH on a macOS laptop. Note that different machines will have compilation commands as determined by the configuration process. See Writing C/C++ or Fortran Applications for a discussion about how to compile your PETSc programs. Users who are experiencing difficulties linking PETSc programs should refer to the FAQ.
$ cd $PETSC_DIR/src/ksp/ksp/tutorials
$ make ex2
/Users/patrick/petsc/arch-darwin-double-debug/bin/mpicc -o ex2.o -c -g3 -I/Users/patrick/petsc/include -I/Users/patrick/petsc/arch-darwin-double-debug/include -I/opt/X11/include -I/opt/local/include `pwd`/ex2.c
/Users/patrick/petsc/arch-darwin-double-debug/bin/mpicc -g3 -o ex2 ex2.o -Wl,-rpath,/Users/patrick/petsc/arch-darwin-double-debug/lib -L/Users/patrick/petsc/arch-darwin-double-debug/lib -lpetsc -lf2clapack -lf2cblas -lmpifort -lgfortran -lgcc_ext.10.5 -lquadmath -lm -lclang_rt.osx -lmpicxx -lc++ -ldl -lmpi -lpmpi -lSystem
/bin/rm -f ex2.o
$ $PETSC_DIR/lib/petsc/bin/petscmpiexec -n 1 ./ex2
Norm of error 0.000156044 iterations 6
$ $PETSC_DIR/lib/petsc/bin/petscmpiexec -n 2 ./ex2
Norm of error 0.000411674 iterations 7
Profiling Programs#
The option
-log_view
activates printing of a performance summary, including
times, floating point operation (flop) rates, and message-passing
activity. Profiling provides details about
profiling, including interpretation of the output data below.
This particular example involves
the solution of a linear system on one processor using GMRES and ILU.
The low floating point operation (flop) rates in this example are due to
the fact that the code solved a tiny system. We include this example
merely to demonstrate the ease of extracting performance information.
$ $PETSC_DIR/lib/petsc/bin/petscmpiexec -n 1 ./ex1 -n 1000 -pc_type ilu -ksp_type gmres -ksp_rtol 1.e-7 -log_view
...
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage ---- Total
Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
VecMDot 1 1.0 3.2830e-06 1.0 2.00e+03 1.0 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 609
VecNorm 3 1.0 4.4550e-06 1.0 6.00e+03 1.0 0.0e+00 0.0e+00 0.0e+00 0 14 0 0 0 0 14 0 0 0 1346
VecScale 2 1.0 4.0110e-06 1.0 2.00e+03 1.0 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 499
VecCopy 1 1.0 3.2280e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 11 1.0 2.5537e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0
VecAXPY 2 1.0 2.0930e-06 1.0 4.00e+03 1.0 0.0e+00 0.0e+00 0.0e+00 0 10 0 0 0 0 10 0 0 0 1911
VecMAXPY 2 1.0 1.1280e-06 1.0 4.00e+03 1.0 0.0e+00 0.0e+00 0.0e+00 0 10 0 0 0 0 10 0 0 0 3546
VecNormalize 2 1.0 9.3970e-06 1.0 6.00e+03 1.0 0.0e+00 0.0e+00 0.0e+00 1 14 0 0 0 1 14 0 0 0 638
MatMult 2 1.0 1.1177e-05 1.0 9.99e+03 1.0 0.0e+00 0.0e+00 0.0e+00 1 24 0 0 0 1 24 0 0 0 894
MatSolve 2 1.0 1.9933e-05 1.0 9.99e+03 1.0 0.0e+00 0.0e+00 0.0e+00 1 24 0 0 0 1 24 0 0 0 501
MatLUFactorNum 1 1.0 3.5081e-05 1.0 4.00e+03 1.0 0.0e+00 0.0e+00 0.0e+00 2 10 0 0 0 2 10 0 0 0 114
MatILUFactorSym 1 1.0 4.4259e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0
MatAssemblyBegin 1 1.0 8.2015e-08 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyEnd 1 1.0 3.3536e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0
MatGetRowIJ 1 1.0 1.5960e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 3.9791e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0
MatView 2 1.0 6.7909e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 5 0 0 0 0 5 0 0 0 0 0
KSPGMRESOrthog 1 1.0 7.5970e-06 1.0 4.00e+03 1.0 0.0e+00 0.0e+00 0.0e+00 1 10 0 0 0 1 10 0 0 0 526
KSPSetUp 1 1.0 3.4424e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0
KSPSolve 1 1.0 2.7264e-04 1.0 3.30e+04 1.0 0.0e+00 0.0e+00 0.0e+00 19 79 0 0 0 19 79 0 0 0 121
PCSetUp 1 1.0 1.5234e-04 1.0 4.00e+03 1.0 0.0e+00 0.0e+00 0.0e+00 11 10 0 0 0 11 10 0 0 0 26
PCApply 2 1.0 2.1022e-05 1.0 9.99e+03 1.0 0.0e+00 0.0e+00 0.0e+00 1 24 0 0 0 1 24 0 0 0 475
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Vector 8 8 76224 0.
Matrix 2 2 134212 0.
Krylov Solver 1 1 18400 0.
Preconditioner 1 1 1032 0.
Index Set 3 3 10328 0.
Viewer 1 0 0 0.
========================================================================================================================
...
Writing C/C++ or Fortran Applications#
The examples throughout the library demonstrate the software usage and
can serve as templates for developing custom applications. We suggest
that new PETSc users examine programs in the directories
$PETSC_DIR/src/<library>/tutorials
where <library>
denotes any
of the PETSc libraries (listed in the following section), such as
SNES
or KSP
or TS
. The manual pages located at
https://petsc.org/release/documentation/ provide links (organized by
both routine names and concepts) to the tutorial examples.
To develop an application program that uses PETSc, we suggest the following:
For completely new applications
Make a directory for your source code: for example,
mkdir $HOME/application
Change to that directory; for example,
cd $HOME/application
Copy an example in the directory that corresponds to the problems of interest into your directory, for example,
cp $PETSC_DIR/src/snes/tutorials/ex19.c ex19.c
Select an application build process. The
PETSC_DIR
(andPETSC_ARCH
if the--prefix=directoryname
option was not used when configuring PETSc) environmental variable(s) must be set for any of these approaches.make (recommended). Copy $PETSC_DIR/share/petsc/Makefile.user or $PETSC_DIR/share/petsc/Makefile.basic.user to your directory, for example,
cp $PETSC_DIR/share/petsc/Makefile.user makefile
Examine the comments in your makefile
Makefile.user uses the pkg-config tool and is the recommended approach.
Use
make ex19
to compile your programCMake. Copy $PETSC_DIR/share/petsc/CMakeLists.txt to your directory, for example,
cp $PETSC_DIR/share/petsc/CMakeLists.txt CMakeLists.txt
Edit CMakeLists.txt, read the comments on usage and change the name of application from ex1 to your application executable name.
Run the program, for example,
./ex19
Start to modify the program for developing your application.
For adding PETSc to an existing application
Start with a working version of your code that you build and run to confirm that it works.
Upgrade your build process. The
PETSC_DIR
(andPETSC_ARCH
if the--prefix=directoryname
option was not used when configuring PETSc) environmental variable(s) must be set for any of these approaches.Using make. Update the application makefile to add the appropriate PETSc include directories and libraries.
Recommended approach. Examine the comments in $PETSC_DIR/share/petsc/Makefile.user and transfer selected portions of that file to your makefile.
Minimalist. Add the line
include ${PETSC_DIR}/lib/petsc/conf/variables
to the bottom of your makefile. This will provide a set of PETSc specific make variables you may use in your makefile. See the comments in the file $PETSC_DIR/share/petsc/Makefile.basic.user for details on the usage.
Simple, but hands the build process over to PETSc’s control. Add the lines
include ${PETSC_DIR}/lib/petsc/conf/variables include ${PETSC_DIR}/lib/petsc/conf/rules
to the bottom of your makefile. See the comments in the file $PETSC_DIR/share/petsc/Makefile.basic.user for details on the usage. Since PETSc’s rules now control the build process you will likely need to simplify and remove much of your makefile.
Not recommended since you must change your makefile for each new configuration/computing system. This approach does not require that the environmental variable
PETSC_DIR
be set when building your application since the information will be hardwired in your makefile. Run the following command in the PETSc root directory to get the information needed by your makefile:$ make getlinklibs getincludedirs getcflags getcxxflags getfortranflags getccompiler getfortrancompiler getcxxcompiler
All the libraries listed need to be linked into your executable and the include directories and flags need to be passed to the compiler(s). Usually this is done by setting
LDFLAGS=<list of library flags and libraries>
andCFLAGS=<list of -I and other flags>
andFFLAGS=<list of -I and other flags>
etc in your makefile.
Using CMake. Update the application CMakeLists.txt by examining the code and comments in $PETSC_DIR/share/petsc/CMakeLists.txt
Rebuild your application and ensure it still runs correctly.
Add a
PetscInitialize()
near the beginning of your code andPetscFinalize()
near the end with appropriate include commands (and use commands in Fortran)Rebuild your application and ensure it still runs correctly.
Slowly start utilizing PETSc functionality in your code, ensure that your code continues to build and run correctly.
PETSc’s Object-Oriented Design#
Though PETSc has a large API, conceptually it’s rather simple.
There are three abstract basic data objects (classes): index sets, IS
, vectors, Vec
, and matrices, Mat
.
Plus a larger number of abstract algorithm objects (classes) starting with: preconditioners, PC
, Krylov solvers, KSP
, and so forth.
Let Object
represent any of these objects. Objects are created with
Object obj;
ObjectCreate(MPI_Comm, &obj);
The object is empty and little can be done with it. A particular implementation of the class is associated with the object by setting the object’s “type”, where type is merely a string name of an implementation class using
Object obj;
ObjectSetType(obj,"Name");
Some objects support subclasses which are specializations of the type. These are set with
Object obj;
ObjectNameSetType(obj,"SubName");
For example, within TS
one may do
TS obj;
TSCreate(PETSC_COMM_WORLD,&obj);
TSSetType(obj,TSARKIMEX);
TSARKIMEXSetType(obj,TSARKIMEX3);
The abstract class TS
can embody any ODE/DAE integrator scheme.
This example creates an additive Runge-Kutta ODE/DAE IMEX integrator, whose type name is TSARKIMEX
, using a 3-order scheme with an L-stable implicit part,
whose subtype name is TSARKIMEX3
.
In order to allow PETSc objects to be runtime configurable, PETSc objects provide a universal way of selecting types (classes) and subtypes at runtime, from what is referred to as the “options database”. The code above can be replaced with
TS obj;
TSCreate(PETSC_COMM_WORLD,&obj);
TSSetFromOptions(obj);
now both the type and subtype can be conveniently set from the command line
$ ./app -ts_type arkimex -ts_arkimex_type 3
The object’s type (implementation class) or subclass can also be changed at any time simply by calling TSSetType()
again (though in order to override command line options the call to TSSetType()
must be made _after_ TSSetFromOptions()
). For example:
// (if set) command line options "override" TSSetType()
TSSetType(ts, TSGLLE);
TSSetFromOptions(ts);
// TSSetType() overrides command line options
TSSetFromOptions(ts);
TSSetType(ts, TSGLLE);
Since the later call always overrides the earlier call the second form shown is rarely – if ever – used, as it is less flexible than configuring command line settings.
The standard methods on an object are of the general form
ObjectSetXXX(obj,...);
ObjectGetXXX(obj,...);
ObjectYYY(obj,...);
For example
TSSetRHSFunction(obj,...)
Particular types and subtypes of objects may have their own methods, which are given in the form
ObjectNameSetXXX(obj,...);
ObjectNameGetXXX(obj,...);
ObjectNameYYY(obj,...);
and
ObjectNameSubNameSetXXX(obj,...);
ObjectNameSubNameGetXXX(obj,...);
ObjectNameSubNameYYY(obj,...);
where Name and SubName are the type and subtype names (for example, as above TSARKIMEX
and 3
. Most “set” operations have options database versions with the same
names in lower case, separated by underscores and with the set remove. For example,
KSPGMRESSetRestart(obj,30); // ignored if the type is not KSPGMRES
can be set at the command line with
$ ./app -ksp_gmres_restart 30
There are a special subset of type-specific methods that are ignored if the type does not match the function name. These are usually setter functions that control some aspect specific to the subtype. For example,
KSPGMRESSetRestart(obj,30); // ignored if the type is not KSPGMRES
These allow cleaner code since it does not have a multitude of if statements to avoid inactive methods. That is one does not need to write code like
if (type == KSPGMRES) { // unneeded clutter
KSPGMRESSetRestart(obj,30);
}
There are many “get” routines that give one temporary access to the internal data of an object. They are used in the style
XXX xxx;
ObjectGetXXX(obj,&xxx);
// use xxx
ObjectRestoreXXX(obj,&xxx);
Objects obtained with a “get” routine should be returned with a “restore” routine, generally within the same function. Objects obtained with a “create” routine should be freed with a “destroy” routine.
There may be variants of the “get” routines that give more limited access to the obtained object. For example,
const PetscScalar *x;
// specialized variant of VecGetArray()
VecGetArrayRead(vec, &x);
// one can read but not write with x[]
PetscReal y = 2*x[0];
// don't forget to restore x after you are done with it
VecRestoreArrayRead(vec, &x);
Objects can be displayed (in a large number of ways) with
ObjectView(obj,PetscViewer viewer);
ObjectViewFromOptions(obj,...);
Where PetscViewer
is an abstract object that can represent standard output, an ASCII or binary file, a graphical window, etc. The second
variant allows the user to delay until runtime the decision of what viewer and format to use to view the object or if to view the object at all.
Objects are destroyed with
ObjectDestroy(&obj)
User Callbacks#
In many situations the user may also wish to override or provide custom functionality. This is handled via callbacks which the library will call at the appropriate time. The most general callback is provided by
PetscObjecSetCallback(obj,callbackfunction(), void *ctx, callbackdestroy(void *ctx));
where callbackfunction()
is what is used by the library, ctx
is an optional data-structure (array, struct, PETSc object) that is used by callbackfunction()
and callbackdestroy(void *ctx)
is an optional function that will be called when obj
is destroyed. The use of the callbackdestroy()
allows users to “set and forget”
data structures that will not be needed elsewhere but still need to be cleaned up when no longer needed. Here is an example of the use of a full-fledged callback
TS ts;
TSMonitorLGCtx *ctx;
TSMonitorLGCtxCreate(..., &ctx)
TSMonitorSet(ts, TSMonitorLGTimeStep, ctx, (PetscErrorCode(*)(void **))TSMonitorLGCtxDestroy);
TSSolve(ts);
Occasionally routines to set callback functions take additional data objects that will be used by the object but are not context data for the function. For example,
The r
vector is an optional argument provided by the user which will be used as work-space by SNES
. Note that this callback does not provide a way for the user
to have the ctx
destroyed when the SNES
object is destroyed, the users must ensure that they free it at an appropriate time. There is no logic to the various ways
PETSc accepts callback functions in different places in the code.
See Tao use of PETSc and callbacks for a cartoon on the use of callbacks in Tao
.
Directory Structure#
We conclude this introduction with an overview of the organization of the PETSc software. The root directory of PETSc contains the following directories:
doc
(only in the tarball distribution of PETSc; not the git repository) - All documentation for PETSc. The filesmanual.pdf
contains the hyperlinked users manual, suitable for printing or on-screen viewering. Includes the subdirectory -manualpages
(on-line manual pages).lib/petsc/conf
- Base PETSc configuration files that define the standard make variables and rules used by PETScinclude
- All include files for PETSc that are visible to the user.include/petsc/finclude
- PETSc Fortran include files.include/petsc/private
- Private PETSc include files that should not need to be used by application programmers.share
- Some small test matrices in data filessrc
- The source code for all PETSc libraries, which currently includesvec
- vectors,is
- index sets,
mat
- matrices,ksp
- complete linear equations solvers,ksp
- Krylov subspace accelerators,pc
- preconditioners,
snes
- nonlinear solversts
- ODE solvers and timestepping,dm
- data management between meshes and solvers, vectors, and matrices,sys
- general system-related routines,logging
- PETSc logging and profiling routines,classes
- low-level classesdraw
- simple graphics,viewer
- mechanism for printing and visualizing PETSc objects,bag
- mechanism for saving and loading from disk user data stored in C structs.random
- random number generators.
Each PETSc source code library directory has the following subdirectories:
tutorials
- Programs designed to teach users about PETSc.These codes can serve as templates for the design of custom applications.
tests
- Programs designed for thorough testing of PETSc. Assuch, these codes are not intended for examination by users.
interface
- Provides the abstract base classes for the objects. Code here does not know about particular implementations and does not actually perform operations on the underlying numerical data.impls
- Source code for one or more implementations of the class for particular data structures or algorithms.utils
- Utility routines. Source here may know about the implementations, but ideally will not know about implementations for other components.