The PETSc Kernel#
PETSc provides a variety of basic services for writing scalable,
component-based libraries; these are referred to as the PETSc kernel [BGMS98].
The source code for the kernel is in src/sys
. It contains systematic
support for
managing PETSc types,
error handling,
memory management,
profiling,
object management,
Fortran interfaces (see [BBK+15])
mechanism for generating appropriate citations for algorithms and software used in PETSc (see [KBMS13])
file I/O,
an options database, and
objects and code for viewing, drawing, and displaying data and solver objects.
Each of these is discussed in a section below.
PETSc Types#
For maximum flexibility, the basic data types int
, double
, and
so on are not used in PETSc source code. Rather, it has
PetscInt
can be set using configure
to be either int
(32
bit, the default) or long long
(64-bit, with
configure –with-64-bit-indices
) to allow indexing into very large
arrays. PetscMPIInt
is used for integers passed to MPI as counts and
sizes. These are always int
since that is what the MPI standard
uses. Similarly, PetscBLASInt
is for counts, and so on passed to
BLAS and LAPACK routines. These are almost always int
unless one is
using a special “64-bit integer” BLAS/LAPACK (this is available, for
example, with Intel’s MKL and OpenBLAS).
In addition, there are special types:
These are currently always int
, but their use clarifies the code.
Implementation of Error Handling#
PETSc uses a “call error handler; then (depending on result) return
error code” model when problems are detected in the running code. The
public include file for error handling is
include/petscerror.h,
and the source code for the PETSc error handling is in
src/sys/error/
.
Simplified Interface#
The simplified macro-based interface consists of the following two calls:
The macro SETERRQ()
is given by
It calls the error handler with the current function name and location:
line number, and file, plus an error code and an error message. Normally
comm
is PETSC_COMM_SELF
; it can be another communicator only if
one is absolutely sure the same error will be generated on all processes
in the communicator. This feature is to prevent the same error message
from being printed by many processes.
The macro PetscCall()
is defined by
#define PetscCall(...) \
The message passed to SETERRQ()
is treated as a printf()
-style
format string, with all additional parameters passed after the string as
its arguments. For example:
SETERRQ(comm,PETSC_ERR,"Iteration overflow: its %" PetscInt_FMT " norm %g",its,(double)norm);
Error Handlers#
The error-handling function PetscError()
calls the “current” error
handler with the code
PetscErrorCode PetscError(MPI_Comm comm, int line, const char *func, const char *file, PetscErrorCode n, PetscErrorType p, const char *mess, ...)
{
va_list Argp;
size_t fullLength;
char buf[2048], *lbuf = NULL;
PetscBool ismain;
PetscErrorCode ierr;
if (!PetscErrorHandlingInitialized) return n;
if (comm == MPI_COMM_NULL) comm = PETSC_COMM_SELF;
/* Compose the message evaluating the print format */
if (mess) {
va_start(Argp, mess);
ierr = PetscVSNPrintf(buf, 2048, mess, &fullLength, Argp);
va_end(Argp);
lbuf = buf;
if (p == PETSC_ERROR_INITIAL) ierr = PetscStrncpy(PetscErrorBaseMessage, lbuf, sizeof(PetscErrorBaseMessage));
}
if (p == PETSC_ERROR_INITIAL && n != PETSC_ERR_MEMC) ierr = PetscMallocValidate(__LINE__, PETSC_FUNCTION_NAME, __FILE__);
if (!eh) ierr = PetscTraceBackErrorHandler(comm, line, func, file, n, p, lbuf, NULL);
else ierr = (*eh->handler)(comm, line, func, file, n, p, lbuf, eh->ctx);
PetscStackClearTop;
/*
If this is called from the main() routine we abort the program.
We cannot just return because them some MPI processes may continue to attempt to run
while this process simply exits.
*/
if (func) {
PetscErrorCode cmp_ierr = PetscStrncmp(func, "main", 4, &ismain);
if (ismain) {
if (petscwaitonerrorflg) cmp_ierr = PetscSleep(1000);
(void)cmp_ierr;
PETSCABORT(comm, ierr);
}
}
#if defined(PETSC_CLANGUAGE_CXX)
if (p == PETSC_ERROR_IN_CXX) PetscCxxErrorThrow();
#endif
return ierr;
}
/*@C
PetscIntView - Prints an array of integers; useful for debugging.
Collective
Input Parameters:
+ N - number of integers in array
. idx - array of integers
- viewer - location to print array, `PETSC_VIEWER_STDOUT_WORLD`, `PETSC_VIEWER_STDOUT_SELF` or 0
Level: intermediate
Note:
This may be called from within the debugger
Developer Note:
`idx` cannot be const because may be passed to binary viewer where temporary byte swapping may be done
.seealso: `PetscViewer`, `PetscRealView()`
@*/
PetscErrorCode PetscIntView(PetscInt N, const PetscInt idx[], PetscViewer viewer)
{
PetscMPIInt rank, size;
PetscInt j, i, n = N / 20, p = N % 20;
PetscBool iascii, isbinary;
MPI_Comm comm;
PetscFunctionBegin;
if (!viewer) viewer = PETSC_VIEWER_STDOUT_SELF;
PetscValidHeaderSpecific(viewer, PETSC_VIEWER_CLASSID, 3);
if (N) PetscAssertPointer(idx, 2);
PetscCall(PetscObjectGetComm((PetscObject)viewer, &comm));
PetscCallMPI(MPI_Comm_size(comm, &size));
PetscCallMPI(MPI_Comm_rank(comm, &rank));
PetscCall(PetscObjectTypeCompare((PetscObject)viewer, PETSCVIEWERASCII, &iascii));
PetscCall(PetscObjectTypeCompare((PetscObject)viewer, PETSCVIEWERBINARY, &isbinary));
if (iascii) {
PetscCall(PetscViewerASCIIPushSynchronized(viewer));
for (i = 0; i < n; i++) {
if (size > 1) {
PetscCall(PetscViewerASCIISynchronizedPrintf(viewer, "[%d] %" PetscInt_FMT ":", rank, 20 * i));
} else {
PetscCall(PetscViewerASCIISynchronizedPrintf(viewer, "%" PetscInt_FMT ":", 20 * i));
}
for (j = 0; j < 20; j++) PetscCall(PetscViewerASCIISynchronizedPrintf(viewer, " %" PetscInt_FMT, idx[i * 20 + j]));
PetscCall(PetscViewerASCIISynchronizedPrintf(viewer, "\n"));
}
if (p) {
if (size > 1) {
PetscCall(PetscViewerASCIISynchronizedPrintf(viewer, "[%d] %" PetscInt_FMT ":", rank, 20 * n));
} else {
PetscCall(PetscViewerASCIISynchronizedPrintf(viewer, "%" PetscInt_FMT ":", 20 * n));
}
for (i = 0; i < p; i++) PetscCall(PetscViewerASCIISynchronizedPrintf(viewer, " %" PetscInt_FMT, idx[20 * n + i]));
PetscCall(PetscViewerASCIISynchronizedPrintf(viewer, "\n"));
}
PetscCall(PetscViewerFlush(viewer));
PetscCall(PetscViewerASCIIPopSynchronized(viewer));
} else if (isbinary) {
PetscMPIInt *sizes, Ntotal, *displs, NN;
PetscInt *array;
PetscCall(PetscMPIIntCast(N, &NN));
if (size > 1) {
if (rank) {
PetscCallMPI(MPI_Gather(&NN, 1, MPI_INT, NULL, 0, MPI_INT, 0, comm));
PetscCallMPI(MPI_Gatherv((void *)idx, NN, MPIU_INT, NULL, NULL, NULL, MPIU_INT, 0, comm));
} else {
PetscCall(PetscMalloc1(size, &sizes));
PetscCallMPI(MPI_Gather(&NN, 1, MPI_INT, sizes, 1, MPI_INT, 0, comm));
Ntotal = sizes[0];
PetscCall(PetscMalloc1(size, &displs));
displs[0] = 0;
for (i = 1; i < size; i++) {
Ntotal += sizes[i];
displs[i] = displs[i - 1] + sizes[i - 1];
}
PetscCall(PetscMalloc1(Ntotal, &array));
PetscCallMPI(MPI_Gatherv((void *)idx, NN, MPIU_INT, array, sizes, displs, MPIU_INT, 0, comm));
PetscCall(PetscViewerBinaryWrite(viewer, array, Ntotal, PETSC_INT));
PetscCall(PetscFree(sizes));
PetscCall(PetscFree(displs));
PetscCall(PetscFree(array));
}
} else {
PetscCall(PetscViewerBinaryWrite(viewer, idx, N, PETSC_INT));
}
} else {
const char *tname;
PetscCall(PetscObjectGetName((PetscObject)viewer, &tname));
SETERRQ(PETSC_COMM_SELF, PETSC_ERR_SUP, "Cannot handle that PetscViewer of type %s", tname);
}
PetscFunctionReturn(PETSC_SUCCESS);
}
You can set a new error handler with the command
PetscPushErrorHandler()
, which maintains a linked list of error
handlers. The most recent error handler is removed via
PetscPopErrorHandler()
.
PETSc provides several default error handlers:
PetscTraceBackErrorHandler()
, the default;PetscAbortErrorHandler()
, called with-onerrorabort
, useful when running in the debugger;PetscReturnErrorHandler()
, which returns up the stack without printing error messages;PetscMPIAbortErrorHandler()
, which callsMPIAbort()
after printing the error message; andPetscAttachDebuggerErrorHandler()
, called with-onerrorattachdebugger
.
Error Codes#
The PETSc error handler takes an error code. The generic error codes are
defined in
include/petscerror.h.
The same error code is used many times in the libraries. For example,
the error code PETSCERRMEM
is used whenever a requested memory
allocation is not available.
Detailed Error Messages#
In a modern parallel component-oriented application code, it does not always make sense to simply print error messages to the terminal (and more than likely there is no “terminal”, for example, with Microsoft Windows or Apple iPad applications). PETSc provides the replaceable function pointer
(*PetscErrorPrintf)("Format",...);
which, by default, prints to standard out. Thus, error messages should
not be printed with printf()
or fprintf()
. Rather, they should
be printed with (*PetscErrorPrintf)()
. You can direct all error
messages to stderr
, instead of the default stdout
, with the
command line option -erroroutputstderr
.
C++ Exceptions#
In PETSc code, when one calls C++ functions that do not return with an error code but might
instead throw C++ exceptions, one can use CHKERRCXX(func)
, which catches the exceptions
in func and then calls SETERRQ()
. The macro CHKERRCXX(func)
is given by
#define CHKERRCXX(...) PetscCallCXX(__VA_ARGS__)
Memory Management#
PETSc provides simple wrappers for the system malloc(), calloc()
,
and free()
routines. The public interface for these is provided in
petscsys.h
, while the implementation code is in src/sys/memory
.
The most basic interfaces are
#define PetscMalloc(a, b) ((*PetscTrMalloc)((a), PETSC_FALSE, __LINE__, PETSC_FUNCTION_NAME, __FILE__, (void **)(b)))
#define PetscFree(a) ((PetscErrorCode)((*PetscTrFree)((void *)(a), __LINE__, PETSC_FUNCTION_NAME, __FILE__) || ((a) = PETSC_NULLPTR, PETSC_SUCCESS)))
PetscErrorCode PetscMallocA(int n, PetscBool clear, int lineno, const char *function, const char *filename, size_t bytes0, void *ptr0, ...)
{
va_list Argp;
size_t bytes[8], sumbytes;
void **ptr[8];
int i;
PetscFunctionBegin;
PetscCheck(n <= 8, PETSC_COMM_SELF, PETSC_ERR_ARG_OUTOFRANGE, "Attempt to allocate %d objects but only 8 supported", n);
bytes[0] = bytes0;
ptr[0] = (void **)ptr0;
sumbytes = (bytes0 + PETSC_MEMALIGN - 1) & ~(PETSC_MEMALIGN - 1);
va_start(Argp, ptr0);
for (i = 1; i < n; i++) {
bytes[i] = va_arg(Argp, size_t);
ptr[i] = va_arg(Argp, void **);
sumbytes += (bytes[i] + PETSC_MEMALIGN - 1) & ~(PETSC_MEMALIGN - 1);
}
va_end(Argp);
if (petscmalloccoalesce) {
char *p;
PetscCall((*PetscTrMalloc)(sumbytes, clear, lineno, function, filename, (void **)&p));
if (p == NULL) {
for (i = 0; i < n; i++) *ptr[i] = NULL;
} else {
for (i = 0; i < n; i++) {
*ptr[i] = bytes[i] ? p : NULL;
p = (char *)PetscAddrAlign(p + bytes[i]);
}
}
} else {
for (i = 0; i < n; i++) PetscCall((*PetscTrMalloc)(bytes[i], clear, lineno, function, filename, (void **)ptr[i]));
}
PetscFunctionReturn(PETSC_SUCCESS);
}
PetscErrorCode PetscFreeA(int n, int lineno, const char *function, const char *filename, void *ptr0, ...)
{
va_list Argp;
void **ptr[8];
int i;
PetscFunctionBegin;
PetscCheck((n >= 1) && (n <= 8), PETSC_COMM_SELF, PETSC_ERR_ARG_OUTOFRANGE, "Attempt to allocate %d objects but only up to 8 supported", n);
ptr[0] = (void **)ptr0;
va_start(Argp, ptr0);
for (i = 1; i < n; i++) ptr[i] = va_arg(Argp, void **);
va_end(Argp);
if (petscmalloccoalesce) {
for (i = 0; i < n; i++) { /* Find first nonempty allocation */
if (*ptr[i]) break;
}
while (--n > i) *ptr[n] = NULL;
PetscCall((*PetscTrFree)(*ptr[n], lineno, function, filename));
*ptr[n] = NULL;
} else {
while (--n >= 0) {
PetscCall((*PetscTrFree)(*ptr[n], lineno, function, filename));
*ptr[n] = NULL;
}
}
PetscFunctionReturn(PETSC_SUCCESS);
}
which allow the use of any number of profiling and error-checking
wrappers for malloc(), calloc()
, and free()
. Both
PetscMallocA()
and PetscFreeA()
call the function pointer values
(*PetscTrMalloc)
and (*PetscTrFree)
. PetscMallocSet()
is
used to set these function pointers. The functions are guaranteed to
support requests for zero bytes of memory correctly. Freeing memory
locations also sets the pointer value to zero, preventing later code
from accidentally using memory that has been freed. All PETSc memory
allocation calls are memory aligned on at least double-precision
boundaries; the macro generated by configure PETSCMEMALIGN
indicates
in bytes what alignment all allocations have. This can be controlled at
configure time with the option -with-memalign=<4,8,16,32,64>
.
PetscMallocA()
supports a request for up to 7 distinct memory
locations of possibly different types. This serves two purposes: it
reduces the number of system malloc()
calls, thus potentially
increasing performance, and it clarifies in the code related memory
allocations that should be freed together.
The following macros are the preferred way to obtain and release memory
in the PETSc source code. They automatically manage calling
PetscMallocA()
and PetscFreeA()
with the appropriate location
information.
#define PetscMalloc1(m1, r1) PetscMallocA(1, PETSC_FALSE, __LINE__, PETSC_FUNCTION_NAME, __FILE__, ((size_t)((size_t)m1) * sizeof(**(r1))), (r1))
#define PetscMalloc2(m1, r1, m2, r2) PetscMallocA(2, PETSC_FALSE, __LINE__, PETSC_FUNCTION_NAME, __FILE__, ((size_t)((size_t)m1) * sizeof(**(r1))), (r1), ((size_t)((size_t)m2) * sizeof(**(r2))), (r2))
…
#define PetscMalloc7(m1, r1, m2, r2, m3, r3, m4, r4, m5, r5, m6, r6, m7, r7) \
#define PetscFree(a) ((PetscErrorCode)((*PetscTrFree)((void *)(a), __LINE__, PETSC_FUNCTION_NAME, __FILE__) || ((a) = PETSC_NULLPTR, PETSC_SUCCESS)))
#define PetscFree2(m1, m2) PetscFreeA(2, __LINE__, PETSC_FUNCTION_NAME, __FILE__, &(m1), &(m2))
…
#define PetscFree7(m1, m2, m3, m4, m5, m6, m7) PetscFreeA(7, __LINE__, PETSC_FUNCTION_NAME, __FILE__, &(m1), &(m2), &(m3), &(m4), &(m5), &(m6), &(m7))
Similar routines, PetscCalloc1()
to PetscCalloc7()
, provide
memory initialized to zero. The size requests for these macros are in
number of data items requested, not in bytes. This decreases the number
of errors in the code since the compiler determines their sizes from the
object type instead of requiring the user to provide the correct value
with sizeof()
.
The routines PetscTrMallocDefault()
and PetscTrFreeDefault()
,
which are set with the routine PetscSetUseTrMallocPrivate()
(and are
used by default for the debug version of PETSc), provide simple logging
and error checking versions of memory allocation.
Implementation of Profiling#
This section provides details about the implementation of event logging
and profiling within the PETSc kernel. The interface for profiling in
PETSc is contained in the file
include/petsclog.h.
The source code for the profile logging is in src/sys/plog/
.
Profiling Object Creation and Destruction#
The creation of objects is profiled with the command
PetscLogObjectCreate()
which logs the creation of any PETSc object. Just before an object is
destroyed, it should be logged with PetscLogObjectDestroy()
These are called automatically by PetscHeaderCreate()
and
PetscHeaderDestroy()
, which are used in creating all objects
inherited from the basic object. Thus, these logging routines need never
be called directly.
If an object has a clearly defined parent object (for instance, when a
work vector is generated for use in a Krylov solver), this information
is logged with the command PetscLogObjectParent()
.
PetscLogObjectParent(PetscObject parent,PetscObject child);
It is also useful to log information about the state of an object, as can be done with the command
PetscLogObjectState(PetscObject h,const char *format,...);
For example, for sparse matrices we usually log the matrix dimensions and number of nonzeros.
Profiling Events#
Events are logged by using the pair PetscLogEventBegin()
and PetscLogEventEnd()
.
This logging is usually done in the abstract interface file for the operations, for example, src/mat/interface/matrix.c.
Controlling Profiling#
Routines that control the default profiling available in PETSc include the following
PetscLogDump(const char *filename);
These routines are normally called by the PetscInitialize()
and
PetscFinalize()
routines when the option -logview
is given.
References#
- BBK+15
Satish Balay, Jed Brown, Matthew G. Knepley, Lois McInnes, and Barry Smith. Software Engineering for Science, chapter Providing Mixed Language and Legacy Support within a Library. Taylor & Francis, 2015.
- BGMS98
Satish Balay, William D. Gropp, Lois Curfman McInnes, and Barry F. Smith. A microkernel design for component-based parallel numerical software systems. In Proceedings of the SIAM Workshop on Object Oriented Methods for Inter-operable Scientific and Engineering Computing, 58–67. SIAM, 1998. URL: ftp://info.mcs.anl.gov/pub/tech_reports/reports/P727.ps.Z.
- KBMS13
Matthew G. Knepley, Jed Brown, Lois Curfman McInnes, and Barry F. Smith. Accurately citing software and algorithms used in publications. In First Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE), held at SC13. 2013. doi:10.6084/m9.figshare.785731.