Please test for Covid before attending the meeting and
mask while traveling to the meeting.
Abstracts
Using Kokkos Ecosystem with PETSc on modern architectures
Luc Berger-Vergiat
Sandia National Laboratories
Supercomputers increasingly rely on GPUs to achieve high
throughput while maintaining a reasonable power consumption. Consequently,
scientific applications are adapting to this new environment, and new
algorithms are designed to leverage the high concurrency of GPUs. In this
presentation, I will show how the Kokkos Ecosystem can help alleviate some
of the difficulties associated with support for multiple CPU/GPU
architectures. I will also show some results using the Kokkos and Kokkos
kernels libraries with PETSc on modern architectures.
From the trenches: porting mef90
Blaise Bourdin
McMaster University
mef90 is a distributed three-dimensional unstructured finite-element
implementation of various phase-field models of fracture. In this talk,
I will share the experience gained while porting mef90 from petsc 3.3 to 3.18.
A new non-hydrostatic capability for MPAS-Ocean
Sara Calandrini , Darren Engwirda, Luke Van Roekel
Los Alamos National Laboratory
The Model for Prediction Across Scales-Ocean (MPAS-Ocean) is an
open-source, global ocean model and is one component within the Department
of Energy’s E3SM framework, which includes atmosphere, sea ice, and
land-ice models. In this work, a new formulation for the ocean model is
presented that solves the non-hydrostatic, incompressible Boussinesq
equations on unstructured meshes. The introduction of this non-hydrostatic
capability is necessary for the representation of fine-scale dynamical
processes, including resolution of internal wave dynamics and large eddy
simulations. Compared to the standard hydrostatic formulation,
a non-hydrostatic pressure solver and a vertical momentum equation are
added, where the PETSc (Portable Extensible Toolkit for Scientific
Computation) library is used for the inversion of a large sparse system for
the nonhydrostatic pressure. Numerical results comparing the solutions of
the hydrostatic and non-hydrostatic models are presented, and the parallel
efficiency and accuracy of the time-stepper are evaluated.
AMD GPU benchmarking, documentation, and roadmap
This talk comprises three parts. First, we present an overview of some
relatively new training documentation like the “AMD lab notes” to enable
current and potential users of AMD GPUs into getting the best experience
out of their applications or algorithms. Second, we briefly discuss
implementation details regarding the PETSc HIP backend introduced into the
PETSc library late last year and present some performance benchmarking data
on some of the AMD hardware. Lastly, we give a preview of the upcoming
MI300 series APU and how software developers can prepare to leverage this
new type of accelerator.
An Immersed Boundary method for Elastic Bodies Using PETSc
Mohamad Ibrahim Cheikh , Konstantin Doubrovinski
Doubrovinski Lab, The University of Texas Southwestern Medical Center
This study presents a parallel implementation of an immersed boundary
method code using the PETSc distributed memory module. This work aims to simulate a complex developmental process that occurs in the
early stages of embryonic development, which involves the transformation of
the embryo into a multilayered and multidimensional structure. To
accomplish this, the researchers used the PETSc parallel module to solve
a linear system for the Eulerian fluid dynamics while simultaneously
coupling it with a deforming Lagrangian elastic body to model the
deformable embryonic tissue. This approach allows for a detailed simulation
of the interaction between the fluid and the tissue, which is critical for
accurately modeling the developmental process. Overall, this work
highlights the potential of the immersed boundary method and parallel
computing techniques for simulating complex physical phenomena.
Transparent Asynchronous Compute Made Easy With PETSc
Jacob Faibussowitch
Argonne National Laboratory
Asynchronous GPU computing has historically been difficult to integrate scalably at the library level. We provide an update on recent work
implementing a fully asynchronous framework in PETSc. We give detailed
performance comparisons and provide a demo to showcase the proposed model’s effectiveness
and ease of use.
PETSc-PIC: A Structure-Preserving Particle-In-Cell Method for Electrostatic Solves
Daniel Finn
University at Buffalo
Numerical solutions to the Vlasov-Poisson equations have important
applications in the fields of plasma physics, solar physics, and cosmology.
The goal of this research is to develop a structure-preserving,
electrostatic and gravitational Vlasov-Poisson(-Landau) model using the
Portable, Extensible Toolkit for Scientific Computation (PETSc) and study
the presence of Landau damping in a variety of systems, such as
thermonuclear fusion reactors and galactic dynamics. The PETSc
Particle-In-Cell (PETSc-PIC) model is a highly scalable,
structure-preserving PIC method with multigrid capabilities. In the PIC
method, a hybrid discretization is constructed with a grid of finitely
supported basis functions to represent the electric, magnetic, and/or
gravitational fields, and a distribution of delta functions to represent
the particle field. Collisions are added to the formulation using
a particle-basis Landau collision operator recently added to the PETSc
library.
Multiscale, Multiphysics Simulation Through Application Composition Using MOOSE
Derek Gaston
Idaho National Laboratory
Eight years ago, at the PETSc 20 meeting, I introduced the idea of
“Simplifying Multiphysics Through Application Composition” – the idea
that physics applications can be built in such a way that they can
instantly be combined to tackle complicated multiphysics problems.
This talk will serve as an update on those plans. I will detail the
evolution of that idea, how we’re using it in practice, how well it’s
working, and where we’re going next. Motivating examples will be drawn
from nuclear engineering, and practical aspects, such as testing, will
be explored.
High-order FEM implementation in AMReX using PETSc
Alex Grant , Karthik Chockalingam, Xiaohu Guo
Science and Technology Facilities Council (STFC), UK
AMReX is a C++ block-structured framework for adaptive mesh refinement,
typically used for finite difference or finite volume codes. We describe
a first attempt at a finite element implementation in AMReX using PETSc.
AMReX splits the domain of uniform elements into rectangular boxes at each
refinement level, with higher levels overlapping rather than replacing
lower levels and with each level solved independently. AMReX boxes can be
cell-centered or nodal; we use cell centered boxes to represent the geometry
and mesh and nodal boxes to identify nodes to constrain and store results
for visualization. We convert AMReX’s independent spatial indices into
a single global index, then use MATMPIAIJ to assemble the system matrix per
refinement level. In an unstructured grid, isoparametric mapping is
required for each element; using a structured grid avoids both this
and indirect addressing, which provides significant potential performance
advantages. We have solved time-dependent parabolic equations and seen
performance gains compared to unstructured finite elements. Further
developments will include arbitrary higher-order schemes and
multi-level hp refinement with arbitrary hanging nodes. PETSc uses AMReX
domain decomposition to partition the matrix and right-hand vectors. For
each higher level, not all of the domain will be refined, but AMReX’s
indices cover the whole space - this poses an indexing challenge and can
lead to over-allocation of memory. It is still to be explored whether DM
data structures would provide a benefit over MATMPIAIJ.
Scalable Riemann Solvers with the Discontinuous Galerkin Method for Hyperbolic Network Simulation
Aidan Hamilton , Jing-Mei Qiu, Hong Zhang
University of Delaware
We develop highly efficient and effective computational algorithms
and simulation tools for fluid simulations on a network. The mathematical
models are a set of hyperbolic conservation laws on the edges of a network, as
well as coupling conditions on junctions of a network. For example, the
shallow water system, together with flux balance and continuity conditions
at river intersections, model water flows on a river network. The
computationally accurate and robust discontinuous Galerkin methods,
coupled with explicit strong-stability preserving Runge-Kutta methods, are
implemented for simulations on network edges. Meanwhile, linear and
nonlinear scalable Riemann solvers are being developed and implemented at
network vertices. These network simulations result in tools built using
PETSc and DMNetwork software libraries for the scientific community in
general. Simulation results of a shallow water system on a Mississippi
river network with over one billion network variables are performed on an
extreme- scale computer using up to 8,192 processors with an optimal
parallel efficiency. Further potential applications include traffic flow
simulations on a highway network and blood flow simulations on an arterial
network, among many others
A mimetic finite difference based quasi-static magnetohydrodynamic solver for force-free plasmas in tokamak disruptions
Zakariae Jorti , Qi Tang, Konstantin Lipnikov, Xianzhu Tang
Los Alamos National Laboratory
Force-free plasmas are a good approximation in the low-beta case, where the
plasma pressure is tiny compared with the magnetic pressure. On time scales
long compared with the transit time of Alfvén waves, the evolution of
a force-free plasma is most efficiently described by a quasi-static
magnetohydrodynamic (MHD) model, which ignores the plasma inertia. In this
work, we consider a regularized quasi-static MHD model for force-free
plasmas in tokamak disruptions and propose a mimetic finite difference
(MFD) algorithm, which is targeted at applications such as the cold
vertical displacement event (VDE) of a major disruption in an ITER-like
tokamak reactor. In the case of whole device modeling, we further consider
the two sub-domains of the plasma region and wall region and their coupling
through an interface condition. We develop a parallel, fully implicit, and
scalable MFD solver based on PETSc and its DMStag data structure to discretize the five-field quasi-static perpendicular plasma dynamics
model on a 3D structured mesh. The MFD spatial discretization is coupled
with a fully implicit DIRK scheme. The full algorithm exactly preserves the
divergence-free condition of the magnetic field under a generalized Ohm’s
law. The preconditioner employed is a four-level fieldsplit preconditioner,
created by combining separate preconditioners for individual
fields, that calls multigrid or direct solvers for sub-blocks or exact
factorization on the separate fields. The numerical results confirm the
divergence-free constraint is strongly satisfied and demonstrate the
performance of the fieldsplit preconditioner and overall algorithm. The
simulation of ITER VDE cases over the actual plasma current diffusion time
is also presented.
PERMON library for quadratic programming
Jakub Kruzik , Marek Pecha, David Horak
VSB - Technical University of Ostrava, Czechia
PERMON (Parallel, Efficient, Robust, Modular, Object-oriented, Numerical)
is a library based on PETSc for solving quadratic programming (QP)
problems. We will present PERMON usage on our implementation of the FETI
(finite element tearing and interconnecting) method. This FETI
implementation involves a chain of QP transformations, such as
dualization, which simplifies a given QP. We will also discuss some useful
options, like viewing Karush-Kuhn-Tucker (optimality) conditions for each
QP in the chain. Finally, we will showcase some QP applications solved by
PERMON, such as the solution of contact problems for hydro-mechanical
problems with discrete fracture networks or the solution of support vector
machines using the PermonSVM module.
Towards enabling digital twins capabilities for a cloud chamber
Vanessa Lopez-Marrero , Kwangmin Yu, Tao Zhang, Mohammad Atif, Abdullah Al Muti Sharfuddin, Fan Yang, Yangang Liu, Meifeng Lin, Foluso Ladeinde, Lingda Li
Brookhaven National Laboratory
Particle-resolved direct numerical simulations (PR-DNS), which resolve not
only the smallest turbulent eddies but also track the development and
the motion of individual particles, are an essential tool for studying
aerosol-cloud-turbulence interactions. For instance, PR-DNS may complement
experimental facilities designed to study key physical processes in
a controlled environment and therefore serve as digital twins for such
cloud chambers. In this talk, we will present our ongoing work aimed at
enabling the use of PR-DNS for this purpose. We will describe the physical
model used, which consists of a set of fluid dynamics equations for
air velocity, temperature, and humidity coupled with a set of equations
for particle (i.e., droplet) growth/tracing. The numerical method used to
solve the model, which employs PETSc solvers in its implementation, will be
discussed, as well as our current efforts to assess performance and
scalability of the numerical solver.
PETSc ROCKS
David May
University of California, San Diego
The field of Geodynamics is concerned with understanding
the deformation history of the solid Earth over millions to billions of
year time scales. The infeasibility of extracting a spatially and
temporally complete geological record based on rocks that are currently
exposed at the surface of the Earth compels many geodynamists to employ
computational simulations of geological processes.
In this presentation I will discuss several geodynamic software packages
which utilize PETSc. I intend to highlight how PETSc has played an
important role in enabling and advancing state-of-the-art in geodynamic
software. I will also summarize my own experiences and observations of how
geodynamic-specific functionality has driven the
development of new general-purpose PETSc functionality.
PETSc Newton Trust-Region for Simulating Large-scale Engineered Subsurface Systems with PFLOTRAN
Heeho Park , Glenn Hammond, Albert Valocchi
Sandia National Laboratories
Modeling large-scale engineered subsurface systems entails significant
additional numerical challenges. For nuclear waste repository, the
challenges arise from: (a) the need to accurately represent both the waste
form processes and shafts, tunnel, and barriers at the small spatial scale
and the large-scale transport processes throughout geological formations;
(b) the strong contrast in material properties such as porosity and
permeability, and the nonlinear constitutive relations for multiphase flow;
(c) the decay of high level nuclear wastes cause nearby water to boil off
into steam leading to dry-out. These can lead to an ill-conditioned
Jacobian matrix and non-convergence with Newton’s method due to
discontinuous nonlinearity in constitutive models.
We apply the open-source simulator PFLOTRAN which employs a FV
discretization and uses the PETSc parallel framework. We implement within
PETSc the general-purpose nonlinear solver, Newton trust-region dogleg
Cauchy (NTRDC) and Newton trust-region (NTR) to demonstrate the
effectiveness of these advanced solvers. The results demonstrate speed-up
compared to the default solvers of PETSc and complete simulations that were
never completed with them.
SNL is managed and operated by NTESS under DOE NNSA contract DE-NA0003525.
Scalable cloud-native thermo-mechanical solvers using PETSc
Ashish Patel , Jeremy Theler, Francesc Levrero-Florencio, Nabil Abboud, Mohammad Sarraf Joshaghani, Scott McClennan
Ansys, Inc.
This talk presents how the Ansys OnScale team uses PETSc to
develop finite element-based thermo-mechanical solvers for scalable
nonlinear simulations on the cloud. We will first provide an overview of
features available in the solver and then discuss how some of the PETSc
objects, like DMPlex and TS, have helped us speed up our development
process. We will also talk about the workarounds we have incorporated to
address the current limitations of some of the functions from DMPlex for
our use cases involving multi-point constraints and curved elements.
Finally, we demonstrate how PETSc’s linear solvers scale on multi-node
cloud instances.
Intel oneAPI Math Kernel Library, what’s new and what’s next?
Spencer Patty
Intel Corporation
This talk provides an overview of Intel® oneAPI Math Kernel Library (oneMKL)
product and software for supporting optimized math routines for both Intel
CPUs and GPUs. Given that PETSc already utilizes several BLAS/LAPACK/Sparse
BLAS routines from oneMKL for Intel CPU and as part of the Aurora project
with Argonne, we discuss the use of OpenMP offload APIs for Intel GPUs.
We explore software and hardware improvements for better sparse linear
algebra performance and have an informal discussion of how to further
support the PETSc community.
Distributed Machine Learning for Natural Hazard Applications Using PERMON
Marek Pecha , David Horak, Richard Tran Mills, Zachary Langford
VSB – Technical University of Ostrava, Czechia
We will present a software solution for distributed machine learning
supporting computation on multiple GPUs running on the top of the PETSc
framework, which we will demonstrate in applications related to natural
hazard localizations and detections employing supervised uncertainties
modeling. It is called PERMON and is designed for convex optimization
using quadratic programming, and its extension PermonSVM implements
maximal-margin classifier approaches associated with support vector
machines (SVMs). Although deep learning (DL) is getting popular in recent
years, SVMs are still applicable. However, unlike DL, the SVM approach requires
additional feature engineering or feature selection. We will present our
workflow and show how to achieve reasonable models for the application
related to wildfire localization in Alaska.
Landau Collisions in the Particle Basis with PETSc-PIC
Joseph Pusztay , Matt Knepley, Mark Adams
University at Buffalo
The kinetic description of plasma encompasses the fine scale interaction of
the various bodies that it is comprised of, and applies to a litany of
experiments ranging from the laboratory magnetically confined fusion
plasma, to the scale of the solar corona. Of great import to these
descriptions are collisions in the grazing limit, which transfer momentum
between components of the plasma. Until recently, these have best been
described conservatively by finite element discretizations of the Landau
collision integral. In recent years a particle discretization has been
proven to preserve the appropriate eigenfunctions of the system, as well as
physically relevant quantities. I present here the recent work on a purely
particle discretized Landau collision operator which preserves mass,
momentum, and energy, with associated accuracy benchmarks in PETSc.
Experiences in solving nonlinear eigenvalue problems with SLEPc
Jose E. Roman
Universitat Politècnica de València
One of the unique features of SLEPc is the module for the general nonlinear
eigenvalue problem (NEP), where we want to compute a few eigenvalues and
corresponding eigenvectors of a large-scale parameter-dependent matrix
T(lambda). In this talk, we will illustrate the use of NEP in the context
of two applications, one of them coming from the characterization of
resonances in nanophotonic devices, and the other one from a problem in
aeroacoustics.
Some thoughts on the future of PETSc :
Barry Smith
Flatiron Institute
How will PETSc evolve and grow in the future? How can PETSc algorithms and
simulations be integrated into the emerging world of machine learning and
deep neural networks? I will provide an informal discussion of these topics
and my thoughts.
Software Development and Deployment Including PETSc
Tim Steinhoff , Volker Jacht
Gesellschaft für Anlagen- und Reaktorsicherheit (GRS), Germany
Once it is decided that PETSc shall handle certain numerical subtasks in
your software the question may arise about how to smoothly incorporate PETSc
into the overall software development and deployment processes. In this
talk, we present our approach how to handle such a situation for the code
family AC2 which is developed and distributed by GRS. AC2 is used to
simulate the behavior of nuclear reactors during operation, transients,
design basis and beyond design basis accidents up to radioactive releases
to the environment. The talk addresses our experiences, what challenges had
to be overcome, and how we make use of GitLab, CMake, and Docker techniques
to establish clean incorporation of PETSc into our software development
cycle.
TaoADMM
Hansol Suh
Argonne National Laboratory
In this tutorial, we will be giving an introduction to ADMM algorithm on
TAO. It will include walking through ADMM algorithm with some real-life
example, and tips on setting up the framework to solve ADMM on PETSc/TAO.
Numerical upscaling of network models using PETSc
Maria Vasilyeva
Texas A&M University-Corpus Christi
Multiphysics models on large networks are used in many applications, for
example, pore network models in reservoir simulation, epidemiological
models of disease spread, ecological models on multispecies interaction,
medical applications such as multiscale multidimensional simulations of
blood flow, etc. This work presents the construction of the numerical
upscaling and multiscale method for network models. An accurate
coarse-scale approximation is generated by solving local problems in
sub-networks. Numerical implementation of the network model is performed
based on the PETSc DMNetwork framework. Results are presented for square
and random heterogeneous networks generated by OpenPNM.
MultiFlow: A coupled balanced-force framework to solve multiphase flows in arbitrary domains
Berend van Wachem , Fabien Evrard
University of Magdeburg, Germany
Since 2000, we have been working on a finite-volume numerical framework
“MultiFlow ” to predict multiphase flows in arbitrary domains by solving
various flavors of the incompressible and compressible Navier-Stokes
equations using PETSc. This framework enables the simulation of creeping,
laminar and turbulent flows with droplets and/or particles at various
scales. It relies on a collocated variable arrangement of the unknown
variables and momentum-weighted-interpolation to determine the fluxes at
the cell faces to couple velocity and pressure. To maximize robustness, the
governing flow equations are solved in a coupled fashion, i.e., as part of
a single equation system involving all flow variables. Various modules are
available within the code in addition to its core flow solver, allowing it to
model interfacial and particulate flows at various flow regimes and scales.
The framework heavily relies on the PETSc library not only to solve the
system of governing equations but also for the handling of unknown
variables, parallelization of the computational domain, and exchange of
data over processor boundaries. We are now in the 3rd generation of our
code, currently using a combination of DMDA, and DMPlex with DMForest/p4est
frameworks to allow for the adaptive octree refinement of the
computational mesh. In this contribution, we will present the details of
the discretization and the parallel implementation of our framework and
describe its interconnection with the PETSc library. We will then present
some applications of our framework, simulating multiphase flows at various
scales, flows regimes, and resolutions. During this contribution, we will
also discuss our framework’s challenges and future objectives.
PETSc in the Ionosphere
Matt Young
University of New Hampshire
A planet’s ionosphere is the region of its atmosphere where a fraction
of the constituent atoms or molecules have separated into positive ions and
electrons. Earth’s ionosphere extends from roughly 85 km during the day
(higher at night) to the edge of space. This partially ionized regime
exhibits collective behavior and supports electromagnetic phenomena that do
not exist in the neutral (i.e., unionized) atmosphere. Furthermore, the
abundance of neutral atoms and molecules leads to phenomena that do not
exist in the fully ionized space environment. In a relatively narrow
altitude range of Earth’s ionosphere called the “E region”, electrons
behave as typical charged particles – moving in response to combined
electric and magnetic fields – while ions collide too frequently with
neutral molecules to respond to the magnetic field. This difference leads
to the Farley-Buneman instability when the local electric field is strong
enough. The Farley-Buneman instability regularly produces irregularities in
the charged-particle densities that are strong enough to reflect radio
signals. Recent research suggests that fully developed turbulent
structures can disrupt GPS communication.
The Electrostatic Parallel Particle-in-Cell (EPPIC) numerical simulation
self-consistently models instability growth and evolution in the E-region
ionosphere. The simulation includes a hybrid mode that treats electrons as
a fluid and treats ions as particles. The particular fluid electron model
requires the solution of an elliptic partial differential equation for the
electrostatic potential at each time step, which we represent as a linear
system that the simulation solves with PETSc. This presentation will
describe the original development of the 2D hybrid simulation, previous
results, recent efforts to extend to 3D, and implications for modeling GPS
scintillation.
The Electrostatic Parallel Particle-in-Cell (EPPIC) numerical simulation
self-consistently models instability growth and evolution in the E-region
ionosphere. The simulation includes a hybrid mode that treats electrons as
a fluid and treats ions as particles. The particular fluid electron model
requires the solution of an elliptic partial differential equation for the
electrostatic potential at each time step, which we represent as a linear
system that the simulation solves with PETSc. This presentation will describe
the original development of the 2D hybrid simulation, previous results, recently
efforts to extend to 3D, and implications to modeling GPS scintillation.
XGCm: An Unstructured Mesh Gyrokinetic Particle-in-cell Code for Exascale Fusion Plasma Simulations
Chonglin Zhang , Cameron W. Smith, Mark S. Shephard
Rensselaer Polytechnic Institute (RPI)
We report the development of XGCm, a new distributed unstructured mesh
gyrokinetic particle-in-cell (PIC) code, short for x-point included
gyrokinetic code mesh-based. The code adopts the physical algorithms of the
well-established XGC code. It is intended as a testbed for experimenting
with new numerical and computational algorithms, which can eventually be
adopted in XGC and other PIC codes. XGCm is developed on top of several
open-source libraries, including Kokkos, PETSc, Omega, and PUMIPic. Omega
and PUMIPic rely on Kokkos to interact with the GPU accelerator, while
PETSc solves the gyrokinetic Poisson equation on either CPU or GPU. We
first discuss the numerical algorithms of our mesh-centric approach for
performing PIC calculations. We then present a code validation study using
the cyclone base case with ion temperature gradient turbulence (case 5 from
Burckel, etc. Journal of Physics: Conference Series 260, 2010, 012006).
Finally, we discuss the performance of XGCm and present weak scaling
results using up to the full system (27,648 GPUs) of the Oak Ridge National
Laboratory’s Summit supercomputer. Overall, XGCm executes all PIC
operations on the GPU accelerators and exhibits good performance and
portability.
PETSc DMNetwork: A Library for Scalable Network PDE-Based Multiphysics Simulation
Hong Zhang (Ms.)
Argonne National Laboratory, Illinois Institute of Technology
We present DMNetwork, a high-level set of routines included in the PETSc
library for the simulation of multiphysics phenomena over large-scale
networked systems. The library aims at applications with networked
structures like those in electrical, water, and traffic
distribution systems. DMNetwork provides data and topology management,
parallelization for multiphysics systems over a network, and hierarchical
and composable solvers to exploit the problem structure. DMNetwork eases
the simulation development cycle by providing the necessary infrastructure
to define and query the network components through simple abstractions.
MPI Multiply Threads
Hui Zhou
Argonne National Laboratory
In the traditional MPI+Thread programming paradigm, MPI and OpenMP each
form their own parallelization. MPI is unaware of the thread
context. The requirement of thread safety and message ordering forces MPI
library to blindly add critical sections, unnecessarily serializing the
code. On the other hand, OpenMP cannot use MPI for inter-thread
communications. Developers often need hand-roll algorithms for
collective operations and non-blocking synchronizations.
MPICH recently added a few extensions to address the root issues in
MPI+Thread. The first extension, MPIX stream, allows applications to
explicitly pass the thread context into MPI. The second extension, thread
communicator, allows individual threads in an OpenMP parallel region to use
MPI for inter-thread communications. In particular, this allows an OpenMP
program to use PETSc within a parallel region.
Instead of MPI+Thread, we refer to this new pattern as MPI x Thread.
PETSc on the GPU
Junchao Zhang
Argonne National Laboratory
In this mini-tutorial, we will briefly introduce the GPU backends of PETSc and how to configure, build, run
and profile PETSc on GPUs. We also talk about how to port your PETSc code to GPUs.
PETSc and PyTorch Interoperability
Hong Zhang (Mr.)
Argonne National Laboratory
In this mini-tutorial, we will introduce: How to convert between PETSc vectors/matrices and PyTorch tensors;
How to generate Jacobian or action of Jacobian with PyTorch and use it in PETSc; How to use PETSc and PyTorch
for solving ODEs and training neural ODEs.
petsc4py
Stefano Zampini
King Abdullah University of Science and Technology (KAUST)
In this mini-tutorial, we will introduce the Python binding of PETSc.
DMPlex
Matt Knepley
University at Buffalo
In this mini-tutorial, we will introduce the DMPlex class in PETSc.
DMSwarm
Joseph Pusztay
University at Buffalo
In this mini-tutorial, we will introduce the DMSwarm class in PETSc.