Feel++

Algebraic solutions

Definitions

Matrices

Matrix Definition A matrix is a linear transformation between finite dimensional vector spaces.

Assembling a matrix Assembling a matrix means defining its action as entries stored in a sparse or dense format. For example, in the finite element context, the storage format is sparse to take advantage of the many zero entries.

Symmetric matrix

\$A = A^T\$

Definite (resp. semi-definite) positive matrix

All eigenvalue are

  1. \$>0\$ (resp \$\geq 0\$) or

  2. \$x^T A x > 0, \forall\ x\$ (resp. \$x^T\ A\ x\geq 0\, \forall\ x\$)

Definite (resp. semi-negative) matrix

All eigenvalue are

  1. \$<0\$ (resp. \$\leq 0\$) or

  2. \$x^T\ A\ x < 0\ \forall\ x\$ (resp. \$x^T\ A\ x \leq 0\, \forall\ x\$)

Indefinite matrix

There exists

  1. positive and negative eigenvalue (Stokes, Helmholtz) or

  2. there exists \$x,y\$ such that \$x^TAx > 0 > y^T A y\$

Preconditioners

Definition

Let \$A\$ be a \$\mathbb{R}^{n\times n}\$ matrix, \$x\$ and \$b\$ be \$\mathbb{R}^n\$ vectors, we wish to solve \$A x = b\$.

Definition

A preconditioner \$\mathcal{P}\$ is a method for constructing a matrix (just a linear function, not assembled!) P^{-1} = \mathcal{P}(A,A_p) using a matrix \$A\$ and extra information \$A_p\$, such that the spectrum of P^{-1}A (left preconditioning) or A P^{-1} (right preconditioning) is well-behaved. The action of preconditioning improves the conditioning of the previous linear system.

Left preconditioning: We solve for (P^{-1} A) x = P^{-1} b and we build the Krylov space \{ P^{-1} b, (P^{-1}A) P^{-1} b, (P^{-1}A)^2 P^{-1} b, \dots\}

Right preconditioning: We solve for (A P^{-1}) P x = b and we build the Krylov space \{ b, (P^{-1}A)b, (P^{-1}A)^2b, \dotsc \}

Note that the product P^{-1}A or A P^{-1} is never assembled.

Properties

Let us now describe some properties of preconditioners

  • P^{-1} is dense, P is often not available and is not needed

  • A is rarely used by \mathcal{P}, but A_p = A is common

  • A_p is often a sparse matrix, the \e preconditioning \e matrix

Here are some numerical methods to solve the system A x = b

  • Matrix-based: Jacobi, Gauss-Seidel, SOR, ILU(k), LU

  • Parallel: Block-Jacobi, Schwarz, Multigrid, FETI-DP, BDDC

  • Indefinite: Schur-complement, Domain Decomposition, Multigrid

Preconditioner strategies

Relaxation

Split into lower, diagonal, upper parts: \$A = L + D + U\$.

Jacobi

Cheapest preconditioner: \$P^{-1}=D^{-1}\$.

# sequential
pc-type=jacobi
# parallel
pc-type=block_jacobi
Successive over-relaxation (SOR)
\$\left(L + \frac 1 \omega D\right) x_{n+1} = \left[\left(\frac 1\omega-1\right)D - U\right] x_n + \omega b \\ P^{-1} = \text{$k$ iterations starting with $x_0=0$}\\\$
  • Implemented as a sweep.

  • \$\omega = 1\$ corresponds to Gauss-Seidel.

  • Very effective at removing high-frequency components of residual.

# sequential
pc-type=sor
Factorization

Two phases

  • symbolic factorization: find where fill occurs, only uses sparsity pattern.

  • numeric factorization: compute factors.

    LU decomposition
    • preconditioner.

    • Expensive, for \$m\times m\$ sparse matrix with bandwidth \$b\$, traditionally requires \$\mathcal{O}(mb^2)\$ time and \$\mathcal{O}(mb)\$ space.

      • Bandwidth scales as \$m^{\frac{d-1}{d}}\$ in \$d\$-dimensions.

        • Optimal in 2D: \$\mathcal{O}(m \cdot \log m)\$ space, \$\mathcal{O}(m^{3/2})\$ time.

        • Optimal in 3D: \$\mathcal{O}(m^{4/3})\$ space, \$\mathcal{O}(m^2)\$ time.

    • Symbolic factorization is problematic in parallel.

Incomplete LU
  • Allow a limited number of levels of fill: ILU(\$k\$).

  • Only allow fill for entries that exceed threshold: ILUT.

  • Usually poor scaling in parallel.

  • No guarantees.

1-level Domain decomposition

Domain size \$L\$, subdomain size \$H\$, element size \$h\$

  • Overlapping/Schwarz

    • Solve Dirichlet problems on overlapping subdomains.

    • No overlap: \$\textit{its} \in \mathcal{O}\big( \frac{L}{\sqrt{Hh}} \big)\$.

    • Overlap \delta: \textit{its} \in \big( \frac L {\sqrt{H\delta}} \big).

pc-type=gasm # has a coarse grid preconditioner
pc-type=asm
  • Neumann-Neumann

    • Solve Neumann problems on non-overlapping subdomains.

    • \textit{its} \in \mathcal{O}\big( \frac{L}{H}(1+\log\frac H h) \big).

    • Tricky null space issues (floating subdomains).

    • Need subdomain matrices, not globally assembled matrix.

Notes: Multilevel variants knock off the leading \frac L H.
Both overlapping and nonoverlapping with this bound.

  • BDDC and FETI-DP

    • Neumann problems on subdomains with coarse grid correction.

    • \textit{its} \in \mathcal{O}\big(1 + \log\frac H h \big).

Multigrid

Hierarchy: Interpolation and restriction operators \Pi^\uparrow : X_{\text{coarse}} \to X_{\text{fine}} \qquad \Pi^\downarrow : X_{\text{fine}} \to X_{\text{coarse}}

  • Geometric: define problem on multiple levels, use grid to compute hierarchy.

  • Algebraic: define problem only on finest level, use matrix structure to build hierarchy.

Galerkin approximation

Assemble this matrix: A_{\text{coarse}} = \Pi^\downarrow A_{\text{fine}} \Pi^\uparrow

Application of multigrid preconditioner ( V -cycle)

  • Apply pre-smoother on fine level (any preconditioner).

  • Restrict residual to coarse level with \Pi^\downarrow.

  • Solve on coarse level A_{\text{coarse}} x = r.

  • Interpolate result back to fine level with \Pi^\uparrow.

  • Apply post-smoother on fine level (any preconditioner).

Multigrid convergence properties
  • Textbook: P^{-1}A is spectrally equivalent to identity

    • Constant number of iterations to converge up to discretization error.

  • Most theory applies to SPD systems

    • variable coefficients (e.g. discontinuous): low energy interpolants.

    • mesh- and/or physics-induced anisotropy: semi-coarsening/line smoothers.

    • complex geometry: difficult to have meaningful coarse levels.

  • Deeper algorithmic difficulties

    • nonsymmetric (e.g. advection, shallow water, Euler).

    • indefinite (e.g. incompressible flow, Helmholtz).

  • Performance considerations

    • Aggressive coarsening is critical in parallel.

    • Most theory uses SOR smoothers, ILU often more robust.

    • Coarsest level usually solved semi-redundantly with direct solver.

  • Multilevel Schwarz is essentially the same with different language

    • assume strong smoothers, emphasize aggressive coarsening.

List of PETSc Preconditioners

See this PETSc page for a complete list.

Table 1. Table of Preconditioners as of PETSc 3.7

PETSc

Description

Parallel

none

No preconditioner

yes

jacobi

diagonal preconditioner

yes

bjacobi

block diagonal preconditioner

yes

sor

SOR preconditioner

yes

lu

Direct solver as preconditioner

depends on the factorization package (e.g.mumps,pastix: OK)

shell

User defined preconditioner

depends on the user preconditioner

mg

multigrid prec

yes

ilu

incomplete lu

icc

incomplete cholesky

cholesky

Cholesky factorisation

yes

asm

Additive Schwarz Method

yes

gasm

Scalable Additive Schwarz Method

yes

ksp

Krylov subspace preconditioner

yes

fieldsplit

block preconditioner framework

yes

lsc

Least Square Commutator

yes

gamg

Scalable Algebraic Multigrid

yes

hypre

Hypre framework (multigrid…​)

bddc

balancing domain decomposition by constraints preconditioner

yes

Principles

Feel++ abstracts the PETSc library and provides a subset (sufficient in most cases) to the PETSc features. It interfaces with the following PETSc libraries: Mat , Vec , KSP , PC , SNES.

  • Vec Vector handling library

  • Mat Matrix handling library

  • KSP Krylov SubSpace library implements various iterative solvers

  • PC Preconditioner library implements various preconditioning strategies

  • SNES Nonlinear solver library implements various nonlinear solve strategies

All linear algebra are encapsulated within backends using the command line option --backend=<backend> or config file option backend=<backend> which provide interface to several libraries

Library

Format

Backend

PETSc

sparse

petsc

Eigen

sparse

eigen

Eigen

dense

eigen_dense

The default backend is petsc.

Somes generic examples

The configuration files .cfg allow for a wide range of options to solve a linear or non-linear system.

We consider now the following example [import:"marker1"](../../codes/mylaplacian.cpp)

To execute this example

# sequential
./feelpp_tut_laplacian
# parallel on 4 cores
mpirun -np 4 ./feelpp_tut_laplacian

As described in section

Direct solver

cholesky and lu factorisation are available. However the parallel implementation depends on the availability of MUMPS. The configuration is very simple.

# no iterative solver
ksp-type=preonly
#
pc-type=cholesky

Using the PETSc backend allows to choose different packages to compute the factorization.

Table 2. Table of factorization package

Package

Description

Parallel

petsc

PETSc own implementation

yes

mumps

MUltifrontal Massively Parallel sparse direct Solver

yes

umfpack

Unsymmetric MultiFrontal package

no

pastix

Parallel Sparse matriX package

yes

To choose between these factorization package

# choose mumps
pc-factor-mat-solver-package=mumps
# choose umfpack (sequential)
pc-factor-mat-solver-package=umfpack

In order to perform a cholesky type of factorisation, it is required to set the underlying matrix to be SPD.

// matrix
auto A = backend->newMatrix(_test=...,_trial=...,_properties=SPD);
// bilinear form
auto a = form2( _test=..., _trial=..., _properties=SPD );

Using iterative solvers

Using CG and ICC(3)

with a relative tolerance of 1e-12:

ksp-rtol=1.e-12
ksp-type=cg
pc-type=icc
pc-factor-levels=3
Using GMRES and ILU(3)

with a relative tolerance of 1e-12 and a restart of 300:

ksp-rtol=1.e-12
ksp-type=gmres
ksp-gmres-restart=300
pc-type=ilu
pc-factor-levels=3
Using GMRES and Jacobi

With a relative tolerance of 1e-12 and a restart of 100:

ksp-rtol=1.e-12
ksp-type=gmres
ksp-gmres-restart 100
pc-type=jacobi

Monitoring linear non-linear and eigen problem solver residuals

# linear
ksp_monitor=1
# non-linear
snes-monitor=1
# eigen value problem
eps-monitor=1

Solving the Laplace problem

We start with the quickstart Laplacian example, recall that we wish to, given a domain \Omega, find u such that

-\nabla \cdot (k \nabla u) = f \mbox{ in } \Omega \subset \mathbb{R}^{2},\\ u = g \mbox{ on } \partial \Omega

Monitoring KSP solvers
feelpp_qs_laplacian --ksp-monitor=true
Viewing KSP solvers
shell> mpirun -np 2 feelpp_qs_laplacian --ksp-monitor=1  --ksp-view=1
  0 KSP Residual norm 8.953261456448e-01
  1 KSP Residual norm 7.204431786960e-16
KSP Object: 2 MPI processes
  type: gmres
    GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
     Orthogonalization with no iterative refinement
    GMRES: happy breakdown tolerance 1e-30
  maximum iterations=1000
  tolerances:  relative=1e-13, absolute=1e-50, divergence=100000
  left preconditioning
  using nonzero initial guess
  using PRECONDITIONED norm type for convergence test
PC Object: 2 MPI processes
  type: shell
    Shell:
  linear system matrix = precond matrix:
  Matrix Object:   2 MPI processes
    type: mpiaij
    rows=525, cols=525
    total: nonzeros=5727, allocated nonzeros=5727
    total number of mallocs used during MatSetValues calls =0
      not using I-node (on process 0) routines
Solvers and preconditioners

You can now change the Krylov subspace solver using the --ksp-type option and the preconditioner using --pc-ptype option.

For example,

  • to solve use the conjugate gradient,cg, solver and the default preconditioner use the following

./feelpp_qs_laplacian --ksp-type=cg --ksp-view=1 --ksp-monitor=1
  • to solve using the algebraic multigrid preconditioner, gamg, with cg as a solver use the following

./feelpp_qs_laplacian --ksp-type=cg --ksp-view=1 --ksp-monitor=1 --pc-type=gamg

Block factorisation

Stokes

We now turn to the quickstart Stokes example, recall that we wish to, given a domain \Omega, find (\mathbf{u},p) such that

\$ -\Delta \mathbf{u} + \nabla p = \mathbf{ f} \mbox{ in } \Omega,\\ \nabla \cdot \mathbf{u} = 0 \mbox{ in } \Omega,\\ \mathbf{u} = \mathbf{g} \mbox{ on } \partial \Omega\$

This problem is indefinite. Possible solution strategies are

  • Uzawa,

  • penalty(techniques from optimisation),

  • augmented lagrangian approach (Glowinski,Le Tallec)

Note that The Inf-sup condition must be satisfied. In particular for a multigrid strategy, the smoother needs to preserve it.

General approach for saddle point problems

The Krylov subspace solvers for indefinite problems are MINRES, GMRES. As to preconditioning, we look first at the saddle point matrix M and its block factorization M = LDL^T, indeed we have :

\$M = \begin{pmatrix} A & B \\ B^T & 0 \end{pmatrix} = \begin{pmatrix} I & 0\\ B^T C & I \end{pmatrix} \begin{pmatrix} A & 0\\ 0 & - B^T A^{-1} B \end{pmatrix} \begin{pmatrix} I & A^{-1} B\\ 0 & I \end{pmatrix}\$
  • Elman, Silvester and Wathen propose 3 preconditioners:

\$P_1 = \begin{pmatrix} \tilde{A}^{-1} & B\\ B^T & 0 \end{pmatrix}, \quad P_2 = \begin{pmatrix} \tilde{A}^{-1} & 0\\ 0 & \tilde{S} \end{pmatrix},\quad P_3 = \begin{pmatrix} \tilde{A}^{-1} & B\\ 0 & \tilde{S} \end{pmatrix}\$

where \$\tilde{S} \approx S^{-1} = B^T A^{-1} B\$ and \$\tilde{A}^{-1} \approx A^{-1}\$

Preconditioner strategies

Relaxation

Split into lower, diagonal, upper parts: \$A = L + D + U\$.

Jacobi

Cheapest preconditioner: \$P^{-1}=D^{-1}\$.

# sequential
pc-type=jacobi
# parallel
pc-type=block_jacobi
Successive over-relaxation (SOR)
\$\left(L + \frac 1 \omega D\right) x_{n+1} = \left[\left(\frac 1\omega-1\right)D - U\right] x_n + \omega b \\ P^{-1} = \text{$k$ iterations starting with $x_0=0$}\$
  • Implemented as a sweep.

  • \$\omega = 1\$ corresponds to Gauss-Seidel.

  • Very effective at removing high-frequency components of residual.

# sequential
pc-type=sor

Factorization

Two phases

  • symbolic factorization: find where fill occurs, only uses sparsity pattern.

  • numeric factorization: compute factors.

LU decomposition
  • preconditioner.

  • Expensive, for \$m\times m\$ sparse matrix with bandwidth \$b\$, traditionally requires \$\mathcal{O}(mb^2)\$ time and \$\mathcal{O}(mb)\$ space.

    • Bandwidth scales as \$m^{\frac{d-1}{d}}\$ in d-dimensions.

    • Optimal in 2D: \$\mathcal{O}(m \cdot \log m)\$ space, \$\mathcal{O}(m^{3/2})\$ time.

    • Optimal in 3D: \$\mathcal{O}(m^{4/3})\$ space, \$\mathcal{O}(m^2)\$ time.

  • Symbolic factorization is problematic in parallel.

Incomplete LU
  • Allow a limited number of levels of fill: ILU(\$k\$).

  • Only allow fill for entries that exceed threshold: ILUT.

  • Usually poor scaling in parallel.

  • No guarantees.

1-level Domain decomposition

Domain size $$L$$, subdomain size $$H$$, element size $$h$$
  • Overlapping/Schwarz

    • Solve Dirichlet problems on overlapping subdomains.

    • No overlap: \$\textit{its} \in \mathcal{O}\big( \frac{L}{\sqrt{Hh}} \big)\$.

    • Overlap \$\delta: \textit{its} \in \big( \frac L {\sqrt{H\delta}} \big)\$.

  • Neumann-Neumann

    • Solve Neumann problems on non-overlapping subdomains.

    • \$\textit{its} \in \mathcal{O}\big( \frac{L}{H}(1+\log\frac H h) \big)\$.

    • Tricky null space issues (floating subdomains).

    • Need subdomain matrices, not globally assembled matrix.

Multilevel variants knock off the leading \$\frac L H\$.
Both overlapping and nonoverlapping with this bound.
  • BDDC and FETI-DP

    • Neumann problems on subdomains with coarse grid correction.

    • \$\textit{its} \in \mathcal{O}\big(1 + \log\frac H h \big)\$.

Multigrid

Introduction

Hierarchy: Interpolation and restriction operators \$\Pi^\uparrow : X_{\text{coarse}} \to X_{\text{fine}} \qquad \Pi^\downarrow : X_{\text{fine}} \to X_{\text{coarse}} \$

  • Geometric: define problem on multiple levels, use grid to compute hierarchy.

  • Algebraic: define problem only on finest level, use matrix structure to build hierarchy.

Galerkin approximation

Assemble this matrix: \$A_{\text{coarse}} = \Pi^\downarrow A_{\text{fine}} \Pi^\uparrow\$

Application of multigrid preconditioner (\$V\$-cycle)

  • Apply pre-smoother on fine level (any preconditioner).

  • Restrict residual to coarse level with \$\Pi^\downarrow\$.

  • Solve on coarse level \$A_{\text{coarse}} x = r\$.

  • Interpolate result back to fine level with \Pi^\uparrow.

  • Apply post-smoother on fine level (any preconditioner).

Multigrid convergence properties
  • Textbook: \$P^{-1}A\$ is spectrally equivalent to identity

    • Constant number of iterations to converge up to discretization error.

  • Most theory applies to SPD systems

    • variable coefficients (e.g. discontinuous): low energy interpolants.

    • mesh- and/or physics-induced anisotropy: semi-coarsening/line smoothers.

    • complex geometry: difficult to have meaningful coarse levels.

  • Deeper algorithmic difficulties

    • nonsymmetric (e.g. advection, shallow water, Euler).

    • indefinite (e.g. incompressible flow, Helmholtz).

  • Performance considerations

    • Aggressive coarsening is critical in parallel.

    • Most theory uses SOR smoothers, ILU often more robust.

    • Coarsest level usually solved semi-redundantly with direct solver.

  • Multilevel Schwarz is essentially the same with different language

    • assume strong smoothers, emphasize aggressive coarsening.

List of PETSc Preconditioners

See this PETSc page for a complete list.

Table 3. Table of Preconditioners as of PETSc 3.7

PETSc

Description

Parallel

none

No preconditioner

yes

jacobi

diagonal preconditioner

yes

bjacobi

block diagonal preconditioner

yes

sor

SOR preconditioner

yes

lu

Direct solver as preconditioner

depends on the factorization package (e.g.mumps,pastix: OK)

shell

User defined preconditioner

depends on the user preconditioner

mg

multigrid prec

yes

ilu

incomplete lu

icc

incomplete cholesky

cholesky

Cholesky factorisation

yes

asm

Additive Schwarz Method

yes

gasm

Scalable Additive Schwarz Method

yes

ksp

Krylov subspace preconditioner

yes

fieldsplit

block preconditioner framework

yes

lsc

Least Square Commutator

yes

gamg

Scalable Algebraic Multigrid

yes

hypre

Hypre framework (multigrid…​)

bddc

balancing domain decomposition by constraints preconditioner

yes

Algebra Backends

Non-Linear algebra backends are crucial components of Feel++. They provide a uniform interface between Feel++ data structures and underlying the linear algebra libraries used by Feel++.

Libraries

Feel++ interfaces the following libraries:

  • PETSc : Portable, Extensible Toolkit for Scientific Computation

  • SLEPc : Eigen value solver framework based on PETSc

  • Eigen3

Backend

Backend is a template class parametrized by the numerical type providing access to

  • vector

  • matrix : dense and sparse

  • algorithms : solvers, preconditioners, …​

PETSc provides sequential and parallel data structures whereas Eigen3 is only sequential.

To create a Backend, use the free function backend(…​) which has the following interface:

backend(_name="name_of_backend",
        _rebuild=... /// true|false,
        _kind=..., // type of backend,
        _worldcomm=... // communicator
        )

All these parameters are optional which means that the simplest call reads for example:

auto b = backend();

They take default values as described in the following table:

parameter

type

default value

_name

string

"" (empty string )

_rebuild

boolean

false

_kind

string

"petsc"

_worldcomm

WorldComm

Environment::worldComm()

_name

Backends are store in a Backend factory and handled by a manager that allows to keep track of allocated backends. They a registered with respect to their name and kind. The default name value is en empty string ("") which names the default Backend. The _name parameter is important because it provides also the name for the command line/config file option section associated to the associated Backend.

Only the command line/config file options for the default backend are registered. Developers have to register the option for each new Backend they define otherwise failures at run-time are to be expected whenever a Backend command line option is accessed.

Consider that you create a Backend name projection in your code like this

auto b = backend(_name="projection");

to register the command line options of this Backend

Environment env( _argc=argc, _argv=argv,
                 _desc=backend_options("projection") );
_kind

Feel++ supports three kind of Backends:

  • petsc : PETSC Backend

  • eigen_dense

  • eigen_sparse

SLEPc uses the PETSc Backend since it is based on PETSc.

The kind of Backend can be changed from the command line or configuration file thanks to the "backend" option.

auto b = backend(_name="name",
                 _kind=soption(_prefix="name",_name="backend"))

and in the config file

[name]
backend=petsc
backend=eigen_sparse
_rebuild

If you want to reuse a Backend and not allocate a new one plus add the corresponding option to the command line/configuration file, you can rebuild the Backend in order to delete the data structures already associated to this Backend and in particular the preconditioner. It is mandatory to do that when you solve say a linear system first with dimensions \$m\times m\$ and then solve another one with different dimension \$n \times n\$ because in that case the Backend will throw an error saying that the dimensions are incompatible. To avoid that you have to rebuild the Backend.

auto b = backend(_name="mybackend");
// solve A x = f
b->solve(_solution=x,_matrix=A,_rhs=f);
// rebuild: clean up the internal Backend data structure
b = backend(_name="mybackend",_rebuild=true);
// solve A1 x1 = f1
b->solve(_solution=x1,_matrix=A1,_rhs=f1);
Although this feature might appear useful, you have to make sure that the solving strategy applies to all problems because you won’t be able to customize the solver/preconditioner for each problem. If you have different problems to solve and need to have custom solver/preconditioner it would be best to have different Backends.
_worldComm

One of the strength of Feel++ is to be able to change the communicator and in the case of Feel++ the WorldComm. This allows for example to

  • solve sequential problems

  • solve a problem on a subset of MPI processes

For example passing a sequential WorldComm to backend() would then in the subsequent use of the Backend generate sequential data structures (e.g. IndexSet, Vector and Matrix) and algorithms (e.g. Krylov Solvers).

 // create a sequential Backend
 auto b = backend(_name="seq",
                  _worldComm=Environment::worldCommSeq());
auto A = b->newMatrix(); // sequential Matrix
auto f = b->newVector(); // sequential Vector

Info The default WorldComm is provided by Environment::worldComm() and it conresponds to the default MPI communicator MPI_COMM_WORLD.

Eigen Problem

To solve standard and generalized eigenvalue problems, Feel++ interfaces SLEPc. SLEPc is a library which extends PETSc to provide the functionality necessary for the solution of eigenvalue problems. It comes with many strategies for both standard and generalized problems, Hermitian or not.

We want to find \$(\lambda_i,x_i)\$ such that \$Ax = \lambda x\$. To do that, most eigensolvers project the problem onto a low-dimensional subspace, this is called a Rayleigh-Ritz projection. + Let \$V_j=[v_1,v_2,...,v_j\$] be an orthogonal basis of this subspace, then the projected problem reads:

Find \$(\theta_i,s_i)\$ for \$i=1,\ldots,j\$ such that \$B_j s_i=\theta_i s_i\$ where \$B_j=V_j^T A V_j\$.

Then the approximate eigenpairs \$(\lambda_i,x_i)\$ of the original problem are obtained as: \$\lambda_i=\theta_i\$ and \$x_i=V_j s_i\$.

The eigensolvers differ from each other in the way the subspace is built.

Code

In Feel++, there is two functions that can be used to solve this type of problems, eigs and veigs.

Here is an example of how to use veigs.

auto Vh = Pch<Order>( mesh );
auto a = form2( _test=Vh, _trial=Vh );
// fill a
auto b = form2( _test=Vh, _trial=Vh );
// fill b
auto eigenmodes = veigs( _formA=a, _formB=b );

where eigenmodes is a std::vector<std::pair<value_type, element_type> > with value_type the type of the eigenvalue, and element_type the type of the eigenvector, a function of the space Vh.

The eigs function does not take the bilinear forms but two matrices. Also, the solver used, the type of the problem, the position of the spectrum and the spectral transformation are not read from the options.

auto Vh = Pch<Order>( mesh );
auto a = form2( _test=Vh, _trial=Vh );
// fill a
auto matA = a.matrixPtr();
auto b = form2( _test=Vh, _trial=Vh );
// fill b
auto matB = b.matrixPtr();
auto eigenmodes = eigs( _matrixA=aHat,
                        _matrixB=bHat,
                        _solver=(EigenSolverType)EigenMap[soption("solvereigen.solver")],
                        _problem=(EigenProblemType)EigenMap[soption("solvereigen.problem")],
                        _transform=(SpectralTransformType)EigenMap[soption("solvereigen.transform")],
                        _spectrum=(PositionOfSpectrum)EigenMap[soption("solvereigen.spectrum")]
                         );
auto femodes = std::vector<decltype(Vh->element())>( eigenmodes.size(), Vh->element() );
int i = 0;
for( auto const& mode : modes )
    femodes[i++] = *mode.second.get<2>();

where eigenmodes is a std::map<real_type, eigenmodes_type> with real_type of the magnitude of the eigenvalue. And eigenmodes_type is a boost::tuple<real_type, real_type, vector_ptrtype> with the first real_type representing the real part of the eigenvalue, the second real_type the imaginary part and the vector_ptrtype is a vector but not an element of a functionspace.

The two functions take a parameter _nev that tel how many eigenpair to compute. This can be set from the command line option --solvereigen.nev. + Another important parameter is _ncv which is the size of the subspace, j above. This parameter should always be greater than nev. SLEPc recommends to set it to max(nev+15, 2*nev). This can be set from the command line option --solvereigen.ncv.

Problem type

The standard formulation reads :

Find \$\lambda\in \mathbb{R}\$ such that \$Ax = \lambda x\$

where \$\lambda\$ is an eigenvalue and \$x\$ an eigenvector.

But in the case of the finite element method, we will deal with the generalized form :

Find \$\lambda\in\mathbb{R}\$ such that \$Ax = \lambda Bx\$

A standard problem is Hermitian if the matrix A is Hermitian (\$A=A^*\$). + A generalized problem is Hermitian if the matrices \$A\$ and \$B\$ are Hermitian and if \$B\$ is positive definite. + If the problem is Hermitian, then the eigenvalues are real. A special case of the generalized problem is when the matrices are not Hermitian but \$B\$ is positive definite.

The type of the problem can be specified using the EigenProblemType, or at run time with the command line option --solvereigen.problem and the following value :

Table 4. Table of problem type
Problem type EigenProblemType command line key

Standard Hermitian

HEP

"hep"

Standard non-Hermitian

NHEP

"nhep"

Generalized Hermitian

GHEP

"ghep"

Generalized non-Hermitian

GNHEP

"gnhep"

Positive definite Generalized non-Hermitian

PGNHEP

"pgnhep"

Position of spectrum

You can choose which eigenpairs will be computed. The user can set it programmatically with PositionOfSpectrum or at run time with the command line option --solvereigen.spectrum and the following value :

Table 5. Table of position of spectrum
Position of spectrum PositionOfSpectrum command line key

Largest magnitude

LARGEST_MAGNITUDE

"largest_magnitude"

Smallest magnitude

SMALLEST_MAGNITUDE

"smallest_magnitude"

Largest real

LARGEST_REAL

"largest_real"

Smallest real

SMALLEST_REAL

"smallest_real"

Largest imaginary

LARGEST_IMAGINARY

"largest_imaginary"

Smallest imaginary

SMALLEST_IMAGINARY

"smallest_imaginary"

Spectral transformation

It is observed that the algorithms used to solve the eigenvalue problems find solutions at the extremities of the spectrum. To improve the convergence, one need to compute the eigenpairs of a transformed operator. Those spectral transformations allow to compute solutions that are not on the boundary of the spectrum.

There are 3 types of spectral transformation:

Shift

\$A-\sigma I\$ or \$B^{-1}A-\sigma I\$

Shift and invert

\$(A-\sigma I)^{-1}\$ or \$(A-\sigma B)^{-1}B\$

Cayley

\$(A-\sigma I)^{-1}(A+\nu I)\$ or \$(A-\sigma B)^{-1}(A+\nu B)\$

By default, shift and invert is used. You can change it with --solvereigen.transform.

Table 6. Table of spectral transformation
Spectral transformation SpectralTransformationType command line key

Shift

SHIFT

shift

Shift and invert

SINVERT

shift_invert

Cayley

CAYLEY

cayley

Eigensolvers

The details of the implementation of the different solvers can be found in the SLEPc Technical Reports.

The default solver is Krylov-Schur, but can be modified using EigenSolverType or the option --solvereigen.solver.

Table 7. Table of eigensolver
Solver EigenSolverType command line key

Power

POWER

power

Lapack

LAPACK

lapack

Subspace

SUBSPACE

subspace

Arnoldi

Arnoldi

arnoldi

Lanczos

LANCZOS

lanczos

Krylov-Schur

KRYLOVSCHUR

krylovschur

Arpack

ARPACK

arpack

Be careful that all solvers can not compute all the problem types and positions of the spectrum. The possibilities are summarize in the following table.

Table 8. Supported problem type for the eigensolvers
Solver Position of spectrum Problem type

Power

Largest magnitude

any

Lapack

any

any

Subspace

Largest magnitude

any

Arnoldi

any

any

Lanczos

any

standard and generalized Hermitian

Krylov-Schur

any

any

Arpack

any

any

Special cases of spectrum

Computing a large portion of the spectrum

In the case where you want compute a large number of eigenpairs, the rule for ncv implies a huge amount of memory to be used. To improve the performance, you can set the mpd parameter, which will limit the dimension of the projected problem.

You can set it via the command line with --solvereigen.mpd <mpd>.

Computing all the eigenpairs in an interval

If you want to compute all the eigenpairs in a given interval, you need to use the option --solvereigen.interval-a to set the beginning of the interval and --solvereigen.interval-b to set the end.

In this case, be aware that the problem need to be generalized and hermitian. The solver will be set to Krylov-Schur and the transformation to shift and invert. Beside, you’ll need to use a linear solver that will compute the inertia of the matrix, this is set to Cholesky, with mumps if you can use it. + For now, this method is only implemented in the eigs function.