LA: allow to solve block linear systems efficiently #466

andrea-iob · 2024-05-27T07:55:46Z

Block matrices represent an important class of problems in numerical linear algebra and offer the possibility of far more efficient iterative solvers than just treating the entire matrix as black box. Following the common linear algebra definition, a block matrix is a matrix divided in a small, problem-size independent (two, three or so) number of very large blocks. These blocks arise naturally from the underlying physics or discretization of the problem, for example, they may be associated with different variables of a physical problem. Under a certain numbering of unknowns the matrix can be written as

                         [ A00  A01  A02 ]
                         [ A10  A11  A12 ]
                         [ A20  A21  A22 ]

where each A_ij is an entire block (see the paragraph "Solving Block Matrices" in the PETSc manual).

When assembling the matrix, a monolithic matrix should be provided. For example, assuming to group the elements of the matrix in five-by-five groups (here, five is the so-called block size of the matrix and usually rises when coupling variables with different meaning, for example pressure, the three components of the velocity and temperature) the assembler will provide the following system:

    [           |           |          |           ]   [           ]     [           ]
    [  (5 x 5)  |  (5 x 5)  |    ...   |  (5 x 5)  ]   [  (5 x 5)  ]     [  (5 x 5)  ]
    [           |           |          |           ]   [           ]     [           ]
    [----------------------------------------------]   [-----------]     [-----------]
    [           |           |          |           ]   [           ]     [           ]
    [    ...    |    ...    |    ...   |  (5 x 5)  ]   [    ...    ]  =  [    ...    ]
    [           |           |          |           ]   [           ]     [           ]
    [----------------------------------------------]   [-----------]     [-----------]
    [           |           |          |           ]   [           ]     [           ]
    [  (5 x 5)  |  (5 x 5)  |    ...   |  (5 x 5)  ]   [  (5 x 5)  ]     [  (5 x 5)  ]
    [           |           |          |           ]   [           ]     [           ]

Internally, the monolithic matrix will be split into blocks. For example, considering two splits, the first one that group together the first four variables and the second one that holds the last variable (i.e., split sizes equal to [4, 1]), the internal split system will be:

      [                     |                     ]  [           ]     [           ]
      [  (4 * N) x (4 * M)  |  (4 * N) x (1 * M)  ]  [  (4 x M)  ]     [  (4 x N)  ]
      [                     |                     ]  [           ]     [           ]
      [-------------------------------------------]  [-----------]  =  [-----------]
      [                     |                     ]  [           ]     [           ]
      [  (1 * N) x (4 * M)  |  (1 * N) x (1 * M)  ]  [  (1 x M)  ]     [  (1 x N)  ]
      [                     |                     ]  [           ]     [           ]

where M and N are the number of rows and columns respectively.

The PETSc PCFIELDSPLIT preconditioner is used to solve the split system. There are different split strategies available:

DIAGONAL: this strategy assumes that the only non-zero blocks are the diagonal ones.

Considering a two-by-two block block matrix

                           [ A00    0  ]
                           [  0    A11 ],

the preconditioned problem will look like

          [ KSPSolve(A00)        0         ]  [ A00    0  ]
          [      0           KSPSolve(A00) ]  [  0    A11 ],

in other words the preconditioner is:

                                 ( [ A00    0  ] )
             approximate inverse (               )
                                 ( [  0    A11 ] ).

The system is solved efficiently by solving each block independently from the others.

Blocks are solved using a flexible GMRES iterative method. If the system is partitioned each block is preconditioned using the (restricted) additive Schwarz method (ASM). On each block of the ASM preconditioner an incomplete LU factorization (ILU) is used. There is one ASM block per process. If the system is not partitioned it is preconditioned using the incomplete LU factorization (ILU).

LOWER: this strategy assumes that the only non-zero blocks are the ones on an below the diagonal.

Considering a two-by-two block block matrix

                           [ A00    0  ]
                           [ A01   A11 ],

the preconditioner is

                                 ( [ A00    0  ] )
             approximate inverse (               )
                                 ( [ A01   A11 ] ).

The system is solved efficiently by first solving with A00, then applying A01 to that solution, removing it from the right hand side of the second block and then solving with A11, in other words

              [ I      0   ]  [    I    0 ]  [  A00^-1  0 ]
              [ 0   A11^-1 ]  [ - A10   I ]  [   0      I ].

This strategy can be seen as a "block" Gauss-Seidel with the blocks being the splits.

Blocks are solved using a flexible GMRES iterative method. If the system is partitioned each block is preconditioned using the (restricted) additive Schwarz method (ASM). On each block of the ASM preconditioner an incomplete LU factorization (ILU) is used. There is one ASM block per process. If the system is not partitioned it is preconditioned using the incomplete LU factorization (ILU).

FULL: this strategy doesn't make assumptions on the structure of the blocks.

Considering a two-by-two block block matrix

                           [ A00   A01 ]
                           [ A11   A11 ],

the preconditioned problem will look like

          [ KSPSolve(A00)        0         ]  [ A00   A01 ]
          [      0           KSPSolve(A00) ]  [ A11   A11 ],

in other words the preconditioner is:

                                 ( [ A00    0  ] )
             approximate inverse (               )
                                 ( [  0    A11 ] )

The preconditioner is evaluated considering only the diagonal blocks and then the full
system is solved.

The system is solved using a flexible GMRES iterative method. If the system is partitioned each diagonal block is preconditioned using the (restricted) additive Schwarz method (ASM). On each block of the ASM preconditioner an incomplete LU factorization (ILU) is used. There is one ASM block per process. If the system is not partitioned it is preconditioned using the incomplete LU factorization (ILU).

There is not much added value in supporting the case where the assembler block size is greater than one but the matrix block size is one. Matrix update can be simplified by enforcing that the assembler and the matrix have the same block size.

Otherwise PETSc may use a different partitioning when restoring a matrix/vector.

…of the system solver

…tations If the system is partitioned, each process can reorder only it's local part of the matrix. This is a limitation of the VecPermute PETSc function that does not support parallel Index Sets with non-local permutations.

src/LA/system_solvers_large.cpp

Instead of inheriting from std::iterator, iterator traits are explicitly defined in the class declaration.

…ncil-based linear system

…ith the same name

andrea-iob added 22 commits May 21, 2024 10:15

IO: check if binary archives were opened properly

f5e78a2

IO: add defualt constructors to the binary archives

7924d30

LA: add missing include

b262288

LA: simplify matrix update

dfdd1bd

There is not much added value in supporting the case where the assembler block size is greater than one but the matrix block size is one. Matrix update can be simplified by enforcing that the assembler and the matrix have the same block size.

LA: optimize update of system matrix

98fa9fa

LA: remove deprecated methods

f74864c

LA: fill KSP status when handling post-solve actions

d7197bf

LA: remove nullspace when handling pre-solve actions

25feaab

LA: rework PETSc KSP options

c088b15

LA: allow to setup preconditioner and Krylov subspace method separately

88de03f

LA: add functions to interact to PETSc matrices/vectors

7a7c077

LA: store dump system information in a binary archive

21322eb

LA: store matrix/vector sizes among dump/restore information

97196ff

Otherwise PETSc may use a different partitioning when restoring a matrix/vector.

LA: rename some variables

29264f9

LA: store parallel information in the system solver assembler

05b24d9

LA: improve Doxygen documentation

e37f401

LA: consolidate handling of KSP options and status

d52e962

LA: add templated methods to assembly/update the system matrix

d28f5a8

LA: rename some variables

4f6e20e

LA: add function to fill matrix/vectors from files

77ca10a

LA: avoid using deprecated PETSc macros

74955b6

LA: fix reordering of vectors for transpose systems

ae67913

andrea-iob added enhancement M - LA M - discretization labels May 27, 2024

andrea-iob requested a review from marcocisternino May 27, 2024 07:55

andrea-iob self-assigned this May 27, 2024

andrea-iob force-pushed the LA.split.system.solver branch from fb5e7d2 to 4e96b7a Compare May 27, 2024 18:34

andrea-iob added 2 commits May 28, 2024 12:48

examples/LA: merge the examples that demonstrate the functionalities …

e80637a

…of the system solver

examples/LA: enable algebraic multigrid preconditioner

d3c044c

andrea-iob added 3 commits May 28, 2024 12:48

LA: add functions for dumping/restoring system information

5c8c52d

LA: use proper size for RHS and solution vectors

789d208

andrea-iob force-pushed the LA.split.system.solver branch 2 times, most recently from ee7a287 to 9da3f6c Compare May 29, 2024 18:24

marcocisternino reviewed May 30, 2024

View reviewed changes

src/LA/system_solvers_large.cpp Show resolved Hide resolved

andrea-iob added 17 commits May 31, 2024 14:33

LA: implement a class for solving split systems

32db634

containers: avoid using deprecated std::iterator

eb1f0ce

Instead of inheriting from std::iterator, iterator traits are explicitly defined in the class declaration.

LA: add a class for handling discrete stencil weights

3b020c0

LA: introduce some helper aliases

0eb90f0

LA: add value type to DiscreteStencil template arguments

c6bc917

LA: avoid specializing DiscretizationStencilSolverAssembler

be9693d

LA: deduce stencil type from function arguments

092fb6d

discretization: use base class constructors

da99d86

discretization: remove unused member

342e2a9

discretization: allow to use a custom system solver for solving a ste…

f93be81

…ncil-based linear system

examples: polymorphic exceptions should be catch by reference

23c2920

LA: allow to redistribute the system during restore

6eb7737

cmake: suppress MSCV warning about inheritance via dominance

4bd7b35

PABLO: avoid using deprecated standard library templates

0d5c4bd

LA: set KSP convergence monitor options before setting up the KSP

d1a86f4

LS: properly set up KSP residual monitor options for the split system

db65fce

discretization: avoid hiding virtual functions with other functions w…

1b15ad6

…ith the same name

andrea-iob force-pushed the LA.split.system.solver branch from 9da3f6c to 1b15ad6 Compare May 31, 2024 12:34

andrea-iob requested a review from marcocisternino May 31, 2024 12:35

marcocisternino approved these changes Jun 5, 2024

View reviewed changes

andrea-iob merged commit 3302bc7 into master Jun 7, 2024
10 checks passed

andrea-iob deleted the LA.split.system.solver branch June 7, 2024 09:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LA: allow to solve block linear systems efficiently #466

LA: allow to solve block linear systems efficiently #466

andrea-iob commented May 27, 2024

LA: allow to solve block linear systems efficiently #466

LA: allow to solve block linear systems efficiently #466

Conversation

andrea-iob commented May 27, 2024