Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LA: allow to solve block linear systems efficiently #466

Merged
merged 44 commits into from
Jun 7, 2024

Conversation

andrea-iob
Copy link
Member

Block matrices represent an important class of problems in numerical linear algebra and offer the possibility of far more efficient iterative solvers than just treating the entire matrix as black box. Following the common linear algebra definition, a block matrix is a matrix divided in a small, problem-size independent (two, three or so) number of very large blocks. These blocks arise naturally from the underlying physics or discretization of the problem, for example, they may be associated with different variables of a physical problem. Under a certain numbering of unknowns the matrix can be written as

                         [ A00  A01  A02 ]
                         [ A10  A11  A12 ]
                         [ A20  A21  A22 ]

where each A_ij is an entire block (see the paragraph "Solving Block Matrices" in the PETSc manual).

When assembling the matrix, a monolithic matrix should be provided. For example, assuming to group the elements of the matrix in five-by-five groups (here, five is the so-called block size of the matrix and usually rises when coupling variables with different meaning, for example pressure, the three components of the velocity and temperature) the assembler will provide the following system:

    [           |           |          |           ]   [           ]     [           ]
    [  (5 x 5)  |  (5 x 5)  |    ...   |  (5 x 5)  ]   [  (5 x 5)  ]     [  (5 x 5)  ]
    [           |           |          |           ]   [           ]     [           ]
    [----------------------------------------------]   [-----------]     [-----------]
    [           |           |          |           ]   [           ]     [           ]
    [    ...    |    ...    |    ...   |  (5 x 5)  ]   [    ...    ]  =  [    ...    ]
    [           |           |          |           ]   [           ]     [           ]
    [----------------------------------------------]   [-----------]     [-----------]
    [           |           |          |           ]   [           ]     [           ]
    [  (5 x 5)  |  (5 x 5)  |    ...   |  (5 x 5)  ]   [  (5 x 5)  ]     [  (5 x 5)  ]
    [           |           |          |           ]   [           ]     [           ]

Internally, the monolithic matrix will be split into blocks. For example, considering two splits, the first one that group together the first four variables and the second one that holds the last variable (i.e., split sizes equal to [4, 1]), the internal split system will be:

      [                     |                     ]  [           ]     [           ]
      [  (4 * N) x (4 * M)  |  (4 * N) x (1 * M)  ]  [  (4 x M)  ]     [  (4 x N)  ]
      [                     |                     ]  [           ]     [           ]
      [-------------------------------------------]  [-----------]  =  [-----------]
      [                     |                     ]  [           ]     [           ]
      [  (1 * N) x (4 * M)  |  (1 * N) x (1 * M)  ]  [  (1 x M)  ]     [  (1 x N)  ]
      [                     |                     ]  [           ]     [           ]

where M and N are the number of rows and columns respectively.

The PETSc PCFIELDSPLIT preconditioner is used to solve the split system. There are different split strategies available:

  1. DIAGONAL: this strategy assumes that the only non-zero blocks are the diagonal ones.

Considering a two-by-two block block matrix

                           [ A00    0  ]
                           [  0    A11 ],

the preconditioned problem will look like

          [ KSPSolve(A00)        0         ]  [ A00    0  ]
          [      0           KSPSolve(A00) ]  [  0    A11 ],

in other words the preconditioner is:

                                 ( [ A00    0  ] )
             approximate inverse (               )
                                 ( [  0    A11 ] ).

The system is solved efficiently by solving each block independently from the others.

Blocks are solved using a flexible GMRES iterative method. If the system is partitioned each block is preconditioned using the (restricted) additive Schwarz method (ASM). On each block of the ASM preconditioner an incomplete LU factorization (ILU) is used. There is one ASM block per process. If the system is not partitioned it is preconditioned using the incomplete LU factorization (ILU).

  1. LOWER: this strategy assumes that the only non-zero blocks are the ones on an below the diagonal.

Considering a two-by-two block block matrix

                           [ A00    0  ]
                           [ A01   A11 ],

the preconditioner is

                                 ( [ A00    0  ] )
             approximate inverse (               )
                                 ( [ A01   A11 ] ).

The system is solved efficiently by first solving with A00, then applying A01 to that solution, removing it from the right hand side of the second block and then solving with A11, in other words

              [ I      0   ]  [    I    0 ]  [  A00^-1  0 ]
              [ 0   A11^-1 ]  [ - A10   I ]  [   0      I ].

This strategy can be seen as a "block" Gauss-Seidel with the blocks being the splits.

Blocks are solved using a flexible GMRES iterative method. If the system is partitioned each block is preconditioned using the (restricted) additive Schwarz method (ASM). On each block of the ASM preconditioner an incomplete LU factorization (ILU) is used. There is one ASM block per process. If the system is not partitioned it is preconditioned using the incomplete LU factorization (ILU).

  1. FULL: this strategy doesn't make assumptions on the structure of the blocks.

Considering a two-by-two block block matrix

                           [ A00   A01 ]
                           [ A11   A11 ],

the preconditioned problem will look like

          [ KSPSolve(A00)        0         ]  [ A00   A01 ]
          [      0           KSPSolve(A00) ]  [ A11   A11 ],

in other words the preconditioner is:

                                 ( [ A00    0  ] )
             approximate inverse (               )
                                 ( [  0    A11 ] )

The preconditioner is evaluated considering only the diagonal blocks and then the full
system is solved.

The system is solved using a flexible GMRES iterative method. If the system is partitioned each diagonal block is preconditioned using the (restricted) additive Schwarz method (ASM). On each block of the ASM preconditioner an incomplete LU factorization (ILU) is used. There is one ASM block per process. If the system is not partitioned it is preconditioned using the incomplete LU factorization (ILU).

andrea-iob added 22 commits May 21, 2024 10:15
There is not much added value in supporting the case where the
assembler block size is greater than one but the matrix block
size is one. Matrix update can be simplified by enforcing that
the assembler and the matrix have the same block size.
Otherwise PETSc may use a different partitioning when restoring
a matrix/vector.
…tations

If the system is partitioned, each process can reorder only
it's local part of the matrix. This is a limitation of the
VecPermute PETSc function that does not support parallel
Index Sets with non-local permutations.
@andrea-iob andrea-iob force-pushed the LA.split.system.solver branch 2 times, most recently from ee7a287 to 9da3f6c Compare May 29, 2024 18:24
@andrea-iob andrea-iob force-pushed the LA.split.system.solver branch from 9da3f6c to 1b15ad6 Compare May 31, 2024 12:34
@andrea-iob andrea-iob merged commit 3302bc7 into master Jun 7, 2024
10 checks passed
@andrea-iob andrea-iob deleted the LA.split.system.solver branch June 7, 2024 09:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants