SNES for bordered block matrix

Hi everyone,

I am working on solving a nonlinear problem using FreeFEM with PETSc/SNES. The linearized system solved at each Newton step results in a matrix with a bordered block structure that looks like this:

Mat A = [[J, 0, jl],
         [H, J, hl],
         [0, q', 0]];

where jl, hl, and q are real[int] and J and H are square Mat objects.

I have created functions to evaluate the residual and Jacobian associated with the block system for SNESSolve, but I am unsure how to set the KSP parameters to make the fieldsplit work properly. This problem should be efficiently solved with an exact block LU factorization. I have implemented this block factorization by hand using in PETSc using only KSP objects, but I would like to “upgrade” to the more robust solvers within SNES.

I’ve tried the following set of parameters:

set(A, sparams = "-pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_detect_saddle_point -ksp_view");

but the solver crashes before PETSc delivers any information about the matrix. I am curious if there is an obvious parameter set that I should be using for such a problem, or if there is a helpful command I can use to ease the debugging process. I would also be interested in any examples that use bordered matrices with SNES.

Cheers,
Chris

I’m not sure why this would not work. Would it be possible for you to share a minimal working reproducer?

Hi Pierre, thank you for the quick response!
Perhaps I am doing something else wrong then. :slight_smile:
Here is a MWE that gives me the same error. I do not expect SNES to converge in this example, but the block matrix structure is the same.
foldcomputeSNES_MWE.edp (3.2 KB)

OK, so, multiple reason why it’s no working. First, when you initially feed A to SNESSolve(), it’s not assembled, thus, PETSc cannot look for the zero terms on the diagonal needed by -pc_fieldsplit_detect_saddle_point. That’s why you get:

[0]PETSC ERROR: Object is in wrong state
[0]PETSC ERROR: Not for unassembled matrix
[...]
[0]PETSC ERROR: #1 MatFindZeroDiagonals() at petsc/src/mat/interface/matrix.c:10350
[0]PETSC ERROR: #2 PCFieldSplitSetDefaults() at petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c:477
[0]PETSC ERROR: #3 PCSetUp_FieldSplit() at petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c:587

An easy way to bypass the issue is to simply assemble a dummy Jacobian with the proper structure before entering SNESSolve().
Second, that’s not all, sadly. When updating the Jacobian in funcJacobian, you write A = [[J,0,Jl],[H,J,Hl],[0,qT',0]]; which will effectively destroy the previous A and create a new one, such that informations like the attached KSP are also lost. This won’t work, because SNESSolve() expect the input Mat to remain at the same place in memory (of course you are free to update the values but not delete the Mat and allocate it somewhere in memory – that’s a limitation of the FreeFEM interface).
To fix this, you’ll want to have a Mat for all the blocks, and then update each block instead of the global Mat. For J and H, that’s easy, that’s what you already do.
For the vectors, it’s slightly more technical and will require some bookkeeping for now, sorry, but at least we will be able to check that my intuition is correct (and that the root of the problem is indeed what I described above). Here is one way to convert a column-vector to a PETSc Mat, using notations from your script.

{
      real[int] JlP, Jl = vJl(0, XMh, tgv = -10);
      ChangeNumbering(J, Jl, JlP); // FreeFEM to PETSc
      real[int, int] JlPmd(JlP.n, 1); // vector to dense array
      JlPmd(:, 0) = JlP; // assigning the one and only column of the dense array
      matrix JlPms = JlPmd; // dense array to sparse matrix
      Mat JlPM(JlPms.n, mpirank == 0 ? 1 : 0); // PETSc Mat with a single column, row-distribution is the same as J
      JlPM = JlPms; // assemble PETSc Mat with FreeFEM matrix
      ObjectView(JlPM, format = "info"); // making sure the dimensions are OK
}

Do you see where I’m going with this? Could you please try to expand on your MWE to try to use only Mat in the global nest?

Thanks for the quick response, Pierre. Here is a modified code which tries to implement what you’ve suggested. I am still getting the same error, but I’m not convinced that I’ve done everything correctly. The changes from the previous file are at lines 64-71 and inside the funcJacobian. Does anything you see here jump out as incorrect?
foldcomputeSNES_MWE2.edp (5.2 KB)

Looks good. I don’t know why -pc_fieldsplit_detect_saddle_point is still generating incoherent index sets.
I’d advice you use instead:

real[int] list(A.n);
list = 2.0;
list(0:2*J.n-1) = 1.0;
set(A, sparams = "-pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_detect_saddle_point false -pc_fieldsplit_schur_precondition self -ksp_view -ksp_monitor", fields = list);

Also, when updating a Mat using a matrix, the notation PETSc = FreeFEM is convenient, but we do not know that the Mat belongs to a nest. You need to change to the lower-level API ChangeOperator(PETSc, FreeFEM, parent = A);. This will ensure that the Jacobian gets properly assembled when all sub-blocks have been updated.

Additionally, you may need this fix to get the code running.

diff --git a/plugin/mpi/PETSc-code.hpp b/plugin/mpi/PETSc-code.hpp
index 398a46b0..4ca71eaa 100644
--- a/plugin/mpi/PETSc-code.hpp
+++ b/plugin/mpi/PETSc-code.hpp
@@ -775,3 +775,3 @@ namespace PETSc {
                   }
-                  if(i < std::min(N, M)) {
+                  if(assembled && i < std::min(N, M)) {
                       PC parent;

Great! Thank you so much for the help. That seems to solve the indexing issue. Here is a MWE that fails for a different reason. Any tips are always welcome, but I’m more confident that I’ll be able to figure this one out on my own. :slight_smile:

Much appreciated, Pierre!

foldcomputeSNES_MWE3.edp (5.4 KB)

I don’t get a failure here (be sure to apply my above patch do the PETSc plugin and recompile PETSc.cpp), though the solver is probably not behaving as it should:

  0 SNES Function norm 1.000000000000e+00 
  0 KSP Residual norm 1.000000000000e+00 
  1 KSP Residual norm            nan 

At the end of the Jacobian update, if I add:

      Mat C;
      MatConvert(A, C);
      ObjectView(C);

I can see that the last line (row 19106) is 0:

[...]
row 19105: (9538, 0.)  (9539, 0.)  (9540, 0.)  (9541, 0.)  (9542, 0.)  (9543, 0.)  (9544, 0.)  (9545, 0.)  (9546, 0.)  (9547, 0.)  (9548, 0.)  (9549, 0.)  (9550, 0.)  (9551, 0.)  (9552, 0.)  (19091, 0.)  (19092, 0.)  (19093, 0.)  (19094, 0.)  (19095, 0.)  (19096, 0.)  (19097, 0.)  (19098, 0.)  (19099, 0.)  (19100, 0.)  (19101, 0.)  (19102, 0.)  (19103, 0.)  (19104, 0.)  (19105, 1.) 
row 19106:
  0 KSP Residual norm 1.000000000000e+00 
[...]

Probably not correct?

Thanks. Yes, I think may partly be a problem with the example I picked for the MWE, since this system doesn’t have a nearby saddle-node bifurcation.

However, using -ksp_view, I did notice that the system is using -pc_type GMRES for the blocks even though I set(J, sparams = " - ksp_type preonly -pc type lu");.

I think this is just a problem of my understanding of how to handle fieldsplit/matnest and not a problem specific to this case. I will have a look through the examples and see if I find any clues…

By the way, I think the problem in the last row was related to the fact that I didn’t initialize um[]. Here’s a version where the last row makes sense:
foldcomputeSNES_MWE4.edp (4.8 KB)

Setting options on J does not make sense. You are using a schur field split. There are thus only two blocks. J is not that block, it’s only the diagonal of the block. You need to prefix your options with -fieldsplit_0_ and set them directly in set(A, ...).

I see. Is there a way to solve a matnest with an exact block lu factorization? In this case, the (0,0) block system [[J, 0],[H, J]] is block-lower-triangular, so the system can be efficiently solved that way using only a factorization of J, without factoring the whole (0,0) block.

Yes, it’s possible, but you’ll probably want to use a shell preconditioner (-fieldsplit_0_pc_type shell or something similar) where you implement this “optimization” yourself, because PETSc can’t do this for you.

1 Like

Once you’ve got the SNESSolve() figured out, if you want to optimize this, get in touch with me in private, there will be some nontrivial stuff to add, but all this is definitely doable :slight_smile:

1 Like

Ok, I’ve taken a step in a different (hopefully simpler) direction to try to work this out. I’m trying to write a (crude) Moore-Penrose numerical continuation scheme that uses SNESSolve. The crux of this problem leads to a bordered matrix of the form:

A = [[J, Jl],[yq',yl]];

where again J is a sparse matrix, Jl and yq are vectors, and yl is a scalar. (all are Mat objects).

Since this is only a 2x2 block system, I wanted to solve this directly using the Schur factorization. However, I am having trouble telling PETSc how to handle the scalar yl in the fieldsplit. Any suggestions?
continueSNES_MWE.edp (6.1 KB)

You really need to start using prefixes. The options for the KSP of A are being overwritten by those of J and vice versa, it’s very difficult to debug. Here is a script that seems to be running.
continueSNES_MWE.edp (6.0 KB)

My apologies for the mess. That is a nice trick, thanks for showing me.

I think I am getting closer. I’ve done a few things to simplify the solution process here. The block matrix is now defined with a constant (2,2) block and the variations are rescaled into the other part of the last row.

With this implementation, there is some weird behavior on the first iteration, but the solver converges with 1 MPI process. It does not converge with 2 or more MPI processes.
continueSNES_MWE2.edp (5.4 KB)

I’ve made a little bit of cleaning in your code. You were setting the solve options for J twice, so I removed one set(), and I also changed the prefix for J_ to fieldsplit_0_. When FreeFEM is converting a full matrix to a sparse matrix, it does not insert zeros. But this causes problems down the road because if the matrix becomes nonzero, PETSc is complaining about inserting a value at a new position. I’ve bypassed this using a small trick that you should have no problem figuring out. Now, the only thing I can’t figure out is why you do stuff like:

  real nu0;
  if(mpirank == 0) nu0 = q(q.n-1);
  mpiReduce(nu0, nu, processor(0, mpiCommWorld), mpiSUM);

Don’t you mean to use a broadcast here? The result of this operation is undefined since nu0 is not initialized on processes other than #0.
continueSNES_MWE2.edp (5.6 KB)

Perfect. Yes, that fixed my issues. And yes, I did mean broadcast here. Thank you for your help!

Sorry to bring this back up, but I have found some “random” behavior in this code (I also found this in the code you sent). If I execute the exact same code several times, I get different behavior on certain runs, where some of the SNES iterations diverge. Do you have any ideas about what might be causing this?

EDIT: This issue does not arise with 1 MPI process. Only in parallel.

continueSNES_MWE3. edp (5.2 KB)