Error in a nonlinear simulation in parallel, using domain decomposition

I am solving a nonlinear unsteady hygro-thermal model with non-homogenous boundary condition using the “macro_ddm.idp” framework.
The PDE are given in Exploring Historical Perspectives in Building Hygrothermal Models: A Comprehensive Review eq (22) and (23)

I have no problem when running sequential simulation with “Mums_seq”, however when I try to run a parallel simulation I have an error message when constructing my matrix after a number of iteration.
Depending on the mesh and the number of process the error appears more or less rapidly. I have tried with MUMPs and PETSc, the only difference, is that I have an error message with MUMPs, as with PETSc the simulation is just stuck.

The solutions does seem to have diverged, but it 's seem that a some point, it is not possible to build the matrices anymore. I have no error when running the MPI code on one process, so I am assuming that the problem is from the domain decomposition.

I understand that it could be harder to solve nonlinear problem with domain decomposition, but is there a way to have a more robust simulation, maybe with a specific preconditioner . I am not a specialist, so I don’t known if the information can be easily found or not.

I just took my sequential code and made the changes using the tutorial and examples on the webpage of by Pierre Jolivet.

Here is the message error with MUMPs

Assertion failed in file src/mpi/comm/contextid.c at line 239: mask[idx] & (1U << bitpos)
** ERROR RETURN ** FROM DMUMPS INFO(1)= -3

Could you share a minimal working example?

I will try to simplify the test case so that I can share a code.

I may have found a solution to my problem when I was trying to simplified the problem. I noticed that for some process the solution were actually “diverging” so instead of computing the biais between two iteration at the local domain, I computed the biais in the global space, and now it seems to be ok.

I don’t if there is a way to avoid the reconstruction on the full domain, here in the attached file there is a simplified version of the code (the simulation get stuck after time step n °187)
Simplified_Test1.edp (13.7 KB)

Your “simplified” code is 300+ lines of code, so I can’t give you the most accurate answer, but you definitely do not need to recompute the full solution.

Here a more simplified version of the model.
I kept the same form of nonlinear terms and boundary condition.
With this simplified version the code does not seems to be diverging.
Simplified_Testv2.edp (4.9 KB)

I removed the reconstruction of the solutions over the all domain since it may not be necessary, and simplified all the physics.
I only kept what I thought was necessary ie the nonlinear terms, as well as the non classical boundary condition, but in a way much simpler form.
But since the physics has been simplified a lot, I can’t not mimic the bug that I had with the more “complex” exemple.

Well, the computation of your stopping criterion is wrong, it should be a global value, obtained with, e.g., a mpiAllReduce().

So do I have to actually reconstruct the solution over the all domain at each iteration to compute the norm, or can I just compute local norms, and then take the maximal value over all the subdomain as stopping criterion since I am computing a l infinity norm.
In this case, is the correct syntax
mpiAllReduce(MaxLocal, MaxGlobal, mpiCommWorld, mpiMAX);
in order to get the maximal value over all subdomain in “MaxGlobal”, if in MaxLocal I have the local l infinity norm ?

Do I have to compute the local norm over all the subdomain (including the overlapping region).
I am not sure I really understood what is in the dof within the overlapping region.

You do not need to reconstruct the global solution just to get its norm.