Stall in some multi-node parallel calculations

No, I applied them yesterday and didn’t change anything after that. And stokes-2d-PETSc.edp contains the line include "macro_ddm.idp". Could it be that stokes-2d-PETSc.edp is such that the for loop or the if statement in the part with true neighbor and false neighbor is not visited?

Did you edit the .edp as I told you (createMat instead of buildMat)? There is a cout for both the if and the else, if nothing is printed, it means there are no intersection between subdomains, but then there is output from the PETSc plugin saying there are intersections, so that’s not possible.

Yes, the modified .edp is the following:

//  run with MPI:  ff-mpirun -np 4 script.edp
// NBPROC 4

load "PETSc"                        // PETSc plugin
macro dimension()2// EOM            // 2D or 3D
include "macro_ddm.idp"             // additional DDM functions

macro def(i)[i, i#B, i#C]// EOM     // vector field definition
macro init(i)[i, i, i]// EOM        // vector field initialization
macro grad(u)[dx(u), dy(u)]// EOM   // two-dimensional gradient
real Sqrt = sqrt(2.);
macro epsilon(u)[dx(u), dy(u#B), (dy(u) + dx(u#B)) / Sqrt]// EOM
macro div(u)(dx(u) + dy(u#B))// EOM
func Pk = [P2, P2, P1];             // finite element space

mesh Th;
{
    mesh ThGlobal = square(getARGV("-global", 40), getARGV("-global", 40), [x, y]);    // global mesh
    ThGlobal = trunc(ThGlobal, (x < 0.5) || (y < 0.5), label = 5);
    Th = movemesh(ThGlobal, [-x, y]);
    Th = ThGlobal + Th;
}
Mat A;
//buildMat(Th, getARGV("-split", 1), A, Pk, mpiCommWorld)
  createMat(Th, A, Pk);

fespace Wh(Th, Pk);                 // local finite element space
varf vPb([u, uB, p], [v, vB, q]) = int2d(Th)(grad(u)' * grad(v) + grad(uB)' * grad(vB) - div(u) * q - div(v) * p + 1e-10 * p * q) + on(1, 3, 5, u = 0, uB = 0) + on(2, u = y*(0.5-y), uB = 0);
real[int] rhs = vPb(0, Wh);

set(A, sparams = "-pc_type lu -pc_factor_mat_solver_type mumps");
Wh<real> def(u);

A = vPb(Wh, Wh);
u[] = A^-1 * rhs;

macro def2(u)[u, u#B]// EOM
macro def1(u)u// EOM
plotMPI(Th, def2(u), [P2, P2], def2, real, cmm = "Global velocity")
plotMPI(Th, uC, P1, def1, real, cmm = "Global pressure")

Let me retry interactively, just to make sure.
By the way, it doesn’t matter if one puts a final semicolon or not after createMat(Th, A, Pk)?

No, it’s a macro so that’s not big deal. You can put a random cout in macro_ddm.idp near the mpiWaitAny that you changed, my guts tell me it will never be displayed, you probably have another macro_ddm.idp file in your system that is being used instead.

Ok, for some reason line 251 is for(...) mpiWaitAll(rq);, but it should be for(...) mpiWaitAny(rq);, is that correct?

Yes, that is correct.

Ok, my mistake, I misread this part:

So, coming back to where we were yesterday evening, here is the output for my code on 2x2 processes (runs to completion, but hangs on 2x8 processes):

-- FreeFem++ v4.9 (Fri Jun 18 14:45:02 CEST 2021 - git v4.9)
 Load: lg_fem lg_mesh lg_mesh3 eigenvalue parallelempi 
 load: init metis (v  5 )
 (already loaded: msh3) sizestack + 1024 =9128  ( 8104 )

0 true neighbor is 1 (1)
0 true neighbor is 2 (2)
0 false neighbor bis is 3 (2), eps = 0.502167, epsTab = 0
0: 
2	
	  1	  2	
2747	2047	
rank 0 sending/receiving 2747 to 1
1 true neighbor is 0 (1)
1 true neighbor is 2 (2)
1: 
2	
	  0	  2	
2747	2995	
rank 1 sending/receiving 2747 to 0
rank 0 sending/receiving 2047 to 2
3 false neighbor bis is 0 (0), eps = 0.445833, epsTab = 0
3 true neighbor is 2 (1)
3: 
1	
	  2	
2950	
rank 3 sending/receiving 2950 to 2
rank 1 sending/receiving 2995 to 2
rank 0 received from 1 (1) with tag 0 and count 2747
rank 1 received from 0 (0) with tag 0 and count 2747
rank 0 received from 2 (2) with tag 0 and count 2047
rank 1 received from 2 (2) with tag 0 and count 2995
Done
Done
2 true neighbor is 0 (1)
2 true neighbor is 1 (2)
2 true neighbor is 3 (3)
2: 
3	
	  0	  1	  3	
2047	2995	2950	
rank 2 sending/receiving 2047 to 0
rank 2 sending/receiving 2995 to 1
rank 2 sending/receiving 2950 to 3
rank 3 received from 2 (2) with tag 0 and count 2950
rank 2 received from 3 (3) with tag 0 and count 2950
rank 2 received from 0 (0) with tag 0 and count 2047
rank 2 received from 1 (1) with tag 0 and count 2995
Done
Done

And here is the output for stokes-2d-PETSc.edp (modified with createMat instead of buildMat) on 2x2 processes (hangs):

-- FreeFem++ v4.9 (Fri Jun 18 14:45:02 CEST 2021 - git v4.9)
 Load: lg_fem lg_mesh lg_mesh3 eigenvalue parallelempi 
 load: init metis (v  5 )
 sizestack + 1024 =10800  ( 9776 )

  -- Square mesh : nb vertices  =1681 ,  nb triangles = 3200 ,  nb boundary edges 160
  -- Square mesh : nb vertices  =1681 ,  nb triangles = 3200 ,  nb boundary edges 160
  -- Square mesh : nb vertices  =1681 ,  nb triangles = 3200 ,  nb boundary edges 160
  -- Square mesh : nb vertices  =1681 ,  nb triangles = 3200 ,  nb boundary edges 160
rank 2 sending/receiving 426 to 3
rank 1 sending/receiving 480 to 0
rank 1 sending/receiving 964 to 3
 --- global mesh of 4800 elements (prior to refinement) partitioned with metis  --metisA: 4-way Edge-Cut:       3, Balance:  1.01 Nodal=0/Dual 1
 (in 4.355907e-03)
 --- partition of unity built (in 2.466662e-01)
rank 0 sending/receiving 480 to 1
rank 1 received from 0 (0) with tag 0 and count 480
rank 0 received from 1 (1) with tag 0 and count 480
rank 1 received from 3 (3) with tag 0 and count 964
rank 3 sending/receiving 964 to 1
rank 3 sending/receiving 426 to 2
rank 2 received from 3 (3) with tag 0 and count 426
rank 3 received from 2 (2) with tag 0 and count 426
rank 3 received from 1 (1) with tag 0 and count 964
 --- global numbering created (in 5.788803e-03)
 --- global CSR created (in 7.145405e-04)
 Warning: -- Your set of boundary condition is incompatible with the mesh label.
 Warning: -- Your set of boundary condition is incompatible with the mesh label.
 Warning: -- Your set of boundary condition is incompatible with the mesh label.
 Warning: -- Your set of boundary condition is incompatible with the mesh label.

Putting a few cout in macro_ddm.idp suggests that my code calls the correct macro_ddm.idp but that stokes-2d-PETSc.edp does not… What can we do?

As I told you already, you have another macro_ddm.idp somewhere on your machine that you need to locate and modify as well. Probably in ../../idp/macro_ddm.idp (relatively to stokes-2d-PETSc.edp).

Output of find ./ -name "macro_ddm.idp":

./lib/ff++/4.9/idp/macro_ddm.idp
./FreeFem-sources/idp/macro_ddm.idp

In FreeFem-sources/idp/, after applying successively the 2 patches, and modifying line 251 of macro_ddm.idp as for(int debugI = 0; debugI < 2*intersection[0].n; ++debugI) mpiWaitAny(rq);, the output for stokes-2d-PETSc.edp is now:

-- FreeFem++ v4.9 (Fri Jun 18 14:45:02 CEST 2021 - git v4.9)
 Load: lg_fem lg_mesh lg_mesh3 eigenvalue parallelempi 
 load: init metis (v  5 )
 The Identifier intersection does not exist 

 Error line number 207, in file ../../idp/macro_ddm.idp, before  token intersection

  current line = 207 mpirank 0 / 4
Compile error : 
	line number :207, intersection
  current line = 207 mpirank 2 / 4
error Compile error : 
	line number :207, intersection
 code = 1 mpirank: 0
  current line = 207 mpirank 1 / 4
  current line = 207 mpirank 3 / 4

You’ve messed up the file, it runs fine on my machine.

Thank you for this constructive comment. So what can I do? Should the file in FreeFem-sources/idp/ be the same as in lib/ff++/4.9/idp/?

Note that diff lib/ff++/4.9/idp/macro_ddm.idp FreeFem-sources/idp/macro_ddm.idp returns:

1d0
< cout << "toto" << endl;
252d250
<         cout << "test 1" << endl;
254d251
<         cout << "test 2" << endl;
256d252
<             cout << "test 3" << endl;
260d255
<                 cout << "test 4" << endl;
268c263
<             } else cout << mpirank << " false neighbor bis is " << intersection[0][i] << " (" << numberIntersection << "), eps = " << eps << ", epsTab = " << epsTab[i] << "\n";
---
>             } else cout << mpirank << " false neighbor bis is " << intersection[0][i] << " (" << numberIntersection << ")\n";

So if is true that I forgot to modify one cout statement, I don’t think I messed up the file.

So what can I do?

You can remove the -ns flag and figure out why the variable intersection is not defined.

Now, although I have not modified macro_ddm.idp further, I can’t reproduce the error about intersection… The output is the following with the -ns flag:

-- FreeFem++ v4.9 (Fri Jun 18 14:45:02 CEST 2021 - git v4.9)
 Load: lg_fem lg_mesh lg_mesh3 eigenvalue parallelempi 
 load: init metis (v  5 )
 sizestack + 1024 =10824  ( 9800 )

  -- Square mesh : nb vertices  =1681 ,  nb triangles = 3200 ,  nb boundary edges 160
  -- Square mesh : nb vertices  =1681 ,  nb triangles = 3200 ,  nb boundary edges 160
  -- Square mesh : nb vertices  =1681 ,  nb triangles = 3200 ,  nb boundary edges 160
  -- Square mesh : nb vertices  =1681 ,  nb triangles = 3200 ,  nb boundary edges 160
2 true neighbor is 1 (1)
2 true neighbor is 3 (2)
2: 
2	
	  1	  3	
190	426	
rank 2 sending/receiving 190 to 1
 --- global mesh of 4800 elements (prior to refinement) partitioned with metis  --metisA: 4-way Edge-Cut:       3, Balance:  1.01 Nodal=0/Dual 1
 (in 4.319191e-03)
0 true neighbor is 1 (1)
 --- partition of unity built (in 2.465439e-01)
0: 
1	
	  1	
480	
rank 0 sending/receiving 480 to 1
1 true neighbor is 0 (1)
1 false neighbor bis is 2 (1)
1 true neighbor is 3 (2)
1: 
2	
	  0	  3	
480	964	
rank 1 sending/receiving 480 to 0
rank 1 sending/receiving 964 to 3
3 true neighbor is 1 (1)
3 true neighbor is 2 (2)
3: 
2	
	  1	  2	
964	426	
rank 3 sending/receiving 964 to 1
rank 3 sending/receiving 426 to 2
rank 0 received from 1 (1) with tag 0 and count 480
rank 2 sending/receiving 426 to 3
rank 1 received from 0 (0) with tag 0 and count 480
rank 2 received from 3 (3) with tag 0 and count 426
rank 1 received from 3 (3) with tag 0 and count 964
rank 3 received from 2 (2) with tag 0 and count 426
rank 3 received from 1 (1) with tag 0 and count 964

and without the -ns flag:

(...)
   41 :  sizestack + 1024 =10824  ( 9800 )

  -- Square mesh : nb vertices  =1681 ,  nb triangles = 3200 ,  nb boundary edges 160
  -- Square mesh : nb vertices  =1681 ,  nb triangles = 3200 ,  nb boundary edges 160
 --- global mesh of 4800 elements (prior to refinement) partitioned with metis  --metisA: 4-way Edge-Cut:       3, Balance:  1.01 Nodal=0/Dual 1
 (in 4.315138e-03)
0 true neighbor is 1 (1)
 --- partition of unity built (in 2.509263e-01)
0: 
1	
	  1	
480	
rank 0 sending/receiving 480 to 1
1 true neighbor is 0 (1)
1 false neighbor bis is 2 (1)
1 true neighbor is 3 (2)
1: 
2	
	  0	  3	
480	964	
rank 1 sending/receiving 480 to 0
rank 1 sending/receiving 964 to 3
rank 0 received from 1 (1) with tag 0 and count 480
2 true neighbor is 1 (1)
2 true neighbor is 3 (2)
2: 
2	
	  1	  3	
190	426	
rank 2 sending/receiving 190 to 1
rank 2 sending/receiving 426 to 3
rank 1 received from 0 (0) with tag 0 and count 480
rank 1 received from 3 (3) with tag 0 and count 964
3 true neighbor is 1 (1)
3 true neighbor is 2 (2)
3: 
2	
	  1	  2	
964	426	
rank 3 sending/receiving 964 to 1
rank 3 sending/receiving 426 to 2
rank 2 received from 3 (3) with tag 0 and count 426
rank 3 received from 2 (2) with tag 0 and count 426
rank 3 received from 1 (1) with tag 0 and count 964

Let me get this straight. In this post, FreeFEM returned a syntaxical error The Identifier intersection does not exist . Then, in that post, without changing anything, the error magically disappeared? That’s not possible. The FreeFEM parser is 100% reproducible, nothing random involved, nothing in parallel. By the way, I can reproduce the same error you had, i.e.:

The Identifier intersection does not exist

 Error line number 207, in file ../../idp/macro_ddm.idp, before  token intersection

  current line = 207 mpirank 0 / 4
Compile error :
	line number :207, intersection

by messing up macro_ddm.idp in the following way:

diff --git a/idp/macro_ddm.idp b/idp/macro_ddm.idp
index a29dacf0..2a930fb3 100644
--- a/idp/macro_ddm.idp
+++ b/idp/macro_ddm.idp
@@ -250,3 +250,4 @@ ENDIFMACRO
         n2oNeighbor.resize(0);
-        mpiWaitAll(rq);
+        // mpiWaitAll(rq);
+        for(int debugI = 0; debugI < 2*intersection[0].n; ++debugI) mpiWaitAny(rq);
         for(int i = 0; i < intersection[0].n; ++i) {

Notice the commented code, that’s not OK to put // comments in macro definition.

Ok, so perhaps instead of modifying a line I had commented it out, and then I guess I deleted it before doing diff ....

Now, what about the output of stokes-2d-PETSc.edp?

Ok, so perhaps instead of modifying a line I had commented it out, and then I guess I deleted it before doing diff ... .

Thank you for this informative comment.

Now, what about the output of stokes-2d-PETSc.edp?

Are you using mpiWaitAll or a loop of mpiWaitAny in macro_ddm.idp? If you are using a loop of mpiWaitAny and it is still not working, try also the following.

diff --git a/idp/macro_ddm.idp b/idp/macro_ddm.idp
index a29dacf0..dd01df6b 100644
--- a/idp/macro_ddm.idp
+++ b/idp/macro_ddm.idp
@@ -250,3 +250,4 @@ ENDIFMACRO
         n2oNeighbor.resize(0);
-        mpiWaitAll(rq);
+        for(int debugI = 0; debugI < 2*intersection[0].n; ++debugI) mpiWaitAny(rq);
+        for(int debugI = 0; debugI < intersection[0].n; ++debugI) epsTab[debugI] = 1.0;
         for(int i = 0; i < intersection[0].n; ++i) {

:slight_smile:

I was using a loop of mpiWaitAny. I modified the file, and now the output is:

-- FreeFem++ v4.9 (Fri Jun 18 14:45:02 CEST 2021 - git v4.9)
 Load: lg_fem lg_mesh lg_mesh3 eigenvalue parallelempi 
 load: init metis (v  5 )
 sizestack + 1024 =10832  ( 9808 )

Entering FreeFem-sources/idp/macro_ddm.idp
  -- Square mesh : nb vertices  =1681 ,  nb triangles = 3200 ,  nb boundary edges 160
Entering FreeFem-sources/idp/macro_ddm.idp
  -- Square mesh : nb vertices  =1681 ,  nb triangles = 3200 ,  nb boundary edges 160
Entering FreeFem-sources/idp/macro_ddm.idp
  -- Square mesh : nb vertices  =1681 ,  nb triangles = 3200 ,  nb boundary edges 160
Entering FreeFem-sources/idp/macro_ddm.idp
  -- Square mesh : nb vertices  =1681 ,  nb triangles = 3200 ,  nb boundary edges 160
 --- global mesh of 4800 elements (prior to refinement) partitioned with metis  --metisA: 4-way Edge-Cut:       3, Balance:  1.01 Nodal=0/Dual 1
 (in 4.330397e-03)
0 true neighbor is 1 (1)
 --- partition of unity built (in 2.576466e-01)
0: 
1	
	  1	
480	
rank 0 sending/receiving 480 to 1
1 true neighbor is 0 (1)
1 true neighbor is 2 (2)
1 true neighbor is 3 (3)
1: 
3	
	  0	  2	  3	
480	190	964	
rank 1 sending/receiving 480 to 0
rank 1 sending/receiving 190 to 2
2 true neighbor is 1 (1)
2 true neighbor is 3 (2)
2: 
2	
	  1	  3	
190	426	
rank 2 sending/receiving 190 to 1
rank 0 received from 1 (1) with tag 0 and count 480
3 true neighbor is 1 (1)
3 true neighbor is 2 (2)
3: 
2	
	  1	  2	
964	426	
rank 3 sending/receiving 964 to 1
rank 3 sending/receiving 426 to 2
rank 1 sending/receiving 964 to 3
rank 2 sending/receiving 426 to 3
rank 1 received from 0 (0) with tag 0 and count 480
rank 1 received from 2 (2) with tag 0 and count 190
rank 1 received from 3 (3) with tag 0 and count 964
rank 2 received from 3 (3) with tag 0 and count 426
 --- global numbering created (in 4.596710e-04)
rank 2 received from 1 (1) with tag 0 and count 190
 --- global CSR created (in 6.911755e-04)
rank 3 received from 2 (2) with tag 0 and count 426
rank 3 received from 1 (1) with tag 0 and count 964
 Warning: -- Your set of boundary condition is incompatible with the mesh label.
 Warning: -- Your set of boundary condition is incompatible with the mesh label.
 Warning: -- Your set of boundary condition is incompatible with the mesh label.
 Warning: -- Your set of boundary condition is incompatible with the mesh label.
 --- system solved with PETSc (in 4.188831e-01)
times: compile 6.200000e-01s, execution 8.000000e-01s,  mpirank:0
 ######## We forget of deleting   0 Nb pointer,   0Bytes  ,  mpirank 0, memory leak =2456640
 CodeAlloc : nb ptr  7304,  size :654304 mpirank: 0
Ok: Normal End
times: compile 0.61s, execution 0.8s,  mpirank:1
 ######## We forget of deleting   0 Nb pointer,   0Bytes  ,  mpirank 1, memory leak =2457456
times: compile 0.6s, execution 0.79s,  mpirank:2
 CodeAlloc : nb ptr  7304,  size :654304 mpirank: 1
 ######## We forget of deleting   0 Nb pointer,   0Bytes  ,  mpirank 2, memory leak =2458448
 CodeAlloc : nb ptr  7304,  size :654304 mpirank: 2
times: compile 0.61s, execution 0.8s,  mpirank:3
 ######## We forget of deleting   0 Nb pointer,   0Bytes  ,  mpirank 3, memory leak =2457952
 CodeAlloc : nb ptr  7304,  size :654304 mpirank: 3