Changing Matrix components PETSc Parallel

Dear FreeFEM developers:

I was converting the code to a parallel version using PETSc, but I found that the number of matrix components changed with the number of processors I used. The test code and results are shown below:

Test 1: ff-mpirun -np 1 test.edp -v 0. \quad Test results:

rows=1681, cols=237
total: nonzeros=322, allocated nonzeros=322

Test 2: ff-mpirun -np 2 test.edp -v 0. \quad Test results:

rows=1681, cols=237
total: nonzeros=50, allocated nonzeros=50

Test 3: ff-mpirun -np 4 test.edp -v 0. \quad Test results:

rows=1681, cols=237
total: nonzeros=81, allocated nonzeros=81

From the above test results, we observe that the number of non-zero components of matrix B are depending on the number of processors used, while matrices A and C are fixed. The version of FreeFEM I used is 4.12.

I would appreciate it if anyone can help me or give me a hint. Thanks in advance.

The following is the test code:

load "PETSc"
macro dimension() 2              //
include "macro_ddm.idp"

mesh Th = square(20, 20);
mesh Thb = emptymesh(Th); 

buildDmesh(Th); 
buildDmesh(Thb);

fespace Vh(Th, P2); 
fespace Vhb(Thb, P2);

varf TemTran(u, uu) = int2d(Th) ( dx(u)*dx(uu)+dy(u)*dy(uu) );

varf matLag(vv, uu) = int1d(Thb, 1, 3)(vv*uu);

Mat A; createMat(Th, A, P2);
Mat C; createMat(Thb, C, P2);
C = matLag(Vhb, Vhb);

matrix  vA = TemTran(Vh, Vh); A = vA;  
matrix  vB = matLag(Vhb, Vh); 
Mat B(A, C, vB);

ObjectView(B, format = "info");
cout<< endl << endl;

This does not work like that. You want to call buildDmesh only once, and then do everything on the local mesh, e.g., emptymesh on the distributed mesh.

Hi, Dr. Pierre Jolivet,
Thanks for your prompt and kind response. Following your suggestion, I have made changes to the code, but I found that the number of non-zero components of matrix B still changes with the number of processors.

Test 1: ff-mpirun -np 1 test.edp -v 0 \quad Test results:

rows=1681, cols=237
total: nonzeros=322, allocated nonzeros=322

Test2: ff-mpirun -np 2 test.edp -v 0 \quad Test results:

rows=1681, cols=206
total: nonzeros=242, allocated nonzeros=247

I also found that the test code can not run when the number of processors is more than 2.

The following is the revised code:

load "PETSc"
macro dimension()    2              //
include "macro_ddm.idp"

mesh Th = square(20, 20);

buildDmesh(Th); 
mesh Thb = emptymesh(Th); 

fespace Vh(Th, P2); 
fespace Vhb(Thb, P2);

varf TemTran(u, uu) = int2d(Th) ( dx(u)*dx(uu)+dy(u)*dy(uu) );

varf matLag(vv, uu) = int1d(Thb, 1, 3)(vv*uu);

Mat A; createMat(Th, A, P2);
Mat C; createMat(Thb, C, P2);
C = matLag(Vhb, Vhb);

matrix  vA = TemTran(Vh, Vh);   A = vA;  
matrix  vB = matLag(Vhb, Vh); 
Mat B(A, C, vB);

ObjectView(B, format = "info");

If you want to do volume/surface coupling, you should not use emptymesh, but extract instead. See this example FreeFem-sources/helmholtz-coupled-2d-PETSc-complex.edp at master · FreeFem/FreeFem-sources · GitHub, and in particular the restriction from volume to surface matrix, RVtoS and the ASurf distributed Mat.

1 Like

Hi, Dr. Pierre Jolivet,
Thanks for the example shared with me. In fact, I want to solve a problem with the Dirichlet boundary condition using the Lagrange Multiplier method.

For example, a heat transfer problem in a square domain. Temperatures at the top and bottom boundaries are T = 200 and T = 0. I want to use the Lagrange multiplier method to impose constant temperature constraints instead of on. The sequential code works well using emptymesh, but the parallel one failed due to the above problem.

Weak form:

\int \nabla T \cdot w \; d\Omega + \int w\mathbf{n} \cdot \nabla T \; d\Gamma+ \int \lambda w \; d\Gamma = 0,
\int \hat{\lambda} (T-T_c)\; d\Gamma = 0.

where \lambda is the Lagrange multiplier defined in the boundary and \hat{\lambda} is the corresponding test function. T_c denotes a temperature constant.

The following code can get correct results if we use only one processor (see the attached code)
TestCode.edp (878 Bytes)

Do not use emptymesh and the code will work with one or multiple processes.

1 Like

Great thanks, Dr. Pierre Jolivet, I will try to use extract instead.

The example is a little bit complex, especially compared to what you want to do. If you are running into troubles, don’t hesitate to upload your new code and I’ll help you fix it.

Hi, Dr. Pierre Jolivet,
According to your suggestion, I first tried extract in the sequential code and it performs better than emptymesh since it can exactly extract the boundary 1 and 3. Then I tried to use extract in the parallel one but met some problems. Just as you said, the example code is not easy for me.

Following the example code, I first decompose the extracted boundary Thb:

mesh Th = square(20, 20);
fespace Ph(Th, P0);
int[int] labs = [1, 3];
meshL Thb = extract(Th, label=labs);
buildDmesh(Th); 
fespace PhL(Thb, P0); PhL Partion;
metisdual(Partion[], Thb, mpisize);
meshL ThL;
for(int i=0; i<mpisize; i++) ThL = trunc(Thb, abs(Partion-i) < 1e-3); 
fespace Vh(Th, P2); 
fespace VhL(ThL, P1);
cout << VhL.ndof << endl

Then I checked the results: ff-mpirun -np 2 TestCode.edp -v 0

11
11

It works well. Then I continue to test Mat B(A, restriction=rB).

varf TemTran(u, uu) = int2d(Th)( dx(u)*dx(uu)+dy(u)*dy(uu) );

varf matLag(vv, uu) = int1d(ThL)(vv*uu);

Mat A; createMat(Th, A, P2);
A = TemTran(Vh, Vh); 

matrix rB = interpolate(VhL, Vh);
matrix vB = matLag(VhL, Vh);

Mat B(A, restriction=rB);
B = vB;
ObjectView(B, format = "info");

but I found that Mat B is a square matrix. It seems that B = vB didn’t work.

Mat Object: 2 MPI processes
type: mpiaij
rows=41, cols=41
total: nonzeros=0, allocated nonzeros=0

Another problem I found is that if I use P2 to replace P1 in fespace VhL(ThL, P1), then the code stops at Mat B(A, restriction=rB), and gives the following error information:

Assertion fail : (mList->n == mList->nnz)
line :3576, in file PETSc-code.hpp
err code 6 , mpirank 1

I have some probelm understanding Mat B(A, restriction=rB). Does it mean creating a Mat B having the same number of rows as A? What is ‘restriction’ means here? How can I transfer the matrix vB to Mat B?

I would greatly appreciate it if you can kindly answer this question. Thanks
testCode.edp (742 Bytes)

Best,

I don’t understand your code.

  1. Why are you calling metisdual()?
  2. Why are you assembling a square Mat (B) with a rectangular matrix (vB)?

Does it mean creating a Mat B having the same number of rows as A?

No, it means creating a restricted Mat, i.e., a distributed fespace with fewer degrees of freedom.
So if you want to mix P2 and P1, you’ll need to make sure you are using the appropriate initial Mat.

How can I transfer the matrix vB to Mat B?

You should not do this, vB is rectangular, B is square, that does not make sense.

I’m attaching a working testCode.edp (815 Bytes).

1 Like

Hi, Dr. Pierre Jolivet,
Now I know what mistakes I made after learning the code revised by you. I really learned a lot. Great thanks.

Best.