Memory allocation problem in MPI computation using PETSc

What do you mean with the command part[] = A.D.? What should I insert instead of A.D.? This is not clear to me, I’m sorry

If foobar is the name of your PETSc Mat, then insert foobar.D, see, e.g., FreeFem-sources/navier-stokes-2d-PETSc.edp at master · FreeFem/FreeFem-sources · GitHub.

Oh sorry, that’s the point… I was supposing that A.D was something link an achronim, while A is nothing but the Mat name!

I’m trying to implement this strategy and I’ll let you know if I’ll need sme more help (which is actually very realistic :slight_smile: )

I was trying to understand how the solution you proposed is working and I have developed this very stupid script:

//prova soluzione PRJ in caso semplicissimo

load "PETSc"
load "msh3"
macro dimension()3//
include "macro_ddm.idp"

macro def(i)i// EOM
macro init(i)i// EOM
func Pk = P2;
int nn = 70;

mesh3 ThNo, Th = cube(nn,nn,nn);
real Volume = int3d(Th)(1.0);
if (mpirank == 0)
{
  cout << "Volume = " << Volume << endl;
}
fespace Vh(Th,Pk);

Mat AA;

buildMat(Th, getARGV("-split",1), AA, Pk, mpiCommWorld);

Vh part = AA.D;
ThNo = trunc(Th, abs(part - 1.0) < 1e-1);

func real integral(real[int]& X) {
  Vh uu;
  ChangeNumbering(AA, uu[], X, inverse = true, exchange = true);
  real glob, loc = int3d(ThNo)(uu);
  mpiAllReduce(loc,glob, mpiCommWorld, mpiSUM);
  return glob;
}

real tin = clock();

Vh valore = 2.0;
real[int] vPETSc;
ChangeNumbering(AA, valore[], vPETSc);

real volx2 = integral(vPETSc);

if (mpirank == 0)
{
  cout << "volx2 = " << volx2 << endl;
}

real duration = clock() - tin;

if (mpirank == 0)
{
  cout << "Duration = " << duration << " [s]" << endl;
}

This script should compute in parallel the volume of a unit cube multiplied by a certain factor (2 in this case). I have noticed that with one core the result is properly evaluated, while increasing the number of cores the result is under estimated. I am pretty sure that is something missing in my script, but I’m not able to understand where. (see attached table for error description)

My bad, I thought this would work, but the multiplicity scaling is wrong with such a fespace. Here is how to bypass the issue. Replace:

buildMat(Th, getARGV("-split",1), AA, Pk, mpiCommWorld);

Vh part = AA.D;
ThNo = trunc(Th, abs(part - 1.0) < 1e-1);

By:

fespace Ph(Th, P0);
Ph part;
partitionerSeq(part[], Th, mpisize);
partitionerPar(part[], Th, mpiCommWorld, mpisize);

buildMatWithPartitioning(Th, part[], getARGV("-split",1), AA, Pk, mpiCommWorld);
part = part;

ThNo = trunc(Th, abs(part - mpirank) < 1e-1);

Is this an acceptable solution for you?

Great! This is working very well in this simple test case! I’m trying to translate it in the real problem application!

Sounds good, you know where to find me if this does not work or need something else :slight_smile:

Here I am again! I’m trying to get step by step closer to the final result, but I have encountered a strange issue which I am not able to understand… everything seems to work smoothly but at a certain point the execution stops (ExitCode 59) showing the error code at the end of the attached .log file (…possibly illegal memory access).
The adopted script is attached too.

Do you have any guess on the reason for this crush?
NewImplTEST.log (751.0 KB)
NewImplementation_onlyPower.edp (6.9 KB)

You need to initialise real[int] bbb(vh.ndof); after the refinement, or alternatively, call bb.resize(vh.ndof); after buildMatWithPartitioning.

Thanks again for making me noticing this!
In the very simple case I have used as a test I was able to make properly work examples with very heavy meshes!
By the way I’m still facing some issues in the real case application, and the problem arises once I try to assemble the matrix (AA = vAmp(vh,vh)). PETSc is showing an error stating that “Argument is out of range” as in the attached sneepshot.

Do you know which could be the reason of this issue? Should I define the matrix on the base of other fespaces (maybe defined on ThNo?).

For completeness the updated version of the “final script” is here attached.

NewImplementation_onlyPower.edp (8.0 KB)

Thanks again for the support!

I’m here again to add some new information. Similar, but not same errors appears (Segmentation Violation, probably memory access out of range) whenever I try to solve a simple heat diffusion problem within the usual cube as happen running the attached script (notice that the mesh is light so the computation can be easily afforded by a simple laptop!)

Any suggestion for the crash reason @prj ?

Sorry again for disturbing you a lot, and thank you very much for the support!

ProvaSoluzionPRJ_heatconduction.edp (3.1 KB)
cubotestL.mesh (2.9 MB)

EDIT: This is running on another machine but the computed average temperature is 4000 [K], which is impossible given the side temperatures of 500 [K] and 300 [K]

For the first script, if you use edge elements, you need to define a special macro PkPart, cf. FreeFem-sources/maxwell-3d.edp at master · FreeFem/FreeFem-sources · GitHub. Then, you’ll have to call buildMatEdgeRecursiveWithPartitioning. For the second script, if you use a FreeFEM matrix and run the code in sequential, does it provide the correct results?

Thanks again @prj for the suggestion, just to be sure:

  • I should define the macro macro ThPkPart()Edge03ds0// EOM at the beginning of the script (i.e. just after the def and init macro)
  • I should substitute buildMatWithPartitioning(Th, part[], getARGV("-split",1), AA, Pk, mpiCommWorld); with buildMatEdgeRecursiveWithPartitioning(Th, part[], getARGV("-split",1), AA, Pk, mpiCommWorld); so basically using the same input argument, is it? Or are there some different arguments to be provided to the new command?

(EDIT: I suppose I need to load/include some external “packages” / .idp since running with only these modification the job fails stating that the identifier buildMatEdgeRecursiveWithPartitioning does not exist)

(EDIT2: Looking within macro_ddm.idp I have noticed that exists two different macros which are macro buildMatEdgeRecursive and macro buildMatEdgeWithPartitioning. I have tryed to use the second one and no errors are appearing when assembling the matrix, but I’m still waiting to understand if it is actually computing or not The new version of the buildMat is the following:)

buildMatEdgeWithPartitioning(Th, part[], getARGV("-split",1), AA, Pk, mpiCommWorld, Edge03ds0, defPart, initPart);

For what concerns the second script, my fault… the problem is solved properly! By the way I’m not still able to store the global result on a .vtu file. I was trying to do something like this:

load "iovtk"

real[int] Tglob;

mpiAllReduce(TT[],Tglob, mpiCommWorld, mpiSUM);

if (mpirank == 0)
{
  savevtk("TemperaturaBlocco.vtu",Th,Tglob);
}

but this was generating an error telling that it is not possible to store such kind of data (but if I define Tglob as Vh it is not working too…)

Thanks again!

  1. You need to use buildMatEdgeWithPartitioning, there is nothing else to include but macro_ddm.idp.
  2. You should not load "iovtk" to use savevtk in parallel, it’s automatically shipped with load "PETSc". See FreeFem-sources/diffusion-mg-3d-PETSc.edp at master · FreeFem/FreeFem-sources · GitHub.

Dear @prj ,
I hope to disturb you for the last time, at least for this topic! :slight_smile:

I have introduced the modifications you suggested me, but in the real case application I’m still having problems. Indeed the error “Caught signal number 7 BUS: Bus Error, possibly illegal memory access” is appearing now when calling partitionerseq and partitionerpar.

The relevant part of the adopted script is attached below.

Thanks again for the support! Have a nice sunday!

load "msh3"
load "PETSc"
load "mat_edgeP1"
load "Element_Mixte3d"
include "getARGV.idp"

int iresta = getARGV("-iresta",0);
int cm = 15; //getARGV("-cm",0);

string meshname = "FixedTFVV"+cm+".mesh";

lockOrientation = false;
mesh3 Th = readmesh3(meshname);
lockOrientation = true;

mesh3 ThNo;

// region definition

int pt = Th(0.0,9.0,0.0).region;

int pp = Th(0.0,10.0,0.0).region;

int cassa = Th(0.0,3.3,0.0).region;

int VV = Th(0.0,13.7,0.0).region;


macro dimension()3// EOM            // 2D or 3D
include "macro_ddm.idp"

macro defPart(u)u// EOM             // partition of unity definition
macro initPart(u)u// EOM            // partition of unity initialization


macro def(i)[i, i#B, i#C] //EOM
macro init(i)[i, i, i] //EOM
macro ThPkPart()Edge03ds0// EOM special FE for the domain partitioning

real mu = 4*pi*1.0e-7;
real nu=1.0/mu;

func ppp = P0;
func pj = [P0, P0, P0];
func Pk = Edge03d;
/*
fespace vhp(Th,ppp);
fespace vhj(Th,pj);
fespace vh(Th,Pk);
*/
real rhocase = 80e-8; // [Ohm*m] ref ENEA
real sigmaCase = 1.0/rhocase;

real rhoVV = 54.4e-8; // [Ohm*m] ref ENEA
real sigmaVV = 1.0/rhoVV;

//vhp Sigma = 1.0 + (sigmaCase-1.0)*(region == cassa) + (sigmaVV-1.0)*(region == VV);

// macros
macro Curl(Ax,Ay,Az) [dy(Az)-dz(Ay), dz(Ax)-dx(Az), dx(Ay)-dy(Ax)]//EOM

real tempo;


real Ip = 3.6e6; //[A]
real rmajor = 9.0;

func Xc = rmajor*(x/sqrt(x^2+y^2));
func Yc = rmajor*(y/sqrt(x^2+y^2));

func rr = sqrt(x^2+y^2);

func oriz = (rr-rmajor)*(region == pp);

func cosalpha = oriz/sqrt(z^2+oriz^2);

func sinalpha = z/sqrt(z^2+oriz^2)*(region == pp);

real rout = 1.4;
real rin = 0.9;

func sezione = (rout - rin)*2.0*pi*(rmajor+((rout+rin)/2.0)*cosalpha);

func J0p = Ip/sezione;



func Jmodtp = 0.0*(tempo<0.5) + J0p*((tempo-0.5)/7.0)*(tempo >= 0.5 && tempo < 7.5) + J0p*(tempo >= 7.5 && tempo < 20.0) + J0p*(1-(tempo-20.0)/0.057)*(tempo>=20.0 && tempo < 20.057) + 0.0;

func cosbeta = Xc/rmajor;
func sinbeta = Yc/rmajor;

real It = 18.0e6;
real rt = 0.75;
real sezt = pi*rt^2;

real J0t = It/sezt;

func Jmodtt = 0.0*(tempo<0.5) + J0t*((tempo-0.5)/7.0)*(tempo >= 0.5 && tempo < 7.5) + J0t*(tempo >= 7.5 && tempo < 20.0) + J0t*(1-(tempo-20.0)/0.057)*(tempo>=20.0 && tempo < 20.057) + 0.0;


real tend = 25.0;
real dt;



if (mpirank == 0)
{
  cout << "Variational formulation defined" << endl;
}

Mat AA;

if (mpirank == 0)
{
  cout << "PETSc matrix defined" << endl;
}
//macro ThPkPart()Edge03ds0// EOM

fespace Ph(Th, P0);
Ph part;
partitionerSeq(part[], Th, mpisize);
partitionerPar(part[], Th, mpiCommWorld, mpisize);

if (mpirank == 0)
{
  cout << "partitioner done" << endl;
}

//buildMatWithPartitioning(Th, part[], getARGV("-split",1), AA, Pk, mpiCommWorld);
buildMatEdgeWithPartitioning(Th, part[], getARGV("-split",1), AA, Pk, mpiCommWorld, Edge03ds0, defPart, initPart);
if (mpirank == 0)
{
  cout << "buildMatEdgeWithPartitioning done" << endl;
}

part = part;

Without the mesh: impossible to debug.

You are right, I’m sorry. I was hoping that thanks to your expertise you were able to easily catch the mistake! :hugs:

The mesh is saved at this GDRIVE link. Not such a small mesh, but to execute the first few lines this should not be an issue.

This is a 1GB mesh. The exact goal of this script is not to have to deal with such a large mesh and instead use the -split parameter. If it was not working before with the 1GB mesh, it still won’t work here.

No, this mesh was actually working with the “old style” approach… let me say that this was the biggest mesh that can be used!

By the way I am checking if with lighter meshes it is instead working!

The problem remains. Dealing with large duplicated meshes is far from ideal. Alternatively, you could try to launch your script with the additional command line argument -Dpartitioner=parmetis, and see if it’s any better.