FreeFem++ PETSc configuration problem in Linux

Dear developer,

I am trying to compile FreeFem with PETSc on a cluster who uses slurm queuing system.
I used intel compilers and intel mpi to configure then I installed petsc-slepc. As in the installation guide.
Then, ./reconfigure, it still does not find PETSc and SLEPc…

configure:13465: find real ( real ) petsc in /home/yim/myff/ff-petsc//r/lib/petsc/conf/petscvariables
configure:13816: " Warning PETSC MPI and FF++ MPI not the same: mpiexec != /ssoft/spack/external/intel/2018.4/impi/2018.4.274/bin64/mpirun or real != real ."
configure:13831: without petsc, slepc *****
configure:13854: find complex (complex) petsc in /home/yim/myff/ff-petsc//c/lib/petsc/conf/petscvariables
configure:14204: " Warning PETSC complex MPI and FF++ MPI not the same: mpiexec != /ssoft/spack/external/intel/2018.4/impi/2018.4.274/bin64/mpirun or complex != complex ."
configure:14217: without petsc complex *****

configure:23581: FreeFEM used download: yes
configure:23583: – Dynamic load facility: yes
configure:23585: – ARPACK (eigen value): yes
configure:23587: – UMFPACK (sparse solver): yes
configure:23589: – BLAS: yes
configure:23591: – with MPI: yes
configure:23593: – with PETSc: no / PETSc complex: no
configure:23595: – with SLEPc: no / SLEPc complex: no
configure:23597: – with hpddm: yes
configure:23599: – with htool: yes
configure:23601: – with bemtool: yes (need boost: yes and htool: yes)
configure:23603: – without libs:

I did everything again with gcc and mvapich2, still has the same problem.
Do you have any idea to fix this?
Thank you.
Best,
Eunok

Do either 1. or 2.

  1. use the develop branch
  2. ./reconfigure MPIRUN=mpiexec

Dear prj,

Thanks a lot for your rapid response.
I tried both of your solutions and only the second one worked.
The first one using the develop branch, did not even succeed “make petsc-slepc”.
Eventually, I also had the problem with libparmmgtypes.h as in (Freefem installation error on ubuntu 18.04)

In the end, I had to do,

./reconfigure MPIRUN=mpiexec --disable-parmmg

which worked for installation!

However, I faced other problem.
When I run the code with PETSc, errors occur for buildMat.
Other codes in examples/hpddm WITHOUT PETSc work fine as well as block-PETSc.edp, which does not call buildMat.

Is it related to the configuration --disable-parmmg ?

The example errors for diffusion-2d-PETSc.edp using 4 tasks with 12GB memory are

– Square mesh : nb vertices =1681 , nb triangles = 3200 , nb boundary edges 160
— global mesh of 3200 elements (prior to refinement) partitioned with metis --metisESCOA: 4-way Edge-Cut: 3, Balance: 1.01 Nodal=0/Dual 1
(in 4.019976e-03)
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
[2]PETSC ERROR: ------------------------------------------------------------------------
[2]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[2]PETSC ERROR: or see https//www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[2]PETSC ERROR: or try http//valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[2]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
[2]PETSC ERROR: to get more information on the crash.
[2]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[2]PETSC ERROR: Signal received
[2]PETSC ERROR: See https//www.mcs.anl.gov/petsc/documentation/faqhtml for trouble shooting.
[2]PETSC ERROR: Petsc Release Version 3.13.0, Mar 29, 2020
[2]PETSC ERROR: /home/yim/FreeFem-sources/src/mpi/FreeFem+±mpi on a named f372 by yim Sat Aug 22 10:34:18 2020
[2]PETSC ERROR: Configure options MAKEFLAGS= --prefix=/home/yim/FreeFem-install/ff-petsc//r --with-debugging=0 COPTFLAGS="-O3 -mtune=native" CXXOPTFLAGS="-O3 -mtune=native" FOPTFLAGS="-O3 -mtune=native" --with-cxx-dialect=C++11 --with-ssl=0 --with-x=0 --with-fortran-bindings=0 --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --with-scalar-type=real --download-openblas --download-scalapack --download-metis --download-ptscotch --download-mumps --download-hypre --download-parmetis --download-superlu --download-suitesparse --download-tetgen --download-slepc --download-hpddm --download-hpddm-commit=e8639ff PETSC_ARCH=fr
[2]PETSC ERROR: #1 User provided function() line 0 in unknown file
application called MPI_Abort(MPI_COMM_WORLD, 50152059) - process 2
In: PMI_Abort(50152059, application called MPI_Abort(MPI_COMM_WORLD, 50152059) - process 2)
slurmstepd: error: *** STEP 5023601.0 ON f366 CANCELLED AT 2020-08-22T10:34:18 ***
srun: error: f372: task 2: Exited with exit code 123
srun: Terminating job step 5023601.0
srun: error: f366: tasks 0-1: Killed
srun: error: f372: task 3: Killed

Thank you.
Best,
Eunok

Do you have multiple MPI installation on your machine? Where is mpiexec pointing at?

Hmm… Yes. in the system there are several mpis… But I use only intel-mpi from the beginning to the end.
Then mpiexec should be linked to the location where the module intel-mpi is loaded, no?
In the config.log,

MPICC=‘mpiicc’
MPICXX=‘mpiicpc’
MPIF77=‘mpiifort’
MPIFC=‘mpiifort’
MPIPROG=‘FreeFem+±mpi’
MPIRUN=‘mpiexec’
MPISCRIPT=‘ff-mpirun’
MPI_FALSE=’#’
MPI_INCLUDE=’-I/ssoft/spack/external/intel/2018.4/impi/2018.4.274/intel64/include -Xlinker --enable-new-dtags -Xlinker -rpath -Xlinker /ssoft/spack/external/intel/2018.4/impi/2018.4.274/intel64/lib/release_mt -Xlinker -rpath -Xlinker /ssoft/spack/external/intel/2018.4/impi/2018.4.274/intel64/lib -Xlinker -rpath -Xlinker /opt/intel/mpi-rt/2017.0.0/intel64/lib/release_mt -Xlinker -rpath -Xlinker /opt/intel/mpi-rt/2017.0.0/intel64/lib ’
MPI_INC_DIR=’/ssoft/spack/external/intel/2018.4/impi/2018.4.274/intel64/include’
MPI_LIB=’-L/ssoft/spack/external/intel/2018.4/impi/2018.4.274/intel64/lib/release_mt -L/ssoft/spack/external/intel/2018.4/impi/2018.4.274/intel64/lib -lmpicxx -lmpifort -lmpi -lmpigi -ldl -lrt -lpthread ’
MPI_LIBC=’-L/ssoft/spack/external/intel/2018.4/impi/2018.4.274/intel64/lib/release_mt -L/ssoft/spack/external/intel/2018.4/impi/2018.4.274/intel64/lib -lmpifort -lmpi -lmpigi -ldl -lrt -lpthread ’
MPI_LIBFC=’-L/ssoft/spack/external/intel/2018.4/impi/2018.4.274/intel64/lib/release_mt -L/ssoft/spack/external/intel/2018.4/impi/2018.4.274/intel64/lib -lmpifort -lmpi -lmpigi -ldl -lrt -lpthread ’
MPI_LIB_DIRS=’/ssoft/spack/external/intel/2018.4/impi/2018.4.274/intel64/lib/release_mt’
MPI_RUN_OPTION=’’

Did I answer your question?
Thanks!
Eunok

No, you didn’t answer my question. Is mpiexec from your system, or from the Intel installation? How do you launch your script?

I see.
I launch the simulation with slurm queuing system by like

module add intel/18.0.5 intel-mpi/2018.4.274
export LD_LIBRARY_PATH={LD_LIBRARY_PATH}:{INTEL_ROOT}/compiler/lib/intel64

dirff=/home/yim/FreeFem-sources/src/mpi/FreeFem+±mpi

srun $dirff diffusion-2d-PETSc.edp > output.txt

How about mpiexec instead of srun?

No… Didn’t work…
I must use srun to send a job…

/home/yim/FreeFem-sources/src/mpi/FreeFem+±mpi: /home/yim/FreeFem-sources/src/mpi/FreeFem+±mpi: [mpiexec@f355] HYDU_sock_write (…/…/utils/sock/sock.c:418): write error (Bad file descriptor)
[mpiexec@f355] HYD_pmcd_pmiserv_send_signal (…/…/pm/pmiserv/pmiserv_cb.c:253): unable to write data to proxy
slurmstepd: error: *** JOB 5028207 ON f355 CANCELLED AT 2020-08-22T15:50:38 ***

Do you think it is due to the petsc with mpi problem?
Other hpddm examples run well even with PETSc buildMat.
Like laplace-beltrami-3d-line-SLEPc.edp, stokes-fieldsplit-3d-PETSc.edp work!
I will check all hpddm examples and tell you which one works which does not. Maybe we can find the reasons why…
Thank you!

Could you copy/paste the result of make check in the examples/hpddm folder, please?
I think the issue comes from the fact that you are mixing different MPI implementations.

Indeed, there is problem. When I do, make -j4 check,
I have something like


Making check in nw
make[2]: Entering directory /home/yim/FreeFem-sources/src/nw' icpc -g -DNDEBUG -O3 -mmmx -mavx -std=c++11 -DBAMG_LONG_LONG -DNCHECKPTR -fPIC -rdynamic -o FreeFem++-nw ../Graphics/sansrgraph.o ../mpi/parallelempi-empty.o ../fflib/ffapi.o ../lglib/liblg.a ../fflib/libff.a -Wl,-rpath,/home/yim/FreeFem-install/ff-petsc/r/lib -L/home/yim/FreeFem-install/ff-petsc/r/lib -lumfpack -lklu -lcholmod -lbtf -lccolamd -lcolamd -lcamd -lamd -lsuitesparseconfig /home/yim/FreeFem-sources/3rdparty/lib/libarpack.a -Wl,-rpath,/home/yim/FreeFem-install/ff-petsc/r/lib -L/home/yim/FreeFem-install/ff-petsc/r/lib -lopenblas -Wl,-rpath,/home/yim/FreeFem-install/ff-petsc/r/lib -L/home/yim/FreeFem-install/ff-petsc/r/lib -lopenblas -ldl -lm -lrt -L/ssoft/spack/external/intel/2018.4/compilers_and_libraries_2018.5.274/linux/compiler/lib/intel64_lin -L/ssoft/spack/humagne/v1/opt/spack/linux-rhel7-x86_E5v4_Mellanox/gcc-4.8.5/gcc-6.5.0-zh4rtcgdquhyaobmlogojyf7incbzjkf/lib/gcc/x86_64-pc-linux-gnu/6.5.0/ -L/ssoft/spack/humagne/v1/opt/spack/linux-rhel7-x86_E5v4_Mellanox/gcc-4.8.5/gcc-6.5.0-zh4rtcgdquhyaobmlogojyf7incbzjkf/lib/gcc/x86_64-pc-linux-gnu/6.5.0/../../../../lib64 -L/ssoft/spack/humagne/v1/opt/spack/linux-rhel7-x86_E5v4_Mellanox/gcc-4.8.5/gcc-6.5.0-zh4rtcgdquhyaobmlogojyf7incbzjkf/lib/gcc/x86_64-pc-linux-gnu/6.5.0/../../../../lib64/ -L/lib/../lib64 -L/lib/../lib64/ -L/usr/lib/../lib64 -L/usr/lib/../lib64/ -L/ssoft/spack/humagne/v1/opt/spack/linux-rhel7-x86_E5v4_Mellanox/gcc-4.8.5/gcc-6.5.0-zh4rtcgdquhyaobmlogojyf7incbzjkf/lib/gcc/x86_64-pc-linux-gnu/6.5.0/../../../ -L/lib64 -L/lib/ -L/usr/lib64 -L/usr/lib -L/ssoft/spack/external/intel/2018.4/compilers_and_libraries_2018.5.274/linux/ipp/lib/intel64 -L/ssoft/spack/external/intel/2018.4/compilers_and_libraries_2018.5.274/linux/tbb/lib/intel64/gcc4.7 -L/ssoft/spack/external/intel/2018.4/compilers_and_libraries_2018.5.274/linux/mkl/lib/intel64 -L/ssoft/spack/external/intel/2018.4/compilers_and_libraries_2018.5.274/linux/compiler/lib/intel64 -lifport -lifcoremt -limf -lsvml -lm -lipgo -lirc -lpthread -lirc_s -ldl make[2]: Leaving directory /home/yim/FreeFem-sources/src/nw’
Making check in mpi
make[2]: Entering directory /home/yim/FreeFem-sources/src/mpi' make[2]: Nothing to be done for check’.
make[2]: Leaving directory /home/yim/FreeFem-sources/src/mpi' Making check in bamg make[2]: Entering directory /home/yim/FreeFem-sources/src/bamg’
make[2]: Nothing to be done for check'. make[2]: Leaving directory /home/yim/FreeFem-sources/src/bamg’
Making check in medit
make[2]: Entering directory /home/yim/FreeFem-sources/src/medit' make check-am make[3]: Entering directory /home/yim/FreeFem-sources/src/medit’
make[3]: Nothing to be done for check-am'. make[3]: Leaving directory /home/yim/FreeFem-sources/src/medit’
make[2]: Leaving directory /home/yim/FreeFem-sources/src/medit' Making check in bin-win32 make[2]: Entering directory /home/yim/FreeFem-sources/src/bin-win32’
echo done
done
make[2]: Leaving directory /home/yim/FreeFem-sources/src/bin-win32' Making check in ffgraphics make[2]: Entering directory /home/yim/FreeFem-sources/src/ffgraphics’
Making check in server
make[3]: Entering directory /home/yim/FreeFem-sources/src/ffgraphics/server' make[3]: Nothing to be done for check’.
make[3]: Leaving directory /home/yim/FreeFem-sources/src/ffgraphics/server' Making check in client make[3]: Entering directory /home/yim/FreeFem-sources/src/ffgraphics/client’
make[3]: Nothing to be done for check'. make[3]: Leaving directory /home/yim/FreeFem-sources/src/ffgraphics/client’
make[3]: Entering directory /home/yim/FreeFem-sources/src/ffgraphics' make[3]: Nothing to be done for check-am’.
make[3]: Leaving directory /home/yim/FreeFem-sources/src/ffgraphics' make[2]: Leaving directory /home/yim/FreeFem-sources/src/ffgraphics’
make[2]: Entering directory /home/yim/FreeFem-sources/src' make[2]: Nothing to be done for check-am’.
make[2]: Leaving directory /home/yim/FreeFem-sources/src' make[1]: Leaving directory /home/yim/FreeFem-sources/src’
Making check in plugin
make[1]: Entering directory /home/yim/FreeFem-sources/plugin' Making check in seq make[2]: Entering directory /home/yim/FreeFem-sources/plugin/seq’
make[3]: Entering directory /home/yim/FreeFem-sources/plugin/seq' MISSING lib mkl, Check the WHERE-LIBRARYfiles eval ./ff-c++ make[3]: Leaving directory /home/yim/FreeFem-sources/plugin/seq’
Warning missing plugin:
finish build list so
make[2]: Leaving directory /home/yim/FreeFem-sources/plugin/seq' Making check in mpi make[2]: Entering directory /home/yim/FreeFem-sources/plugin/mpi’
Warning missing mpi plugin:
finish compile load mpi solver !
make[2]: Leaving directory /home/yim/FreeFem-sources/plugin/mpi' make[2]: Entering directory /home/yim/FreeFem-sources/plugin’
make[2]: Nothing to be done for check-am'. make[2]: Leaving directory /home/yim/FreeFem-sources/plugin’
make[1]: Leaving directory /home/yim/FreeFem-sources/plugin' Making check in examples make[1]: Entering directory /home/yim/FreeFem-sources/examples’
Making check in 3d
make[2]: Entering directory /home/yim/FreeFem-sources/examples/3d' make check-TESTS make[3]: Entering directory /home/yim/FreeFem-sources/examples/3d’
make[4]: Entering directory `/home/yim/FreeFem-sources/examples/3d’
PASS: ArrayFE-3d.edp
PASS: cone.edp
PASS: convect-3d.edp
PASS: cylinder-3d.edp
PASS: beam-3d.edp
PASS: extract-boundary3d.edp
…/…/bin/test-driver-ff: line 127: 20032 Segmentation fault (core dumped) {TEST_FFPP} {FLAGS_FFPP_B} “@" {FLAGS_FFPP_A} > log_file 2>&1 FAIL: 3d-Leman.edp PASS: cube-period.edp PASS: first.edp PASS: Lac.edp PASS: intlevelset3d.edp ../../bin/test-driver-ff: line 127: 20289 Segmentation fault (core dumped) {TEST_FFPP} {FLAGS_FFPP_B} "@” ${FLAGS_FFPP_A} > $log_file 2>&1
FAIL: meditddm.edp
PASS: LaplaceRT-3d.edp
PASS: EqPoisson.edp
PASS: p.edp
PASS: periodic-3d.edp
PASS: Poisson.edp
PASS: pyramide.edp
PASS: sphere2.edp
PASS: sphere6.edp
PASS: Stokes.edp
PASS: TruncLac.edp
PASS: crack-3d.edp
PASS: NSI3d-carac.edp
PASS: cylinder.edp
PASS: Laplace3d.edp
PASS: NSI3d.edp
PASS: Poisson3d.edp
PASS: Poisson-cube-ballon.edp
PASS: schwarz-nm-3d.edp
PASS: tetgencube.edp
PASS: Period-Poisson-cube-ballon.edp
SKIP: bottle.edp
PASS: sphereincube.edp
PASS: refinesphere.edp
PASS: tetgenholeregion.edp
PASS: fallingspheres.edp
PASS: Laplace-Adapt-3d.edp

Testsuite summary for FreeFEM 4.6

TOTAL: 38

PASS: 35

SKIP: 1

XFAIL: 0

FAIL: 2

XPASS: 0

ERROR: 0

============================================================================
See examples/3d/test-suite.log
Please report to frederic.hecht@sorbonne-universite.fr

make[4]: *** [test-suite.log] Error 1
make[4]: Leaving directory /home/yim/FreeFem-sources/examples/3d' make[3]: *** [check-TESTS] Error 2 make[3]: Leaving directory /home/yim/FreeFem-sources/examples/3d’
make[2]: *** [check-am] Error 2
make[2]: Leaving directory /home/yim/FreeFem-sources/examples/3d' make[1]: *** [check-recursive] Error 1 make[1]: Leaving directory /home/yim/FreeFem-sources/examples’
make: *** [check-recursive] Error 1

The check is exited before arriving to hpddm…
By the way, can I do the check hpddm only?
Thanks a lot.
Eunok

Just do make check in the examples/hpddm folder, precisely as I told you before.

Dear Prj,
Sorry for late reply.
I did what you told me and here is the output.

make -j4 check
make check-TESTS
make[1]: Entering directory /home/yim/FreeFem-sources/examples/hpddm' make[2]: Entering directory /home/yim/FreeFem-sources/examples/hpddm’
XFAIL: withPartitioning.edp
XFAIL: buildRecursive.edp
XFAIL: createPartition.edp
XFAIL: reconstructDmesh.edp
XFAIL: diffusion-2d.edp
XFAIL: diffusion-mg-2d.edp
XFAIL: diffusion-substructuring-2d.edp
XFAIL: diffusion-substructuring-withPartitioning-2d.edp
XFAIL: diffusion-3d.edp
XFAIL: diffusion-simple-3d.edp
XFAIL: diffusion-periodic-2d.edp
XFAIL: elasticity-2d.edp
XFAIL: elasticity-substructuring-2d.edp
XFAIL: elasticity-3d.edp
XFAIL: elasticity-simple-3d.edp
PASS: elasticity-block.edp
XFAIL: heat-2d.edp
XFAIL: heat-io-2d.edp
XFAIL: heat-3d.edp
XFAIL: helmholtz-2d.edp
XFAIL: helmholtz-mg-2d.edp
XFAIL: iterative.edp
XFAIL: maxwell-3d.edp
XFAIL: stokes-2d.edp
XFAIL: stokes-3d.edp
XFAIL: stokes-io-3d.edp
XFAIL: heat-torus-3d-surf.edp
XFAIL: bratu-2d-PETSc.edp
XFAIL: diffusion-2d-PETSc.edp
XFAIL: diffusion-3d-PETSc.edp
XFAIL: diffusion-periodic-2d-PETSc.edp
XFAIL: diffusion-periodic-balanced-2d-PETSc.edp
XFAIL: elasticity-2d-PETSc.edp
XFAIL: elasticity-3d-PETSc.edp
XFAIL: elasticity-SNES-3d-PETSc.edp
XFAIL: heat-2d-PETSc.edp
XFAIL: laplace-lagrange-PETSc.edp
XFAIL: natural-convection-fieldsplit-2d-PETSc.edp
XFAIL: neo-Hookean-2d-PETSc.edp
XFAIL: newton-2d-PETSc.edp
XFAIL: newton-adaptmesh-2d-PETSc.edp
XFAIL: newton-vi-2d-PETSc.edp
XFAIL: newton-vi-adaptmesh-2d-PETSc.edp
XFAIL: block-PETSc.edp
XFAIL: laplace-RT-2d-PETSc.edp
XFAIL: stokes-2d-PETSc.edp
XFAIL: stokes-fieldsplit-2d-PETSc.edp
XFAIL: stokes-block-2d-PETSc.edp
XFAIL: stokes-3d-PETSc.edp
XFAIL: stokes-fieldsplit-3d-PETSc.edp
XFAIL: transpose-solve-PETSc.edp
XFAIL: bratu-hpddm-2d-PETSc.edp
XFAIL: vi-2d-PETSc.edp
PASS: orego-Tao-PETSc.edp
XFAIL: heat-TS-2d-PETSc.edp
XFAIL: heat-TS-RHS-2d-PETSc.edp
XFAIL: advection-TS-2d-PETSc.edp
PASS: toy-Tao-PETSc.edp
XFAIL: minimal-surface-Tao-2d-PETSc.edp
XFAIL: Schur-complement-PETSc.edp
XFAIL: maxwell-2d-PETSc.edp
XFAIL: maxwell-3d-PETSc.edp
XFAIL: laplace-adapt-3d-PETSc.edp
XFAIL: diffusion-mg-3d-PETSc.edp
XFAIL: save-load-Dmesh.edp
XFAIL: navier-stokes-2d-PETSc.edp
XFAIL: transfer.edp
SKIP: distributed-parmmg.edp
SKIP: laplace-adapt-dist-3d-PETSc.edp
XFAIL: laplace-2d-SLEPc.edp
XFAIL: laplace-spherical-harmonics-2d-SLEPc.edp
XFAIL: laplace-torus-2d-SLEPc.edp
XFAIL: schrodinger-harmonic-oscillator-1d-SLEPc.edp
XFAIL: schrodinger-square-well-1d-SLEPc.edp
XFAIL: schrodinger-axial-well-2d-SLEPc.edp
XFAIL: schrodinger-harmonic-oscillator-2d-SLEPc.edp
XFAIL: laplace-beltrami-3d-surf-SLEPc.edp
XFAIL: laplace-beltrami-3d-line-SLEPc.edp
XFAIL: diffusion-2d-PETSc-complex.edp
XFAIL: helmholtz-2d-PETSc-complex.edp
XFAIL: helmholtz-mg-2d-PETSc-complex.edp
XFAIL: maxwell-mg-3d-PETSc-complex.edp
XFAIL: laplace-2d-SLEPc-complex.edp
XFAIL: navier-stokes-2d-SLEPc-complex.edp
XFAIL: helmholtz-3d-surf-PETSc-complex.edp
XFAIL: helmholtz-3d-line-PETSc-complex.edp
XFAIL: helmholtz-coupled-2d-PETSc-complex.edp

Testsuite summary for FreeFEM 4.6

TOTAL: 87

PASS: 3

SKIP: 2

XFAIL: 82

FAIL: 0

XPASS: 0

ERROR: 0

============================================================================
make[2]: Leaving directory /home/yim/FreeFem-sources/examples/hpddm' make[1]: Leaving directory /home/yim/FreeFem-sources/examples/hpddm’


Most of them are XFAIL… What this means?
When I see for example, diffusion-3d.edp.log, it says this.


‘mpiexec’ -np 4 …/…/src/mpi/FreeFem+±mpi -nw ./diffusion-3d.edp
Fatal error in PMPI_Init_thread: Other MPI error, error stack:
MPIR_Init_thread(805)…: fail failed
MPID_Init(1859)…: channel initialization failed
MPIDI_CH3_Init(126)…: fail failed
MPID_nem_init_ckpt(857)…: fail failed
MPIDI_CH3I_Seg_commit(427): PMI_KVS_Get returned 4
In: PMI_Abort(69777679, Fatal error in PMPI_Init_thread: Other MPI error, error stack:
MPIR_Init_thread(805)…: fail failed
MPID_Init(1859)…: channel initialization failed
MPIDI_CH3_Init(126)…: fail failed
MPID_nem_init_ckpt(857)…: fail failed
MPIDI_CH3I_Seg_commit(427): PMI_KVS_Get returned 4)
INTERNAL ERROR: invalid error code ffffffff (Ring Index out of range) in MPID_nem_gen2_module_get_from_bc:262
INTERNAL ERROR: invalid error code ffffffff (Ring Index out of range) in MPID_nem_gen2_module_get_from_bc:262
INTERNAL ERROR: invalid error code ffffffff (Ring Index out of range) in MPID_nem_gen2_module_get_from_bc:262
INTERNAL ERROR: invalid error code ffffffff (Ring Index out of range) in MPID_nem_gen2_module_get_from_bc:262
[1] Abort: Error code in polled desc!
at line 2502 in file …/…/src/mpid/ch3/channels/nemesis/netmod/ofa/ofa_init.c
[2] Abort: Error code in polled desc!
at line 2502 in file …/…/src/mpid/ch3/channels/nemesis/netmod/ofa/ofa_init.c

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 4159 RUNNING AT fidis
= EXIT CODE: 9
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 4162 RUNNING AT fidis
= EXIT CODE: 9
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES

Intel® MPI Library troubleshooting guide:
https://software.intel.com/node/561764


HOWEVER, when I run the code with srun with slurm, I have good results. Do you want me to do the test agian with srun?

— 0/4 - diffusion-3d.edp - input parameters: refinement factor = 1 - overlap = 1
– Cube nv=8 nt=6 nbe=12 kind= 6
– Cube nv=8 nt=6 nbe=12 kind= 6
– Cube nv=8 nt=6 nbe=12 kind= 6
– FESpace: Nb of Nodes 8 Nb of DoF 8
– FESpace: Nb of Nodes 8 Nb of DoF 8
– Cube nv=1331 nt=6000 nbe=1200 kind= 6
– Cube nv=1331 nt=6000 nbe=1200 kind= 6
– Build Nodes/DF on mesh : n.v. 1331, n. elmt. 6000, n b. elmt. 1200
– Build Nodes/DF on mesh : n.v. 1331, n. elmt. 6000, n b. elmt. 1200
nb of Nodes 6000 nb of DoF 6000 DFon=0001
nb of Nodes 6000 nb of DoF 6000 DFon=0001
– FESpace: Nb of Nodes 6000 Nb of DoF 6000
– FESpace: Nb of Nodes 1331 Nb of DoF 1331
– FESpace: Nb of Nodes 6000 Nb of DoF 6000
– FESpace: Nb of Nodes 1331 Nb of DoF 1331
— global mesh of 6000 elements (prior to refinement) partitioned with metis --metisESCOA: 4-way Edge-Cut: 4, Balance: 1.02 Nodal=0/Dual 1
(in 1.995611e-02)
– FESpace: Nb of Nodes 400 Nb of DoF 400
– FESpace: Nb of Nodes 409 Nb of DoF 409
– FESpace: Nb of Nodes 662 Nb of DoF 662
– FESpace: Nb of Nodes 716 Nb of DoF 716
– Cube nv=8 nt=6 nbe=12 kind= 6
– FESpace: Nb of Nodes 8 Nb of DoF 8
– FESpace: Nb of Nodes 8 Nb of DoF 8
– Cube nv=1331 nt=6000 nbe=1200 kind= 6
– Cube nv=1331 nt=6000 nbe=1200 kind= 6
– Build Nodes/DF on mesh : n.v. 1331, n. elmt. 6000, n b. elmt. 1200
– Build Nodes/DF on mesh : n.v. 1331, n. elmt. 6000, n b. elmt. 1200
nb of Nodes 6000 nb of DoF 6000 DFon=0001
– FESpace: Nb of Nodes 6000 Nb of DoF 6000
nb of Nodes 6000 nb of DoF 6000 DFon=0001
– FESpace: Nb of Nodes 6000 Nb of DoF 6000
– FESpace: Nb of Nodes 1331 Nb of DoF 1331
– FESpace: Nb of Nodes 1331 Nb of DoF 1331
– FESpace: Nb of Nodes 392 Nb of DoF 392
– FESpace: Nb of Nodes 408 Nb of DoF 408
– FESpace: Nb of Nodes 645 Nb of DoF 645
– FESpace: Nb of Nodes 746 Nb of DoF 746
– Build Nodes/DF on mesh : n.v. 645, n. elmt. 2675, n b. elmt. 702
nb of Nodes 2675 nb of DoF 2675 DFon=0001
– FESpace: Nb of Nodes 2675 Nb of DoF 2675
– Build Nodes/DF on mesh : n.v. 716, n. elmt. 3013, n b. elmt. 752
– Build Nodes/DF on mesh : n.v. 662, n. elmt. 2756, n b. elmt. 714
nb of Nodes 3013 nb of DoF 3013 DFon=0001
– FESpace: Nb of Nodes 3013 Nb of DoF 3013
nb of Nodes 2756 nb of DoF 2756 DFon=0001
– FESpace: Nb of Nodes 2756 Nb of DoF 2756
– Build Nodes/DF on mesh : n.v. 746, n. elmt. 3174, n b. elmt. 764
nb of Nodes 3174 nb of DoF 3174 DFon=0001
– FESpace: Nb of Nodes 3174 Nb of DoF 3174
– FESpace: Nb of Nodes 513 Nb of DoF 513
– FESpace: Nb of Nodes 530 Nb of DoF 530
– FESpace: Nb of Nodes 547 Nb of DoF 547
– FESpace: Nb of Nodes 569 Nb of DoF 569
– FESpace: Nb of Nodes 513 Nb of DoF 513
– FESpace: Nb of Nodes 197 Nb of DoF 197
– FESpace: Nb of Nodes 530 Nb of DoF 530
– FESpace: Nb of Nodes 206 Nb of DoF 206
– FESpace: Nb of Nodes 513 Nb of DoF 513
– FESpace: Nb of Nodes 220 Nb of DoF 220
– FESpace: Nb of Nodes 513 Nb of DoF 513
– FESpace: Nb of Nodes 530 Nb of DoF 530
– FESpace: Nb of Nodes 220 Nb of DoF 220
– FESpace: Nb of Nodes 530 Nb of DoF 530
– FESpace: Nb of Nodes 547 Nb of DoF 547
– FESpace: Nb of Nodes 206 Nb of DoF 206
– FESpace: Nb of Nodes 547 Nb of DoF 547
– FESpace: Nb of Nodes 163 Nb of DoF 163
– FESpace: Nb of Nodes 547 Nb of DoF 547
– FESpace: Nb of Nodes 197 Nb of DoF 197
– FESpace: Nb of Nodes 547 Nb of DoF 547
– FESpace: Nb of Nodes 569 Nb of DoF 569
– FESpace: Nb of Nodes 163 Nb of DoF 163
– FESpace: Nb of Nodes 569 Nb of DoF 569
– FESpace: Nb of Nodes 220 Nb of DoF 220
– FESpace: Nb of Nodes 569 Nb of DoF 569
– FESpace: Nb of Nodes 220 Nb of DoF 220
– FESpace: Nb of Nodes 569 Nb of DoF 569
— partition of unity built (in 5.068111e-02)
– FESpace: Nb of Nodes 645 Nb of DoF 645
– FESpace: Nb of Nodes 662 Nb of DoF 662
– FESpace: Nb of Nodes 716 Nb of DoF 716
– FESpace: Nb of Nodes 746 Nb of DoF 746
– FESpace: Nb of Nodes 513 Nb of DoF 513
– FESpace: Nb of Nodes 530 Nb of DoF 530
– Cube nv=8 nt=6 nbe=12 kind= 6
– FESpace: Nb of Nodes 547 Nb of DoF 547
– Cube nv=8 nt=6 nbe=12 kind= 6
– FESpace: Nb of Nodes 569 Nb of DoF 569
– Cube nv=8 nt=6 nbe=12 kind= 6
– Cube nv=8 nt=6 nbe=12 kind= 6
times: compile 0.06s, execution 0.29s, mpirank:1
times: compile 0.05s, execution 0.29s, mpirank:2
######## We forget of deleting 0 Nb pointer, 0Bytes , mpirank 1, memory leak =956464
CodeAlloc : nb ptr 6807, size :631600 mpirank: 1
times: compile 0.05s, execution 0.29s, mpirank:3
######## We forget of deleting 0 Nb pointer, 0Bytes , mpirank 2, memory leak =809072
CodeAlloc : nb ptr 6807, size :631600 mpirank: 2
######## We forget of deleting 0 Nb pointer, 0Bytes , mpirank 3, memory leak =956480
CodeAlloc : nb ptr 6807, size :631600 mpirank: 3
times: compile 1.600000e-01s, execution 1.800000e-01s, mpirank:0
######## We forget of deleting 0 Nb pointer, 0Bytes , mpirank 0, memory leak =809072
CodeAlloc : nb ptr 6807, size :631600 mpirank: 0
Ok: Normal End


Oh, yes, ff-mpirun uses MPICH, but you need to use srun
Could you please copy/past the result of mpiicpc -show and -mpiicc -show?

mpiicpc -show
icpc -I/ssoft/spack/external/intel/2018.4/impi/2018.4.274/intel64/include -L/ssoft/spack/external/intel/2018.4/impi/2018.4.274/intel64/lib/release_mt -L/ssoft/spack/external/intel/2018.4/impi/2018.4.274/intel64/lib -Xlinker --enable-new-dtags -Xlinker -rpath -Xlinker /ssoft/spack/external/intel/2018.4/impi/2018.4.274/intel64/lib/release_mt -Xlinker -rpath -Xlinker /ssoft/spack/external/intel/2018.4/impi/2018.4.274/intel64/lib -Xlinker -rpath -Xlinker /opt/intel/mpi-rt/2017.0.0/intel64/lib/release_mt -Xlinker -rpath -Xlinker /opt/intel/mpi-rt/2017.0.0/intel64/lib -lmpicxx -lmpifort -lmpi -lmpigi -ldl -lrt -lpthread

mpiicc -show
icc -I/ssoft/spack/external/intel/2018.4/impi/2018.4.274/intel64/include -L/ssoft/spack/external/intel/2018.4/impi/2018.4.274/intel64/lib/release_mt -L/ssoft/spack/external/intel/2018.4/impi/2018.4.274/intel64/lib -Xlinker --enable-new-dtags -Xlinker -rpath -Xlinker /ssoft/spack/external/intel/2018.4/impi/2018.4.274/intel64/lib/release_mt -Xlinker -rpath -Xlinker /ssoft/spack/external/intel/2018.4/impi/2018.4.274/intel64/lib -Xlinker -rpath -Xlinker /opt/intel/mpi-rt/2017.0.0/intel64/lib/release_mt -Xlinker -rpath -Xlinker /opt/intel/mpi-rt/2017.0.0/intel64/lib -lmpifort -lmpi -lmpigi -ldl -lrt -lpthread

I’m seeing you are using the master branch of FreeFEM. Could you please retry from scratch with the develop branch?
If this still fails, could you please attach (do not copy/paste) to your post the files config.log and 3rdparty/ff-petsc/petsc-3.13.4/configure.log?

Dear prj,
Thanks a lot. I re-did what you told me and it failed at make petsc-slepc.
I wanted to attach log file but I cannot attach files “Sorry, new users can not upload attachments.”
Can I send it to your email?

The error msg are…

In file included from /home/yim/FreeFem-install-dev/ff-petsc/r/include/HPDDM.hpp(258),
from /home/yim/FreeFem-sources-develop/3rdparty/ff-petsc/petsc-3.13.4/include/petsc/private/petschpddm.h(49),
from /home/yim/FreeFem-sources-develop/3rdparty/ff-petsc/petsc-3.13.4/src/ksp/ksp/impls/hpddm/hpddm.cxx(1):
/home/yim/FreeFem-install-dev/ff-petsc/r/include/HPDDM_wrapper.hpp(58): catastrophic error: cannot open source file “mkl_spblas.h”

include <mkl_spblas.h>

                      ^

compilation aborted for /home/yim/FreeFem-sources-develop/3rdparty/ff-petsc/petsc-3.13.4/src/ksp/ksp/impls/hpddm/hpddm.cxx (code 4)
gmake[5]: *** [fc/obj/ksp/ksp/impls/hpddm/hpddm.o] Error 4
gmake[5]: *** Waiting for unfinished jobs…
CC fc/obj/ksp/pc/impls/mat/pcmat.o
In file included from /home/yim/FreeFem-install-dev/ff-petsc/r/include/HPDDM.hpp(258),
from /home/yim/FreeFem-sources-develop/3rdparty/ff-petsc/petsc-3.13.4/include/petsc/private/petschpddm.h(49),
from /home/yim/FreeFem-sources-develop/3rdparty/ff-petsc/petsc-3.13.4/src/ksp/pc/impls/hpddm/hpddm.cxx(3):
/home/yim/FreeFem-install-dev/ff-petsc/r/include/HPDDM_wrapper.hpp(58): catastrophic error: cannot open source file “mkl_spblas.h”

include <mkl_spblas.h>

                      ^

compilation aborted for /home/yim/FreeFem-sources-develop/3rdparty/ff-petsc/petsc-3.13.4/src/ksp/pc/impls/hpddm/hpddm.cxx (code 4)
gmake[5]: *** [fc/obj/ksp/pc/impls/hpddm/hpddm.o] Error 4
gmake[4]: *** [libs] Error 2
ERROR***********
Error during compile, check fc/lib/petsc/conf/make.log
Send it and fc/lib/petsc/conf/configure.log to petsc-maint@mcs.anl.gov


make[3]: *** [all] Error 1
make[2]: *** [all] Error 2
make[2]: Leaving directory /home/yim/FreeFem-sources-develop/3rdparty/ff-petsc/petsc-3.13.4' make[1]: *** [petsc-3.13.4/tag-make-complex] Error 2 make[1]: Leaving directory /home/yim/FreeFem-sources-develop/3rdparty/ff-petsc’
make: *** [WHERE-all] Error 2

Best,
Eunok

Now I can do!

configure.log (1.0 MB) config.log (479.0 KB)

FreeFEM (not PETSc) configure is not detecting the MKL. Could you try this line instead:

$  ./configure --enable-download --enable-optim --prefix=/home/yim/FreeFem-install-dev --with-mkl=/ssoft/spack/external/intel/2018.4/compilers_and_libraries_2018.5.274/linux/mkl/lib/intel64

And then re-do make petsc-slepc. Upload the same two files if this still does not work. FreeFEM compilation process on clusters is a little cumbersome, sorry about that…

Thanks a lot for your effort…! But it failed again with the same errors…

config2.log (471.5 KB) configure.log (1.0 MB)