I am solving a 3D Stokes problem with PETSc in parallel, with this variationnal formulation grad(u)' * grad(v) + grad(uB)' * grad(vB) + grad(uC)' * grad(vC) - div(u) * q - div(v) * p + epsq * p * q

To solve this I am currently using the solver gmres with the LU preconditionning -ksp_type gmres -pc_type lu -pc_factor_mat_solver_type mumps.

However this method, is memory and time consuming due to the LU factorization (so I can only run small case even on HPC).

I have also try the solver block jacobi with a ILU preconditionning, but the the linear solver does not converge.

Could someone know a solver and a preconditionner to solve efficiently such a problem ?

I have tried the solver proposed on the FreeFEM git: https://github.com/FreeFem/FreeFem-sources/blob/develop/examples/hpddm/stokes-fieldsplit-3d-PETSc.edp

For the largest problem you tried with this, what was the performance of the method? Also, if the problem is symmetric with homogeneous Dirichlet boundary conditions (tgv = -2), you should consider switching to -pc_type cholesky.

With -ksp_type gmres -pc_type lu -pc_factor_mat_solver_type mumps I have been able to run a case with about 384 000 tetras (mesh3 cube(40, 40, 40)) on 144 CPUs and it takes 53 min.

I have not succeeded to run larger case with this method,

Sorry for my misunderstanding, I don’t have homogeneous Dirichlet boundary conditions. And my problem I think is not symmetric due to the extra term -eps*p*q.

The term -eps*p*q is symmetric, but if you don’t have homogeneous Dirichlet BC, tgv=-2 is useless indeed. 53 minutes for the above problem suggest that something is wrong on your machine.

Considering a case with 750 000 tetras (cube(50, 50, 50)) I have not been able to run it even using more that 200 CPUs.

The PETSc iterations don’t even start. This suggests that there is indeed a problem on the machine. But in the other hand it is known that LU factorization is time and memory consuming.