Efficient solver and preconditionner for 3D Stokes Problem

Loic · March 7, 2023, 7:51am

Dear all,

I am solving a 3D Stokes problem with PETSc in parallel, with this variationnal formulation grad(u)' * grad(v) + grad(uB)' * grad(vB) + grad(uC)' * grad(vC) - div(u) * q - div(v) * p + epsq * p * q

To solve this I am currently using the solver gmres with the LU preconditionning -ksp_type gmres -pc_type lu -pc_factor_mat_solver_type mumps.

However this method, is memory and time consuming due to the LU factorization (so I can only run small case even on HPC).

I have also try the solver block jacobi with a ILU preconditionning, but the the linear solver does not converge.

Could someone know a solver and a preconditionner to solve efficiently such a problem ?

Thank you in advance for your help,

Best regards,

Loïc,

prj · March 7, 2023, 8:06am

Use fieldsplit and look at the literature.

Loic · March 7, 2023, 8:29am

Thanks ! I will have a look on this.

Loïc,

Loic · March 7, 2023, 10:24am

I have tried the solver proposed on the FreeFEM git: https://github.com/FreeFem/FreeFem-sources/blob/develop/examples/hpddm/stokes-fieldsplit-3d-PETSc.edp

set(A, sparams = " -ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_fact_type lower -pc_fieldsplit_detect_saddle_point -fieldsplit_velocity_sub_pc_type lu "
               +  " -fieldsplit_pressure_sub_pc_type lu -fieldsplit_velocity_sub_pc_factor_mat_solver_type mumps -fieldsplit_pressure_sub_pc_factor_mat_solver_type mumps -ksp_monitor -ksp_view "
               + " -fieldsplit_velocity_ksp_type gmres -fieldsplit_velocity_ksp_max_it 5 -fieldsplit_pressure_ksp_type gmres -fieldsplit_pressure_ksp_max_it 5 -ksp_rtol 1e-6");

However, this solver is slower that my previous solver and gives me an incorrect result.

Do you know what might be the cause ?
or how could I adapt it ?

Thank you in advance,

Best regards,

Loïc,

prj · March 7, 2023, 12:37pm

What does -ksp_converged_reason return?

Loic · March 7, 2023, 12:50pm

It returns

 0 KSP Residual norm 4.595362033965e+31 
 1 KSP Residual norm 3.698028393291e+16 
Linear solve converged due to CONVERGED_RTOL iterations 1

Loïc,

prj · March 7, 2023, 12:52pm

Are you using the default tgv value of 10^{30}?

Loic · March 7, 2023, 12:54pm

Yes I am using the default one.

prj · March 7, 2023, 1:01pm

Then switch to something else first.

Loic · March 7, 2023, 1:10pm

I have tried different value of tgv but the result remain not correct and the residuals remain big.

prj · March 7, 2023, 1:12pm

What do you get with tgv = -1 + -ksp_monitor_true_residual -pc_type lu?

Loic · March 7, 2023, 1:19pm

By doing this, I get the right result:

-->Solving with PETSc began....
  0 KSP preconditioned resid norm 2.720997719081e+02 true resid norm 4.595728221659e+01 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP preconditioned resid norm 1.879238360449e-04 true resid norm 1.609760664389e-10 ||r(i)||/||b|| 3.502732508859e-12

and it is much faster.

Loïc,

prj · March 7, 2023, 1:22pm

For the largest problem you tried with this, what was the performance of the method? Also, if the problem is symmetric with homogeneous Dirichlet boundary conditions (tgv = -2), you should consider switching to -pc_type cholesky.

Loic · March 7, 2023, 1:33pm

With -ksp_type gmres -pc_type lu -pc_factor_mat_solver_type mumps I have been able to run a case with about 384 000 tetras (mesh3 cube(40, 40, 40)) on 144 CPUs and it takes 53 min.

I have not succeeded to run larger case with this method,

Loïc,

Loic · March 7, 2023, 1:41pm

I have try tgv=-2 with -pc_type cholesky. The code converges, but the results are not correct.

prj · March 7, 2023, 1:42pm

Is your problem symmetric with homogeneous Dirichlet boundary conditions?

Loic · March 7, 2023, 1:45pm

Sorry for my misunderstanding, I don’t have homogeneous Dirichlet boundary conditions. And my problem I think is not symmetric due to the extra term -eps*p*q.

prj · March 7, 2023, 1:47pm

The term -eps*p*q is symmetric, but if you don’t have homogeneous Dirichlet BC, tgv=-2 is useless indeed. 53 minutes for the above problem suggest that something is wrong on your machine.

Loic · March 7, 2023, 1:55pm

Considering a case with 750 000 tetras (cube(50, 50, 50)) I have not been able to run it even using more that 200 CPUs.

The PETSc iterations don’t even start. This suggests that there is indeed a problem on the machine. But in the other hand it is known that LU factorization is time and memory consuming.

So I don’t know what is exactly the problem

Loïc,

prj · March 7, 2023, 2:00pm

Share the test case, I will let you know how much time it takes on a properly configured machine.

Topic		Replies	Views
High memory usage in vector space problems using PETSc General Discussion	2	820	October 18, 2020
Solver for unsteady N-S equation General Discussion	25	416	June 16, 2025
Iterative solution of the Navier-Stokes equation General Discussion	4	695	July 26, 2021
PETSc Stokes memory issue General Discussion	3	292	November 25, 2022
Supply PETSc numbering for shared-memory block matrix General Discussion	30	1546	September 19, 2022

Efficient solver and preconditionner for 3D Stokes Problem

Related topics