Hi all,
I’m using FF to solve an eigenproblem but I have come across some problems with SLEPc.
Problem
I’d like to solve the Hydrogen atom electron wave function and I’ve done it with a coarse mesh grid (40x40x40). But when it comes to a finner grid, for example, 100x100x100, the EigenValue
from ARPACK would fail. Even if EigenValue
can handle it, it takes much, much longer to solve it using a single CPU core. Then I turned to SLEPc.
According to the example from here, I just copied it and put the Hydrogen atom Schrodinger equation on it. Then the result seemed to be strange:
Unused createMat(Th, A, Pk)
I happened to write two redundant lines of
Mat A;
create(Th, A, Pk);
but never used A
again. A remarkable performance improvement could be observed due to the ff-mpirun -np $NUM_PROCS
. However, the results varied as NUM_PROCS
changed. Here are the results:
!!! ff-mpirun -np 1
!!! Eigenvalues:
---- 0 4.395276e+00 ==
---- 1 1.096251e+01 ==
---- 2 1.096251e+01 ==
---- 3 1.119920e+01 ==
---- 4 1.139153e+01 ==
---- 5 1.255011e+01 ==
---- 6 1.255011e+01 ==
---- 7 1.259630e+01 ==
---- 8 1.259631e+01 ==
---- 9 1.261844e+01 ==
times: compile 2.000000e-01s, execution 4.920000e+01s, mpirank:0
!!! ff-mpirun -np 20
!!! Eigenvalues:
---- 0 2.048881e+00 ==
---- 1 1.012404e+01 ==
---- 2 1.046671e+01 ==
---- 3 1.079496e+01 ==
---- 4 1.225954e+01 ==
---- 5 1.235036e+01 ==
---- 6 1.240903e+01 ==
---- 7 1.251413e+01 ==
---- 8 1.271387e+01 ==
---- 9 1.299447e+01 ==
times: compile 1.800000e-01s, execution 4.420000e+00s, mpirank:0
The time usage of -np 20
is about 10x shorter than -np 1
. But the results of eigenvalues differed.
Uncomment the two lines
If I uncomment the two unused lines, the results are shown below:
!!! ff-mpirun -np 1
!!! Eigenvalues:
---- 0 4.39528 ==
---- 1 10.9625 ==
---- 2 10.9625 ==
---- 3 11.1992 ==
---- 4 11.3915 ==
---- 5 12.5501 ==
---- 6 12.5501 ==
---- 7 12.5963 ==
---- 8 12.5963 ==
---- 9 12.6184 ==
times: compile 0.11s, execution 49.24s, mpirank:0
!!! ff-mpirun -np 20
!!! Eigenvalues:
---- 0 4.39528 ==
---- 1 10.9625 ==
---- 2 10.9625 ==
---- 3 11.1992 ==
---- 4 11.3915 ==
---- 5 12.5501 ==
---- 6 12.5501 ==
---- 7 12.5963 ==
---- 8 12.5963 ==
---- 9 12.6184 ==
times: compile 0.11s, execution 55.49s, mpirank:0
The results of eigenvalues are the same but time usage does not change very much, which means this script has not enabled parallel acceleration.
I am sure that there are issues within my code, but I’m still a newbie to FF.
Are there any patches to it? Thanks in advance!
MWE
load "SLEPc"
include "macro_ddm.idp"
include "cube.idp"
macro dimension() 3 // EOM
macro Grad(u) [dx(u), dy(u), dz(u)] // EOM
int nn = 40;
real ka = 3.8099821161548593; // hbar^2 / 2m_e
real kc = 14.39964547842567; // e^2 / (4pi epsilon_0)
mesh3 Th = cube(nn, nn, nn, [40*x-20, 40*y-20, 40*z-20]);
// 1 => y=0
// 2 => x=1
// 3 => y=1
// 4 => x=0
// 5 => z=0
// 6 => z=1
func Pk = P1;
//Mat A; // !!!
//createMat(Th, A, Pk); // !!! Uncomment these two lines to see the difference
fespace Vh(Th, P1);
cout << "Th : nv = " << Th.nv << " nt = " << Th.nt << endl;
real sigma = -14;
varf a(u1, u2) = int3d(Th) (
Grad(u1)' * Grad(u2) * ka
- 1.0 / sqrt(x^2 + y^2 + z^2) * u1 * u2 * kc
- sigma * u1 * u2
)
+ on(5, 6, u1=0.0)
;
varf b([u1], [u2]) = int3d(Th) ( u1 * u2 ) ;
matrix LocA = a(Vh, Vh);
matrix LocB = b(Vh, Vh);
int nev = 10;
real[int] ev(nev);
Vh[int] eV(nev);
string ssparams = "-eps_nev " + nev
+ " -eps_type krylovschur"
+ " -eps_target " + sigma
//+ " -eps_view "
+ " -st_type sinvert "
+ " -eps_gen_hermitian";
if (!mpirank) {
cout << "---------- " << ssparams << " ----------" << endl;
}
Mat DistA(LocA);
Mat DistB(LocB);
int k = EPSSolve(
DistA, DistB,
vectors = eV,
values = ev,
sparams = ssparams
);
k = min(k, nev);
int[int] Order= [1];
if (!mpirank) {
for (int i=0; i<k; ++i) {
//macro params()cmm = "Eigenvalue #" + i + " = " + ev[i], wait = 1, fill = 1//
cout << " ---- " << i << " " << ev[i] << " == " << endl;
//plotD(Th, eV[i], params)
//int[int] fforder = [1];
//savevtk("EV_" + i + ".vtu", Th, eV[i], order=fforder);
}
}
Logs
logs.zip (200.2 KB)