Computing 3D integral with mixed/vector element variables is very slow

Hello,

I noticed that the computation of 3D integrals with [P1,P1,P1] elements is about ten times slower compared to the integration with P1 elements (after interpolating each coordinate), see the code below and the files attached below:

essai.zip




func int readData(string fileName, real[int] &data){
	{
		ifstream f(fileName);
		f>>data;
	}
}

mesh3 Ths=readmesh3("Ths.mesh");
fespace Fhs12d(Ths,[P1,P1,P1]);
fespace Fhs1(Ths,P1);
Fhs12d [ux,uy,uz];
Fhs1 ux1, uy1, uz1;
readData("ux.gptmp",ux[]);
plot(ux,cmm="ux");
ux1=ux;
uy1=uy;
uz1=uz;

real mu=1, lambda=1;
real sqrt2=sqrt(2);
macro e(vx,vy,vz)   				 	[dx(vx),dy(vy),dz(vz),(dx(vy)+dy(vx))/sqrt2, (dx(vz)+dz(vx))/sqrt2, (dy(vz)+dz(vy))/sqrt2]//
macro div(vx,vy,vz) 				 	(dx(vx)+dy(vy)+dz(vz))//
real cpu1=clock();
real objectiveValue=int3d(Ths)(2*mu*e(ux1,uy1,uz1)'*e(ux1,uy1,uz1)+lambda*div(ux1,uy1,uz1)*div(ux1,uy1,uz1));
cout << "Time 1 : "<<clock()-cpu1 << "  VALUE = " << objectiveValue << endl;

real cpu2=clock();
real objectiveValue2=int3d(Ths)(2*mu*e(ux,uy,uz)'*e(ux,uy,uz)+lambda*div(ux,uy,uz)*div(ux,uy,uz));
cout << "Time 2 : "<<clock()-cpu2 << " Value = "<< objectiveValue << endl;

Thanks,