Replicate int2d calculation

I am trying to understand how the numerical integration works when we use different quadrature formulations. I have created a minimal example which calculates a correlation matrix for a very simple lid driven cavity case.

I am comparing the quadrature formulations qf2pT and qf5pT. Two correlation matrices are calculated for each case: matrix C using int2d and matrix Cm using manual calculation.

I am able to replicate the results when I use qf2pT formulation with following result:

Correlation matrix: C
4 4	
	 0.001239412286 0.0002412006984 -0.000513817483 -0.0009667955009
	 0.0002412006984 0.0001079664125 -9.299777911e-05 -0.0002561693318
	 -0.000513817483 -9.299777911e-05 0.0002146702152 0.0003921450469
	 -0.0009667955009 -0.0002561693318 0.0003921450469 0.0008308197859
	
Correlation matrix: Cm
4 4	
	 0.001239412286 0.0002412006984 -0.000513817483 -0.0009667955009
	 0.0002412006984 0.0001079664125 -9.299777911e-05 -0.0002561693318
	 -0.000513817483 -9.299777911e-05 0.0002146702152 0.0003921450469
	 -0.0009667955009 -0.0002561693318 0.0003921450469 0.0008308197859

But when I change the formulation to qf5pT, the matrices C and Cm are no longer equal.

Correlation matrix: C
4 4	
	 0.0006628061602 0.0001533169954 -0.0002703254283 -0.0005457977273
	 0.0001533169954 7.228030626e-05 -5.741837901e-05 -0.0001681789226
	 -0.0002703254283 -5.741837901e-05 0.0001113690417 0.0002163747656
	 -0.0005457977273 -0.0001681789226 0.0002163747656 0.0004976018843
	
Correlation matrix: Cm
4 4	
	 0.0007177217879 0.00013111664 -0.0002998948947 -0.0005489435333
	 0.00013111664 5.907615925e-05 -5.06112115e-05 -0.0001395815878
	 -0.0002998948947 -5.06112115e-05 0.0001262935963 0.0002242125099
	 -0.0005489435333 -0.0001395815878 0.0002242125099 0.0004643126112

How can I obtain the same result while doing the manual evaluation of int2d ? I want to do this so that I can export the mesh, the mass matrix and the solution from FreeFem to evaluate inner products externally.

Thanks.

verbosity = 0;

int n=2;        // Number of boundary elements
int nsnsh=4;    // Number of snapshots
int i,j,k,l;

mesh Th=square(n,n);

fespace Vh(Th,P2);
fespace Wh(Th,P1);
Vh u,v,uu,vv,up,vp,uavg,vavg;
Wh p,pp,q,pavg;
// snapshot space:
Vh[int] usnsh(nsnsh),vsnsh(nsnsh),xh(2);
Vh uh,vh;
Wh[int] psnsh(nsnsh);
// average values:
uavg=0;vavg=0;pavg=0;  
up=0;vp=0;q=0;

xh[0] = x;
xh[1] = y;
varf bin (uh,vh) = int2d(Th,qft=qf5pT)(uh*vh);
matrix M = bin (Vh,Vh);

macro div(u,v) (dx(u)+dy(v)) //
macro grad(u) [dx(u),dy(u)] //

real Re=1,nu=1/Re,dt=.1;

problem NSunst([u,v,p],[uu,vv,pp],solver=sparsesolver)=
  int2d(Th)(nu*(grad(u)'*grad(uu)+grad(v)'*grad(vv)))
  +int2d(Th)([up,vp]'*grad(u)*uu+[up,vp]'*grad(v)*vv)
  -int2d(Th)(p*div(uu,vv)+pp*div(u,v))
  +int2d(Th)(1.e-10*p*pp)
  +on(1,2,4,u=0,v=0)
  +on(3,u=1,v=0);

for (j=0;j<nsnsh;j++){
  for (i=0;i<10;i++){
    NSunst;
    up=u;vp=v;q=p;
  }
  usnsh[j]=u;vsnsh[j]=v;psnsh[j]=p;
  uavg=uavg+usnsh[j];vavg=vavg+vsnsh[j];pavg=pavg+psnsh[j];

  // increase the Reynolds number by 50 for each snapshot
  Re += 50;nu=1/Re;
}

uavg[]=uavg[]/nsnsh;vavg[]=vavg[]/nsnsh;pavg[]=pavg[]/nsnsh;
// plot([uavg,vavg],value=1,cmm="uavg",wait=1);

for (int i=0;i<nsnsh;i++){
  usnsh[i]=usnsh[i]-uavg;
  vsnsh[i]=vsnsh[i]-vavg;
  psnsh[i]=psnsh[i]-pavg;
}


/* build the correlation matrix:
 * C_ij = (u_i,u_j)
 */
real[int,int] C(nsnsh,nsnsh),Cm(nsnsh,nsnsh);
C=0;Cm=0;
for (i=0;i<nsnsh;i++)
  for (j=i;j<nsnsh;j++){
    C(j,i) = int2d(Th,qft=qf5pT)(usnsh[i]*usnsh[j]+vsnsh[i]*vsnsh[j]);
    for(k=0;k<Vh.ndof;k++)
        Cm(j,i) += M(k,k)*(usnsh[i][][k]*usnsh[j][][k]+vsnsh[i][][k]*vsnsh[j][][k]);
  }
for (j=0;j<nsnsh;j++)
  for (i=j;i<nsnsh;i++){
    C(j,i)=C(i,j);
    Cm(j,i)=Cm(i,j);
  }

cout << "Correlation matrix: C"<< endl;
cout << C << endl;
cout << "Correlation matrix: Cm"<< endl;
cout << Cm << endl;

// {ofstream fc("out-C.dat");fc<<C<<endl;}
// {ofstream fcm("out-Cm.dat");fcm<<Cm<<endl;}

Note: qft is changed at two places: the mass matrix and the correlation matrix calculation.