In that paper the flow is taken incompressible, whereas in your code you use compressible. This is why the solution is different (but looks correct).
You can modify easily your code to do the incompressible case, by using the extra variables p,q as
solve bing([ux,uy,p],[vx,vy,q])
and adding
-int2d(Th)(p*Divergence(vx, vy)+q*Divergence(ux, uy))
But then it happens that the scheme converges extremely slowly (you need thousands of iterations). In order to improve it you should use the Augmented Lagrangian method as in that paper.
Another point is that your computation of the stream function could be misleading. You look for \psi satisfying
-\Delta\psi=\partial_y u_x-\partial_x u_y.
An important property that one would like that if \partial_x u_x+\partial_y u_y=0 then \partial_x\psi=u_y and -\partial_y\psi=u_x.
You see then that the Dirichlet condition on \psi is not appropriate. It is better to take
\frac{\partial\psi}{\partial n}=u_y n_x-u_x n_y on \partial\Omega.
We are thus led to the problem
-\Delta \psi=f in \Omega,\qquad\frac{\partial\psi}{\partial n}=g on \partial\Omega.
To solve this system you need the compatibility condition (to see that it is necessary, integrate the equation over \Omega)
\int_\Omega f=-\int_{\partial\Omega}g
Assuming this, the system is not invertible since \psi is defined up to a constant. You can determine it by imposing \int_{\partial\Omega}\psi=0.
In order to get the solution you can then solve
-\Delta \psi=f in \Omega,\qquad\frac{\partial\psi}{\partial n}+\varepsilon\psi=g on \partial\Omega,
for some small \varepsilon. This system is well-posed. This leads to the code
fespace Rh(Th, P2);
Rh dyg,dyg2;
solve streamlines (dyg, dyg2,solver=UMFPACK)
= int2d(Th)(
dx(dyg)*dx(dyg2)
+ dy(dyg)*dy(dyg2)
)
+ int2d(Th)(
- dyg2*(dy(ux) - dx(uy))
)
+ int1d(Th)(
- dyg2*(uy*N.x-ux*N.y)
)
+ int1d(Th)(1e-8*dyg*dyg2)
;