Stall in some multi-node parallel calculations

So, if I understand correctly, now everything is running to completion? The latest patch is not very clean, it’s more of a hack to bypass the wrong behavior that you get on your system. But if it gets your jobs running, at least that is that.

Exactly. Thank you. I will test other codes and other numbers of nodes and processes, and let you know.

It seems to work well with various codes and numbers of nodes and processes. Thank you again.

So, what was the problem exactly?

It seems to work well with various codes and numbers of nodes and processes.

Again, the patch I sent is kind of nasty. It will likely not work with meshes with very badly-shaped elements.

So, what was the problem exactly?

Your MPI implementation is having some difficulties with MPI_Waitany and/or MPI_Waitall + MPI_Status queries. I’d advise trying to start with a fresh installation using Intel compilers and IntelMPI, which do not have such issues (that’s what I’m using daily). If it still fails with those, there may be something wrong with your interconnect, but it is not possible for me to solve such issues remotely.