mpiBarrier(mpiCommWorld) not working correctly?

It seems like mpiBarrier(mpiCommWorld) is not functioning as expected. Here’s the situation:

Given the following FreeFEM++ code:
/////////////////////////////////////////////////////////////////////////
verbosity = 0;
mpiComm comm(mpiCommWorld, 0, 0);
for (int i = 0; i < mpisize; ++i) {
if (i == mpirank) {
cout << "mprocess = " << mpirank << " among = " << mpisize << endl;
}
mpiBarrier(mpiCommWorld);
}
/////////////////////////////////////////////////////////////////////////////
The output I get is:
mprocess = 0 among = 4
mprocess = 1 among = 4
mprocess = 3 among = 4
mprocess = 2 among = 4

However, I was expecting this output:
mprocess = 0 among = 4
mprocess = 1 among = 4
mprocess = 2 among = 4
mprocess = 3 among = 4

Is mpiBarrier(mpiCommWorld) not working correctly? It should ensure that all processes synchronize at each iteration, so only one process outputs at a time. Could you clarify if there’s an issue with my understanding or usage of mpiBarrier?

What is not working as expected? Why would you expect the output to be:

mprocess = 0 among = 4
mprocess = 1 among = 4
mprocess = 2 among = 4
mprocess = 3 among = 4

I’m guessing if you want to be sure you get the expected output, you should probably flush cout.

Thank you very much for your reply.
The reason I expected the output to be:

mprocess = 0 among = 4
mprocess = 1 among = 4
mprocess = 2 among = 4
mprocess = 3 among = 4

is because in my code, I use a for loop where each process only executes the cout statement when its rank matches the current iteration index i. For example:

  • When i == 0, only process 0 executes cout.
  • When i == 1, only process 1 executes cout, and so on.

After each iteration, I use mpiBarrier(mpiCommWorld); to synchronize all processes. My understanding was that this synchronization would ensure that the output from process 0 is completed before process 1 starts its output, and so on.

However, the actual output is out of order, such as:

mprocess = 0 among = 4
mprocess = 1 among = 4
mprocess = 3 among = 4
mprocess = 2 among = 4

Regarding your suggestion about flushing cout, I was under the impression that using endl in my cout statements already performs a flush. Could you tell me why endl might not be sufficient in this case? Or could the issue be related to the way mpiBarrier is working in this specific context?

My bad, endl does actually flush the buffer. I’m not sure why you get this, what is your system? FWIW, on my Linux and macOS, I get your expected output.

Sorry for late reply.
I’m using Windows 11, Intel Core i5 8265.

OK, not a big surprise then.