e> No, I disagree with your assessment.
I'm afraid that your disagreement is incorrect. You are certainly
welcome to maintain your position, but Dave will still be right.
e> Accessing a volatile is still far more efficient than the function
e> call overhead typically required for obtaining a mutex lock.
That is not necessarily true; even if it were, it is still not
helpful. You should not be caring about nigglingly small points of
efficiency, but about correctness and larger issues of efficiency.
e> volatile is necessary in this context because you need to force the
e> compiler to flush the update to memory, even if that flush doesn't
e> occur immediately due to the vagaries of particular SMP
Using the volatile keyword is still insufficient to give you reliable
(i.e. correct) semantics, and this is the point that Dave and I have
been pushing. The sequence you should be following is like this:
1. Develop your code. Make sure it is correct.
2. Benchmark it in realistic conditions.
3. Is it fast enough? If so, you're done. If not, continue.
4. Look at the algorithms and data structures you're using, and the
overall structure of the synchronisation you are doing. See if
you can make any changes that would have a large impact. Go to
5. Once you have everything working sensibly in the large and you
still aren't getting quite the performance you need, start
worrying about those inner loops.
Only during the last step should you start worrying about ways to
improve the performance of your code with respect to individual
mutexes or condition variables.
At this point, you may be thinking about rewriting your inner loops in
assembly language and doing other platform-dependent things, depending
on how much you need to care about speed and portability, so more or
less anything goes, perhaps including use of your own synchronisation
This is the sort of thing you will only need to pay attention to if
you have a lot of time to spare and performance is of utmost
importance, though; up until near the end of step 5, you should use
whatever portable vendor-provided synchronisation constructs are
appropriate to your task, and you will find that this suffices for
99.99% of all your programming needs.
I absolutely guarantee you that trying to write your own portable
synchronisation code in C or C++ is a quick route to insanity and
humbleness. If you think you know enough to get it right without
having used your code in production work for a year or three, you just
haven't been bitten often enough by the subtle bugs in your code.
Let us pray: