<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: C++ atomics and memory ordering</title>
	<atom:link href="http://bartoszmilewski.wordpress.com/2008/12/01/c-atomics-and-memory-ordering/feed/" rel="self" type="application/rss+xml" />
	<link>http://bartoszmilewski.wordpress.com/2008/12/01/c-atomics-and-memory-ordering/</link>
	<description>Concurrency, Multicore, Language Design, D, C++</description>
	<lastBuildDate>Thu, 29 Oct 2009 00:38:24 +0000</lastBuildDate>
	<generator>http://wordpress.com/</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: improve your memory</title>
		<link>http://bartoszmilewski.wordpress.com/2008/12/01/c-atomics-and-memory-ordering/#comment-164</link>
		<dc:creator>improve your memory</dc:creator>
		<pubDate>Sat, 31 Jan 2009 12:15:01 +0000</pubDate>
		<guid isPermaLink="false">http://bartoszmilewski.wordpress.com/?p=331#comment-164</guid>
		<description>I’m glad we are having this exchange. It shows how hard it is to reason about relaxed memory models.</description>
		<content:encoded><![CDATA[<p>I’m glad we are having this exchange. It shows how hard it is to reason about relaxed memory models.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bartosz Milewski</title>
		<link>http://bartoszmilewski.wordpress.com/2008/12/01/c-atomics-and-memory-ordering/#comment-127</link>
		<dc:creator>Bartosz Milewski</dc:creator>
		<pubDate>Wed, 24 Dec 2008 18:57:25 +0000</pubDate>
		<guid isPermaLink="false">http://bartoszmilewski.wordpress.com/?p=331#comment-127</guid>
		<description>You&#039;re right. If data changes only this once and then remains immutable, it doesn&#039;t have to be atomic or volatile. Similarly if data is subsequently only mutated under a lock.</description>
		<content:encoded><![CDATA[<p>You&#8217;re right. If data changes only this once and then remains immutable, it doesn&#8217;t have to be atomic or volatile. Similarly if data is subsequently only mutated under a lock.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sergey</title>
		<link>http://bartoszmilewski.wordpress.com/2008/12/01/c-atomics-and-memory-ordering/#comment-126</link>
		<dc:creator>Sergey</dc:creator>
		<pubDate>Wed, 24 Dec 2008 18:36:10 +0000</pubDate>
		<guid isPermaLink="false">http://bartoszmilewski.wordpress.com/?p=331#comment-126</guid>
		<description>&gt; atomic ready = false;
&gt; atomic data = 0;

In Java only ready flag has to be declared volatile, data can be plain int.
http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html#volatile</description>
		<content:encoded><![CDATA[<p>&gt; atomic ready = false;<br />
&gt; atomic data = 0;</p>
<p>In Java only ready flag has to be declared volatile, data can be plain int.<br />
<a href="http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html#volatile" rel="nofollow">http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html#volatile</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: The Inscrutable C++ Memory Model &#171;   Bartosz Milewski&#8217;s Programming Cafe</title>
		<link>http://bartoszmilewski.wordpress.com/2008/12/01/c-atomics-and-memory-ordering/#comment-121</link>
		<dc:creator>The Inscrutable C++ Memory Model &#171;   Bartosz Milewski&#8217;s Programming Cafe</dc:creator>
		<pubDate>Tue, 23 Dec 2008 18:16:05 +0000</pubDate>
		<guid isPermaLink="false">http://bartoszmilewski.wordpress.com/?p=331#comment-121</guid>
		<description>[...] Milewski under C++, Concurrency, Memory Model, Multicore, Programming, atomics &#160;  In my last post I made a mistake of publishing a piece of code that used C++ weak atomics&#8211;which are part of [...]</description>
		<content:encoded><![CDATA[<p>[...] Milewski under C++, Concurrency, Memory Model, Multicore, Programming, atomics &nbsp;  In my last post I made a mistake of publishing a piece of code that used C++ weak atomics&#8211;which are part of [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bartosz Milewski</title>
		<link>http://bartoszmilewski.wordpress.com/2008/12/01/c-atomics-and-memory-ordering/#comment-120</link>
		<dc:creator>Bartosz Milewski</dc:creator>
		<pubDate>Sat, 06 Dec 2008 21:17:25 +0000</pubDate>
		<guid isPermaLink="false">http://bartoszmilewski.wordpress.com/?p=331#comment-120</guid>
		<description>I&#039;m glad we are having this exchange. It shows how hard it is to reason about relaxed memory models. I don&#039;t think Anthony&#039;s proof is correct (we are exchanging email about it), but he raised some very good points. When all the dust settles, I&#039;ll write a new blog post about proofs. 

[edit] Correction, after a long exchange with experts, and giving up on Java-specific proof techniques, I am convinced now that both Anthony and Dmitriy were right. I&#039;ll try to explain it in my next post.</description>
		<content:encoded><![CDATA[<p>I&#8217;m glad we are having this exchange. It shows how hard it is to reason about relaxed memory models. I don&#8217;t think Anthony&#8217;s proof is correct (we are exchanging email about it), but he raised some very good points. When all the dust settles, I&#8217;ll write a new blog post about proofs. </p>
<p>[edit] Correction, after a long exchange with experts, and giving up on Java-specific proof techniques, I am convinced now that both Anthony and Dmitriy were right. I&#8217;ll try to explain it in my next post.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Another look at Peterson &#171; Blockheading about C++</title>
		<link>http://bartoszmilewski.wordpress.com/2008/12/01/c-atomics-and-memory-ordering/#comment-118</link>
		<dc:creator>Another look at Peterson &#171; Blockheading about C++</dc:creator>
		<pubDate>Sat, 06 Dec 2008 15:05:25 +0000</pubDate>
		<guid isPermaLink="false">http://bartoszmilewski.wordpress.com/?p=331#comment-118</guid>
		<description>[...] 6, 2008   I had another read of Bartosz Milewski recent blog post about the different memory orderings of [...]</description>
		<content:encoded><![CDATA[<p>[...] 6, 2008   I had another read of Bartosz Milewski recent blog post about the different memory orderings of [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anthony Williams</title>
		<link>http://bartoszmilewski.wordpress.com/2008/12/01/c-atomics-and-memory-ordering/#comment-116</link>
		<dc:creator>Anthony Williams</dc:creator>
		<pubDate>Fri, 05 Dec 2008 14:09:14 +0000</pubDate>
		<guid isPermaLink="false">http://bartoszmilewski.wordpress.com/?p=331#comment-116</guid>
		<description>I have written a post about this over on my blog: http://www.justsoftwaresolutions.co.uk/threading/petersons_lock_with_C++0x_atomics.html

In summary, Dmitriy&#039;s implementation is correct, yours is broken. The key is that he uses exchange on turn (_victim), whereas you use exchange on _interested[me] and a plain store on _victim. You need the store to _victim to be an exchange in order for everything to be guaranteed.</description>
		<content:encoded><![CDATA[<p>I have written a post about this over on my blog: <a href="http://www.justsoftwaresolutions.co.uk/threading/petersons_lock_with_C++0x_atomics.html" rel="nofollow">http://www.justsoftwaresolutions.co.uk/threading/petersons_lock_with_C++0x_atomics.html</a></p>
<p>In summary, Dmitriy&#8217;s implementation is correct, yours is broken. The key is that he uses exchange on turn (_victim), whereas you use exchange on _interested[me] and a plain store on _victim. You need the store to _victim to be an exchange in order for everything to be guaranteed.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bartosz Milewski</title>
		<link>http://bartoszmilewski.wordpress.com/2008/12/01/c-atomics-and-memory-ordering/#comment-115</link>
		<dc:creator>Bartosz Milewski</dc:creator>
		<pubDate>Thu, 04 Dec 2008 22:52:02 +0000</pubDate>
		<guid isPermaLink="false">http://bartoszmilewski.wordpress.com/?p=331#comment-115</guid>
		<description>I realize that, on the x86, the use of &quot;lock exchg&quot; is preferrable for anything stronger than release/acquire--I mentioned that in my previous posts. 

In general, however, you can&#039;t make assumption about how atomic operations translate into fences or locked instructions when you don&#039;t know the underlying architecture. The only guarantees you have are, as you mentioned, the semantics of atomics in terms of happens-before and synchronizes-with relations. Notice however that very few people frame their arguments in those terms. 

The obvious rules of discussing algorithms written in terms of atomics are: (1) it&#039;s enough to show that it doesn&#039;t work on one processor to prove its incorrectness, and (2) to prove its correctness you can&#039;t use any processor-specific arguments. Of course that&#039;s much harder!

So even though I don&#039;t have a formal proof, I believe my implementation of Peterson lock is correct. For all I know, Dmitriy&#039;s implementation might also be correct, but it&#039;s much harder to prove.</description>
		<content:encoded><![CDATA[<p>I realize that, on the x86, the use of &#8220;lock exchg&#8221; is preferrable for anything stronger than release/acquire&#8211;I mentioned that in my previous posts. </p>
<p>In general, however, you can&#8217;t make assumption about how atomic operations translate into fences or locked instructions when you don&#8217;t know the underlying architecture. The only guarantees you have are, as you mentioned, the semantics of atomics in terms of happens-before and synchronizes-with relations. Notice however that very few people frame their arguments in those terms. </p>
<p>The obvious rules of discussing algorithms written in terms of atomics are: (1) it&#8217;s enough to show that it doesn&#8217;t work on one processor to prove its incorrectness, and (2) to prove its correctness you can&#8217;t use any processor-specific arguments. Of course that&#8217;s much harder!</p>
<p>So even though I don&#8217;t have a formal proof, I believe my implementation of Peterson lock is correct. For all I know, Dmitriy&#8217;s implementation might also be correct, but it&#8217;s much harder to prove.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anthony Williams</title>
		<link>http://bartoszmilewski.wordpress.com/2008/12/01/c-atomics-and-memory-ordering/#comment-114</link>
		<dc:creator>Anthony Williams</dc:creator>
		<pubDate>Thu, 04 Dec 2008 21:49:32 +0000</pubDate>
		<guid isPermaLink="false">http://bartoszmilewski.wordpress.com/?p=331#comment-114</guid>
		<description>&gt; As I discussed before, fences really should go _between_ memory operations. Atomics can only see a single operation at a time, so they have to decide whether to always put the fence before or after a given operation.

That&#039;s not true. Your code above need not have *any* fences on an x86, because the exchange is an XCHG instruction, which is serializing. See my post about the C++0x memory model on x86 CPUs. http://www.justsoftwaresolutions.co.uk/threading/intel-memory-ordering-and-c++-memory-model.html

On some architectures, memory_order_acq_rel will require a fence both before AND after the instruction (possibly of different sorts).

&gt; But the Standard never specifies where the fences go, so I can’t rely on it in my implementation.

The Standard doesn&#039;t specify the implementation details, that is true. It does however specify the required ordering constraints in terms of relationships called &quot;happens-before&quot; and &quot;synchronizes-with&quot;.

Incidentally, if you feel that you can reason better with fences, you could write them explicitly: use memory_order_relaxed for all the loads and stores and then use explicit atomic_thread_fence() calls to put the fences in.</description>
		<content:encoded><![CDATA[<p>&gt; As I discussed before, fences really should go _between_ memory operations. Atomics can only see a single operation at a time, so they have to decide whether to always put the fence before or after a given operation.</p>
<p>That&#8217;s not true. Your code above need not have *any* fences on an x86, because the exchange is an XCHG instruction, which is serializing. See my post about the C++0x memory model on x86 CPUs. <a href="http://www.justsoftwaresolutions.co.uk/threading/intel-memory-ordering-and-c++-memory-model.html" rel="nofollow">http://www.justsoftwaresolutions.co.uk/threading/intel-memory-ordering-and-c++-memory-model.html</a></p>
<p>On some architectures, memory_order_acq_rel will require a fence both before AND after the instruction (possibly of different sorts).</p>
<p>&gt; But the Standard never specifies where the fences go, so I can’t rely on it in my implementation.</p>
<p>The Standard doesn&#8217;t specify the implementation details, that is true. It does however specify the required ordering constraints in terms of relationships called &#8220;happens-before&#8221; and &#8220;synchronizes-with&#8221;.</p>
<p>Incidentally, if you feel that you can reason better with fences, you could write them explicitly: use memory_order_relaxed for all the loads and stores and then use explicit atomic_thread_fence() calls to put the fences in.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Keeping your memory in order &#171; Blockheading about C++</title>
		<link>http://bartoszmilewski.wordpress.com/2008/12/01/c-atomics-and-memory-ordering/#comment-113</link>
		<dc:creator>Keeping your memory in order &#171; Blockheading about C++</dc:creator>
		<pubDate>Thu, 04 Dec 2008 19:52:20 +0000</pubDate>
		<guid isPermaLink="false">http://bartoszmilewski.wordpress.com/?p=331#comment-113</guid>
		<description>[...] 4, 2008   A short description of the different memory ordering for concurrency in C++0x; C++ atomics and memory ordering. The default for atomic variables is the familiar sequential ordering [...]</description>
		<content:encoded><![CDATA[<p>[...] 4, 2008   A short description of the different memory ordering for concurrency in C++0x; C++ atomics and memory ordering. The default for atomic variables is the familiar sequential ordering [...]</p>
]]></content:encoded>
	</item>
</channel>
</rss>
