<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>  Bartosz Milewski's Programming Cafe</title>
	<atom:link href="http://bartoszmilewski.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://bartoszmilewski.wordpress.com</link>
	<description>Concurrency, Multicore, Language Design, D, C++</description>
	<lastBuildDate>Wed, 09 Dec 2009 20:53:00 +0000</lastBuildDate>
	<generator>http://wordpress.com/</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<cloud domain='bartoszmilewski.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://www.gravatar.com/blavatar/1662a596871dc0dd1822e9393505359c?s=96&#038;d=http://s.wordpress.com/i/buttonw-com.png</url>
		<title>  Bartosz Milewski's Programming Cafe</title>
		<link>http://bartoszmilewski.wordpress.com</link>
	</image>
			<item>
		<title>Unique Objects</title>
		<link>http://bartoszmilewski.wordpress.com/2009/11/30/unique-objects/</link>
		<comments>http://bartoszmilewski.wordpress.com/2009/11/30/unique-objects/#comments</comments>
		<pubDate>Tue, 01 Dec 2009 00:42:11 +0000</pubDate>
		<dc:creator>Bartosz Milewski</dc:creator>
				<category><![CDATA[Concurrency]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Multithreading]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Scala]]></category>
		<category><![CDATA[Type System]]></category>

		<guid isPermaLink="false">http://bartoszmilewski.wordpress.com/?p=1119</guid>
		<description><![CDATA[I&#8217;ve blogged before about the C++ unique_ptr not being unique and how true uniqueness can be implemented in an ownership-based type system. But I&#8217;ve been just scratching the surface.
The new push toward uniqueness is largely motivated by the demands of multithreaded programming. Unique objects are alias free and, in particular, cannot be accessed from more [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bartoszmilewski.wordpress.com&blog=3549518&post=1119&subd=bartoszmilewski&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>I&#8217;ve blogged before about the C++ <a href="http://bartoszmilewski.wordpress.com/2009/05/21/unique_ptr-how-unique-is-it/">unique_ptr not being unique</a> and how true uniqueness can be implemented in an <a href="http://bartoszmilewski.wordpress.com/2009/06/02/race-free-multithreading-ownership/">ownership-based type system</a>. But I&#8217;ve been just scratching the surface.</p>
<p>The new push toward uniqueness is largely motivated by the demands of multithreaded programming. Unique objects are alias free and, in particular, cannot be accessed from more than one thread at a time. Because of that, they never require locking. They can also be safely passed between threads without the need for deep copying. In other words, they are a perfect vehicle for safe and efficient message passing. But there&#8217;s a rub&#8230;</p>
<p>The problem is this: How do you create and modify unique objects that have internal pointers. A classic example is a doubly linked list. Consider this Java code:</p>
<pre>public class Node {
    public Node _next;
    public Node _prev;
}
public class LinkedList {
    private Node _head;
    public void insert(Node n) {
        n._next = _head;
        if (_head != null)
            _head._prev = n;
        _head = n;
    }
}</pre>
<p>Suppose that you have a unique instance of an empty <code>LinkedList</code> and you want to insert a new link into it without compromising its uniqueness. </p>
<p>The first danger is that there might be external aliases to the node you are inserting&#8211;the node is not unique, it is shared. In that case, after the node is absorbed:</p>
<pre>_head = n;</pre>
<p><code>_head</code> would be pointing to an alias-contaminated object.  The list would &#8220;catch&#8221; aliases and that would break the uniqueness property. </p>
<p>The remedy is to require that the inserted node be unique too, and the ownership of it be transferred from the caller to the <code>insert</code> method. (Notice however that, in the process of being inserted, the node loses its uniqueness, since there are potentially two aliases pointing to it from inside the list&#8211;one is <code>_head</code> and the other is <code>_head._prev</code>. Objects inside the list don&#8217;t have to be unique&#8211;they may be cross-linked.)</p>
<p>The second danger is that the method <code>insert</code> might &#8220;leak&#8221; aliases. The tricky part is when we let the external node, <code>n</code>, store the reference to our internal <code>_head</code>:</p>
<pre>n._next = _head</pre>
<p>We know that this is safe here because the node started unique and it will be absorbed into the list, so this alias will become an internal alias, which is okay. But how do we convince the compiler to certify this code as safe and reject code that isn&#8217;t? Type system to the rescue!</p>
<h2>Types for Uniqueness</h2>
<p>There have been several approaches to uniqueness using the type system. To my knowledge, the most compact and comprehensive one was presented by Haller and Odersky in the paper, <a href="http://lamp.epfl.ch/~phaller/uniquerefs/capabilities_for_uniqueness_TR.pdf">Capabilities for External Uniqueness</a>, which I will discuss in this post. The authors not only presented the theory but also implemented the prototype of the system as an extension of Scala. Since not many people are fluent in Scala, I&#8217;ll translate their examples into pseudo-Java, hopefully not missing much. </p>
<p>Both in Scala and Java one can use annotations to extend the type system. Uniqueness introduces three such annotations, <code>@unique</code>, <code>@transient</code>, and <code>@exposed</code>; and two additional keywords, <code>expose</code> and <code>localize</code>.</p>
<h3>-Objects that are @unique</h3>
<p>In the first approximation you may think of a <code>@unique</code> object as a leak-proof version of C++ <code>unique_ptr</code>. Such object is guaranteed to be &#8220;tracked&#8221; by at most one reference&#8211;no aliases are allowed. Also no <i>external</i> references are allowed to point to the object&#8217;s internals and, conversely, object internals may not reference any external objects. However, and this is a very important point, the insides of the <code>@unique</code> object may freely alias each other. Such a closed cross-linked mess is called a <i>cluster</i>. </p>
<p>Consider, for instance, a (non-empty) <code>@unique</code> linked list. It&#8217;s cluster consists of cross-linked set of nodes. It&#8217;s relatively easy for the compiler to guarantee that no external aliases are created to a <code>@unique</code> list&#8211;the tricky part is to allow the manipulation of list internals without breaking its uniqueness (Fig 1 shows our starting point). </p>
<div id="attachment_1134" class="wp-caption alignnone" style="width: 273px"><a href="http://bartoszmilewski.files.wordpress.com/2009/11/clusters.gif"><img src="http://bartoszmilewski.files.wordpress.com/2009/11/clusters.gif?w=263&#038;h=241" alt="" title="Clusters" width="263" height="241" class="size-full wp-image-1134" /></a><p class="wp-caption-text">Fig 1. The linked list and the node form separate clusters</p></div>
<p>Look at the definition of <code>insert</code>. Without additional annotations we would be able to call it with a node that is shared between several external aliases. After the node is included in the list, those aliases would be pointing to the internals of the list thus breaking its uniqueness. Because of that, the uniqueness-savvy compiler will flag a call to such un-annotated <code>insert</code> on a <code>@unique</code> list as an error. So how can we annotate <code>insert</code> so that it guarantees the preservation of uniqueness?</p>
<h3>-Exposing and Localizing</h3>
<p>Here&#8217;s the modified definition of <code>insert</code>:</p>
<pre>public void insert(<span style="color:#c00;">@unique</span> Node n) <span style="color:#c00;">@transient</span> {
    <span style="color:#c00;">expose (this) { list =&gt;</span>
        var node = <span style="color:#c00;">localize (n, list)</span>;
        node._next = list._head;
        if (list._head != null)
            list._head._prev = node;
        list._head = node;
    }
}</pre>
<p>Don&#8217;t worry, most of the added code can be inferred by the compiler, but I make it explicit here for the sake of presentation. Let&#8217;s go over some of the details.</p>
<p>The node, <code>n</code> passed to <code>insert</code> is declared as <code>@unique</code>. This guarantees that it forms its own cluster and that <code>n</code> is the only reference to it. Also, <code>@unique</code> parameters to a method are <i>consumed</i> by that method. The caller thus loses her reference to it (the compiler invalidates it), as demonstrated in this example:</p>
<pre>@unique LinkedList lst = new @unique LinkedList();
@unique Node nd = new @unique Node();
lst.insert(nd);
nd._next; // error: nd has been consumed!</pre>
<p>The method itself is annotated as <code>@transient</code>. It means that the <code>this</code> object is <code>@unique</code>, but not consumed by the method. In general, the <code>@transient</code> annotation may be applied to any parameter, not only <code>this</code>. You might be familiar with a different name for transient&#8211;borrowed. </p>
<p>Inside <code>insert</code>, the <code>this</code> parameter is explicitly exposed (actually, since the method is <code>@transient</code>, the compiler would expose <code>this</code> implicitly). </p>
<pre>expose (this) { list =&gt; ... }</pre>
<p>The new name for the exposed <code>this</code> is <code>list</code>.</p>
<p>Once a cluster is <i>exposed</i>, some re-linking of its constituents is allowed. The trick is not to allow any re-linking that would lead to the leakage of aliases. And here&#8217;s the trick: To guarantee no leakage, the compiler assigns the exposed object a special type&#8211;its original type tagged by a unique identifier. This identifier is created for the scope of each <code>expose</code> statement. All members of the exposed cluster are also tagged by the same tag. Since the compiler type-checks every assignment it automatically makes sure that both sides have the same tag. </p>
<p>Now we need one more ingredient&#8211;bringing the <code>@unique</code> node into the cluster. This is done by <i>localizing</i> the parameter <code>n</code> to the same cluster as <code>list</code>. </p>
<pre>var node = localize (n, list);</pre>
<p>The <code>localize</code> statement does two things. It consumes <code>n</code> and it returns a reference to it that is tagged by the same tag as its second parameter. From that point on, <code>node</code> has the same tagged type as all the exposed nodes inside the <code>list</code>, and all assignments type-check. </p>
<div id="attachment_1136" class="wp-caption alignnone" style="width: 281px"><a href="http://bartoszmilewski.files.wordpress.com/2009/11/exposed.gif"><img src="http://bartoszmilewski.files.wordpress.com/2009/11/exposed.gif?w=271&#038;h=260" alt="Exposed list and localized node" title="Exposed" width="271" height="260" class="size-full wp-image-1136" /></a><p class="wp-caption-text">Fig 2. The list has been exposed: all references are tagged. The node has been localized (given the same tag as the list). Re-linking is now possible without violating the type system.</p></div>
<p>Note that, in my pseudo-Java, I didn&#8217;t specify the type of <code>node</code> returned by <code>localize</code>. That&#8217;s because tagged types are never explicitly mentioned in the program. They are the realm of the compiler. </p>
<h2>Functional Decomposition</h2>
<p>The last example was somewhat trivial in that the code that worked on exposed objects fit neatly into one method. But a viable type system cannot impose restrictions on structuring the code. The basic requirement for any programming language is to allow functional decomposition&#8211;delegating work to separate subroutines, which can be re-used in other contexts. That&#8217;s why we have to be able to define functions that operate on exposed and/or localized objects.</p>
<p>Here&#8217;s an example from Haller/Odersky that uses recursion within the <code>expose</code> statement. <code>append</code> is a method of a singly-linked list:</p>
<pre>void append(@unique SinglyLinkedList other) @transient
{
    expose(this) { list =&gt;
        if (list._next == null)
            list._next = other; // localize and consume
        else
            list._next.<span style="color:#c00;">append(other)</span>;
    }
}</pre>
<p>In the first branch of the if statement, a <code>@unique</code> parameter, <code>other</code>, is (implicitly) localized and consumed. In the second branch, it is recursively passed to <code>append</code>. Notice an important detail, the subject of <code>append</code>, <code>list._next</code>, is not <code>@unique</code>&#8211;it is exposed. Its type has been tagged by a unique tag. But the <code>append</code> method is declared as <code>@transient</code>. It turns out that both unique and exposed arguments may be safely accepted as transient parameters (including the implicit <code>this</code> parameter). </p>
<p>Because of this rule, it&#8217;s perfectly safe to forgo the explicit <code>expose</code> inside a transient method. The <code>append</code> method may be thus simplified to:</p>
<pre>void append(<span style="color:#c00;">@unique</span> SinglyLinkedList other) <span style="color:#c00;">@transient</span>
{
    // 'this' is implicitly exposed
    if (_next == null)
        _next = other; // localize and consume
    else
        _next.append(other);
}</pre>
<p>Things get a little more interesting when you try to reuse <code>append</code> inside another method. Consider the implementation of <code>insert</code>:</p>
<pre>void insert(@unique SingleLinkedList other) @transient
{
    var locOther = localize(other, this);
    if (other != null)
    {
        locOther.<span style="color:#c00;">append</span>(_next)
        _next = locOther;
   }
}</pre>
<p>The <code>insert</code> method is transient&#8211;it works on unique or exposed lists. It accepts a unique list, <code>other</code>, which is consumed by the <code>localize</code> statement. The <code>this</code> reference is implicitly exposed with the same tag as <code>locOther</code>, so the last statement <code>_next=locOther</code> type-checks. The only thing that doesn&#8217;t type-check is the argument to <code>append</code>, which is supposed to be unique, but here it&#8217;s exposed instead. </p>
<p>This time there is no safe conversion to help us, so if we want to be able to reuse <code>append</code>, we have to modify its definition. First of all, we&#8217;ll mark its parameter as <code>@exposed</code>. An exposed parameter is tagged by the caller. In order for <code>append</code> to work, the <code>this</code> reference must also be tagged by the caller&#8211;with <i>the same</i> tag. Otherwise the assignment, <code>_next=other</code>, inside <code>append</code>, wouldn&#8217;t type-check. It follows that the <code>append</code> method must also be marked as <code>@exposed</code> (when there is more than one exposed parameter, they all have to be tagged with the same tag). </p>
<p>Here&#8217;s the new version of <code>append</code>:</p>
<pre>void append(<span style="color:#c00;">@exposed</span> SinglyLinkedList other) <span style="color:#c00;">@exposed</span>
{
    if (_next == null)
        _next = other; // both exposed under the same tag
    else
        _next.append(other); // both exposed under the same tag
}</pre>
<p>Something interesting happened to <code>append</code>. Since it now operates on exposed objects, it&#8217;s the caller&#8217;s responsibility to expose and localize unique object (this is exactly what we did in <code>insert</code>).  Interestingly, <code>append</code> will now also operate on non-annotated types. You may, for instance, <code>append</code> one non-unique list to another non-unique list and it will type-check! That&#8217;s because non-annotated types are equivalent to exposed types with a null tag&#8211;they form a global cluster of their own. </p>
<p>This kind of polymorphism (non-annotated/annotated) means that in many cases you don&#8217;t have to define separate classes for use with unique objects. What Haller and Odersky found out is that almost all class methods in the Scala&#8217;s collection library required only the simplest <code>@exposed</code> annotations without changing their implementation. That&#8217;s why they proposed to use the <code>@exposed</code> annotation on whole classes. </p>
<h2>Conclusion</h2>
<p>Every time I read a paper about Scala I&#8217;m impressed. It&#8217;s a language that has very solid theoretical foundations and yet is very practical&#8211;on a par with Java, whose runtime it uses. I like Scala&#8217;s approach towards concurrency, with strong emphasis on safe and flexible message passing. Like functional languages, Scala supports immutable messages. With the addition of uniqueness, it will also support safe <i>mutable</i> messages. Neither kind requires synchronization (outside of that provided by the message queue), or deep copying. </p>
<p>There still is a gap in the Scala&#8217;s concurrency model&#8211;it&#8217;s possible to share objects between threads without any protection. It&#8217;s up to the programmer to declare shared methods as <code>synchronized</code>&#8211;just like in Java; but there is no overall guarantee of data-race freedom. So far, only ownership systems were able to deliver that guarantee, but I wouldn&#8217;t be surprised if Martin Odersky had something else up his sleeve for Scala. </p>
<p>I&#8217;d like to thank Philip Haller for reading the draft of this post and providing valuable comments. Philip told me that a new version of the prototype is in works, which will simplify the system further, both for the programmer and the implementer.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/bartoszmilewski.wordpress.com/1119/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/bartoszmilewski.wordpress.com/1119/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/bartoszmilewski.wordpress.com/1119/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/bartoszmilewski.wordpress.com/1119/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/bartoszmilewski.wordpress.com/1119/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/bartoszmilewski.wordpress.com/1119/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/bartoszmilewski.wordpress.com/1119/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/bartoszmilewski.wordpress.com/1119/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/bartoszmilewski.wordpress.com/1119/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/bartoszmilewski.wordpress.com/1119/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bartoszmilewski.wordpress.com&blog=3549518&post=1119&subd=bartoszmilewski&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://bartoszmilewski.wordpress.com/2009/11/30/unique-objects/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c018f213204496b4bbf481e7c8e6c15c?s=96&#38;d=http%3A%2F%2Fa.wordpress.com%2Fi%2Fmu.gif" medium="image">
			<media:title type="html">Bartosz Milewski</media:title>
		</media:content>

		<media:content url="http://bartoszmilewski.files.wordpress.com/2009/11/clusters.gif" medium="image">
			<media:title type="html">Clusters</media:title>
		</media:content>

		<media:content url="http://bartoszmilewski.files.wordpress.com/2009/11/exposed.gif" medium="image">
			<media:title type="html">Exposed</media:title>
		</media:content>
	</item>
		<item>
		<title>Haskell/C++ Video and Slides</title>
		<link>http://bartoszmilewski.wordpress.com/2009/10/26/haskellc-video-and-slides/</link>
		<comments>http://bartoszmilewski.wordpress.com/2009/10/26/haskellc-video-and-slides/#comments</comments>
		<pubDate>Mon, 26 Oct 2009 20:27:34 +0000</pubDate>
		<dc:creator>Bartosz Milewski</dc:creator>
				<category><![CDATA[C++]]></category>
		<category><![CDATA[Functional Programming]]></category>
		<category><![CDATA[Haskell]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://bartoszmilewski.wordpress.com/?p=1094</guid>
		<description><![CDATA[The video of my talk, Haskell and C++ Template Metaprogramming, is now available; and so are the slides. I pretty much covered the material from my last blog post, but many people (including me) find a video presentation easier to follow. 
This is also a plug for the Northwest C++ Users Group that meets in [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bartoszmilewski.wordpress.com&blog=3549518&post=1094&subd=bartoszmilewski&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>The video of my talk, <a href="http://vimeo.com/7211030">Haskell and C++ Template Metaprogramming</a>, is now available; and so are the <a href="http://www.nwcpp.org/Downloads/2009/Haskell_and_C___Template.pdf">slides</a>. I pretty much covered the material from my last <a href="http://bartoszmilewski.wordpress.com/2009/10/21/what-does-haskell-have-to-do-with-c/">blog post</a>, but many people (including me) find a video presentation easier to follow. </p>
<p>This is also a plug for the <a href="http://www.nwcpp.org">Northwest C++ Users Group</a> that meets in Redmond every third Wednesday of the month. If you live in Seattle or on the east side of Lake Washington, check it out. You won&#8217;t be disappointed. </p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/bartoszmilewski.wordpress.com/1094/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/bartoszmilewski.wordpress.com/1094/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/bartoszmilewski.wordpress.com/1094/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/bartoszmilewski.wordpress.com/1094/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/bartoszmilewski.wordpress.com/1094/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/bartoszmilewski.wordpress.com/1094/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/bartoszmilewski.wordpress.com/1094/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/bartoszmilewski.wordpress.com/1094/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/bartoszmilewski.wordpress.com/1094/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/bartoszmilewski.wordpress.com/1094/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bartoszmilewski.wordpress.com&blog=3549518&post=1094&subd=bartoszmilewski&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://bartoszmilewski.wordpress.com/2009/10/26/haskellc-video-and-slides/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c018f213204496b4bbf481e7c8e6c15c?s=96&#38;d=http%3A%2F%2Fa.wordpress.com%2Fi%2Fmu.gif" medium="image">
			<media:title type="html">Bartosz Milewski</media:title>
		</media:content>
	</item>
		<item>
		<title>What Does Haskell Have to Do with C++?</title>
		<link>http://bartoszmilewski.wordpress.com/2009/10/21/what-does-haskell-have-to-do-with-c/</link>
		<comments>http://bartoszmilewski.wordpress.com/2009/10/21/what-does-haskell-have-to-do-with-c/#comments</comments>
		<pubDate>Wed, 21 Oct 2009 23:02:33 +0000</pubDate>
		<dc:creator>Bartosz Milewski</dc:creator>
				<category><![CDATA[C++]]></category>
		<category><![CDATA[D Programming Language]]></category>
		<category><![CDATA[Functional Programming]]></category>
		<category><![CDATA[Haskell]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://bartoszmilewski.wordpress.com/?p=1045</guid>
		<description><![CDATA[If you want to understand C++ template metaprogramming (TMP) you have to know functional programming. Seriously. I want you to think of TMP as maximally obfuscated (subset of) Haskell, and I&#8217;ll illustrate this point by point. If you don&#8217;t know Haskell, don&#8217;t worry, I&#8217;ll explain the syntax as I go. 
The nice thing about single-paradigm [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bartoszmilewski.wordpress.com&blog=3549518&post=1045&subd=bartoszmilewski&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>If you want to understand C++ template metaprogramming (TMP) you have to know functional programming. Seriously. I want you to think of TMP as maximally obfuscated (subset of) Haskell, and I&#8217;ll illustrate this point by point. If you don&#8217;t know Haskell, don&#8217;t worry, I&#8217;ll explain the syntax as I go. </p>
<p>The nice thing about single-paradigm languages like Haskell is that they have very simple syntax (think of Lisp that does everything with just a bunch of parentheses). I will start the Haskell-C++TMP mapping with basics, like functions and recursion, but I&#8217;ll try to cover a lot more, including higher-order functions, pattern matching, list comprehension (did you know it was expressible in C++?), and more. </p>
<p>Keep in mind that my Haskell examples are runtime functions operating on runtime data whereas their C++ TMP equivalents are compile-time templates operating mostly on types. Operation on types are essential in providing correct and efficient implementations of parametrized classes and functions. </p>
<p>By necessity the examples are simple, but the same mapping may be applied to much more complex templates from the C++ Standard Library and Boost. As a bonus, I&#8217;ll also explain the hot new thing, variadic templates.</p>
<h2>Functional Approach to Functions</h2>
<p>How do you implement useful functions if you don&#8217;t have mutable variables, <var>if</var> statements, or loops? To a C++ programmer that might seem like an impossible task. But that&#8217;s the reality of C++ compile-time language that forms the basis of TMP. Functional programming to the rescue! </p>
<p>As a warm-up, let&#8217;s see how Haskell implements a simple function, the factorial:</p>
<pre>fact 0 = 1
fact n = n * fact (n - 1)</pre>
<p>The first line states that the factorial of zero is one. The second line defines factorial for a non-zero argument, n (strictly speaking it only works for positive non-zero arguments). It does it using recursion: factorial of <var>n</var> is equal to <var>n</var> times the factorial of <var>n-1</var>. The recursion stops when <var>n</var> is equal to zero, in which case the first definition kicks in. Notice that the definition of the function <var>fact</var> is split into two sub-definitions. So when you call:</p>
<pre>fact 4</pre>
<p>the first definition is looked up first and, if it doesn&#8217;t match the argument (which it doesn&#8217;t), the second one comes into play. This is the simplest case of <i>pattern matching</i>: 4 doesn&#8217;t match the &#8220;pattern&#8221; 0, but it matches the pattern <var>n</var>.</p>
<p>Here&#8217;s almost <i>exactly</i> the same code expressed in C++ TMP:</p>
<pre>template<span style="color:#00f;">&lt;int n&gt;</span> struct
<span style="color:#00f;">fact</span> {
    static const int value = <span style="color:#00f;">n * fact&lt;n - 1&gt;</span>::value;
};

template&lt;&gt; struct
<span style="color:#00f;">fact&lt;0&gt;</span> { // specialization for n = 0
    static const int value = <span style="color:#00f;">1</span>;
};</pre>
<p>You might notice how the horrible syntax of C++ TMP obscures the simplicity and elegance of this code. But once you are equipped with the C++/Haskell decoder ring, things become a lot clearer. </p>
<p>Let&#8217;s analyze this code. Just like in Haskell, there are two definitions of <var>fact</var>, except that their order is inverted. This is because C++ requires template specialization to follow the template&#8217;s general definition (or declaration, as we&#8217;ll see later). The pattern matching of arguments in C++ does not follow the order of declarations but rather is based on &#8220;best match&#8221;. If you instantiate the template with argument zero:</p>
<pre>cout &lt;&lt; "Factorial of 0 = " &lt;&lt; fact&lt;0&gt;::value &lt;&lt; endl;</pre>
<p>the second pattern, <var>fact&lt;0&gt;</var>, is a better fit. Otherwise the first one, <var>&lt;int n&gt;</var>, is used.</p>
<p>Notice also the weird syntax for &#8220;function call&#8221;</p>
<pre>fact&lt;n&gt;::value</pre>
<p>and for the &#8220;return statement&#8221;</p>
<pre>static const int value = n * fact&lt;n - 1&gt;::value;</pre>
<p>This all makes sense if you look at templates as definitions of parameterized types, which was their initial purpose in C++. In that interpretation, we are defining a <var>struct</var> called <var>fact</var>, parameterized by an integer <var>n</var>, whose sole member is a static const integer called <var>value</var>. Moreover, this template is specialized for the case of <var>n</var> equal zero. </p>
<p>Now I want you to forget about what I just said and put on the glasses which make the C++ code look like the corresponding Haskell code. </p>
<p>Here&#8217;s another example&#8211;this time of a predicate (a function returning a Boolean):</p>
<pre>is_zero 0 = True
is_zero x = False</pre>
<p>Let&#8217;s spice it up a little for C++ and define a predicate on types rather than integers. The following compile-time function returns <var>true</var> only when the type <var>T</var> is a pointer:</p>
<pre>template&lt;class T&gt; struct
<span style="color:#00f;">isPtr</span> {
    static const bool value = false;
};

template&lt;class U&gt; struct
<span style="color:#00f;">isPtr&lt;U*&gt;</span> {
    static const bool value = true;
};</pre>
<p>This time the actual argument to <var>isPtr</var> is first matched to the more specialized pattern, <var>U*</var> and, if it fails, the general pattern is used. </p>
<p>We can add yet another specialization, which will pattern-match a const pointer:</p>
<pre>template&lt;class U&gt; struct
<span style="color:#00f;">isPtr&lt;U * const&gt;</span> {
    static const bool value = true;
};</pre>
<p>These types of type predicates may be, for instance, used to select more flexible and efficient implementations of parameterized containers. Think of the differences between a vector of values vs. a vector of pointers.</p>
<h2>Lists</h2>
<p>The basic data structure in functional languages is the list. Haskell&#8217;s lists are introduced using square brackets. For instance, a list of three numbers, 1, 2, 3, looks like:</p>
<pre>[1, 2, 3]</pre>
<p>List processing in functional languages follows the standard pattern: a list is split into <var>head</var> and <var>tail</var>, an operation is performed on the head, and the tail is processed using recursion. The splitting is done by pattern matching: the Haskell pattern being <var>(head:tail)</var> (to be precise, the colon in parentheses represents the <var>cons</var> operation&#8211;the creation of a list by prepending an element to an existing list). </p>
<p>Here&#8217;s a simple function, <var>count</var>, that calculates the length of a list:</p>
<pre>count [] = 0
count (head:tail) = 1 + count tail</pre>
<p>The first pattern, [], matches an empty list; the second a non-empty one. Notice that a function call in Haskell doesn&#8217;t use parentheses around arguments, so <var>count tail</var> is interpreted as a call to <var>count</var> with the argument <var>tail</var>.</p>
<p>Before C++0x, TMP was severely crippled by the lack of a list primitive. People used separate definitions for a list of zero, one, two, etc.,  elements, and even used special macros to define them. This is no longer true in C++0x, thanks to <i>variadic templates</i> and <i>template parameter packs</i>. Here&#8217;s our Haskel <var>count</var> translated into C++ TMP:</p>
<pre>// Just a declaration
template&lt;class... list&gt; struct
<span style="color:#00f;">count</span>;

template&lt;&gt; struct
<span style="color:#00f;">count&lt;&gt;</span> {
    static const int value = 0;
};

template&lt;class head, class... tail&gt; struct
<span style="color:#00f;">count&lt;head, tail...&gt;</span> {
    static const int value = 1 + count&lt;tail...&gt;::value;
};</pre>
<p>First we have a declaration (not a definition) of the template <var>count</var> that takes a variable number of type parameters (the keyword <var>class</var> or <var>typename</var> introduces a type parameter). They are packed into a template parameter pack, <var>list</var>. </p>
<p>Once this general declaration is visible, specializations may follow in any order. I arranged them to follow the Haskell example. The first one matches the empty list and returns zero. The second one uses the pattern, <var>&lt;head, tail&#8230;&gt;</var>. This pattern will match any non-empty list and split it into the <var>head</var> and the (possibly empty) <var>tail</var>. </p>
<p>To &#8220;call&#8221; a variadic template, you initiate it with an arbitrary number of arguments and retrieve its member, <var>value</var>, e.g.,</p>
<pre>int n = count&lt;int, char, long&gt;::value; // returns 3</pre>
<p>A few words about variadic templates: A variadic template introduces a template parameter pack using the notation <var>class&#8230; pack</var> (or <var>int&#8230; ipack</var>, etc&#8230;). The only thing you may do with a pack is to expand it and pass to another variadic template. The expansion is done by following the name of the pack with three dots, as in <var>tail&#8230;</var>. You&#8217;ll see more examples later.</p>
<p>Variadic templates have many applications such as type-safe printf, tuples (objects that store an arbitrary number of differently typed arguments), variants, and many more. </p>
<h2>Higher-Order Functions and Closures</h2>
<p>The real power of functional programming comes from treating functions as first class citizens. It means that you may pass functions to other functions and return functions from functions. Functions operating on functions are called higher-order functions. Surprisingly, it seems like compile-time C++ has better support for higher-order functions than run-time C++.</p>
<p>Let&#8217;s start with a Haskell example. I want to define a function that takes two predicate functions and returns another predicate function that combines the two using logical <var>OR</var>. Here it is in Haskell:</p>
<pre>or_combinator f1 f2 =
    &lambda; x -&gt; (f1 x) || (f2 x)</pre>
<p>The <var>or_combinator</var> returns an anonymous function (the famous &#8220;lambda&#8221;) that takes one argument, <var>x</var>, calls both <var>f1</var> and <var>f2</var> with it, and returns the logical <var>OR</var> of the two results. The return value of <var>or_combinator</var> is this freshly constructed function. I can then call this function with an arbitrary argument. For instance, here I&#8217;m checking if 2 is either zero or one (guess what, it isn&#8217;t!):</p>
<pre>(or_combinator is_zero is_one) 2</pre>
<p>I put the parentheses around the function and its arguments for readability, although they are not strictly necessary. 2 is the argument to the function returned by <var>or_combinator</var>. </p>
<p>The lambda that&#8217;s returned from <var>or_combinator</var> is actually a closure. It &#8220;captures&#8221; the two arguments, <var>f1</var> and <var>f2</var> passed to <var>or_combinator</var>. They may be used long after the call to <var>or_combinator</var> has returned. </p>
<p>It might take some getting used to it before you are comfortable with functions taking functions and returning functions, but it&#8217;s much easier to learn this stuff in Haskell than in the obfuscated C++. Indeed, here&#8217;s an almost direct translation of this example:</p>
<pre>template&lt;template&lt;class&gt; class <span style="color:#00f;">f1</span>, template&lt;class&gt; class <span style="color:#00f;">f2</span>&gt; struct
<span style="color:#00f;">or_combinator</span> {
    template&lt;class T&gt; struct
    <span style="color:#00f;">lambda</span> {
        static const bool value = <span style="color:#00f;">f1&lt;T&gt;</span>::value <span style="color:#00f;">|| f2&lt;T&gt;</span>::value;
    };
};</pre>
<p>Since in the metalanguage a function is represented by a template, the template <var>or_combinator</var> takes two such templates as arguments. It &#8220;calls&#8221; these templates using the standard syntax <var>f&lt;T&gt;::value</var>. Actually, the <var>or_combinator</var> doesn&#8217;t call these functions. Instead it defines a new template, which I call <var>lambda</var>, that takes the argument <var>T</var> and calls those functions. This template acts like a closure&#8211;it captures the two templates that are the arguments to <var>or_combinator</var>.</p>
<p>Here&#8217;s how you may use the <var>or_combinator</var> to combine two tests, <var>isPtr</var> and <var>isConst</var> and apply the result to the type <var>const int</var>:</p>
<pre>std::cout
   &lt;&lt; "or_combinator&lt;isPtr, isConst&gt;::lambda&lt;const int&gt;::value = "
   &lt;&lt; or_combinator&lt;isPtr, isConst&gt;::lambda&lt;const int&gt;::value
   &lt;&lt; std::endl;</pre>
<p>Such logical combinators are essential for predicate composability.</p>
<h2>Higher-Order Functions Operating on Lists</h2>
<p>Once you combine higher-order functions with lists you have a powerful functional language at your disposal. Higher-order functions operating on lists look very much like algorithms. Let me show you some classic examples. Here&#8217;s the function (or algorithm), <var>all</var>, that returns <var>true</var> if and only if all elements of a list satisfy a given predicate.</p>
<pre>all pred [] = True
all pred (head:tail) = (pred head) &amp;&amp; (all pred tail)</pre>
<p>By now you should be familiar with all the techniques I used here, like pattern matching or list recursion. </p>
<p>Here&#8217;s the same code obfuscated by the C++ syntax:</p>
<pre>template&lt;template&lt;class&gt; class predicate, class... list&gt; struct
<span style="color:#00f;">all</span>;

template&lt;template&lt;class&gt; class predicate&gt; struct
<span style="color:#00f;">all&lt;predicate&gt;</span> {
    static const bool value = true;
};

template&lt;
    template&lt;class&gt; class predicate,
    class head,
    class... tail&gt; struct
<span style="color:#00f;">all&lt;predicate, head, tail...&gt;</span> {
    static const bool value =
        predicate&lt;head&gt;::value
        &amp;&amp; all&lt;predicate, tail...&gt;::value;
};</pre>
<p>Except for the initial declaration required by C++ there is a one-to-one paradigm match between the two implementations. </p>
<p>Another useful algorithm, a veritable workhorse of functional programming, is &#8220;fold right&#8221; (together with it&#8217;s dual partner, &#8220;fold left&#8221;). It folds a list while accumulating the results (that&#8217;s why in runtime C++ this algorithm is called &#8220;accumulate&#8221;). Here&#8217;s the Haskell implementation:</p>
<pre>foldr f init [] = init
foldr f init (head:tail) =
    f head (foldr f init tail)</pre>
<p>Function <var>f</var>, which is the first argument to <var>foldr</var>, takes two arguments, the current element of the list and the accumulated value. Its purpose is to process the element and incorporate the result in the accumulator. The new accumulated value is then returned. It is totally up to the client to decide what kind of processing to perform, how it is accumulated, and what kind of value is used. The second argument, <var>init</var>, is the initial value for the accumulator.</p>
<p>Here&#8217;s how it works: The result of <var>foldr</var> is generated by acting with <var>f</var> on the head of the list and whatever has been accumulated by processing the tail of the list. The algorithm recurses until the tail is empty, in which case it returns the initial value. At runtime this type of algorithm would make N recursive calls before starting to pop the stack and accumulate the results. </p>
<p>For instance, <var>foldr</var> may be used to sum the elements of a list (<var>so_far</var> is the accumulator, which is initialized to zero):</p>
<pre>add_it elem so_far = elem + so_far
sum_it lst = foldr add_it 0 lst</pre>
<p>The accumulator function is <var>add_it</var>. If, instead, I wanted to calculate the product of all elements, I&#8217;d use a function <var>mult_it</var> and the starting value of one. You get the idea.</p>
<p>Here&#8217;s the same algorithm in C++ TMP:</p>
<pre>template&lt;template&lt;class, int&gt; class, int, class...&gt; struct
<span style="color:#00f;">fold_right</span>;

template&lt;template&lt;class, int&gt; class f, int init&gt; struct
<span style="color:#00f;">fold_right&lt;f, init&gt;</span> {
    static const int value = init;
};

template&lt;template&lt;class, int&gt; class f, int init, class head, class...tail&gt; struct
<span style="color:#00f;">fold_right&lt;f, init, head, tail...&gt;</span> {
    static const int value = f&lt;head, fold_right&lt;f, init, tail...&gt;::value&gt;::value;
};</pre>
<p>Once you understand the Haskell version, this complex code suddenly becomes transparent (if it doesn&#8217;t, try squinting <img src='http://s.wordpress.com/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' />  ). </p>
<h2>Lists of Numbers</h2>
<p>Let&#8217;s now switch to integers for a moment. Haskell defines a function <var>sum</var> that adds all elements of a list:</p>
<pre>sum [] = 0
sum (head:tail) = head + (sum tail)</pre>
<p>We can do the same in C++ TMP (in five times as many lines of code):</p>
<pre>template&lt;int...&gt; struct
<span style="color:#00f;">sum</span>;

template&lt;&gt; struct
<span style="color:#00f;">sum&lt;&gt;</span> {
    static const int value = 0;
};

template&lt;int i, int... tail&gt; struct
<span style="color:#00f;">sum&lt;i, tail...&gt;</span> {
    static const int value = i + sum&lt;tail...&gt;::value;
};</pre>
<h2>List Comprehension</h2>
<p>Haskell has one more trick up its sleeve for operating on lists without explicit recursion. It&#8217;s called <i>list comprehension</i>. It&#8217;s a way of defining new lists based on existing lists. The nomenclature and notation are borrowed from Set Theory, where you often encounter definitions such as: S is a set of elements, where&#8230; Let&#8217;s look at a simple Haskell example:</p>
<pre>[x * x | x &lt;- [3, 4, 5]]</pre>
<p>This is a set (list) of elements <var>x * x</var>, where <var>x</var> is from the list <var>[3, 4, 5]</var>. </p>
<p>Remember our recursive definition of <var>count</var>? Using list comprehension it&#8217;s reduced to a one-liner:</p>
<pre>count lst = sum [1 | x &lt;- lst]</pre>
<p>Here, we create a list of ones, one for each element of the list. Our result is the sum of those ones. To make this definition more amenable to translation into C++, let&#8217;s define an auxiliary function <var>one</var> that, for any argument <var>x</var>, returns 1.</p>
<pre>one x = 1</pre>
<p>Here&#8217;s the modified definition of <var>count</var>:</p>
<pre>count lst = sum [one x | x &lt;- lst]</pre>
<p>Now we are ready to convert this code to C++ TMP:</p>
<pre>template&lt;class T&gt; struct
<span style="color:#00f;">one</span> {
    static const int value = 1;
};

template&lt;class... lst&gt; struct
<span style="color:#00f;">count</span> {
    static const int value = sum&lt;one&lt;lst&gt;::value...&gt;::value;
};</pre>
<p>Here our list is stored in a template parameter pack, <var>lst</var>. If we wanted to expand this pack, we&#8217;d use the notation <var>lst&#8230;</var>, but that&#8217;s not what&#8217;s happening here. The ellipsis appears after the pattern containing the pack:</p>
<pre>one&lt;lst&gt;::value...</pre>
<p>Compare this with the equivalent Haskell:</p>
<pre>[one x | x &lt;- lst]</pre>
<p>In C++, when the ellipsis follows a pattern that contains a pack, it&#8217;s not the pack that&#8217;s expanded, but the whole pattern is repeated for each element of the pack. Here, if our list were <var>&lt;int, char, void*&gt;</var>, the pattern would be expanded to:</p>
<pre>&lt;one&lt;int&gt;::value, one&lt;char&gt;::value, one&lt;void*&gt;::value&gt;</pre>
<p>The subsequent call to <var>sum</var> would be made with those arguments. </p>
<p>Notice that a different positioning of the ellipsis would result in a completely different expansion. This pattern:</p>
<pre>one&lt;lst...&gt;::value</pre>
<p>would result in the call to <var>one</var> with the list of types, which would be an error.</p>
<p>Here&#8217;s another example of pattern expansion: a function that counts the number of pointers in a list of types:</p>
<pre>template&lt;class... lst&gt; struct
<span style="color:#00f;">countPtrs</span> {
    static const int value = sum&lt;isPtr&lt;lst&gt;::value ...&gt;::value;
};</pre>
<p>In this case the pattern is:</p>
<pre>isPtr&lt;lst&gt;::value ...</pre>
<p>and it expands into a list of Booleans. (I&#8217;m taking advantage of the fact that <var>false</var> is zero and <var>true</var> is one, when converted to integers.)</p>
<p>You may find a more complex practical example in the Gregor, J&auml;rvi, and Powell paper (see bibliography).</p>
<h2>Continuations</h2>
<p>List comprehension can be used to define some very useful higher-order functions. One of such functions is <var>map</var>, which takes a list and applies a unary function to each element, resulting in a new list. You might be familiar with the runtime implementation of this algorithm in C++ STL under the name of <var>transform</var>. This is what <var>map</var> looks like in Haskell:</p>
<pre>map f lst = [f x | x &lt;- lst]</pre>
<p>Here, <var>f</var> is the unary function and <var>lst</var> is the input list. You have to admire the terseness and elegance of this notation. </p>
<p>The first impulse would be to translate it into C++ TMP as:</p>
<pre>template&lt;template&lt;class&gt; class f, class... lst&gt; struct
<span style="color:#00f;">map</span> {
    typedef f&lt;lst&gt;... type;
};</pre>
<p>This is surprisingly terse too. The problem is that it doesn&#8217;t compile. As far as I know there is no way for a template to &#8220;return&#8221; a variable list of elements. In my opinion, this is a major language design flaw, but that&#8217;s just me. </p>
<p>There are several workarounds, none of them too exciting. One is to define a separate entity called a <var>typelist</var> along the lines of:</p>
<pre>template struct
<span style="color:#00f;">typelist&lt;hd, tl...&gt;</span> {
    typedef hd head;
    typedef typelist&lt;tl...&gt; tail;
};</pre>
<p>(As a matter of fact I have implemented typelists and related algorithms both in C++ and D.)</p>
<p>Another approach is to use continuations. Template parameter packs cannot be returned, but they can be passed to variadic templates (after expansion). So how about defining an algorithm like <var>map</var> to take one additional function that would consume the list that is the result of mapping? Such a function is often called a continuation, since it continues the calculation where normally one would return the result. First, let&#8217;s do it in Haskell:</p>
<pre>map_cont cont f lst = cont [f x | x &lt;- lst]</pre>
<p>The function <var>map_cont</var> is just like <var>map</var> except that it takes a continuation, <var>cont</var>, and applies it to the result of mapping. We can test it by defining yet another implementation of <var>count</var>:</p>
<pre>count_cont lst = map_cont sum one lst</pre>
<p>The continuation here is the function <var>sum</var> that will be applied to the list produced by acting with function <var>one</var> on the list <var>lst</var>. Since this is quite a handful, let me rewrite it in a more familiar notation of runtime C++:</p>
<pre>int map_cont(int (*cont)(list), int (*f)(int), list lst) {
    list tmp;
    for (auto it = lst.begin(); it != lst.end(); ++it)
        tmp.push_front(f(*it));
    return cont(tmp);
}</pre>
<p>Now for the same thing in compile-time C++:</p>
<pre>template&lt;template&lt;class...&gt; class cont,
         template&lt;class&gt; class f,
         class... lst&gt; struct
<span style="color:#00f;">map_cont</span> {
    static const int value =
        cont&lt;typename f&lt;lst&gt;::type ...&gt;::value;
};</pre>
<p>It&#8217;s a one-to-one mapping of Haskell code&#8211;and it has very little in common with the iterative runtime C++ implementation. Also notice how loose the typing is in the TMP version as compared with the runtime version. The continuation is declared as a variadic template taking types, function <var>f</var> is declared as taking a type, and the list is a variadic list of types. Nothing is said about return types, except for the constraints that <var>f</var> returns a type (because <var>cont</var> consumes a list of types) and <var>cont</var> returns an <var>int</var>. Actually, this last constraint can be relaxed if we turn integers into types&#8211;a standard trick (hack?) in TMP:</p>
<pre>template&lt;int n&gt; struct
<span style="color:#00f;">Int</span> {
    static const int value = n;
};</pre>
<p>Loose typing&#8211;or &#8220;kinding,&#8221; as it is called for types of types&#8211;is an essential part of compile-time programming. In fact the popularity of the above trick shows that C++ kinding might be already too strong.</p>
<h2>The D Digression</h2>
<p>I&#8217;m grateful to Andrei Alexandrescu for reviewing this post. Since he objected to the sentence, &#8220;I&#8217;m disappointed that the D programming language followed the same path as C++ rather than lead the way,&#8221; I feel compelled to support my view with at least one example.  Consider various implementations of <var>all</var>. </p>
<p>In Haskell, beside the terse and elegant version I showed before:</p>
<pre>all pred [] = True
all pred (head:tail) = (pred head) &amp;&amp; (all pred tail)</pre>
<p>there is also a slightly more verbose one:</p>
<pre>all pred list =
    if null list then True
    else pred (head list) &amp;&amp; all pred (tail list)</pre>
<p>which translates better into D. The D version (taken from its standard library, Phobos) is not as short as Haskell&#8217;s, but follows the same functional paradigm:</p>
<pre>template allSatisfy(alias F, T...) {
    static if (T.length == 1)
    {
        alias F!(T[0]) <span style="color:#00f;">allSatisfy</span>;
    }
    else
    {
        enum bool <span style="color:#00f;">allSatisfy</span> = F!(T[0]) &amp;&amp; allSatisfy!(F, T[1 .. $]);
    }
}</pre>
<p>It definitely beats C++, there&#8217;s no doubt about it. There are some oddities about it though. Notice that the type tuple, <var>T&#8230;</var> gets the standard list treatment, but the split into the head and tail follows the array/slice notation. The head is <var>T[0]</var> and the tail is an array slice, <var>T[1..$]</var>. Instead of using <var>value</var> for the return value, D uses the &#8220;eponymous hack,&#8221; as I call it. The template &#8220;returns&#8221; the value using its own name. It&#8217;s a hack because it breaks down if you want to define the equivalent of &#8220;local variable&#8221; inside a template. For instance, the following code doesn&#8217;t compile:</p>
<pre>template allSatisfy(alias F, T...) {
    static if (T.length == 1)
    {
        alias F!(T[0]) allSatisfy;
    }
    else
    {
        <span style="color:#00f;">private enum bool tailResult = allSatisfy!(F, T[1..$]);</span>
        enum bool allSatisfy = F!(T[0]) &amp;&amp; tailResult;
    }
}</pre>
<p>This breaks one of the fundamental property of any language: decomposability. You want to be able to decompose your calculation into smaller chunks and then combine them together. Of course, you may still use decomposition if you don&#8217;t use the eponymous hack and just call your return value <var>value</var>. But that means modifying all the calling sites. </p>
<p>By the way, this is how decomposition works in Haskell:</p>
<pre>all2 pred [] = True;
all2 pred (head:tail) = (pred head) &amp;&amp; tailResult
    where tailResult = all2 pred tail</pre>
<p>Anyway, the important point is that there is <i>no reason</i> to force functional programming paradigm at compile-time. The following hypothetical syntax would be much easier for programmers who are used to imperative programming:</p>
<pre>template allSatisfy(alias Pred, T...) {
    foreach(t; T)
        if (!Pred!(t))
            return false;
    return true;
}</pre>
<p>In fact it&#8217;s much closer to the parameterized runtime function proposed by Andrei:</p>
<pre>bool all(alias pred, Range)(Range r) {
    foreach (e; r)
        if (!pred(e))
            return false;
    return true;
}</pre>
<p>This is why I&#8217;m disappointed that the D programming language followed the same path as C++ rather than lead the way.</p>
<h2>Conclusions</h2>
<p>I have argued that some familiarity with Haskell may be really helpful in understanding and designing templates in C++. You might ask why C++ chose such horrible syntax to do compile-time functional programming. Well, it didn&#8217;t. The ability to do compile-time calculations in C++ was <i>discovered</i> rather than built into the language. It was a very fruitful discovery, as the subsequent developments, especially the implementation of the Boost MPL, have shown. However, once the functional paradigm and its weird syntax took root in C++ TMP, it stayed there forever. I&#8217;m aware of only one effort to rationalize C++ TMP by Daveed Vandevoorde in <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n1471.pdf">Reflective Metaprogramming in C++</a>. </p>
<p>I have tested all code in this blog using the Hugs interpreter for Haskell and the GNU C++ compiler v. 4.4.1 with the special switch <var>-std=c++0x</var>. The names I used in the blog might conflict with the standard definitions. For instance, Hugs defines its own <var>map</var> and <var>foldr</var>.</p>
<p>If you want to learn more about template metaprogramming, I recommend two books:</p>
<ol>
<li>Andrei Alexandrescu, Modern C++ Design</li>
<li>David Abrahams and Aleksey Gurtvoy, C++ Template Metaprogramming</li>
</ol>
<p>The reference I used for variadic templates was the paper by <a href="http://www.osl.iu.edu/~dgregor/cpp/variadic-templates.pdf">Douglas Gregor, Jaakko J&auml;rvi, and Gary Powell</a></p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/bartoszmilewski.wordpress.com/1045/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/bartoszmilewski.wordpress.com/1045/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/bartoszmilewski.wordpress.com/1045/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/bartoszmilewski.wordpress.com/1045/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/bartoszmilewski.wordpress.com/1045/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/bartoszmilewski.wordpress.com/1045/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/bartoszmilewski.wordpress.com/1045/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/bartoszmilewski.wordpress.com/1045/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/bartoszmilewski.wordpress.com/1045/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/bartoszmilewski.wordpress.com/1045/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bartoszmilewski.wordpress.com&blog=3549518&post=1045&subd=bartoszmilewski&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://bartoszmilewski.wordpress.com/2009/10/21/what-does-haskell-have-to-do-with-c/feed/</wfw:commentRss>
		<slash:comments>19</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c018f213204496b4bbf481e7c8e6c15c?s=96&#38;d=http%3A%2F%2Fa.wordpress.com%2Fi%2Fmu.gif" medium="image">
			<media:title type="html">Bartosz Milewski</media:title>
		</media:content>
	</item>
		<item>
		<title>Ownership Systems against Data Races</title>
		<link>http://bartoszmilewski.wordpress.com/2009/09/22/ownership-systems-against-data-races/</link>
		<comments>http://bartoszmilewski.wordpress.com/2009/09/22/ownership-systems-against-data-races/#comments</comments>
		<pubDate>Tue, 22 Sep 2009 21:24:27 +0000</pubDate>
		<dc:creator>Bartosz Milewski</dc:creator>
				<category><![CDATA[C++]]></category>
		<category><![CDATA[Concurrency]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Type System]]></category>

		<guid isPermaLink="false">http://bartoszmilewski.wordpress.com/?p=1041</guid>
		<description><![CDATA[Here&#8217;s the video from my recent talk to the Northwest C++ Users Group (NWCPP) about how to translate the data-race free type system into a system of user-defined annotations in C++. I start with the definition of a data race and discuss various ways to eliminate them. Then I describe the ownership system and give [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bartoszmilewski.wordpress.com&blog=3549518&post=1041&subd=bartoszmilewski&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Here&#8217;s the <a href="http://www.vimeo.com/6689999">video from my recent talk</a> to the <a href="http://www.nwcpp.org">Northwest C++ Users Group (NWCPP)</a> about how to translate the data-race free type system into a system of user-defined annotations in C++. I start with the definition of a data race and discuss various ways to eliminate them. Then I describe the ownership system and give a few examples of annotated programs.</p>
<p>Here are the <a href="http://www.nwcpp.org/Downloads/2009/Ownership_Systems_against_Data_Races.pdf">slides from the presentation</a>. They include extensive notes.</p>
<p>By the way, we are looking for speakers at the NWCPP, not necessarily  related to C++. We are thinking of changing the charter to include all programming languages. If you are near Seattle on Oct 21 09 (or any third Wednesday of the month), and you are ready to give a 60 min presentation, please contact me.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/bartoszmilewski.wordpress.com/1041/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/bartoszmilewski.wordpress.com/1041/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/bartoszmilewski.wordpress.com/1041/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/bartoszmilewski.wordpress.com/1041/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/bartoszmilewski.wordpress.com/1041/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/bartoszmilewski.wordpress.com/1041/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/bartoszmilewski.wordpress.com/1041/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/bartoszmilewski.wordpress.com/1041/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/bartoszmilewski.wordpress.com/1041/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/bartoszmilewski.wordpress.com/1041/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bartoszmilewski.wordpress.com&blog=3549518&post=1041&subd=bartoszmilewski&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://bartoszmilewski.wordpress.com/2009/09/22/ownership-systems-against-data-races/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c018f213204496b4bbf481e7c8e6c15c?s=96&#38;d=http%3A%2F%2Fa.wordpress.com%2Fi%2Fmu.gif" medium="image">
			<media:title type="html">Bartosz Milewski</media:title>
		</media:content>
	</item>
		<item>
		<title>Template Metaprogramming Made Easy (Huh?)</title>
		<link>http://bartoszmilewski.wordpress.com/2009/09/08/template-metaprogramming-made-easy-huh/</link>
		<comments>http://bartoszmilewski.wordpress.com/2009/09/08/template-metaprogramming-made-easy-huh/#comments</comments>
		<pubDate>Tue, 08 Sep 2009 16:49:52 +0000</pubDate>
		<dc:creator>Bartosz Milewski</dc:creator>
				<category><![CDATA[C++]]></category>
		<category><![CDATA[D Programming Language]]></category>
		<category><![CDATA[Functional Programming]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://bartoszmilewski.wordpress.com/?p=1016</guid>
		<description><![CDATA[&#8220;I&#8217;ve been doing some template metaprogramming lately,&#8221; he said nonchallantly. 
Why is it funny? Because template metaprogramming is considered really hard. I mean, &#252;ber-guru-level hard. I&#8217;m lucky to be friends with two such gurus, Andrei Alexandrescu who wrote the seminal &#8220;Modern C++ Programming,&#8221; and Eric Niebler, who implemented the Xpressive library for Boost; so I [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bartoszmilewski.wordpress.com&blog=3549518&post=1016&subd=bartoszmilewski&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>&#8220;I&#8217;ve been doing some template metaprogramming lately,&#8221; he said nonchallantly. </p>
<p>Why is it funny? Because template metaprogramming is considered <i>really</i> hard. I mean, &uuml;ber-guru-level hard. I&#8217;m lucky to be friends with two such gurus, Andrei Alexandrescu who wrote the seminal &#8220;<a href="http://www.amazon.com/exec/obidos/ASIN/0201704315/modecdesi-20">Modern C++ Programming</a>,&#8221; and Eric Niebler, who implemented the <a href="http://www.boost.org/doc/libs/1_38_0/doc/html/xpressive.html">Xpressive</a> library for Boost; so I know the horrors. </p>
<p>But why is template metaprogramming so hard? Big part of it is that C++ templates are rather ill suited for metaprogramming, to put it mildly. They are fine for simple tasks like parameterized containers and some generic algorithms, but not for operations on types or lists of types. To make things worse, C++ doesn&#8217;t provide a lot in terms of reflection, so even such simple tasks like deciding whether a given type is a pointer are hard (see the example later). Granted, C++0x offers some improvements, like template parameter packs; but guru-level creds are still required.</p>
<p>So, I&#8217;ve been doing some template metaprogramming lately&#8230; in D. The D programming language doesn&#8217;t have the baggage of compatibility with previous botched attempts, so it makes many things considerably easier on programmers. But before I get to it, I&#8217;d like to talk a little about the connection between generic programming and functional programming, give a short intro to functional programming; and then show some examples in C++ and D that involve pattern matching and type lists.</p>
<h2>It&#8217;s not your father&#8217;s language</h2>
<p>The key to understanding metaprogramming is to realize that it&#8217;s done in a different language than the rest of your program. Both in C++ and D you use a form of functional language for that purpose. First of all, no mutation! If you pass a list of types to a template, it won&#8217;t be able to append another type to it. It will have to create a completely new list using the old list and the new type as raw materials.</p>
<p>Frankly, I don&#8217;t know why mutation should be disallowed at compile time (all template calculations are done at compile time). In fact, for templates that are used in D mixins, I proposed not to invent a new language but to use a subset of D that included mutation. It worked just fine and made mixins much easier to use (for an example, see my <a href="http://www.ddj.com/cpp/212201754">DrDobbs article</a>).</p>
<p>Once you disallow mutation, you&#8217;re pretty much stuck with functional paradigm. For instance, you can&#8217;t have loops, which require a mutable loop counter or some other mutable state, so you have to use recursion. </p>
<p>You&#8217;d think functional programmers would love template metaprogramming; except that they flip over horrendous syntax of C++ templates. The one thing going for functional programming is that it&#8217;s easy to define and implement. You can describe typeless lambda calculus with just a few formulas in operational semantics. </p>
<p>One thing is important though: meta-language can&#8217;t be strongly typed, because a strongly typed language requires another language to implement generic algorithms on top of it. So to terminate the succession of meta-meta-meta&#8230; languages there&#8217;s a need for either a typeless, or at least dynamically-typed, top-level meta-language. My suspicion is that C++0x concepts failed so miserably because they dragged the metalanguage in the direction of strong typing. The nails in the coffin for C++ concepts were concept maps, the moral equivalent of implicit conversions in strongly-typed languages. </p>
<p>Templates are still not totally typeless. They distinguish between type arguments (introduced by <var>typename</var> or <var>class</var> in C++), template template arguments, and typed template arguments. Here&#8217;s an example that shows all three kinds:</p>
<pre>template&lt;class T, template&lt;class X&gt; class F, int n&gt;</pre>
<h2>Functional Programming in a Nutshell</h2>
<p>&#8220;Functions operating on functions&#8221;&#8211;that&#8217;s the gist of functional programming. The rest is syntactic sugar. Some of this sugar is very important. For instance, you want to have built-in integers and lists for data types, and pattern matching for dispatching. </p>
<h3>-Functions</h3>
<p>Here&#8217;s a very simple compile-time function in the C++ template language:</p>
<pre>template&lt;class T&gt;
struct IsPtr {
    static const bool apply = false;
}</pre>
<p>If it doesn&#8217;t look much like a function to you, here it is in more normal ad-hoc notation:</p>
<pre>IsPtr(T) {
    return false;
}</pre>
<p>You can &#8220;execute&#8221; or &#8220;call&#8221; this meta-function by instantiating the template <var>IsPtr</var> with a type argument and accessing its member <var>apply</var>:</p>
<pre>IsPtr&lt;int&gt;::apply;</pre>
<p>There is nothing magical about &#8220;apply&#8221;, you can call it anything (&#8220;result&#8221; or &#8220;value&#8221; are other popular identifiers). This particular meta-function returns a Boolean, but any compile-time constant may be returned. What&#8217;s more important, any type or a template may be returned. But let&#8217;s not get ahead of ourselves.</p>
<h3>-Pattern matching</h3>
<p>You might be wondering what the use is for a function (I&#8217;ll be dropping the &#8220;meta-&#8221; prefix in what follows) that always returns <var>false</var> and is called <var>IsPtr</var>. Enter the next weapon in the arsenal of functional programmers: pattern matching. What we need here is to be able to match function arguments to different patterns and execute different code depending on the match. In particular, we&#8217;d like to return a different value, <var>true</var>, for T matching the pattern <var>T*</var>. In the C++ metalanguage this is done by partial template specialization. It&#8217;s enough to define another template of the same name that matches a more specialized pattern, <var>T*</var>:</p>
<pre>template&lt;class T&gt;
struct IsPtr&lt;T*&gt; {
    static const bool apply = true;
}</pre>
<p>When faced with the call, </p>
<pre>IsPtr&lt;int*&gt;::apply</pre>
<p>the compiler will first look for specializations of the template <var>IsPtr</var>, starting with the most specialized one. In our case, the argument <var>int*</var> matches the pattern <var>T*</var> so the version returning <var>true</var> will be instantiated. Accessing the <var>apply</var> member of this instantiation will result in the Boolean value <var>true</var>, which is exactly what we wanted. Let me rewrite this example using less obfuscated syntax.</p>
<pre>IsPtr(T*) {
    return true;
}
IsPtr(T) { // default case
    return false;
}</pre>
<p>D template syntax is slightly less complex than that of C++. The above example will read:</p>
<pre>template IsPtr(T) {
    static if (is (T dummy: U*, U))
        enum IsPtr = true;
    else
        enum IsPtr = false;
}
// Compile-time tests
static assert( IsPtr!(int*) );
static assert( !IsPtr!(int) );
</pre>
<p>As you can see, D offers compile-time <var>if</var> statements and more general pattern matching. The syntax of pattern matching is not as clear as it could be (what&#8217;s with the <var>dummy</var>?), but it&#8217;s more flexible. Compile-time constants are declared as <var>enum</var>s. </p>
<p>There is one little trick (a hack?) that makes the syntax of &#8220;function call&#8221; a little cleaner. If, inside the template, you define a member of the same name as the template itself (I call it the &#8220;eponymous&#8221; member) than you don&#8217;t have to use the &#8220;apply&#8221; syntax. The &#8220;call&#8221; looks more like a call, except for the exclamation mark before the argument list (a D tradeoff for not using angle brackets). You&#8217;ll see later how the eponymous trick fails for more complex cases.</p>
<h3>-Lists</h3>
<p>The fundamental data structure in all functional languages is a list. Lists are very easy to operate upon using recursive algorithms and, as it turns out, they can be used to define arbitrarily complex data structures. No wonder C++0x felt obliged to introduce a compile-time type list as a primitive. It&#8217;s called a <i>template parameter pack</i> and the new syntax is:</p>
<pre>template&lt;class... T&gt;Foo</pre>
<p>You can instantiate such a template with zero arguments,</p>
<pre>Foo&lt;&gt;</pre>
<p>one argument,</p>
<pre>Foo&lt;int&gt;</pre>
<p>or more arguments,</p>
<pre>Foo&lt;int, char*, void*&gt;</pre>
<p>How do you iterate over a type list? Well, there is no iteration in the metalanguge so the best you can do is to use recursion. To do that, you have to be able to separate the head of the list from its tail. Then you perform the action on the head and call yourself recursively with the tail. The head/tail separation is done using pattern matching. </p>
<p>Let me demonstrate a simple example from the paper <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2080.pdf">Variadic Templates</a> by Garry Powell et al. It calculates the length of a pack using recursion. First, the basic case&#8211;length-zero list:</p>
<pre>template&lt;&gt;
struct count &lt;&gt; {
    static const int value = 0;
}</pre>
<p>That is the full specialization of a template, so it will be tried first. Here&#8217;s the general case:</p>
<pre>template&lt;typename Head, typename... Tail&gt;
struct count&lt;Head, Tail...&gt; {
    static const int value = 1 + count&lt;Tail...&gt;::value;
}</pre>
<p>Let&#8217;s see what it would look like in &#8220;normal&#8221; notation:</p>
<pre>count() {
    return 0;
}
count(head, tail) {
    return 1 + count(tail);
}</pre>
<p>And here&#8217;s the D version:</p>
<pre>template count(T...) {
    static if (T.length == 0)
        enum count = 0;
    else
        enum count = 1 + count!(T[1..$]);
}
// tests
static assert( count!() == 0);
static assert( count!(int, char*, char[]) == 3);</pre>
<p><var>T&#8230;</var> denotes a type tuple, which supports array-like access. To get to the tail of the list, D uses array slicing, where <var>T[1..$]</var> denotes the slice of the array starting from index 1 up to the length of the array (denoted by the dollar sign). I&#8217;ll explain the important differences between C++ pack and D tuple (including pack expansion) in the next installment.</p>
<h2>Conclusion</h2>
<p>When looked upon from the functional perspective, template metaprogramming doesn&#8217;t look as intimidating as it it seems at first. Knowing this interpretation makes you wonder if there isn&#8217;t a better syntax or even a better paradigm for metaprogramming.</p>
<p>I&#8217;ll discuss more interesting parts of template metaprogramming in the next installment (this one is getting too big already). In particular, I&#8217;ll show examples of higher order meta-functions like <var>Filter</var> or <var>Not</var> and some interesting tricks with type lists.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/bartoszmilewski.wordpress.com/1016/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/bartoszmilewski.wordpress.com/1016/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/bartoszmilewski.wordpress.com/1016/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/bartoszmilewski.wordpress.com/1016/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/bartoszmilewski.wordpress.com/1016/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/bartoszmilewski.wordpress.com/1016/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/bartoszmilewski.wordpress.com/1016/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/bartoszmilewski.wordpress.com/1016/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/bartoszmilewski.wordpress.com/1016/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/bartoszmilewski.wordpress.com/1016/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bartoszmilewski.wordpress.com&blog=3549518&post=1016&subd=bartoszmilewski&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://bartoszmilewski.wordpress.com/2009/09/08/template-metaprogramming-made-easy-huh/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c018f213204496b4bbf481e7c8e6c15c?s=96&#38;d=http%3A%2F%2Fa.wordpress.com%2Fi%2Fmu.gif" medium="image">
			<media:title type="html">Bartosz Milewski</media:title>
		</media:content>
	</item>
		<item>
		<title>Spawning a Thread, the D way</title>
		<link>http://bartoszmilewski.wordpress.com/2009/09/01/spawning-a-thread-the-d-way/</link>
		<comments>http://bartoszmilewski.wordpress.com/2009/09/01/spawning-a-thread-the-d-way/#comments</comments>
		<pubDate>Tue, 01 Sep 2009 21:51:24 +0000</pubDate>
		<dc:creator>Bartosz Milewski</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://bartoszmilewski.wordpress.com/?p=960</guid>
		<description><![CDATA[Spawning a thread in non-functional languages is considered a very low-level primitive. Often spawn or CreateThread takes a function pointer and an untyped (void) pointer to &#8220;data&#8221;. The newly created thread will execute the function, passing it the untyped pointer, and it&#8217;s up to the function to cast the data into something more palatable. This [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bartoszmilewski.wordpress.com&blog=3549518&post=960&subd=bartoszmilewski&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Spawning a thread in non-functional languages is considered a very low-level primitive. Often <var>spawn</var> or <var>CreateThread</var> takes a function pointer and an untyped (void) pointer to &#8220;data&#8221;. The newly created thread will execute the function, passing it the untyped pointer, and it&#8217;s up to the function to cast the data into something more palatable. This is indeed the lowest of the lowest. It&#8217;s the stinky gutters of programming. </p>
<p>Isn&#8217;t it much nicer to create a <var>Thread</var> or a <var>Runnable</var> object and let the ugly casting be done under the covers? But, as I argued before, the <var>Thread</var> object doesn&#8217;t really buy you much in terms of the most important safety issue: the avoidance of data races. So we can have a <var>Thread</var> object instead of a void pointer, and a <var>run</var> method that understands the format of the <var>Thread</var> object (or <var>Runnable</var>, take your pick). But because the <var>Thread</var> /<var>Runnable</var> object has reference semantics, we still end up inadvertently sharing data between threads. Unless the programmer consciously avoids or synchronizes shared access, he or she is left exposed to the most vile concurrency bugs&#8211;<i>by default</i>!</p>
<p>As they say, Cooks cover their mistakes with sauces; doctors, with six feet of dirt; language designers, with objects. </p>
<h2>Requirements</h2>
<p>But enough ranting! I have the opportunity to design the <var>spawn</var> function for D and I don&#8217;t want to do any more cover-ups beyond hiding the ugly systems&#8217; APIs. Here are my design requirements:</p>
<ul>
<li><var>spawn</var> should take an arbitrary function as the main argument. It should refuse (at compile time) delegates or closures, which would introduce back-door sharing. (This might be relaxed later as we gain experience in controlling the sharing.)</li>
<li>It should take a variable number of arguments of the types compatible with those of the function parameters. It should detect type mismatches at compile time.</li>
<li>It should refuse the types of arguments that are prone to introducing data races. For now, I&#8217;ll allow only value types, immutable types, and explicitly shared types (<var>shared</var> is a type modifier in D).</li>
</ul>
<p>I wish I could use the more precise race-free type system that I&#8217;ve been describing in my previous posts, but since I can&#8217;t get it into D2, there&#8217;s still a little bit of &#8220;programmer beware&#8221; in this implementation.</p>
<p>These requirement seem like a tall order for any language other than D. I wouldn&#8217;t say it&#8217;s a piece of cake in D, but it&#8217;s well within the reach of a moderately experienced programmer.</p>
<h2>Unit Tests</h2>
<p>Let me start by writing a little use case for my design (Oh, the joys of extreme programming!):</p>
<pre>S s = { 3.14 };
Tid tid = <span style="color:red;">spawn</span>(&amp;thrFun, 2, s, "hello");
tid.join;</pre>
<p>Here&#8217;s the definition of the function, <var>thrFun</var>:</p>
<pre>void thrFun(int i, S s, string str) {
    writeln("thread function called with: ", i, ", ", s.fl, " and ", str);
}</pre>
<p>Its parameter types fulfill the restrictions I listed above. The <var>int</var> is a value and so is <var>S</var> (structs are value types in D, unless they contain references):</p>
<pre>struct S {
    float fl;
}</pre>
<p>Interestingly, the string is okay too, because its reference part is immutable. In D, a string is defined as an array of immutable characters, <var>immutable (char)[]</var>.</p>
<p>Besides positive tests, the even more important cases are negative. For instance, I don&#8217;t want <var>spawn</var> to accept a function that takes an <var>Object</var> as argument. Objects are reference types and (if not declared <var>shared</var>) can sneak in unprotected sharing. </p>
<p>How do you build unit tests whose compilation should fail? Well, D has a trick for that (ignore the ugly syntax): </p>
<pre>void fo(Object o) {}
assert (!__traits(compiles,
    (Object o) { return spawn(&amp;fo, o); }));</pre>
<p>This code asserts that the function literal (a lambda), </p>
<pre>(Object o){ return spawn(&amp;fo, o); }</pre>
<p>does not compile with the thread function <var>fo</var>. Now that&#8217;s one useful construct worth remembering!</p>
<h2>Implementation</h2>
<p>Without further ado, I present you with the implementation of <var>spawn</var> that passes all the above tests (and more):</p>
<pre>Tid spawn(T...)(void function(T) fp, T args)
    if (isLocalMsgTypes!(T))
{
    return core.thread.spawn( <span style="color:#c00;">(){ fp(args); }</span>);
}</pre>
<p>This attractively terse code uses quite a handful of D features, so let me first read it out loud for kicks:</p>
<ul>
<li><var>spawn</var> is a function template returning the <var>Tid</var> (Thread ID) structure. <var>Tid</var> is a reference-counted handle, see my previous blog.</li>
<li>It is parameterized by a type tuple <var>T&#8230;</var>.</li>
<li>It takes the following parameters:
<ul>
<li>a pointer to a function, <var>fp</var>, taking arguments of the types specified by the tuple <var>T&#8230;</var></li>
<li>a variable number of parameters, <var>args</var>, of types <var>T&#8230;</var>. </li>
</ul>
</li>
<li>The type tuple <var>T&#8230;</var> must obey the predicate <var>isLocalMsgTypes</var>, which is defined elsewhere.</li>
<li>The implementation of <var>spawn</var> calls the (in general, unsafe) function <var>core.thread.spawn</var> (defined in the module <var>core.thread</var>) with the following closure (nested function):
<pre>    (){ fp(args); }</pre>
<p>which captures local variables, <var>args</var>.</li>
</ul>
<p>As you may guess, the newly spawned thread runs the closure, so it has access to captured <var>args</var> from the original thread. In general, that&#8217;s a recipe for a data race. What saves the day is the predicate <var>isLocalMsgTypes</var>, which defines what types are safe to pass as inter-thread messages. </p>
<p>Note the important point: there should be no difference between the constraints imposed on the types of parameters passed to <var>spawn</var> and the types of messages that can be sent to a thread. You can think of spawn parameters as initial messages sent to a nascent thread. As I said before, message types include value types, immutable types and shared types (no support for unique types yet).</p>
<h2>Useful D features</h2>
<p>Let me explain some of D novelties I used in the definition of <var>spawn</var>. </p>
<p>A function with two sets of parenthesized parameters is automatically a template&#8211;the first set are template parameters, the second, runtime parameters. </p>
<h3>-Tuples</h3>
<p>Type tuples, like <var>T&#8230;</var>, represent arbitrary lists of types. Similar constructs have also been introduced in C++0x, presumably under pressure from Boost, to replace the unmanageably complex type lists. </p>
<p>What are the things that you can do with a type-tuple in D? You can retrieve its length (<var>T.length</var>), access its elements by index, or slice it; all at compile time. You can also define a variable-argument-list function, like <var>spawn</var> and use one symbol for a whole list of arguments, as in <var>T args</var>:</p>
<pre>Tid spawn(T...)(void function(T) fp, T args)</pre>
<p>Now let&#8217;s go back to my test:</p>
<pre>Tid tid = <span style="color:red;">spawn</span>(&amp;f, 2, s, "hello");</pre>
<p>I spawn a thread to execute a function of three arguments, <var>void f(int i, S s, string str)</var>. The <var>spawn</var> template is instantiated with a type tuple <var>(int, S, string)</var>. At compile time, this tuple is successfully tested by the predicate <var>isLocalMsgTypes</var>. The actual arguments to <var>spawn</var>, besides the pointer to function, are <var>(2, s, &#8220;hello&#8221;)</var>, which indeed are of correct types. They appear inside <var>spawn</var> under the collective name, <var>args</var>. They are then used as a collective argument to <var>fp</var> inside the closure, <var>(){ fp(args); }</var>. </p>
<h3>-Closures</h3>
<p>The closure captures the arguments to <var>spawn</var>. It is then passed to the internal function (not a template anymore), </p>
<pre>core.thread.spawn(void delegate() dlg)</pre>
<p>When the new thread is created, it calls the closure <var>dlg</var>, which calls <var>fp</var> with the captured arguments. At that point, the value arguments, <var>i</var> and <var>s</var> are copied, along with the shallow part of the string, <var>str</var>. The deep part of the string, the buffer, is not copied&#8211;and for a good reason too&#8211; it is immutable, so it can safely be read concurrently. At that point, the thread function is free to use those arguments without worrying about races. </p>
<h3>-Restricted Templates</h3>
<p>The <var>if</var> statement before the body of a template is D&#8217;s response to C++0x DOA concepts (yes, after years of design discussions, concepts were finally killed with extreme prejudice). </p>
<pre>if (isLocalMsgTypes!(T))</pre>
<p>The <var>if</var> is used to create &#8220;restricted templates&#8221;. It contains a logical compile-time expression that is checked before the template is instantiated. If the expression is <var>false</var>, the template doesn&#8217;t match and you get a compile error. Notice that template restrictions not only produce better error messages, but can also impose restrictions that are otherwise impossible or very hard to enforce. Without the restriction, <var>spawn</var> could be called with an unsuitable type, e.g. an <var>Object</var> not declared as <var>shared</var> and the compiler wouldn&#8217;t even blink. </p>
<p>(I will talk about template restrictions and templates in general in a future blog.)</p>
<h3>&#8211;Message Types</h3>
<p>Besides values, we may also pass to <var>spawn</var> objects that are declared as <var>immutable</var> or <var>shared</var> (in fact, we may pass them inside values as well). In D, <var>shared</var> objects are supposed to provide their own synchronization&#8211;their methods must either be <var>synchronized</var> or lock free. An example of a shared object that you&#8217;d want to pass to <var>spawn</var> is a message queue&#8211;to be shared between the parent thread and the spawned thread.</p>
<p>You might remember that my race-free type system proposal included <var>unique</var> types, which would be great for message passing, and consequently as arguments to <var>spawn</var> (there is a uniqueness proposal for Scala, and there&#8217;s the Kilim message-passing system for Java based on unique types). Unfortunately, unique types won&#8217;t be available in D2. Instead some kind of specialized <var>Unique</var> library classes might be defined for that purpose.</p>
<h2>Conclusion</h2>
<p>The D programming language has two faces. On the one hand, it&#8217;s easy to use even for a beginner. On the other hand, it provides enough expressive power to allow for the creation of sophisticated and safe libraries. What I tried to accomplish in this post is to give a peek at D from the perspective of a library writer. In particular I described mechanisms that help make the concurrency library safer to use. </p>
<p>This is still work in progress, so don&#8217;t expect to see it in the current releases of D2. </p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/bartoszmilewski.wordpress.com/960/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/bartoszmilewski.wordpress.com/960/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/bartoszmilewski.wordpress.com/960/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/bartoszmilewski.wordpress.com/960/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/bartoszmilewski.wordpress.com/960/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/bartoszmilewski.wordpress.com/960/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/bartoszmilewski.wordpress.com/960/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/bartoszmilewski.wordpress.com/960/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/bartoszmilewski.wordpress.com/960/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/bartoszmilewski.wordpress.com/960/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bartoszmilewski.wordpress.com&blog=3549518&post=960&subd=bartoszmilewski&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://bartoszmilewski.wordpress.com/2009/09/01/spawning-a-thread-the-d-way/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c018f213204496b4bbf481e7c8e6c15c?s=96&#38;d=http%3A%2F%2Fa.wordpress.com%2Fi%2Fmu.gif" medium="image">
			<media:title type="html">Bartosz Milewski</media:title>
		</media:content>
	</item>
		<item>
		<title>The Anatomy of Reference Counting</title>
		<link>http://bartoszmilewski.wordpress.com/2009/08/19/the-anatomy-of-reference-counting/</link>
		<comments>http://bartoszmilewski.wordpress.com/2009/08/19/the-anatomy-of-reference-counting/#comments</comments>
		<pubDate>Wed, 19 Aug 2009 19:27:06 +0000</pubDate>
		<dc:creator>Bartosz Milewski</dc:creator>
				<category><![CDATA[C++]]></category>
		<category><![CDATA[Concurrency]]></category>
		<category><![CDATA[D Programming Language]]></category>
		<category><![CDATA[Multithreading]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://bartoszmilewski.wordpress.com/?p=997</guid>
		<description><![CDATA[What is there to reference counting that is not obvious? In any language that supports deterministic destruction and the overloading of the copy constructor and the assignment operator it should be trivial. Or so I though until I decided to implement a simple ref-counted thread handle in D. Two problems popped up: 

How does reference [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bartoszmilewski.wordpress.com&blog=3549518&post=997&subd=bartoszmilewski&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>What is there to reference counting that is not obvious? In any language that supports deterministic destruction and the overloading of the copy constructor and the assignment operator it should be trivial. Or so I though until I decided to implement a simple ref-counted thread handle in D. Two problems popped up: </p>
<ol>
<li>How does reference counting interact with garbage collection?</li>
<li>How to avoid data races in a multithreaded environment?</li>
</ol>
<p>In purely garbage-collected languages, like Java, you don&#8217;t implement reference counting, period. Which is pretty bad, if you ask me. GC is great at managing memory, but not so good at managing other resources. When a program runs out of memory, it forces a collection and reclaims unused memory. When it runs out of, say, system thread handles, it doesn&#8217;t reclaim unused handles&#8211;it just dies. You can&#8217;t use GC to manage system handles. So, as far as system resources go, Java forces the programmer to use the moral equivalent of C&#8217;s <var>malloc</var> and <var>free</var>. The programmer must free the resources explicitly.</p>
<p>In C++ you have <var>std::shared_ptr</var> for all your reference-counting needs, but you don&#8217;t have garbage collection for memory&#8211;at least not yet. (There is also the Microsoft&#8217;s C++/CLI which mixes the two systems.)</p>
<p>D offers the best of both worlds: GC <i>and</i> deterministic destruction. So let&#8217;s use GC to manage memory and reference counting (or other policies, like uniqueness) to manage other limited resources. </p>
<h2>First attempt</h2>
<p>The key to reference counting is to have a &#8220;value&#8221; type, like the <var>shared_ptr</var> in C++, that can be cloned and passed between functions at will. Internally this value type must have access to a shared chunk of memory that contains the reference count. In <var>shared_ptr</var> this chunk is a separately allocated integral type&#8211;the counter. The important thing is that all clones of the same resource share the same counter. The counter&#8217;s value reflects how many clones there are. Copy constructors and assignment operators take care of keeping the count exact. When the count goes to zero, that is the last copy of the resource goes out of scope, the resource is automatically freed (for instance, by calling <var>CloseHandle</var>). </p>
<p>In my first attempt, I decided that the memory allocated for the counter should be garbage-collected. After all, the important thing is to release the handle&#8211;the memory will take care of itself. </p>
<pre>struct RcHandle {
   shared Counter _counter; // GC'd shared Counter object
   HANDLE _h;
   ~this() { // destructor
      if (_counter.dec() == 0) // access the counter
         CloseHandle(_h);
   }
   // methods that keep the counter up to date
}</pre>
<p><var>RcHandle</var> is a struct, which is a value type in D. <var>Counter</var> is a class, which is a reference type; so <var>_counter</var> really hides a pointer to shared memory.</p>
<p>A few tests later I got a nasty surprise. My program faulted while trying to access an already deallocated counter. How did that happen? How could garbage collector deallocate my counter if I still had a reference to it?</p>
<p>Here&#8217;s what I did in my test: I embedded the ref-counted handle inside another garbage-collected object:</p>
<pre>class Embedder { // GC'd class object
   RcHandle _rc;
}</pre>
<p>When the time came to collect that object (which happened after the program ran to completion), its finalizer was called. Whenever an object contains fields that have non-trivial destructors, the compiler generates a finalizer that calls the destructors of those embedded objects&#8211;<var>_rc</var> in this case. The destructor of the ref-counted handle checks the reference count stored in the counter. Unfortunately the counter didn&#8217;t exist anymore. Hours of debugging later I had the answer.</p>
<p>What happened is that the garbage collector had two objects on its list: the embedder and the counter. It just so happened that the collector decided to collect those two objects in reverse order: the counter first, then the embedding object. So, by the time it got to the finalizer of the embedding object, the counter was gone! </p>
<p>What I discovered (with the help of other members of the D team who were involved in the discussion)  was that there are some limitations on mixing garbage collection with deterministic destruction. There is a general rule:</p>
<p><b>An object&#8217;s destructor must not access any garbage-collected objects embedded in it.</b></p>
<p>Since the destructor of the ref-counted handle must have access to the counter, the counter must not be garbage-collectible. That means only one thing: it has to be allocated using <var>malloc</var> and explicitly deallocated using <var>free</var>. Which brings us to the second problem.</p>
<h2>Concurrency</h2>
<p>What can be simpler than an atomic reference count? On most processors you can atomically increment and decrement a memory location. You can even decrement and test the value in one uninterruptible operation. Problem solved! Or is it?</p>
<p>There is one tiny complication&#8211;the location that you are atomically modifying might disappear. I know, this is totally counter-intuitive. After all the management of the counter follows the simple rule: the last to leave the room turns off the light. If the destructor of <var>RcHandle</var> sees the reference count going from one to zero, it knows that no other <var>RcHandle</var> has access it, and it can safely <var>free</var> the counter. Who can argue with cold logic?</p>
<p>Here&#8217;s the troubling scenario: <var>RcHandle</var> is embedded in an object that is visible from two threads: </p>
<pre>class Embedder {
   RcHandle _rcH;
}
shared Embedder emb;</pre>
<p>Thread 1 tries to overwrite the handle:</p>
<pre>RcHandle myHandle1;
emb._rcH = myHandle1;</pre>
<p>while Thread 2 tries to copy the same handle to a local variable:</p>
<pre>RcHandle myHandle2 = emb._rcH;</pre>
<p>Consider the following interleaving:</p>
<ol>
<li>T2: Load the address of the <var>_counter</var> embedded in <var>_rcH</var>.</li>
<li>T1: Swap <var>emb._rcH</var> with <var>myHandle</var></li>
<li>T1: Decrement the counter that was embedded in <var>_rcH</var>. If it&#8217;s zero (and it is, in this example), free the counter.</li>
<li>T2: Increment the <var>_counter</var>. Oops! This memory location has just been freed.</li>
</ol>
<p>The snag is that there is a window between T2 reading the pointer in (1), and incrementing the location it&#8217;s pointing to in (4). Within that window, the reference count does not match the actual number of clients having access to the pointer. If T1 happens to do its ugly deed of freeing the counter within that window, the race may turn out deadly. (This problem has been known for some time and there were various proposals to fix it, for instance using DCAS, as in this paper on <a href="http://research.sun.com/people/moir/pubs/LFRC-DC02.pdf">Lock-Free Reference Counting</a>.)</p>
<p>Should we worry? After all the C++ <var>shared_ptr</var> also exposes this race and nobody is crying havoc. It turns out that it all boils down to the responsibilities of the shared object (and I&#8217;m grateful to <a href="http://erdani.org/">Andrei</a> for pointing it out). </p>
<p><b>A shared object should not willy-nilly expose its implementation to clients</b></p>
<p>If the clients of <var>Embedder</var> want access to the handle, they should call a <var>synchronized</var> method. Here&#8217;s the correct, race-free implementation of the <var>Embedder</var> in the scenario I just described.</p>
<pre>class Embedder {
private:
   RcHandle _rcH;
public:
   synchronized RcHandle GetHandle() const { return _rcH; }
   synchronized void SetHandle(RcHandle h) { _rcH = h; }
   ...
}</pre>
<p>The method <var>GetHandle</var> copies <var>_rcH</var> and increments its count under the <var>Embedder</var>&#8217;s lock. Another thread calling <var>SetHandle</var> has no chance of interleaving with that action, because it is forced to use the same lock. D2 actually enforces this kind of protection for shared objects, so my original example wouldn&#8217;t even compile.</p>
<p>You might be thinking right now that all this is baloney because I&#8217;m trying to fit a square peg into a round hole. I&#8217;m imposing value semantics on a non-atomic object, and you cannot atomically overwrite a non-atomic object. However, using this logic, you could convince yourself that a <var>double</var> cannot have value semantics (on most common architectures doubles are too large to be atomic). And yet you can safely pass doubles between threads and assign them to each other (which is the same as overwriting). It&#8217;s only when you embed a <var>double</var> inside a shared object, you <i>have to</i> protect it from concurrent access. And it&#8217;s not the <var>double</var> protecting itself&#8211;it&#8217;s the shared embedder that is responsible for synchronization. It&#8217;s exactly the same with <var>RcHandle</var>.</p>
<h2>Conclusions</h2>
<p>Just when you thought you knew everything about reference counting you discover something you haven&#8217;t though about. The take-home message is that mixing garbage collection with deterministic destruction is not a trivial matter, and that a race-free reference-counted object is vulnerable to races when embedded it in another shared object. Something to keep in mind when programming in D or in C++.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/bartoszmilewski.wordpress.com/997/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/bartoszmilewski.wordpress.com/997/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/bartoszmilewski.wordpress.com/997/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/bartoszmilewski.wordpress.com/997/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/bartoszmilewski.wordpress.com/997/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/bartoszmilewski.wordpress.com/997/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/bartoszmilewski.wordpress.com/997/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/bartoszmilewski.wordpress.com/997/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/bartoszmilewski.wordpress.com/997/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/bartoszmilewski.wordpress.com/997/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bartoszmilewski.wordpress.com&blog=3549518&post=997&subd=bartoszmilewski&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://bartoszmilewski.wordpress.com/2009/08/19/the-anatomy-of-reference-counting/feed/</wfw:commentRss>
		<slash:comments>22</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c018f213204496b4bbf481e7c8e6c15c?s=96&#38;d=http%3A%2F%2Fa.wordpress.com%2Fi%2Fmu.gif" medium="image">
			<media:title type="html">Bartosz Milewski</media:title>
		</media:content>
	</item>
		<item>
		<title>On Actors and Casting</title>
		<link>http://bartoszmilewski.wordpress.com/2009/07/16/on-actors-and-casting/</link>
		<comments>http://bartoszmilewski.wordpress.com/2009/07/16/on-actors-and-casting/#comments</comments>
		<pubDate>Thu, 16 Jul 2009 18:28:01 +0000</pubDate>
		<dc:creator>Bartosz Milewski</dc:creator>
				<category><![CDATA[C++]]></category>
		<category><![CDATA[Concurrency]]></category>
		<category><![CDATA[D Programming Language]]></category>
		<category><![CDATA[Erlang]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Multithreading]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Scala]]></category>
		<category><![CDATA[Type System]]></category>

		<guid isPermaLink="false">http://bartoszmilewski.wordpress.com/?p=935</guid>
		<description><![CDATA[Is the Actor model just another name for message passing between threads? In other words, can you consider a Java Thread object with a message queue an Actor? Or is there more to the Actor model? Bartosz investigates. 
I&#8217;ll start with listing various properties that define the Actor Model. I will discuss implementation options in [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bartoszmilewski.wordpress.com&blog=3549518&post=935&subd=bartoszmilewski&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Is the Actor model just another name for message passing between threads? In other words, can you consider a Java <var>Thread</var> object with a message queue an Actor? Or is there more to the Actor model? Bartosz investigates. </p>
<p>I&#8217;ll start with listing various properties that define the Actor Model. I will discuss implementation options in several languages.</p>
<h2>Concurrency</h2>
<p><b>Actors are objects that execute concurrently.</b> Well, sort of. Erlang, for instance, is not an object-oriented language, so we can&#8217;t really talk about &#8220;objects&#8221;. An actor in Erlang is represented by a thing called a Process ID (Pid). But that&#8217;s nitpicking. The second part of the statement is more interesting. Strictly speaking, an actor may execute concurrently but at times it will not. For instance, in Scala, actor code may be executed by the calling thread. </p>
<p>Caveats aside, it&#8217;s convenient to think of actors as objects with a thread inside.</p>
<h2>Message Passing</h2>
<p><b>Actors communicate through message passing.</b> Actors don&#8217;t communicate using shared memory (or at least pretend not to). The only way data may be passed between actors is through messages. </p>
<p>Erlang has a primitive send operation denoted by the exclamation mark. To send a message <var>Msg</var> to the process (actor) <var>Pid</var> you write:</p>
<pre>Pid ! Msg</pre>
<p>The message is copied to the address space of the receiver, so there is no sharing. </p>
<p>If you were to imitate this mechanism in Java, you would create a <var>Thread</var> object with a mailbox (a concurrent message queue), with no public methods other than <var>put</var> and <var>get</var> for passing messages. Enforcing copy semantics in Java is impossible so, strictly speaking, mailboxes should only store built-in types. Note that passing a Java <var>String</var>s is okay, since strings are immutable. </p>
<h3>-Typed messages</h3>
<p>Here&#8217;s the first conundrum: in Java, as in any statically typed language, messages have to be typed. If you want to process more than one type of messages, it&#8217;s not enough to have just one mailbox per actor. In Erlang, which is dynamically typed, one canonical mailbox per actor suffices. In Java, mailboxes have to be abstracted from actors. So an actor may have one mailbox for accepting strings, another for integers, etc. You build actors from those smaller blocks.</p>
<p>But having multiple mailboxes creates another problem: How to block, waiting for messages from more than one mailbox at a time without breaking the encapsulation? And when one of the mailboxes fires, how to retrieve the correct type of a message from the appropriate mailbox? I&#8217;ll describe a few approaches.</p>
<h3>-Pattern matching</h3>
<p>Scala, which is also a statically typed language, uses the power of functional programming to to solve the typed messages problem. The <var>receive</var> statement uses pattern matching, which can match different types. It looks like a switch statements whose <var>case</var> labels are patterns. A pattern may specify the type it expects. You may send a string, or an integer, or a more complex data structure to an actor. A single <var>receive</var> statement inside the actor code may match any of those.</p>
<pre>receive {
    case s: String =&gt; println("string: "+ s)
    case i: Int =&gt; println("integer: "+ i)
    case m =&gt; println("unknown: "+ m)
}</pre>
<p>In Scala the type of a variable is specified after the colon, so <var>s:String</var> declares the variable <var>s</var> of the type <var>String</var>. The last case is a catch-all.</p>
<p>This is a very elegant solution to a difficult problem of marrying object-oriented programming to functional programming&#8211;a task at which Scala exceeds. </p>
<h3>-Casting</h3>
<p>Of course, we always have the option of escaping the type system. A mailbox could be just a queue of <var>Object</var>s. When a message is received, the actor could try casting it to each of the expected types in turn or use reflection to find out the type of the message. Here&#8217;s what Martin Odersky, the creator of Scala,  has to say about it:</p>
<blockquote><p>The most direct (some would say: crudest) form of decomposition uses the type-test and type-cast instructions available in Java and many other languages.</p></blockquote>
<p>In the paper he co-authored with Emir and Williams (<a href="http://lampwww.epfl.ch/~emir/written/MatchingObjectsWithPatterns-TR.pdf">Matching Objects With Patterns</a>) he gives the following evaluation of this method:</p>
<blockquote><p>Evaluation: Type-tests and type-casts require zero overhead for the class hierarchy. The pattern matching itself is very verbose, for both shallow and deep patterns. In particular, every match appears as both a type-test and a subsequent type-cast. The scheme raises also the issue that type-casts are potentially unsafe because they can raise ClassCastExceptions. Type-tests and type-casts completely expose representation. They have mixed characteristics with respect to extensibility. On the one hand, one can add new variants without changing the framework (because there is nothing to be done in the framework itself). On the other hand, one cannot invent new patterns over existing variants that use the same syntax as the type-tests and type-casts.
</p></blockquote>
<p>The best one could do in C++ or D is to write generic code that hides casting from the client. Such generic code could use <i>continuations</i> to process messages after they&#8217;ve been cast. A continuation is a function that you pass to another function to be executed after that function completes (strictly speaking, a real continuation never returns, so I&#8217;m using this word loosely). The above example could be rewritten in C++ as:</p>
<pre>void onString(std::string const &amp; s) {
    cout &lt;&lt; "string: " &lt;&lt; s &lt;&lt; std::endl;
}
void onInt(int i) {
    cout &lt;&lt; "integer: " &lt;&lt; i &lt;&lt; std::endl;
}

receive&lt;std::string, int&gt; (&amp;onString, &amp;onInt);</pre>
<p>where <var>receive</var> is a variadic template (available in C++0x). It would do the dynamic casting and call the appropriate function to process the result. The syntax is awkward and less flexible than that of Scala, but it works. </p>
<p>The use of lambdas might make things a bit clearer. Here&#8217;s an example in D using lambdas (function literals), courtesy Sean Kelly and Jason House:</p>
<pre>receive(
    (string s){ writefln("string: %s", s); },
    (int i){ writefln("integer: %s", i); }
);</pre>
<p>Interestingly enough, Scala&#8217;s <var>receive</var> is a library function with the pattern matching block playing the role of a continuation. Scala has syntactic sugar to make lambdas look like curly-braced blocks of code. Actually, each case statement is interpreted by Scala as a <i>partial function</i>&#8211;a function that is not defined for all values (or types) of arguments. The pattern matching part of <var>case</var> becomes the <var>isDefinedAt</var> method of this partial function object, and the code after that becomes its <var>apply</var> method. Of course, partial functions could also be implemented in C++ or D, but with a lot of superfluous awkwardness&#8211;lambda notation doesn&#8217;t help when partial functions are involved.</p>
<h3>-Isolation</h3>
<p>Finally, there is the problem of <b>isolation</b>. A message-passing system must be protected from data sharing. As long as the message is a primitive type and is passed by value (or an immutable type passed by reference), there&#8217;s no problem. But when you pass a mutable <var>Object</var> as a message, in reality you are passing a reference (a handle) to it. Suddenly your message is shared and may be accessed by more than one thread at a time. You either need additional synchronization outside of the Actor model or risk data races. Languages that are not strictly functional, including Scala, have to deal with this problem. They usually pass this responsibility, conveniently, to the programmer.</p>
<h3>-Kilim</h3>
<p>Java is not a good language to implement the Actor model. You can extend Java though, and there is one such extension worth mentioning called <a href="http://www.malhar.net/sriram/kilim/kilim_ecoop08.pdf">Kilim</a> by Sriram Srinivasan and Alan Mycroft from Cambridge, UK. Messages in Kilim are restricted to objects with no internal aliasing, which have move semantics. The pre-processor (weaver) checks the structure of messages and generates appropriate Java code for passing them around. I tried to figure out how Kilim deals with waiting on multiple mailboxes, but there isn&#8217;t enough documentation available on the Web. The authors mention using the <var>select</var> statement, but never provide any details or examples.</p>
<p><i>Correction: Sriram was kind enough to provide an example of the use of <var>select</var>:</i></p>
<pre>int n = Mailbox.select(mb0, mb1, .., timeout);</pre>
<p><i>The return value is the index of the mailbox, or -1 for the timeout. Composability is an important feature of the message passing model.</i></p>
<h2>Dynamic Networks</h2>
<p>Everything I described so far is common to CSP (Communicating Sequential Processes) and the Actor model. Here&#8217;s what makes actors more general:</p>
<p><b>Connections between actors are dynamic.</b> Unlike processes in CSP, actors may establish communication channels dynamically. They may pass messages containing references to actors (or mailboxes). They can then send messages to those actors. Here&#8217;s a Scala example:</p>
<pre>receive {
    case (name: String, actor: Actor) =&gt;
        actor ! lookup(name)
}</pre>
<p>The original message is a tuple combining a string and an actor object. The receiver sends the result of <var>lookup(name)</var> to the actor it has just learned about. Thus a new communication channel between the receiver and the unknown actor can be established at runtime. (In Kilim the same is possible by passing mailboxes via messages.)</p>
<h2>Actors in D</h2>
<p>The D programming language with <a href="http://bartoszmilewski.wordpress.com/2009/05/26/race-free-multithreading/">my proposed race-free type system</a> could dramatically improve the safety of message passing. Race-free type system distinguishes between various types of sharing and enforces synchronization when necessary. For instance, since an <var>Actor</var> would be shared between threads, it would have to be declared <var>shared</var>. All objects inside a shared actor, including the mailbox, would automatically inherit the shared property. A shared message queue inside the mailbox could only store value types, <var>unique</var> types with move semantics, or reference types that are either immutable or are monitors (provide their own synchronization). These are exactly the types of messages that may be safely passed between actors. Notice that this is more than is allowed in Erlang (value types only) or Kilim (unique types only), but doesn&#8217;t include &#8220;dangerous&#8221; types that even Scala accepts (not to mention Java or C++). </p>
<p>I will discuss message queues in the next installment.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/bartoszmilewski.wordpress.com/935/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/bartoszmilewski.wordpress.com/935/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/bartoszmilewski.wordpress.com/935/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/bartoszmilewski.wordpress.com/935/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/bartoszmilewski.wordpress.com/935/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/bartoszmilewski.wordpress.com/935/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/bartoszmilewski.wordpress.com/935/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/bartoszmilewski.wordpress.com/935/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/bartoszmilewski.wordpress.com/935/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/bartoszmilewski.wordpress.com/935/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bartoszmilewski.wordpress.com&blog=3549518&post=935&subd=bartoszmilewski&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://bartoszmilewski.wordpress.com/2009/07/16/on-actors-and-casting/feed/</wfw:commentRss>
		<slash:comments>31</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c018f213204496b4bbf481e7c8e6c15c?s=96&#38;d=http%3A%2F%2Fa.wordpress.com%2Fi%2Fmu.gif" medium="image">
			<media:title type="html">Bartosz Milewski</media:title>
		</media:content>
	</item>
		<item>
		<title>What&#8217;s Wrong with the Thread Object?</title>
		<link>http://bartoszmilewski.wordpress.com/2009/07/07/whats-wrong-with-the-thread-object/</link>
		<comments>http://bartoszmilewski.wordpress.com/2009/07/07/whats-wrong-with-the-thread-object/#comments</comments>
		<pubDate>Tue, 07 Jul 2009 22:37:55 +0000</pubDate>
		<dc:creator>Bartosz Milewski</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://bartoszmilewski.wordpress.com/?p=923</guid>
		<description><![CDATA[I started writing a post about implementing actors in D when I realized that there was something wrong with the way thread spawning interacts with data sharing. Currently D&#8217;s Thread class closely mimics its Java counterpart, so I started wondering if this may be a more general problem&#8211;a problem with mixing object-oriented paradigm with multithreading.
In [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bartoszmilewski.wordpress.com&blog=3549518&post=923&subd=bartoszmilewski&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>I started writing a post about implementing actors in D when I realized that there was something wrong with the way thread spawning interacts with data sharing. Currently D&#8217;s <var>Thread</var> class closely mimics its Java counterpart, so I started wondering if this may be a more general problem&#8211;a problem with mixing object-oriented paradigm with multithreading.</p>
<p>In functional languages like Erlang or Concurrent ML, you start a thread by calling a function&#8211;<var>spawn</var> or <var>create</var>, respectively. The argument to this function is another function&#8211;the one to be executed in a new thread. If you think of a new thread as a semi-separate application, the thread function is its &#8220;main&#8221;. You may also pass arguments to it&#8211;the equivalent of <var>argc</var>, <var>argv</var>, only more general. Those arguments are, of course, passed by value (we&#8217;re talking <i>functional</i> programming after all).</p>
<p>In non-functional languages it&#8217;s possible and often desirable to <i>share</i> data between threads. It&#8217;s very important to know which variables are shared and which aren&#8217;t, because shared variables require special handling&#8211;they need synchronization. You may share global variables with a thread, or you may pass shared variables to it during thread creation. You may also provide access to more shared variables during the thread&#8217;s lifetime by attaching and detaching them from the original shared variables&#8211;that&#8217;s how message queues may be described in this language.</p>
<p>In the object-oriented world everything is an object so, predictably, a (capital letter) <var>Thread</var> is an object too. An object combines data with code. There is a piece of data associated with every thread&#8211;the thread ID or a handle&#8211;and there are things you can do to a thread, like pause it, wait for its termination, etc.</p>
<p>But there&#8217;s more: A <var>Thread</var> in Java has a thread function. It&#8217;s a method called <var>run</var>. The user defines his or her own thread by inheriting from <var>Thread</var> and overriding <var>run</var>. Since <var>run</var> takes no arguments, data has to be passed to the new thread by making it part of the derived thread object. In general a thread object contains two types of data: shared and non-shared. The non-shared data are the value arguments passed by the creator to the thread function, possibly some return values, plus state used internally by the thread function. Thread data may form some logical abstraction or, as it often happens, have the structure of a kitchen sink after a dinner party. </p>
<p>Because of the presence of shared state, a <var>Thread</var> object must be considered shared, thus requiring synchronization. As I described in my previous posts, public methods of a shared object must either be synchronized or lock free. But what about the <var>run</var> method? It cannot be synchronized because that would make the whole thread object inaccessible to other threads, including its creator (which may be okay for a daemon thread, but wold be a fatal flaw in general). </p>
<p>It makes perfect sense for <var>run</var> to be private, since nobody should call it from the outside (or, for that matter, from the inside). But the reason for private methods not requiring synchronization is that they are only called from public methods, which must be synchronized. This is not true for <var>run</var>&#8211;<var>run</var> is not called from under any lock! So essentially <var>run</var> has unfettered access to all (potentially shared) data stored in <var>this</var>. Unless the programmer is <i>very</i> disciplined, the potential for data races is high. And that&#8217;s where Java stands right now (the <var>Runnable</var> interface has the same problems).</p>
<p>If you&#8217;ve been following my blog, you know that I&#8217;m working on a <a href="http://bartoszmilewski.wordpress.com/2009/05/26/race-free-multithreading/">type system that eliminates races</a>. If I can convince Walter and Andrei, this system might be implemented in the D programming language. As I mentioned, D&#8217;s treatment of threads comes directly from Java and therefore is inherently unsafe. Like in Java, D&#8217;s <var>Thread</var> object has a <var>run</var> method. </p>
<p>So how could D, with the race-free type system, improve on Java? One possibility is to make the <var>run</var> method <var>lockfree</var>. A <var>lockfree</var> method (public or private) is not synchronized but its access to <var>this</var> is severely restricted. It can only operate on <var>lockfree</var> data members (if any), unless it takes the object&#8217;s lock or calls a synchronized method (all public methods of a shared object are, by default, synchronized). Let me give you a small example:</p>
<pre>class Counter: Thread {
public:
    // public methods are implicitly synchronized
    void inc() { ++_cnt; }
    void dec() { --_cnt; }
private:
    override void run() <span style="color:red;">lockfree</span> {
        inc(); // ok: calling a synchronized method
        wait(10000);
        synchronized(this) { // ok: explicit synchronization on this
            _cnt = _cnt * _cnt;
        }
        // _cnt /= 2; // error: not synchronized!
    }
    int _cnt;
}
// usage:
auto cnt = new <span style="color:red;">shared</span> Counter;
cnt.start;
cnt.inc;
cnt.join;</pre>
<p>This approach would work and guarantee freedom from races, but I don&#8217;t think it fits D. Unlike Java, which is so OO that even <var>main</var> is a method of an object, D is a multi-paradigm language. It doesn&#8217;t <i>have to</i> force threads into an OO paradigm. And the thread function, being a thread equivalent of <var>main</var> doesn&#8217;t <i>have to</i> be a method of any object. In my opinion, the functional approach to thread creation would serve D much better.</p>
<p>Here&#8217;s how I see it:</p>
<pre>class Counter {
public:
    void inc() { ++_cnt; }
    void dec() { --_cnt; }
    int get() const { return _cnt; }
    void set(int cnt) { _cnt = cnt; }
private:
    int _cnt;
}

void counterFun(shared Counter cnt) {
    cnt.inc;
    Thread.sleep(10000);
    synchronized(cnt) {
        int c = cnt.get;
        cnt.set(c * c);
    }
}
// usage:
auto cnt = new shared Counter;
Thread thr = Thread.spawn(&amp;counterFun, cnt);
thr.start;
cnt.inc;
thr.join;</pre>
<p>I find it more logical to have <var>start</var> and <var>join</var> operate on a different object, <var>thr</var>, rather than on the counter, <var>cnt</var>. The static template method <var>spawn</var> accepts a function that takes an arbitrary number of arguments that are either values or shared or unique objects. In particular, you could pass to it your (shared) communication channels or message queues to implement the message-passing paradigm. </p>
<p>Such <var>spawn</var> primitive could be used to build more complex classes&#8211;including the equivalents of a Java&#8217;s <var>Thread</var> or a Scala&#8217;s <var>Actor</var>. </p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/bartoszmilewski.wordpress.com/923/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/bartoszmilewski.wordpress.com/923/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/bartoszmilewski.wordpress.com/923/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/bartoszmilewski.wordpress.com/923/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/bartoszmilewski.wordpress.com/923/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/bartoszmilewski.wordpress.com/923/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/bartoszmilewski.wordpress.com/923/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/bartoszmilewski.wordpress.com/923/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/bartoszmilewski.wordpress.com/923/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/bartoszmilewski.wordpress.com/923/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bartoszmilewski.wordpress.com&blog=3549518&post=923&subd=bartoszmilewski&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://bartoszmilewski.wordpress.com/2009/07/07/whats-wrong-with-the-thread-object/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c018f213204496b4bbf481e7c8e6c15c?s=96&#38;d=http%3A%2F%2Fa.wordpress.com%2Fi%2Fmu.gif" medium="image">
			<media:title type="html">Bartosz Milewski</media:title>
		</media:content>
	</item>
		<item>
		<title>Multithreading Tutorial: Globals</title>
		<link>http://bartoszmilewski.wordpress.com/2009/06/23/multithreading-tutorial-globals/</link>
		<comments>http://bartoszmilewski.wordpress.com/2009/06/23/multithreading-tutorial-globals/#comments</comments>
		<pubDate>Tue, 23 Jun 2009 17:39:34 +0000</pubDate>
		<dc:creator>Bartosz Milewski</dc:creator>
				<category><![CDATA[C++]]></category>
		<category><![CDATA[Concurrency]]></category>
		<category><![CDATA[D Programming Language]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Multithreading]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Type System]]></category>

		<guid isPermaLink="false">http://bartoszmilewski.wordpress.com/?p=871</guid>
		<description><![CDATA[If it weren&#8217;t for the multitude of opportunities to shoot yourself in the foot, multithreaded programming would be easy. I&#8217;m going to discuss some of these &#8220;opportunities&#8221; in relation to global variables. I&#8217;ll talk about general issues and discuss the ways compilers can detect them. In particular, I&#8217;ll show the protections provided by my proposed [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bartoszmilewski.wordpress.com&blog=3549518&post=871&subd=bartoszmilewski&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>If it weren&#8217;t for the multitude of opportunities to shoot yourself in the foot, multithreaded programming would be easy. I&#8217;m going to discuss some of these &#8220;opportunities&#8221; in relation to global variables. I&#8217;ll talk about general issues and discuss the ways compilers can detect them. In particular, I&#8217;ll show the protections provided by my proposed <a href="http://bartoszmilewski.wordpress.com/2009/05/26/race-free-multithreading/">extensions to the type system</a>.</p>
<h2>Global Variables</h2>
<p>There are so many ways the sharing of global variables between threads can go wrong, it&#8217;s scary. </p>
<p>Let me start with the simplest example: the declaration of a global object of class <var>Foo</var> (in an unspecified language with Java-like syntax). </p>
<pre>Foo TheFoo = new Foo;</pre>
<p>In C++ or Java, TheFoo would immediately be <em>visible to all threads</em>, even if <var>Foo</var> provided no synchronization whatsoever (strictly speaking Java doesn&#8217;t have global variables, but static data members play the same role). </p>
<p>If the programmer doesn&#8217;t do anything to protect shared data, the default immediately exposes her to data races. </p>
<p>The D programming language (version 2.0, also known as D2) makes a better choice&#8211;global variables are, by default, thread local. That takes away the danger of accidental sharing. If the programmer <em>wants to</em> share a global variable, she has to declare it as such:</p>
<pre>shared Foo TheFoo = new <span style="color:red;">shared</span> Foo;</pre>
<p>It&#8217;s still up to the designer of the class <var>Foo</var> to provide appropriate synchronization. </p>
<p>Currently, the only multithreaded guarantee for shared objects in D2 is the absence of <i>low-level</i> data races on multiprocessors&#8211;and even that, only in the safe subset of D. What are low level data races? Those are the races that break some lock-free algorithms, like the infamous Double-Checked Locking Pattern. If I were to explain this to a Java programmer, I&#8217;d say that all data members in a shared object are <var>volatile</var>. This property propagates transitively to all objects the current object has access to. </p>
<p>Still, the following implementation of a shared object in D would most likely be incorrect even with the absence of low-level data races:</p>
<pre>class Foo {
    private int[] _arr;
    public void append(int i) {
       _arr ~= i; // array append
    }
}

auto TheFoo = new shared Foo;</pre>
<p>The problem is that an array in D has two fields: the length and the pointer to a buffer. In <var>shared Foo</var>, each of them would be updated atomically, but the duo would not. So two threads calling <var>TheFoo.append</var> could interleave their updates in an unpredictable way, possibly leading to loss of data.</p>
<p>My race-free type system goes further&#8211;it eliminates all data races, both low- and high-level. The same code would work differently in my scheme. When an object is declared <var>shared</var>, all its methods are automatically synchronized. <var>TheFoo.append</var> would take <var>Foo</var>&#8217;s lock and make the whole append operation atomic. (For the advanced programmer who wants to implement lock-free algorithms my scheme offers a special <var>lockfree</var> qualifier, which I&#8217;ll describe shortly.)</p>
<p>Now suppose that you were cautious enough to design your Java/D2 class <var>Foo</var> to be thread safe:</p>
<pre>class Foo {
    private int [] _arr;
    public <span style="color:red;">synchronized</span> void append(int i) {
       _arr ~= i; // array append
    }
}</pre>
<p>Does it mean your global variable, <var>TheFoo</var>, is safe to use? Not in Java. Consider this:</p>
<pre>static Foo TheFoo; // static = global
// Thread 1
TheFoo = new Foo();
// Thread 2
while (TheFoo == null)
    continue;
TheFoo.append(1);</pre>
<p>You won&#8217;t even know what hit you when your program fails. I will direct the reader to one of my <a href="http://bartoszmilewski.wordpress.com/2008/08/04/multicores-and-publication-safety/">older posts</a> that explains the problems of <i>publication safety</i> on a multiprocessor machine. The bottom line is that, in order to make your program work correctly in Java, you have to declare <var>TheFoo</var> as <var>volatile</var> (or <var>final</var>, if you simply want to prevent such usage). Again, it looks like in Java the defaults are stacked against safe multithreading.</p>
<p>This is not a problem in D2, since <var>shared</var> implies volatile.</p>
<p>In my scheme, the default behavior of <var>shared</var> is different. It works like Java&#8217;s <var>final</var>. The code that tries to rebind the shared object (re-assign to the handle) would not compile. This is to prevent accidental lock-free programming. (If you haven&#8217;t noticed, the code that waits on the handle of <var>TheFoo</var> to switch from null to non-null <i>is</i> lock-free. The handle is not protected by any lock.) Unlike D2, I don&#8217;t want to make lock-free programming &#8220;easy,&#8221; because it <i>isn&#8217;t</i> easy.  It&#8217;s almost like D2 is <i>endorsing</i> lock-free programming by giving the programmer a false sense of security.</p>
<p>So what do you do if you really want to spin on the handle? You declare your object <var>lockfree</var>.</p>
<pre><span style="color:red;">lockfree</span> Foo TheFoo;</pre>
<p><var>lockfree</var> implies <var>shared</var> (it doesn&#8217;t make sense otherwise), but it also makes the handle &#8220;volatile&#8221;. All accesses to it will be made sequentially consistent (on the x86, it means all stores will compile to <var>xchg</var>).</p>
<p>Note that <var>lockfree</var> is shallow&#8211;data members of <var>TheFoo</var> don&#8217;t inherit the <var>lockfree</var> qualification. Instead, they inherit the implied <var>shared</var> property of <var>TheFoo</var>. </p>
<p>It&#8217;s not only object handles that can be made <var>lockfree</var>. Other atomic data types like integers, Booleans, etc., can also be <var>lockfree</var>. A <var>lockfree</var> <var>struct</var> is also possible&#8211;it is treated as a tuple whose all elements are <var>lockfree</var>. There is no atomicity guarantee for the whole <var>struct</var>. Methods can be declared <var>lockfree</var> to turn off default synchronization.</p>
<h2>Conclusion</h2>
<p>Even the simplest case of sharing a global variable between threads is fraught with danger. My proposal inobtrusively eliminates most common traps. The defaults are carefully chosen to let the beginners avoid the pitfalls of multithreaded programming.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/bartoszmilewski.wordpress.com/871/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/bartoszmilewski.wordpress.com/871/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/bartoszmilewski.wordpress.com/871/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/bartoszmilewski.wordpress.com/871/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/bartoszmilewski.wordpress.com/871/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/bartoszmilewski.wordpress.com/871/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/bartoszmilewski.wordpress.com/871/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/bartoszmilewski.wordpress.com/871/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/bartoszmilewski.wordpress.com/871/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/bartoszmilewski.wordpress.com/871/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bartoszmilewski.wordpress.com&blog=3549518&post=871&subd=bartoszmilewski&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://bartoszmilewski.wordpress.com/2009/06/23/multithreading-tutorial-globals/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c018f213204496b4bbf481e7c8e6c15c?s=96&#38;d=http%3A%2F%2Fa.wordpress.com%2Fi%2Fmu.gif" medium="image">
			<media:title type="html">Bartosz Milewski</media:title>
		</media:content>
	</item>
	</channel>
</rss>