<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>GHC Mutterings</title>
	<atom:link href="http://ghcmutterings.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://ghcmutterings.wordpress.com</link>
	<description>thoughts from the developers of GHC</description>
	<lastBuildDate>Mon, 25 Jan 2010 12:40:05 +0000</lastBuildDate>
	<generator>http://wordpress.com/</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<cloud domain='ghcmutterings.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://www.gravatar.com/blavatar/5cdcede448107b3ed8eaeae0ff715f9c?s=96&#038;d=http://s2.wp.com/i/buttonw-com.png</url>
		<title>GHC Mutterings</title>
		<link>http://ghcmutterings.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://ghcmutterings.wordpress.com/osd.xml" title="GHC Mutterings" />
	<atom:link rel='hub' href='http://ghcmutterings.wordpress.com/?pushpress=hub'/>
		<item>
		<title>Yielding more improvements in parallel performance</title>
		<link>http://ghcmutterings.wordpress.com/2010/01/25/yielding-more-improvements-in-parallel-performance/</link>
		<comments>http://ghcmutterings.wordpress.com/2010/01/25/yielding-more-improvements-in-parallel-performance/#comments</comments>
		<pubDate>Mon, 25 Jan 2010 11:38:34 +0000</pubDate>
		<dc:creator>simonmar</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://ghcmutterings.wordpress.com/?p=104</guid>
		<description><![CDATA[
GHC&#8217;s parallel GC makes heavy use of hand-written spinlocks.  These are basically mutexes like those provided by the Unix pthreads API or equivalently Windows CrticicalSections, except that they have no support for blocking the thread and waking up, they just spin until the lock is acquired.  I did it this way for a [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ghcmutterings.wordpress.com&blog=4970297&post=104&subd=ghcmutterings&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>
GHC&#8217;s parallel GC makes heavy use of hand-written spinlocks.  These are basically mutexes like those provided by the Unix pthreads API or equivalently Windows <tt>CrticicalSections</tt>, except that they have no support for blocking the thread and waking up, they just spin until the lock is acquired.  I did it this way for a few reasons:
</p>
<ol>
<li>I sometimes need to acquire a lock on one thread and release in on another. This is not supported by traditional mutexes.</li>
<li>We expect all threads to be running and contentions to be short.</li>
<li>I like to know exactly what code is running.</li>
</ol>
<p>
Unfortunately when using all the cores on the machine, it is common that one or more of our threads gets descheduled, and assuption (2) no longer holds.  When this happens, one or more of the threads can be spinning waiting for the descheduled thread, and no progress is made (the CPU just gets warmer) until the OS decides to reschedule it, which might be a whole time slice.
</p>
<p>
This is something we knew about (see <a href="http://hackage.haskell.org/trac/ghc/ticket/3553">ticket #3553</a>), but didn&#8217;t have a good solution for, and was recently encountered in another context <a href="http://hackage.haskell.org/trac/ghc/ticket/3758">here</a>.  It has been called the &#8220;last core parallel slowdown&#8221; and seems to affect Linux more than other OSs, presumably due to the way the Linux scheduler works.  In my experience the effect is far more dramatic when using the 8th core of an 8-core box than when using the second core of a dual-core.
</p>
<p>
This problem was present in GHC 6.10, but is exacerbated in GHC 6.12 because we now do minor GCs in parallel, which can mean hundreds of all-core synchronisations per second.  The reason we do minor GCs in parallel is for locality; the results in our <a href="http://www.haskell.org/~simonmar/papers/multicore-ghc.pdf">ICFP&#8217;09 paper</a> clearly show the performance benefits here.
</p>
<p>
Using traditional mutexes instead of our hand-rolled spinlocks would help, but it&#8217;s not as simple as just swapping out our spinlocks for mutexes: as noted above, sometimes we acquire one of these locks on<br />
one thread and release it on another, and pthreads doesn&#8217;t allow that. It might be possible to restructure things such that this doesn&#8217;t happen, but I haven&#8217;t found a good way yet.  An alternative is to use condition variables, but this lead to a severe reduction in performance when I tried it (see ticket #3553).
</p>
<p>
So as an experiment I tried adding an occasional call to &#8216;yield&#8217; (<tt>sched_yield</tt> on Unix, <tt>SwitchToThread</tt> on Windows) inside the code that acquires a spinlock, with a tunable number of spins between each call to yield (I&#8217;m using 1000). I also added a &#8216;yield&#8217; in the GC&#8217;s wait loop, so that threads with no work to do during parallel GC will repeatedly yield between searching for work.
</p>
<p>
To my surprise, this change helped not only the &#8220;last core&#8221; case, but also parallel performance across the board. I imagine one reason for this is that the yields help reduce contention on the memory bus,<br />
particularly in the parallel GC.
</p>
<p>
Here are the results, measureing the benchmark programs used in our ICFP&#8217;09 paper.  First, using all 8 cores of an 8-core, running 64-bit programs on Fedora 9, comparing GHC before and after the patch:
</p>
<pre>
---------------------------
        Program   Elapsed
---------------------------
           gray    -40.3%
         mandel    -41.4%
        matmult    -12.7%
         parfib     -3.3%
        partree     -4.9%
           prsa    -14.6%
            ray    -10.9%
       sumeuler     -1.8%
---------------------------
 Geometric Mean    -17.7%
</pre>
<p>
Now, using only 7 cores of the 8-core:
</p>
<pre>
--------------------------
        Program   Elapsed
--------------------------
           gray    -15.6%
         mandel    -18.8%
        matmult     -5.5%
         parfib     +2.9%
        partree     -5.0%
           prsa     -9.3%
            ray     +1.7%
       sumeuler     -0.9%
--------------------------
 Geometric Mean     -6.6%
</pre>
<p>
we even see a benefit when not using all the cores.
</p>
<p>
Here&#8217;s the difference between 7 and 8 cores, both with the new patch:
</p>
<pre>
--------------------------
        Program   Elapsed
--------------------------
           gray    +39.1%
         mandel     -8.0%
        matmult     -4.1%
         parfib    -10.0%
        partree     -1.3%
           prsa    -15.3%
            ray    +37.3%
       sumeuler    -11.4%
--------------------------
 Geometric Mean     +1.5%
</pre>
<p>
So the &#8220;last core&#8221; problem affects only two of our benchmarks (ray and gray), whereas the others all now improve when adding the last core.</p>
<p>
This nicely illustrates the problem of extrapolating from a single benchmark &#8211; programs vary greatly in their behaviour, in my experience it&#8217;s impossible to find a single program that is &#8220;representative&#8221;. Using larger programs usually doesn&#8217;t help: often large programs tend to have small inner-loops.  This group of 8 programs is quite meager, and expanding it is something I&#8217;d like to do (send me your parallel programs!).
</p>
<p>
Here&#8217;s the comparison on a dual-core, using 32-bit programs on Ubuntu Karmic, comparing GHC before and after the patch:</p>
<pre>
-------------------------
        Program  Elapsed
-------------------------
           gray   -17.2%
         mandel   -13.4%
        matmult    -6.7%
         parfib    +0.4%
        partree    -1.5%
           prsa    -1.0%
            ray    +1.6%
       sumeuler    -8.7%
-------------------------
 Geometric Mean    -6.0%
</pre>
<p>
And since this is such a trivial patch, we can merge it into 6.12.2, which should hopefully be in the next <a href="http://hackage.haskell.org/platform/">Haskell Platform</a> release.
</p>
<p>
This is really an interim solution, since the real solution is not to do &#8220;stop-the-world&#8221; GC at all, at least for minor collections.  That&#8217;s something we&#8217;re also working on (hopefully for GHC 6.14), but in the meantime this patch gives some nice improvements for the 6.12 line, and shows that you can actually push a stop-the-world design quite a long way.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ghcmutterings.wordpress.com/104/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ghcmutterings.wordpress.com/104/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ghcmutterings.wordpress.com/104/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ghcmutterings.wordpress.com/104/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ghcmutterings.wordpress.com/104/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ghcmutterings.wordpress.com/104/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ghcmutterings.wordpress.com/104/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ghcmutterings.wordpress.com/104/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ghcmutterings.wordpress.com/104/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ghcmutterings.wordpress.com/104/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ghcmutterings.wordpress.com&blog=4970297&post=104&subd=ghcmutterings&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://ghcmutterings.wordpress.com/2010/01/25/yielding-more-improvements-in-parallel-performance/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">simonmar</media:title>
		</media:content>
	</item>
		<item>
		<title>Parallelism /= Concurrency</title>
		<link>http://ghcmutterings.wordpress.com/2009/10/06/parallelism-concurrency/</link>
		<comments>http://ghcmutterings.wordpress.com/2009/10/06/parallelism-concurrency/#comments</comments>
		<pubDate>Tue, 06 Oct 2009 22:37:48 +0000</pubDate>
		<dc:creator>simonmar</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://ghcmutterings.wordpress.com/?p=94</guid>
		<description><![CDATA[If you want to make programs go faster on parallel hardware, that you need some kind of concurrency.  Right?
In this article I&#8217;d like to explain why the above statement is false, and why we should be very clear about the distinction between concurrency and parallelism.  I should stress that these ideas are not mine, and [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ghcmutterings.wordpress.com&blog=4970297&post=94&subd=ghcmutterings&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>If you want to make programs go faster on parallel hardware, that you need some kind of concurrency.  Right?</p>
<p>In this article I&#8217;d like to explain why the above statement is false, and why we should be very clear about the distinction between concurrency and parallelism.  I should stress that these ideas are not mine, and are by no means new, but I think it&#8217;s important that this issue is well understood if we&#8217;re to find a way to enable everyday programmers to use multicore CPUs.  I was moved to write about this after reading Tim Bray&#8217;s articles on <a href="http://www.tbray.org/ongoing/When/200x/2009/09/27/Concur-dot-next">Concur.next</a>: while I agree with a lot of what&#8217;s said there, particularly statements like</p>
<blockquote><p>Exposing real pre-emptive threading with shared mutable data structures to application programmers is <em>wrong</em></p></blockquote>
<p>it seems that parallelism and concurrency are still being conflated. Yes we need concurrency in our languages, but if all we want to do is make programs run faster on a multicore, concurrency should be a last resort.</p>
<p>First, I&#8217;ll try to establish the terminology.</p>
<p>A <strong>concurrent</strong> program is one with multiple threads of control.  Each thread of control has effects on the world, and those threads are interleaved in some arbitrary way by the scheduler.  We say that a concurrent programming language is <em>non-deterministic</em>, because the total effect of the program may depend on the particular interleaving at runtime.  The programmer has the tricky task of controlling this non-determinism using synchronisation, to make sure that the program ends up doing what it was supposed to do regardless of the scheduling order.  And that&#8217;s no mean feat, because there&#8217;s no reasonable way to test that you have covered all the cases.  This is regardless of what synchronisation technology you&#8217;re using: yes, STM is better than locks, and message passing has its advantages, but all of these are just ways to communicate between threads in a non-deterministic language.</p>
<p>A <strong>parallel</strong> program, on the other hand, is one that merely runs on multiple processors, with the goal of hopefully running faster than it would on a single CPU.</p>
<p>So where did this dangerous assumption that Parallelism == Concurrency come from?  It&#8217;s a natural consequence of languages with side-effects: when your language has side-effects everywhere, then any time you try to do more than one thing at a time you essentially have non-determinism caused by the interleaving of the effects from each operation.  So in side-effecty languages, the only way to get parallelism is concurrency; it&#8217;s therefore not surprising that we often see the two conflated.</p>
<p>However, in a side-effect-free language, you are free to run different parts of the program at the same time without observing any difference in the result.  This is one reason that our salvation lies in programming languages with controlled side-effects.  The way forward for those side-effecty languages is to start being more explicit about the effects, so that the effect-free parts can be identified and exploited.</p>
<p>It pains me to see <a href="http://www.sauria.com/blog/2009/10/05/the-cambrian-period-of-concurrency/">Haskell&#8217;s concurrency compared against the concurrency support in other languages</a>, when the goal is simply to make use of multicore CPUs (<strong>Edit</strong>: Ted <a href="http://www.sauria.com/blog/2009/10/06/concurrency-parallelism/">followed up with a clarification</a>).   It&#8217;s missing the point: yes of course Haskell has the best concurrency support <img src='http://s.wordpress.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> , but for this problem domain it has something even better: deterministic parallelism.  In Haskell you can use multicore CPUs without getting your hands dirty with concurrency and non-determinism, without having to get the synchronisation right, and with a guarantee that the parallel program gives the same answer every time, just more quickly.</p>
<p>There are two facets to Haskell&#8217;s determinstic parallelism support:</p>
<ul>
<li> par/pseq and Strategies.  These give you a way to add parallelism to an existing program, usually without requiring much restructuring.  For instance, there&#8217;s a parallel version of &#8216;map&#8217;.    Support for this kind of parallelism is maturing with the soon to be released GHC 6.12.1, where we made some <a href="http://www.haskell.org/~simonmar/bib/multicore-ghc-09_abstract.html">significant performance improvements</a> over previous versions.</li>
<li><a href="http://www.cse.unsw.edu.au/~chak/papers/PLKC08.html">Nested Data Parallelism</a>.  This is for taking advantage of parallelism in algorithms that are best expressed by composing operations on (possibly nested) arrays.  The compiler takes care of flattening the array structure, fusing array operations, and dividing the work amongst the available CPUs.  Data-Parallel Haskell will let us take advantage of GPUs and many-core machines for large-scale data-parallelism in the future.  Right now, DPH support in GHC is experimental, but work on it continues.</li>
</ul>
<p>That&#8217;s not to say that concurrency doesn&#8217;t have its place.  So when should you use concurrency?  Concurrency is most useful as a method for structuring a program that needs to communicate with multiple external clients simultaneously, or respond to multiple asynchronous inputs.  It&#8217;s perfect for a GUI that needs to respond to user input while talking to a database and updating the display at the same time, for a network application that talks to multiple clients simultaneously, or a program that communicates with multiple hardware devices, for example.  Concurrency lets you structure the program as if each individual communication is a sequential task, or a <em>thread</em>, and in these kinds of settings it&#8217;s often the ideal abstraction.  STM is vitally important for making this kind of programming more tractable.</p>
<p>As luck would have it, we can run concurrent programs in parallel without changing their semantics.  However, concurrent programs are often not compute-bound, so there&#8217;s not a great deal to be gained by actually running them in parallel, except perhaps for lower latency.</p>
<p>Having said all this, there is some overlap between concurrency and parallelism.  Some algorithms use multiple threads for parallelism deliberately; for example, search-type problems in which multiple threads search branches of a problem space, where knowledge gained in one branch may be exploited in other concurrent searches.  SAT-solvers and game-playing algorithms are good examples.  An open problem is how to incorporate this kind of non-deterministic parallelism in a safe way: in Haskell these algorithms would end up in the IO monad, despite the fact that the result could be deterministic.  Still, I believe these kinds of problems are in the minority, and we can get a long way with purely deterministic parallelism.</p>
<p>You&#8217;ll be glad to know that with GHC you can freely mix parallelism and concurrency on multicore CPUs to your heart&#8217;s content.  Knock yourself out <img src='http://s.wordpress.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ghcmutterings.wordpress.com/94/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ghcmutterings.wordpress.com/94/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ghcmutterings.wordpress.com/94/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ghcmutterings.wordpress.com/94/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ghcmutterings.wordpress.com/94/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ghcmutterings.wordpress.com/94/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ghcmutterings.wordpress.com/94/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ghcmutterings.wordpress.com/94/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ghcmutterings.wordpress.com/94/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ghcmutterings.wordpress.com/94/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ghcmutterings.wordpress.com&blog=4970297&post=94&subd=ghcmutterings&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://ghcmutterings.wordpress.com/2009/10/06/parallelism-concurrency/feed/</wfw:commentRss>
		<slash:comments>28</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">simonmar</media:title>
		</media:content>
	</item>
		<item>
		<title>Heads up: what you need to know about Unicode I/O in GHC 6.12.1</title>
		<link>http://ghcmutterings.wordpress.com/2009/09/30/heads-up-what-you-need-to-know-about-unicode-io-in-ghc-6-12-1/</link>
		<comments>http://ghcmutterings.wordpress.com/2009/09/30/heads-up-what-you-need-to-know-about-unicode-io-in-ghc-6-12-1/#comments</comments>
		<pubDate>Wed, 30 Sep 2009 10:07:18 +0000</pubDate>
		<dc:creator>simonmar</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://ghcmutterings.wordpress.com/?p=88</guid>
		<description><![CDATA[The GHC 6.12.1 release candidate will be out shortly, and it includes a newly rewritten I/O library including Unicode support.  Here&#8217;s what you need to know to make sure your applications/libraries continue to work with GHC 6.12.1.
We expect the release candidate phase to last a couple of weeks or so, depending on how many problems [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ghcmutterings.wordpress.com&blog=4970297&post=88&subd=ghcmutterings&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>The GHC 6.12.1 release candidate will be out shortly, and it includes a newly rewritten I/O library including Unicode support.  Here&#8217;s what you need to know to make sure your applications/libraries continue to work with GHC 6.12.1.</p>
<p>We expect the release candidate phase to last a couple of weeks or so, depending on how many problems arise, after which 6.12.1 will be released.  However, 6.12 is not currently scheduled to become part of the <a href="http://hackage.haskell.org/platform/">Haskell Platform</a> until the next platform release, due around February 2010, so package authors have a grace period for testing before 6.12.1 becomes more widely used.</p>
<p>The new System.IO docs can be found <a href="http://www.haskell.org/ghc/dist/current/docs/libraries/base/System-IO.html">here</a>, in particular the unicode-related functionality is  <a href="http://www.haskell.org/ghc/dist/current/docs/libraries/base/System-IO.html#23">here</a>.</p>
<h3>Console and text I/O</h3>
<p>If you are reading or writing to/from the console, or  reading/writing text files in the local encoding, then use the <strong>System.IO</strong> functions for doing text I/O (<strong>openFile</strong>, <strong>readFile</strong>, <strong>hGetContents</strong>, <strong>putStr</strong>, etc.), and you will automatically benefit from  the new Unicode support.  Text written will be encoded according to the current locale, or code page on Windows, and text read will be decoded accordingly.</p>
<p>If you need to use a particular encoding (e.g. UTF-8), then the  <strong>hSetEncoding </strong>function lets you set the encoding on a Handle, e.g.</p>
<pre>  hSetEncoding stdout utf8</pre>
<h3>Binary I/O</h3>
<p>If you&#8217;re reading or writing binary data, or for some other reason you want to bypass the Unicode encoding/decoding that the IO library now does, you have two options:</p>
<ul>
<li>Use <strong>openBinaryFile </strong>or <strong>hSetBinaryMode </strong>to put the Handle into binary  mode.  No encoding/decoding or newline translation will be done.</li>
<li>Use <strong>hGetBuf</strong>/<strong>hPutBuf</strong>, or the I/O operations provided by <strong>Data.ByteString</strong>, which all operate with binary data.</li>
</ul>
<h3>Using utf8-string</h3>
<p>If you&#8217;re using utf8-string in certain ways then you might get incorrect results.</p>
<ul>
<li>The operations in <strong>System.IO.UTF8</strong> add a UTF8 wrapper around the  corresponding <strong>System.IO</strong> operation.  Unless the underlying Handle is in binary  mode, these operations will result in garbage being read or  written.  For example, if you want to use <strong>System.IO.UTF8.print</strong>,  then call <strong>hSetBinaryMode stdout True</strong> first.  Better still, just use <strong>System.IO.print </strong>directly.  f you need to fix the encoding to UTF-8 rather than using the locale encoding, then call <strong>hSetEncoding handle utf8</strong>.</li>
</ul>
<ul>
<li>The rest of the operations in utf8-string will continue to work as before.</li>
</ul>
<h3>Newline handling</h3>
<p>There is a new API for newline translation in<strong> System.IO</strong>.  By default, Handles in text mode translate newlines to or from the native representation for the current platform, that is &#8220;\r\n&#8221; on Windows and &#8220;\n&#8221; on other platforms.  You can change this default using <strong>hSetNewlineMode</strong>, for example to be able to read a file with either Windows or Unix line-ending conventions:</p>
<pre> hSetNewlineMode handle universalNewlineMode</pre>
<p>where <strong>universalNewlineMode</strong> translates from &#8220;\r\n&#8221; to &#8220;\n&#8221; on input, leaving &#8220;\n&#8221; alone, and translates &#8220;\n&#8221; to the native newline representation on output.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ghcmutterings.wordpress.com/88/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ghcmutterings.wordpress.com/88/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ghcmutterings.wordpress.com/88/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ghcmutterings.wordpress.com/88/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ghcmutterings.wordpress.com/88/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ghcmutterings.wordpress.com/88/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ghcmutterings.wordpress.com/88/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ghcmutterings.wordpress.com/88/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ghcmutterings.wordpress.com/88/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ghcmutterings.wordpress.com/88/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ghcmutterings.wordpress.com&blog=4970297&post=88&subd=ghcmutterings&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://ghcmutterings.wordpress.com/2009/09/30/heads-up-what-you-need-to-know-about-unicode-io-in-ghc-6-12-1/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">simonmar</media:title>
		</media:content>
	</item>
		<item>
		<title>GHC Status Update (from the Haskell Implementors Workshop)</title>
		<link>http://ghcmutterings.wordpress.com/2009/09/15/ghc-status-update-from-the-haskell-implementors-workshop/</link>
		<comments>http://ghcmutterings.wordpress.com/2009/09/15/ghc-status-update-from-the-haskell-implementors-workshop/#comments</comments>
		<pubDate>Tue, 15 Sep 2009 09:06:30 +0000</pubDate>
		<dc:creator>simonmar</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://ghcmutterings.wordpress.com/?p=85</guid>
		<description><![CDATA[The video of Simon Peyton Jones&#8217; GHC Status Update presentation at the Haskell Implementors Workshop is now online.  Lots of details about the goodies that will shortly be arriving in GHC 6.12.1.
       <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ghcmutterings.wordpress.com&blog=4970297&post=85&subd=ghcmutterings&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>The video of Simon Peyton Jones&#8217; GHC Status Update presentation at the Haskell Implementors Workshop is now <a href="http://www.vimeo.com/6570515">online</a>.  Lots of details about the goodies that will shortly be arriving in GHC 6.12.1.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ghcmutterings.wordpress.com/85/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ghcmutterings.wordpress.com/85/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ghcmutterings.wordpress.com/85/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ghcmutterings.wordpress.com/85/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ghcmutterings.wordpress.com/85/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ghcmutterings.wordpress.com/85/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ghcmutterings.wordpress.com/85/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ghcmutterings.wordpress.com/85/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ghcmutterings.wordpress.com/85/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ghcmutterings.wordpress.com/85/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ghcmutterings.wordpress.com&blog=4970297&post=85&subd=ghcmutterings&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://ghcmutterings.wordpress.com/2009/09/15/ghc-status-update-from-the-haskell-implementors-workshop/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">simonmar</media:title>
		</media:content>
	</item>
		<item>
		<title>Visualising the Haskell package dependency graph</title>
		<link>http://ghcmutterings.wordpress.com/2009/07/06/visualising-the-haskell-package-dependency-graph/</link>
		<comments>http://ghcmutterings.wordpress.com/2009/07/06/visualising-the-haskell-package-dependency-graph/#comments</comments>
		<pubDate>Mon, 06 Jul 2009 14:24:59 +0000</pubDate>
		<dc:creator>simonmar</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://ghcmutterings.wordpress.com/?p=78</guid>
		<description><![CDATA[This is a graph showing the dependencies between the packages that come with GHC.  I just added some (trivial) support to the ghc-pkg tool to generate the output in dot format, and generated the above graph with
ghc-pkg dot &#124; tred &#124; dot -Tsvg &#62;pkgs.svg
Note the &#8220;tred&#8221; filter, which eliminates clutter from transitive edges.  The [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ghcmutterings.wordpress.com&blog=4970297&post=78&subd=ghcmutterings&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div id="attachment_77" class="wp-caption aligncenter" style="width: 310px"><a href="http://www.haskell.org/~simonmar/pkg.svg"><img class="size-medium wp-image-77" title="pkg" src="http://ghcmutterings.files.wordpress.com/2009/07/pkg.png?w=300&#038;h=257" alt="Package dependency tree" width="300" height="257" /></a><p class="wp-caption-text">Package dependency graph</p></div>
<p>This is a graph showing the dependencies between the packages that come with GHC.  I just added some (trivial) support to the ghc-pkg tool to generate the output in dot format, and generated the above graph with</p>
<pre>ghc-pkg dot | tred | dot -Tsvg &gt;pkgs.svg</pre>
<p>Note the &#8220;tred&#8221; filter, which eliminates clutter from transitive edges.  The &#8216;ghc-pkg dot&#8217; command should be in GHC 6.12.1.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ghcmutterings.wordpress.com/78/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ghcmutterings.wordpress.com/78/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ghcmutterings.wordpress.com/78/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ghcmutterings.wordpress.com/78/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ghcmutterings.wordpress.com/78/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ghcmutterings.wordpress.com/78/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ghcmutterings.wordpress.com/78/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ghcmutterings.wordpress.com/78/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ghcmutterings.wordpress.com/78/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ghcmutterings.wordpress.com/78/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ghcmutterings.wordpress.com&blog=4970297&post=78&subd=ghcmutterings&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://ghcmutterings.wordpress.com/2009/07/06/visualising-the-haskell-package-dependency-graph/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">simonmar</media:title>
		</media:content>

		<media:content url="http://ghcmutterings.files.wordpress.com/2009/07/pkg.png?w=300" medium="image">
			<media:title type="html">pkg</media:title>
		</media:content>
	</item>
		<item>
		<title>New paper: Parallel Performance Tuning for Haskell</title>
		<link>http://ghcmutterings.wordpress.com/2009/06/22/new-paper-parallel-performance-tuning-for-haskell/</link>
		<comments>http://ghcmutterings.wordpress.com/2009/06/22/new-paper-parallel-performance-tuning-for-haskell/#comments</comments>
		<pubDate>Mon, 22 Jun 2009 14:38:56 +0000</pubDate>
		<dc:creator>simonmar</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://ghcmutterings.wordpress.com/?p=72</guid>
		<description><![CDATA[Here&#8217;s our Haskell Symposium paper about parallel profiling with GHC and ThreadScope:
Parallel Performance Tuning for Haskell (Don Jones Jr., Simon Marlow, Satnam Singh) Haskell &#8216;09: Proceedings of the second ACM SIGPLAN symposium on Haskell, Edinburgh, Scotland, ACM, 2009
Abstract:
Parallel Haskell programming has entered the mainstream with support now included in GHC for multiple parallel programming models, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ghcmutterings.wordpress.com&blog=4970297&post=72&subd=ghcmutterings&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s our Haskell Symposium paper about parallel profiling with GHC and ThreadScope:</p>
<p><a href="http://www.haskell.org/~simonmar/papers/threadscope.pdf">Parallel Performance Tuning for Haskell</a> (Don Jones Jr., Simon Marlow, Satnam Singh) <em>Haskell &#8216;09: Proceedings of the second ACM SIGPLAN symposium on Haskell</em>, Edinburgh, Scotland, ACM, 2009</p>
<p>Abstract:</p>
<blockquote><p>Parallel Haskell programming has entered the mainstream with support now included in GHC for multiple parallel programming models, along with multicore execution support in the runtime. However, tuning programs for parallelism is still something of a black art. Without much in the way of feedback provided by the runtime system, it is a matter of trial and error combined with experience to achieve good parallel speedups.</p>
<p>This paper describes an early prototype of a parallel profiling system for multicore programming with GHC. The system comprises three parts: fast event tracing in the runtime, a Haskell library for reading the resulting trace files, and a number of tools built on this library for presenting the information to the programmer. We focus on one tool in particular, a graphical timeline browser called ThreadScope.</p>
<p>The paper illustrates the use of ThreadScope through a number of case studies, and describes some useful methodologies for parallelizing Haskell programs.</p></blockquote>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ghcmutterings.wordpress.com/72/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ghcmutterings.wordpress.com/72/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ghcmutterings.wordpress.com/72/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ghcmutterings.wordpress.com/72/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ghcmutterings.wordpress.com/72/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ghcmutterings.wordpress.com/72/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ghcmutterings.wordpress.com/72/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ghcmutterings.wordpress.com/72/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ghcmutterings.wordpress.com/72/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ghcmutterings.wordpress.com/72/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ghcmutterings.wordpress.com&blog=4970297&post=72&subd=ghcmutterings&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://ghcmutterings.wordpress.com/2009/06/22/new-paper-parallel-performance-tuning-for-haskell/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">simonmar</media:title>
		</media:content>
	</item>
		<item>
		<title>The new GHC build system is here!</title>
		<link>http://ghcmutterings.wordpress.com/2009/04/28/the-new-ghc-build-system-is-here/</link>
		<comments>http://ghcmutterings.wordpress.com/2009/04/28/the-new-ghc-build-system-is-here/#comments</comments>
		<pubDate>Tue, 28 Apr 2009 10:29:50 +0000</pubDate>
		<dc:creator>simonmar</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://ghcmutterings.wordpress.com/?p=65</guid>
		<description><![CDATA[The new GHC build system has been now been merged in.   GHC developers can look forward to increases in productivity and faster build times thanks to the new non-recursive make design.
Here are some quick stats:
Lines of build-system code (including Makefile and Haskell code):

old build system: 7793
new build system: 5766 (about 2000 fewer lines, or [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ghcmutterings.wordpress.com&blog=4970297&post=65&subd=ghcmutterings&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>The new GHC build system has been <a href="http://www.haskell.org/pipermail/cvs-ghc/2009-April/048308.html">now been merged in</a>.   GHC developers can look forward to increases in productivity and faster build times thanks to the new <a href="http://hackage.haskell.org/trac/ghc/wiki/Building/Architecture/Idiom/NonRecursiveMake">non-recursive make</a> design.</p>
<p>Here are some quick stats:</p>
<p>Lines of build-system code (including Makefile and Haskell code):</p>
<ul>
<li>old build system: 7793</li>
<li>new build system: 5766 (about 2000 fewer lines, or a 26% reduction)</li>
</ul>
<p>Furthermore, this doesn&#8217;t count the code for &#8216;cabal make&#8217;, which is still in Cabal but is no longer used by GHC.</p>
<p>Time to validate with -j2 (the default; test suite is still single-threaded):</p>
<ul>
<li>old:  28 mins</li>
<li>new: 28 mins</li>
</ul>
<p>Single and dual-core builds don&#8217;t see much difference.  However, adding more cores starts to demonstrate the improved parallelism: validate with -j4 (still single-threaded test suite):</p>
<ul>
<li>old: 25.3 mins</li>
<li>new: 24.0 mins</li>
</ul>
<p>Parallelism in the new build system is a lot better.  It can build libraries in parallel with each other, profiled libraries in parallel with non-profiled libraries, and even libraries in parallel with stage 2.  There&#8217;s very little explicit ordering in the new build system, we only tell make about dependencies.</p>
<p>Time to do &#8216;make&#8217; when the tree is fully up-to-date:</p>
<ul>
<li>old: 2m 41s</li>
<li>new: 4.1s</li>
</ul>
<p>Time to do &#8216;make distclen&#8217;:</p>
<ul>
<li>old: 5.7s</li>
<li>new: 1.0s</li>
</ul>
<p>We also have <a href="http://hackage.haskell.org/trac/ghc/wiki/Building">all-new build-system documentation</a>.</p>
<p>The biggest change you&#8217;ll notice is that the build system now expresses all the dependencies, so whatever you change, you should be able to say &#8216;make&#8217; to bring everything up to date.  Sometimes this can result in more rebuilding than you were expecting, or more than is strictly necessary, but it should save time in the long run as we run<br />
into fewer problems caused by things being inconsistent or out-of-date in the build.</p>
<p>We stretched GNU make to its limits.  On the whole it performed pretty well: even for a build of this size, the time and memory consumed by make itself is negligible.  The most annoying problem we encountered was the need to <a>split the build into phases</a> to work around GNU make&#8217;s lack of support for dependencies between included makefiles.</p>
<p>On the whole I&#8217;m now convinced that non-recursive make is not only useful but practical, provided you stick to some clear <a href="http://hackage.haskell.org/trac/ghc/wiki/Building/Architecture#Idioms">idioms</a> in your build-system design.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ghcmutterings.wordpress.com/65/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ghcmutterings.wordpress.com/65/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ghcmutterings.wordpress.com/65/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ghcmutterings.wordpress.com/65/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ghcmutterings.wordpress.com/65/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ghcmutterings.wordpress.com/65/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ghcmutterings.wordpress.com/65/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ghcmutterings.wordpress.com/65/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ghcmutterings.wordpress.com/65/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ghcmutterings.wordpress.com/65/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ghcmutterings.wordpress.com&blog=4970297&post=65&subd=ghcmutterings&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://ghcmutterings.wordpress.com/2009/04/28/the-new-ghc-build-system-is-here/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">simonmar</media:title>
		</media:content>
	</item>
		<item>
		<title>New paper: Runtime Support for Multicore Haskell</title>
		<link>http://ghcmutterings.wordpress.com/2009/03/03/new-paper-runtime-support-for-multicore-haskell/</link>
		<comments>http://ghcmutterings.wordpress.com/2009/03/03/new-paper-runtime-support-for-multicore-haskell/#comments</comments>
		<pubDate>Tue, 03 Mar 2009 12:56:49 +0000</pubDate>
		<dc:creator>simonmar</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://ghcmutterings.wordpress.com/?p=58</guid>
		<description><![CDATA[Here&#8217;s a paper on the internals of GHC&#8217;s parallelism support, showing some nice improvements in parallel performance over GHC 6.10.1: &#8220;Runtime Support for Multicore Haskell&#8220;.   Abstract:
Purely functional programs should run well on parallel hardware because of the absence of side effects, but it has proved hard to realise this potential in practice.  Plenty of [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ghcmutterings.wordpress.com&blog=4970297&post=58&subd=ghcmutterings&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s a paper on the internals of GHC&#8217;s parallelism support, showing some nice improvements in parallel performance over GHC 6.10.1: &#8220;<a href="http://www.haskell.org/~simonmar/papers/multicore-ghc.pdf">Runtime Support for Multicore Haskell</a>&#8220;.   Abstract:</p>
<blockquote><p>Purely functional programs should run well on parallel hardware because of the absence of side effects, but it has proved hard to realise this potential in practice.  Plenty of papers describe promising ideas, but vastly fewer describe real implementations with good wall-clock performance.  We describe just such an implementation, and quantitatively explore some of the complex design tradeoffs that make such implementations hard to build.  Our measurements are necessarily detailed and specific, but<br />
they are reproducible, and we believe that they offer some general insights.</p></blockquote>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ghcmutterings.wordpress.com/58/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ghcmutterings.wordpress.com/58/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ghcmutterings.wordpress.com/58/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ghcmutterings.wordpress.com/58/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ghcmutterings.wordpress.com/58/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ghcmutterings.wordpress.com/58/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ghcmutterings.wordpress.com/58/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ghcmutterings.wordpress.com/58/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ghcmutterings.wordpress.com/58/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ghcmutterings.wordpress.com/58/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ghcmutterings.wordpress.com&blog=4970297&post=58&subd=ghcmutterings&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://ghcmutterings.wordpress.com/2009/03/03/new-paper-runtime-support-for-multicore-haskell/feed/</wfw:commentRss>
		<slash:comments>18</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">simonmar</media:title>
		</media:content>
	</item>
		<item>
		<title>Ever wondered how big a closure is?</title>
		<link>http://ghcmutterings.wordpress.com/2009/02/12/53/</link>
		<comments>http://ghcmutterings.wordpress.com/2009/02/12/53/#comments</comments>
		<pubDate>Thu, 12 Feb 2009 11:52:56 +0000</pubDate>
		<dc:creator>simonmar</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://ghcmutterings.wordpress.com/?p=53</guid>
		<description><![CDATA[Somebody asked me yesterday how to find out how big the runtime representation of a type is.  I hacked this up using the internal unpackClosure# primitive:

{-# LANGUAGE MagicHash,UnboxedTuples #-}
module Size where

import GHC.Exts
import Foreign

unsafeSizeof :: a -&#62; Int
unsafeSizeof a =
  case unpackClosure# a of
    (# x, ptrs, nptrs #) -&#62;
  [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ghcmutterings.wordpress.com&blog=4970297&post=53&subd=ghcmutterings&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>Somebody asked me yesterday how to find out how big the runtime representation of a type is.  I hacked this up using the internal unpackClosure# primitive:</p>
<pre>
{-# LANGUAGE MagicHash,UnboxedTuples #-}
module Size where

import GHC.Exts
import Foreign

unsafeSizeof :: a -&gt; Int
unsafeSizeof a =
  case unpackClosure# a of
    (# x, ptrs, nptrs #) -&gt;
      sizeOf (undefined::Int) + -- one word for the header
        I# (sizeofByteArray# (unsafeCoerce# ptrs)
             +# sizeofByteArray# nptrs)
</pre>
<p>Try it in GHCi:</p>
<pre>
Prelude&gt; :!ghc -c Size.hs
Prelude&gt; :l Size
Ok, modules loaded: Size.
Prelude Size&gt; unsafeSizeof 3.3
16
Prelude Size&gt; unsafeSizeof "a"
24
Prelude Size&gt; unsafeSizeof (1,2,3,4)
40
Prelude Size&gt; unsafeSizeof True
8
Prelude Size&gt; unsafeSizeof $! Data.Complex.(:+) 2 3
24
Prelude Size&gt; unsafeSizeof $! 3
16
Prelude Size&gt; unsafeSizeof $! (3::Integer)
16
Prelude Size&gt; unsafeSizeof $! (3::Int)
16
Prelude Size&gt; unsafeSizeof $! (2^64::Integer)
24
</pre>
<p>I&#8217;m on a 64-bit machine, obviously.  It doesn&#8217;t always do the right thing, but for ordinary algebraic types it should work most of the time.  Remember to use $!, as the size returned for an unevaluated thunk is always just one word (unpackClosure# doesn&#8217;t work for thunks).</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ghcmutterings.wordpress.com/53/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ghcmutterings.wordpress.com/53/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ghcmutterings.wordpress.com/53/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ghcmutterings.wordpress.com/53/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ghcmutterings.wordpress.com/53/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ghcmutterings.wordpress.com/53/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ghcmutterings.wordpress.com/53/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ghcmutterings.wordpress.com/53/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ghcmutterings.wordpress.com/53/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ghcmutterings.wordpress.com/53/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ghcmutterings.wordpress.com&blog=4970297&post=53&subd=ghcmutterings&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://ghcmutterings.wordpress.com/2009/02/12/53/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">simonmar</media:title>
		</media:content>
	</item>
		<item>
		<title>Benchmarking recent improvements in parallelism</title>
		<link>http://ghcmutterings.wordpress.com/2009/01/09/46/</link>
		<comments>http://ghcmutterings.wordpress.com/2009/01/09/46/#comments</comments>
		<pubDate>Fri, 09 Jan 2009 13:49:26 +0000</pubDate>
		<dc:creator>simonmar</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://ghcmutterings.wordpress.com/?p=46</guid>
		<description><![CDATA[Over the last few months we&#8217;ve been making various improvements to the performance of parallel programs with GHC.  I thought I&#8217;d post a few benchmarks so you can see where we&#8217;ve got to.  This is a fairly random collection of 6 parallel benchmarks (all using par/seq-style parallelism rather than explicit threading with forkIO). [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ghcmutterings.wordpress.com&blog=4970297&post=46&subd=ghcmutterings&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>Over the last few months we&#8217;ve been making various improvements to the performance of parallel programs with GHC.  I thought I&#8217;d post a few benchmarks so you can see where we&#8217;ve got to.  This is a fairly random collection of 6 parallel benchmarks (all using par/seq-style parallelism rather than explicit threading with forkIO).  The main point here is that I just took the programs unchanged &#8211; I haven&#8217;t made any attempt to modify the programs themselves to  make them parallelize better (although other people might have done so in the past), the focus here has been on changing the GHC runtime to optimize these existing programs.  The programs come mostly from old benchmarks for the GUM implementation of Parallel Haskell, and the sources can be found <a href="http://darcs.haskell.org/nofib/parallel">here.</a></p>
<ul>
<li>matmult: matrix multiply</li>
<li>parfib: our old friend fibonacci, in parallel</li>
<li>partree: some operations on a tree in parallel</li>
<li>prsa: decode an RSA-encoded message in parallel</li>
<li>ray: a ray-tracer</li>
<li>sumeuler:  <tt>sum . map euler</tt></li>
</ul>
<p>Here are the results.  The first column of numbers is the time taken for GHC 6.10.1 to run the programs on one CPU, and the following three columns are the difference in elapsed time when the programs are run on 4 CPUs (actually 4 cores of my 8-core x86_64 box) with respectively GHC 6.8.3, 6.10.1, and my current working version (HEAD + a couple of patches).</p>
<pre>------------------------------------------------------------
  Program   6.10.1   6.8.3 -N4  6.10.1 -N4  ghc-simonmar -N4
------------------------------------------------------------
  matmult     8.55   -60.0%     -63.7%       -72.0%
   parfib     9.65   -72.6%     -70.2%       -76.3%
  partree     8.03   +26.4%     +52.7%       -40.7%
     prsa     9.52   +13.8%     -44.1%       -68.2%
      ray     7.04   +16.5%     +11.8%       +28.0%
 sumeuler     9.64   -71.2%     -73.1%       -74.0%
  -1 s.d.    -----   -68.8%     -71.6%       -78.0%
  +1 s.d.    -----   +20.1%      +6.4%       -27.1%
  Average    -----   -38.8%     -45.0%       -59.9%</pre>
<p>The target is -75%: that&#8217;s a speedup of 4 on 4 cores.  As you can see, 6.10.1 is already doing better than 6.8.3, but the current version has made some dramatic improvements and is getting close to the ideal speedup on several of the programs.  Something odd is going on with ray, I don&#8217;t know what yet!</p>
<p>Here&#8217;s a summary of the improvements we made:</p>
<ul>
<li>Lock-free work-stealing queues for load-balancing of sparks (par).  This work was originally done by Jost Berthold during his internship at MSR in the summer of 2008, and after further improvements was merged into the GHC mainline after the 6.10.1 release.</li>
<li>Improvements to parallel GC: we now use the same threads for GC as for executing the program, and have made improvements to the barrier (stopping threads to do GC), and improvements to affinity (making sure each GC thread traverses data local to that CPU).  Some of this has yet to hit the mainline, but it will shortly.</li>
<li>Eager blackholing: this reduces the chance that multiple threads repeat the same work in a parallel program.  It&#8217;s  a compile-time option (-feager-blackholing in the HEAD) and it costs a little execution time to turn it on, but it can improve parallelism quite a lot.</li>
<li>Running sparks in batches.  Previously, each time we run a spark we created a new thread for it.  Threads are lightweight, but the cost can still be high relative to the size of the spark.  So now we have Haskell threads that repeatedly run sparks (stealing from other CPUs if necessary) until there are no more sparks to run, eliminating the context-switch and thread-creation overhead for sparks.  This means we can push the granularity quite a lot: parfib speeds up even with a very low threshold now.</li>
</ul>
<p>We&#8217;re on the lookout for more parallel benchmarks: each new program we find tends to stress the runtime in a different way, so the more code we have, the better.  Even if (or especially if) your program doesn&#8217;t go faster on a multicore &#8211; send it to us and we&#8217;ll look into it.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ghcmutterings.wordpress.com/46/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ghcmutterings.wordpress.com/46/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ghcmutterings.wordpress.com/46/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ghcmutterings.wordpress.com/46/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ghcmutterings.wordpress.com/46/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ghcmutterings.wordpress.com/46/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ghcmutterings.wordpress.com/46/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ghcmutterings.wordpress.com/46/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ghcmutterings.wordpress.com/46/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ghcmutterings.wordpress.com/46/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ghcmutterings.wordpress.com&blog=4970297&post=46&subd=ghcmutterings&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://ghcmutterings.wordpress.com/2009/01/09/46/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">simonmar</media:title>
		</media:content>
	</item>
	</channel>
</rss>