Async I/O: Dream vs. Parallel Extensions
We’ve been following the Parallel Programming with .NET blog pretty closely and are excited about what the Task Parallel Library and PLINQ will bring to .NET 4.0. A lot of the problems they are addressing are the same that we’ve faced for a while, which was the motivation for us to create Dream in the first place.
Yesterday, the Parallel Programming with .NET blog posted an article entitled “Parallel Extensions and I/O” illustrating how to use Parallel Extensions to asynchronously retrieve a number of resources located at the end of a Uri. Dealing with web resource being one of the primary purposes of Dream, we thought it would be interesting to show the simple and compact syntax you can use with Dream right now to achieve the same result.
I will cover their approach briefly for contrast, but you should read their article first to get an understanding of the problem and the proposed solution. Here’s the scenario in a nutshell. We have list of Uris for web resources like this:
static string[] Resources = new string[] {
"http://www.microsoft.com", "http://www.msdn.com",
"http://www.msn.com", "http://www.bing.com"
};
And we want to parallelize fetching these resources and do so asynchronously. The final result should be a collection of byte arrays, one for each resource.
In Dream, we have wrapped HttpWebRequest with a class called Plug, which provides a fluent interface for defining web requests and retrieving data. For a more indepth look at Plug, you should read Consuming REST services and TDD with Plug.
Instead of using an async callback pattern for Plug, we use the Dream Result<T> object, which functions as a completion handle. In the case of Plug, the return of .GetAsync() is a Result<DreamMessage>, where DreamMessage is a representation of the Http response with built-in conversion into common data types such as Text, Xml, or bytes. This means we can fire off a number of asynchronous requests, collect all their Result objects and then iterate over them, waiting for their completion.
Given this infrastructure, we can write two simple LINQ expressions to execute this work:
// first, we start all requests asynchronously // (the ToArray() forces immediate execution) Result<DreamMessage>[] requests = (from resource in Resources select Plug.New(resource).GetAsync()).ToArray(); // second, we block on each result objects until it has completed // and convert the response into a byte array byte[][] data = (from request in requests select request.Wait().ToBytes()).ToArray();
The trick in the above code is the lazy execution of LINQ. I.e. the first expression sets up a select that fires off all requests asynchronously. However, no attempt to start the asynchronous requests is made until we iterate over the returned IEnumerable. By using .ToArray(), we force immediate execution, which still returns immediately, since each execution runs asynchronously.
Had we called Plug.GetAsync().Wait() instead of splitting Plug.GetAsync() and .Wait() into two steps, each request would have had to wait until the previous one had finished before the next one would be issued. However, in the above approach, the first .ToArray() starts all the requests, while the second .ToArray() waits on each result objects until all have completed their requests. Note that completion, in this case, includes reading the response message asynchronously as well. Hence, we get the absolute lowest possible overhead for achieving our goal with very little code!
With Plug doing the heavy lifting, asynchronous execution of web calls becomes very simple and, best of all, it’s available today in Dream under the Apache 2.0 license.