An not so asynchronous mistake anyone can make when using the System.IO.Stream class.

Got a stack overflow exception come in today for an application I'm working on.  Turns out if you send rather large files around then all of a sudden this guy just pops up and takes you down.  I have to say this wasn't something I was expecting, and was actually pretty sure the code we were running was stable.

What caused the problem?  Turns out Stream.BeginRead isn't so asynchronous after all.  In fact Stream.BeginRead just does a Read operation and then immediately calls your callback.  Okay, so it is up to derived classes to support BeginRead if they want asynchronous operation.  Well, I have to say my friend, that sucks.  After all, you spend a good deal of time building a model and you expect that model to work.  Take the following asynchronous aware code to read a stream all the way to the end.

myState.Stream.BeginRead(myState.Buffer, 0, myState.Buffer.Length, new AsyncCallback(ReadMore), myState);

private void ReadMore(IAsyncResult ar) {
    MyState myState = (MyState) ar.AsyncState;
    int bytesRead = myState.Stream.EndRead(ar);

    if ( bytesRead > 0 ) {
        // Do some work
        myState.Stream.BeginRead(myState.Buffer, 0, myState.Buffer.Length, new AsyncCallback(ReadMore), myState);
    } else {
        // Our operation is done.  Do some final work
    }
}

Can you spot the the StackOverflow yet?  Well, since we are calling ReadMore as our asynchronous callback delegate, and since things are actually happening synchronously behind the scenes, each call to ReadMore is recursing further into our stack space.  You get 1 stack frame for ReadMore and one stack frame for BeginRead each time through.  This is what the two patterns look like if BeginRead is really asynchronous and if it cheats.

Thread1 stack:
CurrentMethod() calls BeginRead() which returns
Thread2 stack:
ReadOperation() calls AsyncCallback() calls EndRead() calls BeginRead() which returns

Thread1 stack:
CurrentMethod() calls BeginRead() calls ReadOperation() calls AsyncCallback() calls EndRead() calls BeginRead() calls repeat...

You see the difference?  Big difference.  You probably wouldn't even hit this in your program unless you made a LOT of reads.  Takes a lot of nested reads to exhaust the stack.  Now a bunch of people are probably raising their hands saying “I knew that!  I knew they cheated and I wrote my code accordingly.”  How do you write your code accordingly?  If you want to support generic reading from files, the network, and memory, you have to take a Stream class.  Now how in the hell do you write code to asynchronously read from all of those stream types and truly alleviate the stack overflow?  I have some ideas, namely abstracting out a class that knows how to consume streams asynchronously without using BeginRead/EndRead.  In the case of various types of streams you could even special case it to use the real asynchronous behavior instead of your own.  If I come up with anything solid, maybe I'll share it.

Published Thursday, May 06, 2004 5:21 PM by Justin Rogers

Comments

Thursday, May 06, 2004 8:29 PM by Justin Rogers

# re: An not so asynchronous mistake anyone can make when using the System.IO.Stream class.

Just for laughs. Note the Stream uses Stream.SynchronousAsyncResult for its IAsyncResult implementation. Isn't that funny. Even more funny is it's implementation of CompletedSynchronously:

IL_0000: ldc.i4.1
IL_0001: ret

In other words, it ALWAYS completes synchronously.

Damn, I just thoroughly read the docs (V1.0 docs on this machine, I'll check the V1.1 docs later) and everything in there seems to IMPLY that the methods really are asynchronous. They do state that the asynchronous versions are implemented on top of the Read/Write synchronous versions and that if you replace the Read/Write methods then things will still work for you. That's cool, IF the BeginRead/BeginWrite methods were actually doing asynchronous by loading a ThreadPool item.

They even have recommendations for multiple simultaneous asynchronous reads, and I quote "Multiple simultaneous asynchronous requests render the request completion order uncertain." Based on the fact that the implementation is SYNCHRONOUS, I don't think that will be a problem.
Friday, May 07, 2004 9:48 AM by Steve

# re: An not so asynchronous mistake anyone can make when using the System.IO.Stream class.

That is a real stinker... It sort of defeats the purpose of even using it to begin with.
Monday, May 10, 2004 8:42 PM by Alexei

# re: An not so asynchronous mistake anyone can make when using the System.IO.Stream class.

There are two parts to this problem one is your side the other one is on Stream class.

1) The Stream class problem is that it should be issuing a blocking call in BeginRead().

2) You problem is that on every BeginXXX call you should be checking ar.CompletedSynchronously and if so proceed on the same thread. In fact, sync completions would give you perf improvement since an async completion needs to flow execution context onto a new thread.


IAsyncResult ar = BeginDoSomething(... _MyCallback, myState);
if (!ar.CompletedSynchronously)
return;
ProcessSomething(true);

void MyCallback(IAsyncResult ar)
{
if (ar.CompletedSynchronously)
return;
ProcessSomething(false);
}

void ProcessSomething(bool onMainThread)
{
try {
EndDoSomething(ar);
}
catch (Exception e)
{
if (IsNotExpectedException())
throw; // This may and should bring your process down.

if (onMainThread)
throw; // Something on the main thread may want to handle it.

//
// If something on the main thread could handle an exception call it here.
//
HandleException();
return;
}
HandleSuccess();
}
Wednesday, June 09, 2004 9:42 PM by Brian Grunkemeyer

# re: An not so asynchronous mistake anyone can make when using the System.IO.Stream class.

There are a number of issues here, and I think they're all solved.

First, the stack overflow exception can be avoided by using the CompletedSynchronously property, as Alexei points out. Based on its result, you can tell whether you can recurse or whether you should return and have your caller issue another BeginXxx call. I think you'll find this if you read through enough conceptual docs on our async design pattern, but it is pretty buried.

Secondly, Stream's BeginRead is synchronous like this. Stream's async code paths are implemented in terms of the sync code paths, and the sync code paths are implemented in terms of the async ones. So if you subclass Stream, you only have to implement the sync or the async code paths, and you get the other one for free. Additionally, your users have a consistent programming model.

Thirdly, I've reimplemented Stream's BeginRead & BeginWrite methods in Whidbey to do their work on a threadpool thread instead of their own thread (via async delegates). Your code will probably work if you pick up a more recent build. (I made the change to Whidbey around Feburary or March - you might not see it until Beta 1 or a very recent community drop. I wanted to make this change much earlier in Whidbey, but I was blocked until we added a Semaphore class to fully insulate users from an obscure race I would have otherwise exposed in user code. All the other sync primitives have sometimes-unfortunate threading restrictions, some of which weren't easily discoverable in V1 & V1.1.)

Additionally note that FileStream & NetworkStream do support true async IO using an IO completion port. However, for FileStream, the OS restricts file handles to only one type of IO - either sync or async. So you must explicitly open the file asynchronously if you want fast async IO. (You can do this by passing in FileOptions.Asynchronous in Whidbey, or by passing true to the "useAsync" parameter before that.) If you open it in the wrong mode, it may hurt your perf by up to 10x on pathological cases, and you may end up with the naive implementations from Stream in these cases.

I hope this helps.

Brian Grunkemeyer
CLR Base Class Library Team
Wednesday, June 09, 2004 10:31 PM by Justin Rogers

# re: An not so asynchronous mistake anyone can make when using the System.IO.Stream class.

You mean something like what I have below. I've wrapped the concepts of synchronous completion so that I have enough variables to figure out when I need to loop and when the aysnchronous code is doing the work for me. I'll probably blog the below and linkie it to this entry. Also note that I've also implemented a ThreadPool override of the Stream class that provides true asynchronous actions. So I'm guessing we are on the same page.

using System;
using System.IO;

public class JustAStreamWrapper : Stream {
private Stream underlyingStream;

public JustAStreamWrapper(Stream stream) {
this.underlyingStream = stream;
}

public override bool CanRead { get { return true; } }
public override bool CanSeek { get { return false; } }
public override bool CanWrite { get { return false; } }
public override long Length { get { return -1; } }
public override long Position { get { return -1; } set { } }

public override void Flush() { }
public override long Seek(long pos, SeekOrigin origin) { return -1; }
public override void SetLength(long length) { }

public override int Read(byte[] buffer, int offset, int count) {
return underlyingStream.Read(buffer, offset, count);
}

public override void Write(byte[] buffer, int offset, int count) {
}
}

public class SynchronousCompletionHandler {
private bool firstCall;
private Stream underlyingStream;
private byte[] userArray;

public SynchronousCompletionHandler(Stream underlyingStream, byte[] userArray) {
this.underlyingStream = underlyingStream;
this.firstCall = true;
this.userArray = userArray;
}

public Stream Stream { get { return this.underlyingStream; } }
public bool FirstCall { get { bool retVal = this.firstCall; this.firstCall = false; return retVal; }
public byte[] UserArray { get { return this.userArray; } }
}

public class NonAsyncStream {
private static void Main(string[] args) {
try {
DoNoStackOverflow(args[0]);
Console.WriteLine("Past the NSOE method");
DoStackOverflow(args[0]);
} catch { Console.WriteLine("SOE");
} finally { Console.WriteLine("SOE");
}
}

private static void DoNoStackOverflow(string largeFile) {
SynchronousCompletionHandler sch = new SynchronousCompletionHandler(new JustAStreamWrapper(File.OpenRead(largeFile)), new byte[4096]);
sch.Stream.BeginRead(sch.UserArray, 0, sch.UserArray.Length, new AsyncCallback(End_ReadNoOverflow), sch);
}

private static void End_ReadNoOverflow(IAsyncResult ar) {
SynchronousCompletionHandler sch = ar.AsyncState as SynchronousCompletionHandler;
if ( sch == null ) {
return;
}

if ( ar.CompletedSynchronously && !sch.FirstCall ) {
return;
}

while(sch.Stream.EndRead(ar) > 0) {
// ProcessArray(sch.UserArray);
ar = sch.Stream.BeginRead(sch.UserArray, 0, sch.UserArray.Length, new AsyncCallback(End_ReadNoOverflow), sch);

if ( !ar.CompletedSynchronously ) { break; }
}

// Signal final completion
}

private static byte[] foo = new byte[4096]; // Very small so we overflow with smaller files
private static void DoStackOverflow(string largeFile) {
JustAStreamWrapper jasw = new JustAStreamWrapper(File.OpenRead(largeFile));


jasw.BeginRead(foo, 0, foo.Length, new AsyncCallback(End_Read), jasw);
}

private static void End_Read(IAsyncResult ar) {
Stream stream = (Stream) ar.AsyncState;

if ( stream.EndRead(ar) > 0 ) {
Console.WriteLine("One Time Through");
stream.BeginRead(foo, 0, foo.Length, new AsyncCallback(End_Read), stream);
} else {
Console.WriteLine("Less than good");
}
}
}
Monday, June 08, 2009 3:54 AM by LA.NET [EN]

# Multithreading: the IAsyncResult interface

In the last post , we’ve started looking at the APM model used by the .NET framework. Today we’re going

Monday, June 08, 2009 4:03 AM by ASPInsiders

# Multithreading: the IAsyncResult interface

In the last post , we’ve started looking at the APM model used by the .NET framework. Today we’re going

Leave a Comment

(required) 
(required) 
(optional)
(required)