Threads are a popular way of getting on with multiple things at once. They are not the only way of doing this, but they are particuarly attractive if you are faced with an API that is resolutely synchronous.
A common question that emerges once you have kicked off some concurrent work is: how do I stop it? Here are two popular reasons for wanting to stop some work in progress:
- You need to shut down the program.
- The user cancelled the operation.
In the first case, it is often acceptable to drop everything mid flow and not bother shutting down cleanly, because the internal state of the program no longer matters, and the OS will release many resources held by our program when it exits. The only concern is if the program stores state persistently - it is important to make sure that any such state is consistent when our program exits. However, if we are relying on a database for such state, we can still often get away with abandoning things mid flow, particularly if we are using transactions - aborting a transaction rolls everything back to where it was before the transaction started, so this should be sufficient to return the system to a consistent state.
There are of course cases where dropping everything on the floor will not work. If the application stores its state on disk without the aid of a database, it will need to take steps to make sure that the on-disk representation is consistent before abandoning an operation. And in some cases, a program may have interactions in progress with external systems or services that require explicit cleanup beyond what will happen automatically. However, if you have designed your system to be robust in the face of a sudden failure (e.g. loss of power) then it should be acceptable simply to abandon work in progress rather than cleaning up neatly when shutting the program down. (Indeed there is a school of thought that says that if your program requires explicit shutdown, it is not sufficiently robust - for a truly robust program, sudden termination should always be a safe way to shut down. And given that, some say, you may as well make this your normal mode of shutdown - it's a very quick way of shutting down!)
User-initiated cancellation of a single operation is an entirely different matter however.
If the user chooses to cancel an operation for some reason - maybe it is taking too long - she will expect to be able to continue using the program afterwards. It is therefore not acceptable simply to drop everything on the floor, because the OS is not about to tidy up after us. Our program has to live with its internal state after the operation has been cancelled. It is therefore necessary for cancellation to be done in an orderly fashion, so that the program's state is still internally consistent once the operation is complete.
Bearing this in mind, consider the use of Thread.Abort
. This is, unfortunately, a popular choice for cancelling work, because it usually manages to stop the target thread no matter what it was up to. This means you will often see its use recommended on mailing lists and news groups as a way of stopping work in progress, but it is really only appropriate if you are in the process of shutting down the program, because it makes it very hard to be sure what state the program will be in afterwards.
Asynchronous Exceptions
The problem with Thread.Abort
is that it can interrupt the progress of the target thread at any point. It does so by raising an 'asynchronous' exception, an exception that could emerge at more or less any point in your program. (This has nothing to do with the .NET async pattern by the way - that's about doing work without hogging the thread that started the work.)
Most exceptions are synchronous, meaning that it is possible to determine the points in a program at which such an exception might be thrown. For example, when you call System.Int32.Parse
you know that it will throw a FormatException
if there is something wrong with the string you pass it. Most importantly you know that it won't wait until you've executed a few lines of code before saying "oh by the way, here's an exception." If the call to Int32.Parse
returns normally, you know that you won't be seeing a FormatException
.
With asynchronous exceptions on the other hand, you never know where they might emerge - they could be thrown at more or less any point in your program's execution. This makes them rather hard to deal with - how are you supposed to cope gracefully with exceptions if you have no idea where they will emerge?
This is a particularly big problem for finally
blocks. If you're doing your exception handling properly, you'll most likely have far more finally
(or using
) blocks in your code than you have catch
blocks. This is because in order to recover successfully from an error, your code will need to tidy up after itself. And since C# doesn't support C++-style scope-based destructor execution, finally
blocks (and their close cousins, using
blocks) are the only sane way of ensuring that such tidying is performed reliably.
Consider this code:
using (FileStream fs = File.Open(myDataFile, FileMode.Open, FileAccess.ReadWrite, FileShare.None)) { ...do stuff with data file... }
This using
block is really shorthand for this:
FileStream fs = File.Open(myDataFile, FileMode.Open, FileAccess.ReadWrite, FileShare.None); try { ...do stuff with data file... } finally { IDisposable disp = fs; disp.Dispose(); }
The compiler will generate that finally
block for us. (We could write it out in full like this every time, we just don't usually bother, because the first example is much more succinct and easier to read.) The whole idea of the using
statement here is that it guarantees to close the file regardless of whether we leave the using
block normally, or by throwing an exception.
Asynchronous exceptions weaken this guarantee.
Suppose the code above will be working on the file for some time, and you've decided to do it on some worker thread. Now suppose the user has chosen to cancel the operation, and your UI thread calls Thread.Abort
to stop the operation. Most of the time, this will actually work. However, there's one situation in which it goes horribly wrong.
Suppose the worker thread had very nearly finished when the user decided to abort the operation. What happens if the worker thread has just entered the compiler-generated finally
block when the UI thread calls Thread.Abort
? If the worker thread is now in the finally
block, it isoutside of the try
block. This means that if the ThreadAbortException
gets raised at this point, the remainder of the finally
block won't run to completion. And if the worker thread hadn't quite managed to call Dispose
yet, or it had but the FileStream
object hadn't quite managed to close the file yet, the file isn't going to get closed.
At best, the FileStream
's finalizer will eventually run and close the file. But it's conceivable that the FileStream.Dispose
method might set its internal state to indicate that the file has been closed before it really closes the handle. Most classes aren't written to behave predictably if you start injecting asynchronous exceptions onto the thread you're calling their methods on.
The bottom line is that if an asynchronous exception occurs at the wrong moment, the file will remain open, possibly until the process exits. Since the file was opened for exclusive access, this means further attempts to open the file will fail until the process exits. The user will learn to hate your program.
This kind of thing is what makes async exceptions evil. And since Thread.Abort
works by raising an asynchronous exception, we can conclude that Thread.Abort
is evil. Hence the title.
Non-Evil Cancellation
Since Thread.Abort
is bad news, how should we cancel operations? I think that if you're looking at how to do something to the worker thread to stop it, you're looking at it from the wrong angle. (So I wouldn't recommend Thread.Interrupt
either, although at least it doesn't raise exceptions asynchronously.)
The approach I always recommend is dead simple. Have a volatile bool
field that is visible both to your worker thread and your UI thread. If the user clicks cancel, set this flag. Meanwhile, on your worker thread, test the flag from time to time. If you see it get set, stop what you're doing.
The issue most people initially have with this approach is that it doesn't forcibly stop the thread in whatever it's in the middle of. That's actually a good thing though - it's much easier to keep your program's internal state consistent if you get to choose when to abort an operation. And in any case, if you're concerned about how long it will take for an operation to grind to a halt, just pretend to the user that it has been cancelled as soon as they click cancel, and then let it grind to a halt on the worker thread in its own sweet time. Of course, in some scenarios you will actually need to make the user wait until you've managed to stop the operation, but in the cases where there's no good reason to do this, just relax and let things come to a halt in their own time.
Alternative Non-Evil Cancellation
There is a completely different approach you can take: use processes. If you fire up an entirely seperate process to do the background work, then you can nuke it with impunity, because you don't care about its internal state, and it won't affect your process's internal state. (Although if it modifies persistent state you still need to take care to leave it in a persistent state.)
It feels pretty stone-age if you're used to using multiple threads, because it's so much effort marshalling data into and out of the remote process, but it is a workable approach. It also allows the child process to be incredibly shabby about releasing resources because it knows it's not going to live long. The main problem is that processes are relatively heavyweight on Windows. So for those two reasons, I tend to prefer the in-process solution.