No matter how many client sites I visit, invariably I stumble across Windows Communication Foundation (WCF) client-side code that leaks memory and resources. This can result in an application crashing or connection limits being reached resulting in further service calls being rejected. Not only can the client-side code cause problems on the client machine, but if connections are not correctly closed on the client then that can cause the server to hit its maximum connection limit and reject service calls from other clients as well!

These types of issues are caused by developers performing incorrect error handling and incorrect disposal of the client and connection. Microsoft really is to blame for starting this mess, as they created an API that does not conform to typical practices - or even their own guidelines! For instance, Microsoft’s guideline is never to throw an exception from a Dispose method - but the WCF client code does exactly that. At the end of the day however, it is the responsibility of individual developers to be aware of the issues and design restrictions and create software that works properly.

In many cases, WCF client-side code resides in small or medium-sized user applications that do not get used hard or long enough to exhibit a noticeable degradation in system resources and upset users. But as previously stated, depending on the number of users, it still may be causing issues on servers.

When it goes wrong - for instance, on a server - there can be a spectacular flurry of activity! It can be amazing to see how quickly IT departments can move when a server hosting core business services starts failing due to custom server applications leaking resources and using many Gigabytes of memory. Unfortunately it takes such an event for many businesses to pay attention to the need for higher-quality software development practices and testing.

In the past I have seen proud companies who develop and sell custom n-tier products “work around” known problems in their proprietary code by recommending to their clients that they recycle the IIS Application Pools frequently and run an excessive numbers of servers. I never really appreciated that band-aid attitude towards software development practices or clients.

This series of articles will show you how to fix the root cause - by correctly calling WCF services, handling errors and disposing of WCF clients.

Update 2014-10-01: The focus of this article is not about the overall design of WCF. Other parts of WCF are designed very well - especially with regards to its extensible nature. This article is a narrow focus on the design decisions that cause the common issues I see too often on so many client sites.

The Problems

The “Using” Statement and Understanding States

One of the problems that Microsoft caused was by not following their own guideline of not throwing exceptions in the Dispose method. The using statement is an excellent and common pattern for automatically disposing IDisposable objects at the end of the code block. Avoiding Problems with the Using Statement describes how even though the WCF client is disposable, developers should NOT use the using statement with them.

The C# “using” statement results in a call to Dispose(). This is the same as Close(), which may throw exceptions when a network error occurs. Because the call to Dispose() happens implicitly at the closing brace of the “using” block, this source of exceptions is likely to go unnoticed both by people writing the code and reading the code. This represents a potential source of application errors.

Microsoft demonstrates in that article how to clean up “correctly” when an exception occurs.

 1try
 2{
 3    ...
 4    client.Close();
 5}
 6catch (CommunicationException e)
 7{
 8    ...
 9    client.Abort();
10}
11catch (TimeoutException e)
12{
13    ...
14    client.Abort();
15}
16catch (Exception e)
17{
18    ...
19    client.Abort();
20    throw;
21}

It is important to understand that if the client is in a State of Faulted, then the only action that should be taken by client code is the Abort method. As documented in Expected Exceptions the TimeoutException, CommunicationException and any derived class of CommunicationException are ‘expected’ exceptions from the WCF client.

If an expected exception occurs, the client may or may not be usable afterwards. To determine if the client is still usable, check that the State property is CommunicationState.Opened. If it is still opened, then it is still usable. Otherwise you should abort the client and release all references to it.

Caution: You may observe that clients that have a session are often no longer usable after an exception, and clients that do not have a session are often still usable after an exception. However, neither of these is guaranteed, so if you want to try to continue using the client after an exception your application should check the State property to verify the client is still opened.

Code that calls a client communication method must catch the TimeoutException and CommunicationException.

However, all this talk of checking for the State property is cautioned by Accessing Services Using a WCF Client:

Checking the value of the ICommunicationObject.State property is a race condition and is not recommended to determine whether to reuse or close a channel.

If you were to check the State property in order to determine whether to Abort or Close, depending on your approach there could be a race condition. For instance the following code could result in a race condition:

1if (this.State == CommunicationState.Faulted) 
2{ 
3    this.Abort();
4}
5else 
6{
7    this.Close();
8}

The race condition could occur because when the State property is checked to see if it is Faulted it might be Opened at that point in time. However, by the short time that Close method is reached, the State might now be Faulted, and when Close is called an exception would be thrown. This code alone is not sufficient.

The Communication State Enumeration documentation explains the meaning of each possible State:

The Closed state is equivalent to being disposed and the configuration of the object can still be inspected.

The Faulted state is used to indicate that the object has transitioned to a state where it can no longer be used. There are two primary scenarios where this can happen:

  • If the Open method fails for any reason, the object transitions to the faulted state.
  • If a session-based channel detects an error that it cannot recover from, it transitions to the faulted state. This can happen for instance if there is a protocol error (that is, it receives a protocol message at an invalid time) or if the remote endpoint aborts the session.

An object in the Faulted state is not closed and may be holding resources. The Abort method should be used to close an object that has faulted. If Close is called on an object in the Faulted state, a CommunicationObjectFaultedException is thrown because the object cannot be gracefully closed.

The article Understanding State Changes describes how the State property can transition to different states. This article expands upon and somewhat contradicts the previous by indicating that if the object is in the Faulted state then the Close method will call Abort for you and return.

The Close() method can be called at any state. It tries to close the object normally. If an error is encountered, it terminates the object. The method does nothing if the current state is Closing or Closed. Otherwise it sets the state to Closing. If the original state was Created, Opening or Faulted, it calls Abort().

I’m confused - how about you?

To remedy my confusion, I went to the source of truth - the code. The reference source for CommunicationObject shows us that the Close method handles being called from any State value, and that it will perform an Abort if required. In addition, if the State was Faulted, the Close method actually will throw a CommunicationObjectFaultedException.

 1...
 2               switch (originalState)
 3                {
 4                    case CommunicationState.Created:
 5                    case CommunicationState.Opening:
 6                    case CommunicationState.Faulted:
 7                        this.Abort();
 8                        if (originalState == CommunicationState.Faulted)
 9                        {
10                            throw TraceUtility.ThrowHelperError(this.CreateFaultedException(), Guid.Empty, this);
11                        }
12                        break;
13...

The article Understanding State Changes also explicitly states that the Abort method can throw exceptions.

The Abort() method does nothing if the current state is Closed or if the object has been terminated before (for example, possibly by having Abort() executing on another thread). Otherwise it sets the state to Closing and calls OnClosing() (which raises the Closing event), OnAbort(), and OnClosed() in that order (does not call OnClose because the object is being terminated, not closed). OnClosed() sets the state to Closed and raises the Closed event. If any of these throw an exception, it is re-thrown to the caller of Abort.

No sample code from Microsoft or anywhere else that I have seen handles the situation where the Abort method throws an exception.

Stack Overflow has an interesting thread about how to work around the using block issue and perform the close/abort pattern. Their top-voted solution for the close/abort pattern, to fix the race condition is:

 1bool success = false; 
 2try 
 3{
 4    if (State != CommunicationState.Faulted) 
 5    {
 6        Close(); 
 7        success = true;
 8    }
 9}
10finally 
11{ 
12    if (!success) 
13    {
14        Abort();
15    }
16}

In this code, if the State is already Faulted or the execution of the Close method throws an exception (which is implicitly caught and ignored), then finally the Abort method will be called. That’s pretty good. But as we now know, the Abort method can throw exceptions, and that code does not handle it.

Exception Catching Order

The article Sending and Receiving Faults shows us that we need to catch the exceptions in a specific order - especially in relation to the SOAP-based FaultException.

Because FaultException derives from FaultException, and FaultException derives from CommunicationException, it is important to catch these exceptions in the proper order. If, for example, you have a try/catch block in which you first catch CommunicationException, all specified and unspecified SOAP faults are handled there; any subsequent catch blocks to handle a custom FaultException exception are never invoked.

Remember that one operation can return any number of specified faults. Each fault is a unique type and must be handled separately.

Closing the channel can throw exceptions if the connection cannot be cleanly closed or is already closed, even if all the operations returned properly.

Typically, client object channels are closed in one of the following ways:

  • When the WCF client object is recycled.
  • When the client application calls ClientBase.Close.
  • When the client application calls ICommunicationObject.Close.
  • When the client application calls an operation that is a terminating operation for a session.

In all cases, closing the channel instructs the channel to begin closing any underlying channels that may be sending messages to support complex functionality at the application level. For example, when a contract requires sessions a binding attempts to establish a session by exchanging messages with the service channel until a session is established. When the channel is closed, the underlying session channel notifies the service that the session is terminated. In this case, if the channel has already aborted, closed, or is otherwise unusable (for example, when a network cable is unplugged), the client channel cannot inform the service channel that the session is terminated and an exception can result.

Abort the Channel If Necessary

Because closing the channel can also throw exceptions, then, it is recommended that in addition to catching fault exceptions in the correct order, it is important to abort the channel that was used in making the call in the catch block. If the fault conveys error information specific to an operation and it remains possible that others can use it, there is no need to abort the channel (although these cases are rare). In all other cases, it is recommended that you abort the channel. For a sample that demonstrates all of these points, see Expected Exceptions.

And here is the sample code from that article:

 1using System;
 2using System.ServiceModel;
 3using System.ServiceModel.Channels;
 4using Microsoft.WCF.Documentation;
 5
 6public class Client
 7{
 8  public static void Main()
 9  {
10    SampleServiceClient wcfClient = new SampleServiceClient();
11
12    try
13    {
14      wcfClient.SampleMethod("hello");
15
16      wcfClient.Close();
17    }
18    catch (TimeoutException timeProblem)
19    {
20      wcfClient.Abort();
21    }
22    catch (FaultException<MyCustomFault> myCustomFault)
23    {
24      wcfClient.Abort();
25    }
26    catch (FaultException<MyOtherCustomFault> myOtherCustomFault)
27    {
28      wcfClient.Abort();
29    }
30    catch (FaultException unknownFault)
31    {
32      wcfClient.Abort();
33    }
34    catch (CommunicationException commProblem)
35    {
36      wcfClient.Abort();
37    }
38  }
39}

Note the following about this sample code:

  • The client is not closed or aborted if there is an unexpected exception; and
  • There is no concern about catching exceptions from the Abort method.

Other Exceptions

There is one more type of exception that never seems to be mentioned in sample code or in any literature I have seen related to WCF, and that is the ThreadAbortException.

When this exception is raised, the runtime executes all the finally blocks before ending the thread. Because the thread can do an unbounded computation in the finally blocks or call Thread.ResetAbort to cancel the abort, there is no guarantee that the thread will ever end.

The ThreadAbortException is a special exception that can occur asynchronously. If the WCF client is called from within a thread, and if the thread is aborted, then it might be prudent to clean up the client before the thread finishes.

The top-voted solution from Stack Overflow does partially and elegantly handle this situation, as well as the other asynchronous exceptions such as OutOfMemoryException and StackOverflowException.

Oh My!

So does all that sound complicated enough? No wonder so many don’t get it right…

How To Do It Correctly?

In my opinion, the most correct solution would:

  • Perform the Close/Abort pattern without a race condition
  • Handle the situation when the service operation throws exceptions
  • Handle the situations when both the Close and Abort methods throw exceptions
  • Handle asynchronous exceptions such as the ThreadAbortException

Below is my proposed solution for correctly using a WCF client.

  1SampleServiceClient client = null;
  2
  3try
  4{
  5    client = new SampleServiceClient();
  6
  7    var response = client.SampleOperation(1234);
  8
  9    // Do some business logic
 10}
 11catch (FaultException<MyCustomException>)
 12{
 13    // Do some business logic for this SOAP Fault Exception
 14}
 15catch (FaultException)
 16{
 17    // Do some business logic for this SOAP Fault Exception
 18}
 19catch (CommunicationException)
 20{
 21    // Catch this expected exception so it is not propagated further.
 22    // Perhaps write this exception out to log file for gathering statistics...
 23}
 24catch (TimeoutException)
 25{
 26    // Catch this expected exception so it is not propagated further.
 27    // Perhaps write this exception out to log file for gathering statistics...
 28}
 29catch (Exception)
 30{
 31    // An unexpected exception that we don't know how to handle.
 32    // Perhaps write this exception out to log file for support purposes...
 33    throw;
 34}
 35finally
 36{
 37    // This will:
 38    // - be executed if any exception was thrown above in the 'try' (including ThreadAbortException); and
 39    // - ensure that CloseOrAbortServiceChannel() itself will not be interrupted by a ThreadAbortException
 40    //   (since it is executing from within a 'finally' block)
 41    CloseOrAbortServiceChannel(client);
 42
 43    // Unreference the client
 44    client = null;
 45}
 46
 47
 48
 49private void CloseOrAbortServiceChannel(ICommunicationObject communicationObject)
 50{
 51    bool isClosed = false;
 52
 53    if (communicationObject == null || communicationObject.State == CommunicationState.Closed)
 54    {
 55        return;
 56    }
 57
 58    try 
 59    {
 60        if (communicationObject.State != CommunicationState.Faulted)
 61        {
 62            communicationObject.Close();
 63            isClosed = true;
 64        }
 65    }
 66    catch (CommunicationException)
 67    {
 68        // Catch this expected exception so it is not propagated further.
 69        // Perhaps write this exception out to log file for gathering statistics...
 70    }
 71    catch (TimeoutException)
 72    {
 73        // Catch this expected exception so it is not propagated further.
 74        // Perhaps write this exception out to log file for gathering statistics...
 75    }
 76    catch (Exception)
 77    {
 78        // An unexpected exception that we don't know how to handle.
 79        // Perhaps write this exception out to log file for support purposes...
 80        throw;
 81    }
 82    finally
 83    {
 84        // If State was Faulted or any exception occurred while doing the Close(), then do an Abort()
 85        if (!isClosed)
 86        {
 87            AbortServiceChannel(communicationObject);
 88        }
 89    }
 90}
 91
 92private static void AbortServiceChannel(ICommunicationObject communicationObject)
 93{
 94    try
 95    {
 96        communicationObject.Abort();
 97    }
 98    catch (Exception)
 99    {
100        // An unexpected exception that we don't know how to handle.
101        // If we are in this situation:
102        // - we should NOT retry the Abort() because it has already failed and there is nothing to suggest it could be successful next time
103        // - the abort may have partially succeeded
104        // - the actual service call may have been successful
105        //
106        // The only thing we can do is hope that the channel's resources have been released.
107        // Do not rethrow this exception because the actual service operation call might have succeeded
108        // and an exception closing the channel should not stop the client doing whatever it does next.
109        //
110        // Perhaps write this exception out to log file for gathering statistics and support purposes...
111    }
112}

Well, that’s quite depressing, isn’t it! Imagine that you have an application that calls many services. If you were to tell me that I should duplicate all that code every time I want to make a service operation call, as a developer I won’t be happy.

Unfortunately that is exactly the situation that Microsoft has forced upon developers.

You could take some short-cuts and not do all the exception handling, but no doubt on the day that one of those perhaps rare exceptions happen (and it will!), you will be glad that you handled those edge cases.

Please Tell Me There Is a Better Way!

In my next article, I will show you how to use some programming tricks to significantly reduce the amount of code that developers have to write and provide a nice, clean API for working with WCF clients.

Please leave below any comments, feedback or suggestions, or alternatively contact me on a social network.

comments powered by Disqus