Archive for the ‘.NET Patterns and Practices’ Category

XFactor

By: Cole Francis, Architect, PSC, LLC


THE PROBLEM

So, what do you do when you’re building a website, and you have a long-running client-side call to a Web API layer. Naturally, you’re going to do what most developers do and call the Web API asynchronously.  This way, your code can continue to cruise along until a result finally return from the server.

But, what if matters are actually worse than that?  What if your Web API Controller code contacts a Repository POCO that then calls a stored procedure through the Entity Framework.  And, what if the Entity Framework leverages a project dedicated database, as well as a system-of-record database, and calls to your system-of-record database sporadically fail?

Like most software developers, you would lean towards looking at the log files, offering traceability and logging for your code.  But, what if there wasn’t any logging baked into the code?  Even worse, what if this problem only occurred sporadically?  And, when it occurs, orders don’t make it into the system-of-record database, which means that things like order changes and financial transactions don’t occur.  Have you ever been in a situation like this one?


PART I – HERE COMES ELMAH

From a programmatic perspective, let’s hypothetically assume that the initial code had the controller code calling the repository POCO in a simple For/Next loop that iterates a hardcoded 10 times.  So, if just one of the 10 iterating attempts succeeds, then it means that the order was successfully processed.  In this case, the processing thread would break free from the critical section in the For/Next loop and continue down its normal processing path.  This, my fellow readers, is what’s commonly referred to as “Optimistic Programming”.

The term, “Optimistic Programming”, hangs itself on the notion that your code will always be bug-free and operate on a normal execution path.  It’s this type of programming that provides a developer with an artificial comfort level.  After all, at least one of the 10 iterative calls will surely succeed.  Right?  Um…right?  Well, not exactly.

Jack Ganssle, from the Ganssle Group, does an excellent job explaining why this development approach can often lead to catastrophic consequences.  He does this in his 2008 online rant entitled, “Optimistic Programming“.  Sure, his article is practically ten years old at this point, but his message continues to be relevant to this very day.

The bottom line is that without knowing all of the possible failure points, their potential root cause, and all the alternative execution paths a thread can tread down if an exception occurs, then you’re probably setting yourself up for failure.  I mean, are 10 attempts really any better than one?  Are 10,000 calls really any better than 10?  Not only are these flimsy hypothesis with little or no real evidence to back them up, but they further convolute and mask the underlying root cause of practically any issue that arises.  The real question is, “Why are 10 attempts necessary when only one should suffice?”

So, what do you do in a situation when you have very little traceability into an ailing application in Production, but you need to know what’s going on with it…like yesterday!  Well, the first thing you do is place a phone call to The PSC Group, headquartered in Schaumburg, IL.  The second thing you do is ask for the help of Blago Stephanov, known internally to our organization as “The X-Factor”, and for a very good reason.  This guy is great at his craft and can accelerate the speed of development and problem solving by at least a factor 2…that’s no joke.

In this situation, Blago recommends using a platform like Elmah for logging and tracing unhandled errors.  Elmah is a droppable, pluggable logging framework that dynamically captures all unhandled exceptions.  It also offers color-coded stack traces with line numbers that can help pinpoint exactly where the exception was thrown.  Even more impressive, its very quick to implement and requires low personal involvement during integration and setup.  In a nutshell, its implementation is quick and it makes debugging a breeze.

Additionally, Elmah comes with a web page that allows you to remotely view the unhandled exceptions.  This is a fantastic function for determining the various paths, both normal and alternate, that lead up to an unhandled error. Elmah also allows developers to manually record their own information by using the following syntax.

ErrorSignal.FromCurrentContext().Raise(ex);

 

Regardless, Elmah’s capabilities go well beyond just recording exceptions. For all practical purposes, you can record just about any information you desire. If you want to know more about Elmah, then you can read up on it by clicking here.  Also, you’ll be happy to know that you can buy if for the low, low price of…free.  It just doesn’t get much better than this.


PART II – ONE REALLY COOL (AND EXTREMELY RELIABLE) RE-TRY PATTERN

So, after implementing Elmah, let’s say that we’re able to track down the offending lines of code, and in this case the code was failing in a critical section that iterates 10 times before succeeding or failing silently.  We would have been very hard-pressed to find it without the assistance of Elmah.

Let’s also assume that the underlying cause is that the code was experiencing deadlocks in the Entity Framework’s generated classes whenever order updates to the system-of-record database occur.  So, thanks to Elmah, at this point we finally have some decent information to build upon.  Elmah provides us with the stack trace information where the error occurred, which means that we would be able to trace the exception back to the offending line(s) of code.

After we do this, Blago recommends that we craft a better approach in the critical section of the code.  This approach provides more granular control over any programmatic retries if a deadlock occurs.  So, how is this better you might ask?  Well, keep in mind from your earlier reading that the code was simply looping 10 times in a For/Next loop.  So, by implementing his recommended approach, we’ll have the ability to not only control the number of iterative reattempts, but we can also control wait times in between reattempted calls, as well as the ability to log any meaningful exceptions if they occur.

 

       /// <summary>
       /// Places orders in a system-of-record DB
       /// </summary>
       /// <returns>An http response object</returns>
       [HttpGet]
       public IHttpActionResult PlaceOrder()
       {
           using (var or = new OrderRepository())
           {
               Retry.DoVoid(() => or.PlaceTheOrder(orderId));
               return Ok();
           }
       }

 

The above Retry.DoVoid() method calls into the following generic logic, which performs its job flawlessly.  What’s more, you can see in the example below where Elmah is being leveraged to log any exceptions that we might encounter.

 

using Elmah;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;

namespace PSC.utility
{
   /// <summary>
   /// Provides reliable and traceable retry logic
   /// </summary>
   public static class Retry
   {
       /// <summary>
       /// Retry logic
       /// </summary>
       /// <returns>Fire and forget</returns>
       public static void DoVoid(Action action, int retryIntervallInMS = 300, int retryCount = 5)
       {
           Do<object>(() =>
           {
               action();
               return null;
           }, retryIntervallInMS, retryCount);
       }

       public static T Do<T>(Func<T> action, int retryIntervallInMS = 300, int retryCount = 5)
       {
           var exceptions = new List<Exception>();
           TimeSpan retryInterval = TimeSpan.FromMilliseconds(retryIntervallInMS);

           for (int retry = 0; retry < retryCount; retry++)
           {
               bool success = true;

               try
               {
                   success = true;

                   if (retry > 0)
                   {
                       Thread.Sleep(retryInterval);
                   }
                   return action();
               }
               catch (Exception ex)
               {
                   success = false;
                   exceptions.Add(ex);
                   ErrorSignal.FromCurrentContext().Raise(ex);
               }
               finally
               {
                   if (retry > 0 && success) {
                       ErrorSignal.FromCurrentContext().Raise(new Exception(string.Format("The call was attempted {0} times. It finally succeeded.", retry)));
                   }
               }
           }
           throw new AggregateException(exceptions);
     }
   }
}

As you can see, the aforementioned Retry() pattern offers a much more methodical and reliable approach to invoke retry actions in situations where our code might be failing a few times before actually succeeding.  But, even if the logic succeeds, we still have to ask ourselves questions like, “Why isn’t one call enough?” and “Why are we still dealing with the odds of success?”

After all, not only do we have absolutely no verifiable proof that looping and reattempting 10 times achieves the necessary “odds of success”.  Therefore, the real question is why there should there be any speculation at all in this matter?  After all, we’re talking about pushing orders into a system-of-record database for revenue purposes, and the ability to process orders shouldn’t boil down to “odds of success”.  It should just work…every time!

Nonetheless, what this approach will buy us is one very valuable thing, and that’s enough time to track down the issue’s root cause.  So, with this approach in place, our number one focus would now be to find and solve the core problem.


PART III – PROBLEM SOLVED

So, at this point we’ve relegated ourselves to the fact that, although the aforementioned retry logic doesn’t hurt a thing,  it masks the core problem.

Blago recommends that the next step is to load test the failing method by creating a large pool of concurrent users (e.g. 1,000) all simulating the order update function at the exact same time.  I’ll also take it one step further by recommending that we also need to begin analyzing and profiling the SQL Server stored procedures that are being called by the Entity Framework and rejected.

I recommend that we first review the execution plans of the failing stored procedures, making sure their compiled execution plans aren’t lopsided.  if we happen to notice that too much time is being spent on individual tasks inside the stored procedure’s execution plan, then our goal should be to optimize them.  Ideally, what we want to see is an even distribution of time optimally spread across the various execution paths inside our stored procedures.

In our hypothetical example, we’ll assume there are a couple of SQL Server tables using complex keys to comprise a unique record on the Order table.

Let’s also assume that during the ordering process, there’s a query that leverages the secondary key to retrieve additional data before sending the order along to the system-of-record database.   However, because the complex keys are uniquely clustered, getting the data back out of the table using a single column proves to be too much of a strain for the growing table.  Ultimately, this leads to query timeouts and deadlocks, particularly under load.

To this end, optimizing the offending stored procedures by creating a non-clustered, non-unique index for the key attributes in the offending tables will vastly improve their efficiency.  Once the SQL optimizations are complete, the next step should be to perform more load tests and to leverage the SQL Server Profiling Tool to gauge the impact of our changes.  At this point, the deadlocks should disappear completely.


LET’S SUMMARIZE, SHALL WE

The moral of this story is really twofold.  (1) Everyone should have an “X-Factor” on their project; (2) You can’t beat great code traceability and logging in a solution. If option (1) isn’t possible, then at a minimum make sure that you implement option (2).

Ultimately, logging and traceability help out immeasurably on a project, particularly where root cause analysis is imperative to track down unhandled exceptions and other issues.  It’s through the introduction of Elmah that we were able to quickly identify and resolve the enigmatic database deadlock problems that plagued our hypothetical solution.

Regardless, while this particular scenario is completely conjectural, situations like these aren’t all that uncommon to run across in the field.  Regardless, most of this could have been prevented by following Jack Ganssule’s 10-year old advice, which is to make sure that you check those goesintas and goesoutas!  But, chances are that you probably won’t.

Thanks for reading and keep on coding! 🙂

CouplingDesignPatterns

Author: Cole Francis, Architect

BACKGROUND PROBLEM

My last editorial focused on building out a small application using a simple Service Locator Pattern, which exposed a number of cons whenever the pattern is used in isolation. As you might recall, one of the biggest problems that developers and architects have with this pattern is the way that service object dependencies are created and then inconspicuously hidden from their callers inside the service object register of the Service Locator Class. This behavior can result in a solution that successfully compiles at build-time but then inexplicably crashes at runtime, often offering no insight into what went wrong.

THE REAL PROBLEM

I think it’s fair to say that when some developers think about design patterns they don’t always consider the possibility of combining one design pattern with another in order to create a more extensible and robust framework. The reason why opportunities like these are overlooked is because the potential for a pattern’s extensibility isn’t always obvious to its implementer.

For this very reason, I think it’s important to demonstrate how certain design patterns can be coupled together to create some very malleable application frameworks, and to prove my point I took the Service Locator Pattern I covered in my previous editorial and combined it with a very basic Factory Pattern.

Combining these two design patterns provides us with the ability to clearly separate the “what to do” from the “when to do it” concerns. It also offers build-time type checking and the ability to test each layer of the application using an object’s interface. Enough chit-chat. Let’s get on with the demo!

THE SOLUTION

Suppose we are a selective automobile manufacturer and offer two well-branded models:

    (1) A luxury model named “The Drifter”.
    (2) A sport luxury model named “The Showdown”.

To keep things simple, I’ve included very few parts for each make’s model. So, while each model is equipped with its own engine and emblem, both models share the same high-end stereo package and high-performance tires. Shown below is a snapshot of the ServiceLocator Class, which looks nearly identical to the one I included in my last editorial. For this reason, I’m not color-coding anything inside the class except where I’ve made changes to it. I’ve also kept the color-coding consistent throughout the rest of the code examples in order to depict how the different classes and design patterns get tied together:


namespace FactoryPatternExample
{
    public class ServiceLocator
    {
        #region Member Variables

        ///
        /// An early loaded dictionary object acting as a memory map for each interface's concrete type
        /// 
        private IDictionary<object, object> services;

        #endregion

        #region IServiceLocator Methods

        ///
        /// Resolves the concrete service type using a passed in interface
        /// 
        public T Resolve<T>()
        {
            try
            {
                return (T)services[typeof(T)];
            }
            catch (KeyNotFoundException)
            {
                throw new ApplicationException("The requested service is not registered");
            }
        }

        /// 
        /// Extends the service locator capabilities by allowing an interface and concrete type to 
        /// be passed in for registration (e.g. if you wrap the assembly and wish to extend the 
        /// service locator to new types added to the extended project)
        /// 
        public void Register<T>(object resolver)
        {
            try
            {
                this.services[typeof(T)] = resolver;
            }
            catch (Exception)
            {

                throw;
            }
        }

        #endregion

        #region Constructor(s)

        ///
        /// The service locator constructor, which resolves a supplied interface with its corresponding concrete type
        /// 
        public ServiceLocator()
        {
            services = new Dictionary<object, object>();

            // Registers the service in the locator
            this.services.Add(typeof(IDrifter_LuxuryVehicle), new Drifter_LuxuryVehicle());
            this.services.Add(typeof(IShowdown_SportVehicle), new Showdown_SportVehicle());
        }

        #endregion
    }
}


Where the abovementioned code differs from a basic Service Locator implementation is when we add our vehicles to the service register’s Dictionary object in the ServiceLocator() Class Constructor. When this occurs, the following parts are registered using a Factory Pattern that gets invoked in the Constructor of the shared Vehicle() Base Class (highlighted in yellow, below):


 
namespace FactoryPatternExample.Vehicles.Models
{
    public class Drifter_LuxuryVehicle : Vehicle, IDrifter_LuxuryVehicle
    {
        /// 
        /// Factory Pattern for the luxury vehicle line of automobiles
        /// 
        /// 
        public override void CreateVehicle()
        {
            Parts.Add(new Parts.Emblems.SilverEmblem());
            Parts.Add(new Parts.Engines._350_LS());
            Parts.Add(new Parts.Stereos.HighEnd_X009());
            Parts.Add(new Parts.Tires.HighPerformancePlus());
        }
    }
}



 
namespace FactoryPatternExample.Vehicles.Models
{
    public class Showdown_SportVehicle : Vehicle, IShowdown_SportVehicle
    {
        /// 
        /// Factory Pattern for the luxury vehicle line of automobiles
        /// 
        /// 
        public override void CreateVehicle()
        {
            Parts.Add(new Parts.Emblems.GoldEmblem());
            Parts.Add(new Parts.Engines._777_ProSeries());
            Parts.Add(new Parts.Stereos.HighEnd_X009());
            Parts.Add(new Parts.Tires.HighPerformancePlus());
        }
    }
}


As you can see from the code above, both subtype classes inherit from the Vehicle() Base Class, but each subtype implements its own distinctive interface (e.g. IDrifter_LuxuryVehicle and IShowdown_SportVehicle). Forcing each subclass to implement its own unique interface is what ultimately allows a calling application to distinguish one vehicle type from another.

Additionally, it’s the Vehicle() Base Class that calls the CreateVehicle() Method inside its Constructor. But, because the CreateVehicle() Method in the Vehicle() Base Class is overridden by each subtype, each subtype is given the ability to add its own set of exclusive parts to the list of parts in the base class. As you can see, I’ve hardcoded all of the parts in my example out of convenience, but they can originate just as easily from a data backing store.



namespace FactoryPatternExample.Vehicles
{
    public abstract class Vehicle : IVehicle
    {
        List _parts = new List();

        public Vehicle()
        {
            this.CreateVehicle();
        }

        public List Parts 
        { 
            get
            {
                return _parts;
            }
        }

        // Factory Method
        public abstract void CreateVehicle();
    }
}


As for the caller (e.g. a client application), it only needs to resolve an object using that object’s interface via the Service Locator in order to obtain access to its publicly exposed methods and properties. (see below):


FactoryPatternExample.ServiceLocator serviceLocator = new FactoryPatternExample.ServiceLocator();
IDrifter_LuxuryVehicle luxuryVehicle = serviceLocator.Resolve<IDrifter_LuxuryVehicle>();

if (luxuryVehicle != null)
{
     foreach (Part part in ((IVehicle)(luxuryVehicle)).Parts)
     {
          Console.WriteLine(string.Concat("   - ", part.Label, ": ", part.Description));
     }
}

Here are the results after making a few minor tweaks to the UI code:

The Results

What’s even more impressive is that the Service Locator now offers compile-time type checking and the ability to test each layer of the code in isolation thanks to the inclusion of the Factory Pattern:

BuildTimeError

In summary, many of the faux pas experienced when implementing the Service Locator Design Pattern can be overcome by coupling it with a slick little Factory Design Pattern. What’s more, if we apply this same logic both equitably and ubiquitously across all design patterns, then it seems unfair to take a single design pattern and criticize its integrity and usefulness in complete sequestration, because it’s often the combination of multiple design patterns that make frameworks and applications more integral and robust. Thanks for reading and keep on coding! 🙂