I want you to think about time estimates for a moment, specifically, let’s talk about something we all enjoy: our commute. The last significant commute I had was 70 miles each way through two metropolitan traffic patterns. Yikes! Does anybody want to guess how long it took me to get to work? If on a highway with a speed limit of 65mph, and assuming I never went below that speed from my driveway to my parking space at work – just over an hour. But what about traffic signals, stop signs, and traffic congestion? How can we account for these uncertainties?
Out of the many types of risk that we discuss, I believe that schedule risk is one of the most difficult to deal with through reserves. There are two scheduling methods that we will discuss. Both rely upon a network diagram that are created through the use of a diagramming method, typically the Precedence Diagramming Method .
- Critical Path Method (CPM): This method establishes the longest duration sequence of activities from the start to the finish of the project through the use of a forward and backward pass. The Critical Path is that path which has the least amount of float, usually zero, but possibly a positive or negative value. Typically, the individual activity estimates are calculated as a function of optimistic (best-case), most-likely, and pessimistic (worst-case) scenario estimates.
- Critical Chain Method (CCM): This method strips activity duration down to the bare minimum of what could be considered reasonable. By forcing a sense of urgency, it counteracts a natural tendency towards Parkinson’s Law.
The methods with which we handle risk in either scheduling methodology are quite different. In CPM, we tend to use what is called the beta-PERT distribution. Since this distribution is non-normal, we cannot calculate the variance as a function of distance data has from the mean; instead, we sneak in the back door of the equation. Treating this non-normal distribution as normal, we simply look at the range of the data (Pessimistic – Optimistic) and divide by six to get a standard deviation, which we then apply to the central tendency. We can then calculate the variance by squaring the standard deviation. There are a few major issues with this:
- The three estimates collected for calculations are almost always spread out in a specific manner. If my regular commute to work is 60 minutes, a severe amount of traffic may double that amount of time to 120 minutes. Could a total absence of traffic reduce the regular commute by an equal amount, all the way down to 0 minutes? Not likely. The delta between Optimistic and Most-Likely, and Most-Likely and Pessimistic are not equal. This leads to a beta-PERT distribution with a positive skew. The central tendency tends to land near the mode (as PERT is weighted towards the Most-Likely) and the standard deviation treats it as a normal distribution regardless of how the data is truly distributed.
- Once we calculate this faux-variance that is based upon the range of the data rather than its spread, what do we do with it? Many times, it is added in the form of a contingency reserve at the end of the project. This is no different than assuming that you might hit traffic and have a 120 minute commute, so you should always leave 120 minutes early. While this may be true depending on your risk exposure (think about a big meeting or interview!), wasting 60 minutes every morning can get old fast.
- The original estimates were made up by someone. I’m not saying that this is a bad thing. The fact of the matter is that all estimates come from someone at some point, we just need to ensure their veracity. If it is an estimate about work that has never been done before, a Bayesian model may prove useful.
The Critical Chain Method calls for a series of buffers to be placed within it to account for the fact that it has no wiggle room whatsoever. The largest tends to be called the project buffer which sits at the end of the critical chain before it hits the end of the project. To protect the critical chain, other chains having feeding buffers to account for their uncertainty prior to merging. While CCM certainly has its merits, it also has its dangers:
- Buffers are arbitrarily established. With CPM we used some type of equation to calculate the likely variance that we could encounter, but with CCM we are not utilizing the estimates provided to us.
- By not using realistic activity duration, we may need to increase the amount of buffer above whatever amount would have been necessary for a schedule that accounted for realistic duration and variance.
- Practitioners may become accustomed to the manufactured sense of urgency and begin to pad their estimates, thereby negating the positive effects of CCM. This has a secondary effect of adding risk to the schedule.
- Since all of the buffer is at the end of the project, there is little ability for the management team to address risks as they occur throughout the project work outside of delaying subsequent tasks.
An interesting solution to these issues is the Event Chain Method (ECM). This is definitely not a ‘cure-all’ that should be applied to every project, but should be considered for those projects where two major preconditions exist: the possibility of project failure due to schedule slippage, and the abundance of schedule risk. ECM examines activities and their uncertainties in a manner different form CPM or CCM.
The Intaver Institute does a great job describing how ECM addresses the two basic types of risk, aleatory and epistemic. When we think about how our commute is normally 60 minutes, but sometimes it is 58 minutes, or 61 minutes, that is what we would typically call common cause variation, or aleatory uncertainty. If you got a new job at a new location, how can you give a good estimate if you have never made the commute? Here we are dealing with epistemic uncertainty due to a lack of knowledge. ECM addresses these types of uncertainty through the regular use of Monte Carlo simulation and Bayesian Belief Networks.
Virine and Trumper wrote at ProjectDecisions.org that ECM has six principles (I will only focus on the first two):
- Moment of event and excitation status
- Event chains
- Event chain diagrams and state tables
- Monte Carlo analysis
- Critical event chains and event cost
- Project performance measurement with event and event chains
The first principle refers to what is typically called risk responses. When the environment changes, so will our actions. By assigning a ground state and various levels of excited states for different possible triggering events and related outcomes, we can visually represent the different possibilities on the same schedule model instance – typically a Gantt chart. Another consideration on this principle is when the risk may occur in relationship to the timeline of an activity, even as it occurs. Too often, we look at risk as binary events of either happening or not, normally before an activity begins. I know I’m not the only one guilty of going to work and leaving my umbrella in the car because it’s not raining, but then it starts raining later and I get soaked!
The second principle is of event chains, and what are sometimes called second and third order effects, or the law of unintended consequences. When there is a problem on a project, most project managers will have a knee-jerk reaction and cause more harm than good. By offering built in responses, in the form of the aforementioned excited states, we can offer immediate actions without unplanned workarounds becoming necessary.
When it comes to project scheduling, there is no one-size-fits-all. Different methods will work well in different situations. This is one that can work well in a project that is facing potential failure through a large amount of risk.