Schedule Risk: Event Chain Methodology

I want you to think about time estimates for a moment, specifically, let’s talk about something we all enjoy: our commute. The last significant commute I had was 70 miles each way through two metropolitan traffic patterns. Yikes! Does anybody want to guess how long it took me to get to work? If on a highway with a speed limit of 65mph, and assuming I never went below that speed from my driveway to my parking space at work – just over an hour. But what about traffic signals, stop signs, and traffic congestion? How can we account for these uncertainties?

Out of the many types of risk that we discuss, I believe that schedule risk is one of the most difficult to deal with through reserves. There are two scheduling methods that we will discuss. Both rely upon a network diagram that are created through the use of a diagramming method, typically the Precedence Diagramming Method .

  • Critical Path Method (CPM): This method establishes the longest duration sequence of activities from the start to the finish of the project through the use of a forward and backward pass. The Critical Path is that path which has the least amount of float, usually zero, but possibly a positive or negative value. Typically, the individual activity estimates are calculated as a function of optimistic (best-case), most-likely, and pessimistic (worst-case) scenario estimates.
  • Critical Chain Method (CCM): This method strips activity duration down to the bare minimum of what could be considered reasonable. By forcing a sense of urgency, it counteracts a natural tendency towards Parkinson’s Law.

The methods with which we handle risk in either scheduling methodology are quite different. In CPM, we tend to use what is called the beta-PERT distribution. Since this distribution is non-normal, we cannot calculate the variance as a function of distance data has from the mean; instead, we sneak in the back door of the equation. Treating this non-normal distribution as normal, we simply look at the range of the data (Pessimistic – Optimistic) and divide by six to get a standard deviation, which we then apply to the central tendency. We can then calculate the variance by squaring the standard deviation. There are a few major issues with this:

  • The three estimates collected for calculations are almost always spread out in a specific manner. If my regular commute to work is 60 minutes, a severe amount of traffic may double that amount of time to 120 minutes. Could a total absence of traffic reduce the regular commute by an equal amount, all the way down to 0 minutes? Not likely. The delta between Optimistic and Most-Likely, and Most-Likely and Pessimistic are not equal. This leads to a beta-PERT distribution with a positive skew. The central tendency tends to  land near the mode (as PERT is weighted towards the Most-Likely) and the standard deviation treats it as a normal distribution regardless of how the data is truly distributed. 
  • Once we calculate this faux-variance that is based upon the range of the data rather than its spread, what do we do with it? Many times, it is added in the form of a contingency reserve at the end of the project. This is no different than assuming that you might hit traffic and have a 120 minute commute, so you should always leave 120 minutes early. While this may be true depending on your risk exposure (think about a big meeting or interview!), wasting 60 minutes every morning can get old fast.
  • The original estimates were made up by someone. I’m not saying that this is a bad thing. The fact of the matter is that all estimates come from someone at some point, we just need to ensure their veracity. If it is an estimate about work that has never been done before, a Bayesian model may prove useful.

The Critical Chain Method calls for a series of buffers to be placed within it to account for the fact that it has no wiggle room whatsoever. The largest tends to be called the project buffer which sits at the end of the critical chain before it hits the end of the project. To protect the critical chain, other chains having feeding buffers to account for their uncertainty prior to merging. While CCM certainly has its merits, it also has its dangers:

  • Buffers are arbitrarily established. With CPM we used some type of equation to calculate the likely variance that we could encounter, but with CCM we are not utilizing the estimates provided to us. 
  • By not using realistic activity duration, we may need to increase the amount of buffer above whatever amount would have been necessary for a schedule that accounted for realistic duration and variance.
  • Practitioners may become accustomed to the manufactured sense of urgency and begin to pad their estimates, thereby negating the positive effects of CCM. This has a secondary effect of adding risk to the schedule.
  • Since all of the buffer is at the end of the project, there is little ability for the management team to address risks as they occur throughout the project work outside of delaying subsequent tasks.

An interesting solution to these issues is the Event Chain Method (ECM). This is definitely not a ‘cure-all’ that should be applied to every project, but should be considered for those projects where two major preconditions exist: the possibility of project failure due to schedule slippage, and the abundance of schedule risk. ECM examines activities and their uncertainties in a manner different form CPM or CCM.

The Intaver Institute does a great job describing how ECM addresses the two basic types of risk, aleatory and epistemic. When we think about how our commute is normally 60 minutes, but sometimes it is 58 minutes, or 61 minutes, that is what we would typically call common cause variation, or aleatory uncertainty. If you got a new job at a new location, how can you give a good estimate if you have never made the commute? Here we are dealing with epistemic uncertainty due to a lack of knowledge. ECM addresses these types of uncertainty through the regular use of Monte Carlo simulation and Bayesian Belief Networks.

Virine and Trumper wrote at that ECM has six principles (I will only focus on the first two):

  1. Moment of event and excitation status
  2. Event chains
  3. Event chain diagrams and state tables
  4. Monte Carlo analysis
  5. Critical event chains and event cost
  6. Project performance measurement with event and event chains

The first principle refers to what is typically called risk responses. When the environment changes, so will our actions. By assigning a ground state and various levels of excited states for different possible triggering events and related outcomes, we can visually represent the different possibilities on the same schedule model instance – typically a Gantt chart. Another consideration on this principle is when the risk may occur in relationship to the timeline of an activity, even as it occurs. Too often, we look at risk as binary events of either happening or not, normally before an activity begins. I know I’m not the only one guilty of going to work and leaving my umbrella in the car because it’s not raining, but then it starts raining later and I get soaked!

The second principle is of event chains, and what are sometimes called second and third order effects, or the law of unintended consequences. When there is a problem on a project, most project managers will have a knee-jerk reaction and cause more harm than good. By offering built in responses, in the form of the aforementioned excited states, we can offer immediate actions without unplanned workarounds becoming necessary.

When it comes to project scheduling, there is no one-size-fits-all. Different methods will work well in different situations. This is one that can work well in a project that is facing potential failure through a large amount of risk.

Karl Cheney
For more posts, check out our blog at:
Castlebar Solutions

Subjectivity and Risk Scoring

One common misconception that I was guilty of subscribing to is that while qualitative risk analysis always has some degree of subjectivity, quantitative risk analysis should remain strictly objective. My school of thought has since shifted that there is no such thing as truly objective data with which to conduct such an analysis, because if nothing else, even if the measurements are truly reproducible and repeatable, there is an observer bias that is injected into the mix. Rather than fighting the subjectivity, it’s much easier to just accept it. I know it sounds like a terrible idea, but bear with me.

In a previous post, I described pulling marbles from a bag using a Bayesian technique to develop an estimate. I actually had a few people get in touch with me about how ridiculous it is just automatically assume that there is an equiprobability of marbles without any prior knowledge as to the contents. Lacking any prior knowledge or experience, we are using a Bayesian concept of uninformative prior which gives us a general starting point. This concept demands indifference, and until we find out otherwise we will assume that there is an equiprobability.

But what if we had prior knowledge or experience? If I spent my youth playing marbles, perhaps I could tell you that a bag of that size and weight likely contains about 20 marbles. Could we use this to our advantage? Absolutely. We can now reasonably assume that the red marble will be drawn at a minimum of 5% of the time. This is what is often referred to as a priorpriori, or informative prior. But that’s just a minimum, we could be facing a significantly higher percentage. However, an incomplete data set is what drove use towards a Bayesian technique in the first place. We now have the option of either using the informative or the uninformative prior for our first round of testing.

This is something that pains people to discuss, because we are now possibly considering 5% likelihood as the low end of the spectrum based upon my personal experience. Before we grab the pitchforks and torches, let’s remember that expert judgment is regularly called upon throughout planning various aspects of a project and that human input remains important. This is not to say that we should not debate the veracity of this estimate, because we should! But let’s not debate the source, let’s not argue solely because it was based upon a person’s opinion rather than empirical data.

At the end of the first test, when we draw the first marble and record the result, we have our first posterior. However, when we go to run our test again, we will change the equation that we use to calculate probability. We will now have a new likelihood for drawing a red marble, as the posterior from the previous experiment becomes the next experiment’s prior.

As further data is collected, our calculations will continue to evolve and with this additional data we will develop refined probabilities. Here in lies the argument that people have against Bayesian methods for risk management in a predictive life cycle project: if planning is done upfront, how can appropriate plans ever be completed if the results of risk management activities continue to change? The problem becomes one of attempting to conduct planning in a vacuum rather than the methodology we are using for risk.

When the facts change, I change my mind. What do you do, sir? – John Maynard Keynes

It goes without any great argument that planning within a project is iterative and ongoing. In fact, project managers regularly engage in what is called progressive elaboration where a plan is regularly refined as more information becomes available. Risk management is no different. One of the processes invoked is Control Risks, where risks are supposed to be reassessed at a frequency determined within the risk management plan. This reassessment is supposed to determine if a shift in probability and/or impact has occurred since last assessed.

This type of reassessment should be intuitive for most people. I have lived in New Jersey for a number of years, and it gets some great weather – especially in the winter time. When we receive word that we are going to have a major snow storm, I try to check the weather every 3-4 hours to see if it’s still coming towards us and how much snow we are going to get. It’s a running joke that if they call for the storm of the century, we’ll get flurries; but the opposite holds true, as well. Think about how unreasonable it would be for me to watch the news once and tell the kids that they don’t have school next week!

Risk is uncertainty. The thing that hurts most projects is that we try and turn that uncertainty into certainty, which just does not work. I embrace everything in terms of likelihood of occurrence based upon probability developed from data. There’s an 80% chance of snow? I like to tell my kids there’s a 20% chance that they’re going to school. Risks can be managed, and we can do our due diligence to gather as much data as possible.

Karl Cheney
Castlebar Solutions

Bayesian Risk Management

One area of project management that stumps a lot of people is how we come up with the probability and impact data for quantitative analysis. This is something that is not discussed with any depth in the PMBoK, or even PMI’s Practice Standard for Project Risk Management. In fact, it is summed up succinctly in two paragraphs under Data Gathering and Representation techniques as basically “collect the data through interviews” and then “create a probability distribution”. For the record, I am definitely not criticizing PMI for this approach, as entire books are written, and fields of study based, upon what is described in those two paragraphs. Also, just a friendly reminder this is ONE way to prepare estimates for probability and impact.

If you’ve never heard of Bayes, you’re in for a treat. If I had to sum up the work of Thomas Bayes in one sentence, I would say that it allows you to make inferences with incomplete data. His work has evolved into fields of study from Game Theory to Statistics. Right now, I want to concentrate solely on Bayesian Statistics.

We can all agree that the initial period of a project life-cycle is when uncertainty is highest. This uncertainty is inherent in any project that is stood up, and it invariably decreases as planning is conducted. Project Risk is negatively correlated with Planning. This is not to say that planning can eliminate all risk, because that is impossible – but we can reduce uncertainty through concerted planning.

Risk management should begin at the start of a project. When I would find myself assigned to a project, one of the first things I sought to identify is what I needed to look into. What uncertainties are out there that I must address? Of course, the identification is the easy part! Relative estimation through qualitative risk analysis is the next step, and can be a fun exercise by ranking risk using animal names. Personally, I like to use chickens, horses and elephants. But what about when we got to quantitative analysis? Now we are no longer comparing one risk against another, but trying to determine numeric values for probability and impact.

Quantitative analysis can be especially difficult to do with any degree of accuracy if your organization has no historical experience in this type of work, or if the solution’s technological maturity is lacking. How can we make estimates about uncertainty, when we’re so uncertain about that with which we are uncertain? Management Reserves, per the PMBoK, are set aside for unidentified and unforeseeable risks – so it’s too late, we’ve already identified it, we own it, and it would be irresponsible to not plan for it.

Complexity and technical risk are not new challenges during quantitative analysis. I have read many papers on the topic, but I’m quite fond of a RAND Corp Working Paper by Lionel Galway which addresses the level of uncertainty inherent in complex projects, stating:

One argument against quantitative methods in advanced-technology projects is that there simply is not enough information with which to make judgments about time and cost. There may not even be enough information to specify the tasks and components.

I’m inclined to agree with Lionel that it is very difficult to make judgments with any degree of certainty when we’re lacking solid information. Risk data quality assessments are something called for by the PMBoK to test the veracity of the data we use. So how can we move forward?

Scott Ferson gives us a road map in a great article about Bayesian Methods in Risk Assessment. If you’d like to see the math side of this, please check out the article – I’m staying strictly conceptual. In this article, he used a scenario that described these concepts quite well: a bag of marbles. You have a cloth bag full of marbles. Well, you think it’s just marbles in there – but you don’t know and you can’t peek inside. Ferson is kind enough to tell us that there are five colors inside, including red – so we know red is possible but we don’t know if there is equal representation for all five colors. If we pull out one marble, what are the odds it will be red? This scenario has incomplete data, just like what most project teams have at the beginning of a project.

This comic does a great job introducing the two schools of thought for statistics that we’ll examine, and pretty quickly you will see why I am a fan of the Bayesian approach. This is not to totally discount frequentist probability, as I use it on a regular basis while conducting Six Sigma initiatives; however, it just does not work for our bag of marbles.

Determining a frequentist probability would require first establishing a key population parameter, its size: how many marbles are in the bag? Next, we would calculate a sample size based upon: population size, desired confidence level, precision level and the fact that we are working with discrete data. If the P value is too high, we can increase the sample size to increase of confidence that the results are not by chance. Based upon the sample, we can make statistical inferences about the population, and eventually we could establish the probability of drawing a red marble.

Just a couple problems here… I told you that you have a bag of marbles, but don’t forget that you’re not allowed to peek. You just have to tell me what the odds are of you drawing a red marble. But you don’t know the population size and you cannot draw a sample. The first marble you will see will be the one for which you were supposed to determine probability. Lacking any data makes frequentist probability calculations an impossibility, and having incomplete data severely inhibits its effectiveness. So let’s look at another method. Since it was developed to deal with incomplete data, Bayesian statistics allows us to approach everything very differently.

Bayesian statistics becomes more accurate as more information becomes available. The first marble will have the least accurate estimate, with the estimates getting better with every subsequent drawing. A simplified version of the formula becomes (n+1)/(N+c); where n=the number of red marbles we’ve seen so far, N=total number of marbles sampled, and c=number of colors possible. So for the first marble, the probability is calculated as (0+1)/(0+5)=0.20. While it may or may not be correct, it is a starting point. For every marble sampled and returned to the bag, the formula will change and the accuracy of future estimates will improve. Glickman and Vandyk expand upon this usage by expanding into the application of multilevel models with the use of Monte Carlo analysis.

I can’t ignore the very human aspects of Bayesian statistics, which was captured well by Haas et al. as they described three pillars of Bayesian Risk Management: Softcore BRM, Bayesian Due Diligence, and Hardcore BRM. Softcore relies upon subjective interpretation of uncertainty, think about consulting your subject matter experts. Hardcore leans on mathematical approaches and statistical inference, while the Due Diligence mitigates the triplet of opacity by ensuring that facts do not override the expertise of authoritative and learned people.

The reality is that the majority of people working on projects use software to determine their quantitative estimates for specific risks that have been identified. However, I’m not a fan of answering someone’s question by stating “don’t worry, there’s software for that!” While you may never have to calculate risks in an analogue manner, it is worth knowing that if you are dealing with unfamiliar risk a Bayesian approach makes more sense than a frequentist approach. If you have historical information available and a well-defined population, by all means, collect a sample. But keep in mind, as I often tell people when I work Six Sigma, “don’t make the data fit the test, select a test that fits the data”.

Karl Cheney
Castlebar Solutions