Impact of using Planned vs Completed
“Tell me how you measure me, and I will tell you how I will behave. If you measure me in an illogical way… do not complain about illogical behavior…” thoughtful words by Eli Goldratt from his book “The Haystack Syndrome.” 
The main reason I recall this a recent conversation about measuring the performance of the Scrum Team. The most common approach I observe nowadays is to measure plan to complete ratio. Usually backed by the rationale that although the team velocity can be different and we can’t compare it, besides not even demanding the required velocity; however, “plan to complete ratio” gives predictability. It is commonly the first lever asked to move when working with a new team.
Let’s put the “output vs. outcome,” “effort vs. value,” and “lagging, leading and vanity metrics” discussion aside for a moment and trying to understand what the goal is. Define the goal for measurement concept is not new; Peter Drucker introduced this as “management by objective and self-interest” . He said that each manager to develop and set the objective for his unit himself . The famous OKR (Objective and Key Results) builds on it; however, instead of top-down, OKR balanced the concept of top-down and bottom-up .
In this case, the goal is the make velocity predictable; in other words, make their effort predictable. The first time I saw this concept as “Measuring release predictability” by Dean Leffingwell introduced in his book, later became one of Scaled Agile Framework’s primary sources . Dean never proposed to measure stories plan vs. completed; however, he measures business value plan vs. actual. Besides, it is not the developers who provide the business value.
One of the regular complementary practices is, calculating the team’s velocity by measuring the work done in the previous sprints. Measuring is usually done by estimating Product Backlog Item with another complementary practice, “Story Point.” The first thing to remember is that Sprint Backlog is a plan to accomplish the Sprint Goal. By making “planned vs. completed” predictable means, in other words, make sure our plan is predictable. What if we discover something during the Sprint? Shouldn’t we update our plan? If we don’t inspect and adapt our plan, we will do just a mini-waterfall; on the other hand, if we update the plan, but not the Sprint Backlog, our Sprint Backlog will not be transparent anymore. To put it differently, “planned vs. completed” means stick with the original plan even if there is any need to inspect and adapt.
Besides this, it just measures the effort, not the value. Let’s try to connect with few examples. How much do you care about the oil burn rate of an airplane as a passenger? Or the oven temperature who prepared the food while eating dinner in a restaurant? How useful the information about the number of commits in a new cell phone OS update for you as a consumer? Or the number of sales pitch phone call a salesman made in a month for your business?
Let’s assume we still want this as a first intermediate step and will move on to more meaningful metrics after some time. What is the problem with this approach? Let me introduce two more concepts before giving some examples. The first one is the Hawthorne Effect, which describes the inclination of an individual to change or improve the behavior in response to their awareness of being observed as a result of Elton Mayo’s observation during the Hawthorne Experiments . In simple words, individuals modify their behaviors to obtain a favorable perception when they are under the pressure of being observed, such as performing during the practice and in front of the audience.
The other concept is Goodhart’s law proposed by Charles Goodhart. He proposed, “Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purpose” . It was more popularized by Marilyn Strathern in simple words “When a measure becomes a target, it ceases to be a good measure.” 
These two things, “Hawthorne Effect” and “Goodhart’s law,” produce some unintended consequences to the team. Here is a list of a few behaviors I observed in the past when using this approach.
· Teams intentionally pull less work than their capacity to make their data look good at the end of the sprints, making their Sprint planning not transparent and lacks the openness, courage, and respect values here.
· Team focus more on PBI/Stories than Sprint goal, usually don’t even have a Sprint goal, which lacks the focus and commitment and respect to the stakeholder.
· Teams focus more on Story Point/Stories/Velocity/PBI than providing value to the stakeholder.
· Teams mark the PBI/Stories complete to make their data look good, although it is not done as per “Definition of Done.” As a result, the “Increment” is not transparent.
· Teams inflate the estimation and adding buffer to avoid having bad Sprint end data; this makes estimation non-transparent.
· Teams work extra hours, sometimes over the weekend, to complete the work to make their data look good; this will make their estimation non-transparent and create a non-sustainable pace for the team.
· Teams create some stories, usually in technical debt, design, and spike, only to claim them done at the end of the Sprint to make their reports look good.
· Teams first complete the work, then pull in the Sprint Backlog to make sure they will get credit at the end of the Sprint.
· Individuals in the teams inclined to pick PBI/Stories do have the expertise to make their data look good at the end of the Sprint, creating silos, local optimization, and lack of focus.
· If PBI/Story is partially complete, teams split it into two and mark one complete to claim the credit of partially complete work to make their data look good. It will possibly make inspection and adaption of Increment suffer.
· Teams introduce some stories as maintenance or similar work every Sprint and mark it complete at the end of Sprint to make their data look good to reduce the transparency.
· Teams break the PBI/Stories into a design, implement, test, and similar steps to claim credit of individual steps, causing a mini waterfall with no inspection and adaption.
I want to conclude this with a beautiful sentence written by Ralph Jocham and Don McGreal “If you punish bad news, you will only get good news — or, more accurately, camouflaged bad news made to look good.”  I understand and appreciate the good intention of this metric, but be careful of the unintended consequences of creating a “Cobra effect,”; a term coined by Horst Siebert. It is based on the story of the number of cobras in Delhi. The British government offered a bounty for every dead cobra to decrease its population, which initially works. However, people started breeding cobras to generate income. Thus, the government stopped the program. As a result, cobra breeders release their cobra consider them worthless, causing an increase in the population. The result is even worse than what it was initially, even with good intentions.
1. The Haystack Syndrome by Eli Goldratt
2. The Practice of Management by Peter F.Drucker
3. Measure What Matters by John Doerr
4. Agile Software Requirements by Dean Leffingwell
6. Problems of Monetary Management by Charles Goodhart
7. ‘Improving ratings’: audit in the British University System by Marilyn Strathern
8. The Professional Product Owner by Ralph Jocham and Donn McGreal