Going off a Hunch

Room for Improvement

As I reflect on our previous sprint here at InsightSquared, there is a hunch burning in the back of my mind telling me that we could develop and iterate on our stories more efficiently. As a scrum master, one of my responsibilities is to ensure that all stories my team is working on have momentum – frequent activity, both from the engineer working the story and the product manager (PM).

I have noticed that our stories, upon being delivered by an engineer, sometimes become stagnant while waiting for a PM to act on it. Minutes, hours, sometimes days go by before the assigned PM would accept or reject the story. Typically, a PM accepts the story and the team moves forward. However, when a story gets rejected I think to myself:

“If we knew this story was rejected sooner, would development the second time around have been faster?”

I need data to answer this question so I open an editor and start coding.

Pivotal Tracker, the project management tool we use, has an API I have played with but never used for statistical analysis. I need data and lots of it. Specifically, I need to get the history of every story that has been rejected and a timeline of state changes. To solve this problem, I opt for an entirely front-end approach for pulling, processing, and visualizing the data.

Fetch the Data

I start by leafing through Pivotal’s API docs, which is an incredibly thorough spec. Using cross-origin resource sharing (CORS) we can pull all the data we need via jQuery like so:

  var API_TOKEN = 'blah123'; // use your own API token
  var next_url =
  var response =
      url: next_url,
      async: false,
      timeout: 5000, // their API is beta, just in case something hangs
        function(xhr) {xhr.setRequestHeader('X-TrackerToken',API_TOKEN);}

This simple request to the tracker polls for the first 100 stories for project id 12345 that have been rejected or accepted and are “done” (not in the current Sprint). Don’t forget to include the API-token in the request header or your CORS request will fail.

The request returns an organized array of one hundred stories, neatly tucked away in a JSON data model. We can wrap the request in a function and use it to recursively poll the endpoint for data until we have everything. Since there is a hard-limit on how many results the tracker API will return per request, we need some pagination logic. This example limits results to 100 at a time.

After the fetching portion of the code is complete, the next step is to retrieve the data. Polling all four scrum teams for feature and bug stories, I can also fetch each activity for every story and filter out the stories that have not experienced a “Rejected” state in their history. To avoid having to poll the tracker API every time I want data, I also store all story activity data in a file locally.

  "51326669": [
    {       "kind": "story_update_activity",
            "highlight": "rejected",
            "occurred_at": "2013-07-05T01:15:46Z"

A snippet of story history, specifically an update including rejection

Process the Data

At this point, we need to process the activity history for each story and determine the duration each story is “alive” and “dead,” from the engineer’s perspective. Alive means an engineer is actively working on it – from the “Start” start to the “Deliver” state. Dead means an engineer has delivered the story and is awaiting an action from the PM (Accept/Reject). See diagram for more details.

Next, I populate an array of objects denoting alive and dead durations (in seconds), that looked like:

  [{"alive":[314089,14386],"dead":[11268]}, ...]

This can be interpreted as a story being alive for ~3.5 days (314089 seconds), then delivered and dead for ~3 hours (11268), then rejected, then finally alive again for ~4 hours (14386 seconds). Each alive and dead array is ordered chronologically, i.e. the first alive element is the first time it is alive, the second alive element is the second time it is alive, etc. I only added data for corresponding alive and dead durations. That is, each dead duration directly follows an alive duration.

Here is my logic for dealing with some interesting, non-linear alive & dead durations:

Start → Deliver → Start → Deliver → Reject → Start → Deliver

∴ Combine the first two alive durations into one, record first dead duration

Start → Deliver → Reject → Deliver

∴ No 2nd start, ignore it all

Start → Deliver → Accept → Reject → Start → Deliver

∴ We have two independent alive durations; the dead duration includes accepted state

Draw Conclusions

Iterating through all of the alive and dead durations of stories, I can calculate the mean alive duration after rejection. Initially, we have no takeaway; this is simply telling us the mean time it takes an engineer to take another stab at a story after it is rejected. To address the original question, I introduce a “rejection threshold” variable, a duration of time where a story is dead, to use as a standard.

For each story, analyze the dead duration to determine if it is above or below the threshold and bucket the alive (second-time around) duration into this threshold category. Find the mean of each category and compare to see how the time it takes to reject a story impacts development time on the second pass. Setting the threshold to 1 day, we get our first observation:

Stories that take over 1 day to reject are 26% shorter to deliver afterwards.

Not exactly the expected results, but just to be sure let’s do a bit more analysis. Looking at some of the alive durations, I notice some outliers – durations of 1 second, 3 seconds, 11 days, 14 days. Something is clearly fishy here, so I filter the data to disregard durations less than 30 seconds and greater than 1 week. The result?

  Stories that take over 1 day to reject take 55% longer to deliver afterwards.

This is a serious takeaway in the realm of productivity. We now have hard statistical data indicating a correlation between quickly responding to stories and the engineering resources needed to complete and redeliver the story after review.

With one promising result in hand, the next thing to find is the “Goldilocks” of rejection thresholds that tells us the longest a PM should idle on a story before further delay hinders productivity the second time around. The answer? 3 hours. From a rejection threshold of 2 hours to 3, our mean second-time development durations increase from 1% longer to 20% longer.

All-in-all, this is a truly fascinating discovery in our project management process. We can leverage these results to change the way we react to stories, improving utilization of engineering time. I encourage you to look at the source code attached if you’re interested in the Pivotal API, statistical analysis, and all the JavaScript in between. Most importantly if you have a hunch, follow it. If your hunch doesn’t prove true keep digging, you’ll find something.

Feel free to dive into the source code, zipped and available for download here.