Optimal TFS Gated Checkin Batching

July 2015

Problem Statement

In source control/build systems, gated checkins are a useful way to make sure your build stays stable. When you go to check in your code, it actually queues a build on some build machine and makes sure that your project can build, and possibly run a set of tests. If these all succeed, then the change goes in, otherwise it does not. Usually, only one gated build runs at a time, which can become a bottleneck if you have a lot of checkins.

In Microsoft's Team Foundation Server 2012, an option to batch gated builds was added. The way this works is that it will first build the next \(b\) checkins (the value of \(b\) is configurable). If that succeeds, then it will move on. If that fails, however, it then goes back to the "old" way and does a gated build on each checkin individually.

So the question is: given the probability that an individual build will succeed, what is the optimal number for \(b\)?

(note that this assumes there are always at least \(b\) checkins pending, and ignores things like merge conflicts)

To more easily compare different values of \(b\), let's consider the build ratio, which is the expected number of builds to do one checkin. Clearly we can get a build ratio of 1 by doing no batching, so we'd like to see how low we can make it!

One Phase

Let's call the batch gated builds described above (try to build all at once, then if that fails build each one individually) the one phase technique. (foreshadowing!)

Analysis

The analysis here is fairly straightforward. Let \(p\) be the probability of any individual checkin failing, and for convenience let \(q\) be the probability of a checkin succeeding (i.e. \(q = 1-p\)). So there's a \(q^b\) chance that we get to do 1 build for all of the checkins, but a \(1-q^b\) chance that we have to do \(1+b\) builds since we will try to batch them, fail, and then do them individually.

This works out to be $$\frac{q^b*1+(1-q^b)*(1+b)}{b}$$which simplifies down to $$\frac{q^b+(1+b)-(1+b)q^b}{b}$$ and then $$\frac{-bq^b+(1+b)}{b}$$or finally $$\mathbf{1+\frac{1}{b}-q^b}$$

So now it's a simple task of finding the value of \(b\) that minimizes this (assuming \(q\) is constant). Taking the derivative of this with respect to \(b\) and setting it equal to 0, we get \(-q^b\ln q-\frac{1}{b^2}=0\) Fun fact - this is very hard to solve! After spending a little while on this I gave up and looked to good old Wolfram Alpha.

Results

Here are the results, courtesy of Wolfram Alpha: (here's a link for q=.8)

qoptimal value of bbuild ratio
.83.82
.853.72
.94.59
.955.43
.966.38
.976.33
.988.27
.9911.20

This seems like a pretty good strategy - if somewhere around 92% of your checkins succeed you can cut your builds in half!

Interestingly, this article about batched gated builds says:

A potential drawback to this method is that if a batched build fails, you'll end up building the application more times than you would have if you didn't batch at all. Therefore, you should strive to keep the batch size to a small, manageable number, such as five or less.

I'm not sure if that remark was based on math or not, but it seems to hold up - you have to get up to a 96% chance of each build succeeding to make going up to \(b=6\) worth it, which is pretty high!

One of the interesting parts of this is that changing the value of \(b\) really doesn't change the build ratio very much. Here's a table for \(q=.9\):

bbuild ratio
2.69
3.60
4.59
5.61
6.64
7.66
8.69

This kind of makes sense intuitively - as \(b\) increases, you are more likely to have to do individual builds, but if you don't you save a lot of builds.

Two Phase

Here's an interesting idea - what if we used a different technique? If \(b\) is even, we could start by building all \(b\) checkins at once. If it works, great, but if not, then we split it up into two chunks of \(\frac{b}{2}\) and build each of those under the one phase strategy. Would this do any better? Let's see!

Analysis

There are three possibilities here:

Adding all these together and dividing by \(b\) to get the build ratio, we get $$\frac{q^b*1+2q^\frac{b}{2}(1-q^\frac{b}{2})*(3+\frac{b}{2})+(1-q^\frac{b}{2})(1-q^\frac{b}{2})*(3+b)}{b}$$which simplifies to$$\frac{q^b+(2q^\frac{b}{2}-2q^b)(3+\frac{b}{2})+(1-2q^\frac{b}{2}+q^b)(3+b)}{b}$$which simplifies to$$\frac{q^b+6q^\frac{b}{2}+bq^\frac{b}{2}-6q^b-bq^b+3+b-6q^\frac{b}{2}-2bq^\frac{b}{2}+3q^b+bq^b}{b}$$which simplifies to$$\frac{3+b-2q^b-bq^\frac{b}{2}}{b}$$which is good enough to feed to Wolfram Alpha!

Results

Here are the results, courtesy of Wolfram Alpha: (here's a link for q=.8) - note that I had to find the best even value of \(b\).

qoptimal value of bbuild ratio
.86.90
.856.76
.96.59
.958.39
.968.35
.978.29
.9810.23
.9914.16

Hmm, I thought this would be better, but it's worse than the one phase strategy until around \(q=.9\) and then it gets better. Oh well!

I'm also curious about the more general \(n\)-phase strategies but I haven't had time to work them out...


Fancy math equations by MathJax.