Archive for May 20th, 2009

Problem 2 – various approaches

May 20, 2009

Allow me to begin with a minor improvement of the standard while-loop solution to problem 2. Noticing that every third Fibonacci number is even, we might as well only calculate F(3n), if possible. Of course, Fibonacci numbers naturally come in adjacent pairs, so perhaps it’d be easy enough to calculate b(n)=F(3n) and c(n)=F(3n+1). With starting values b(0)=0,c(0)=1, it’s easy to determine the recurrence b(n+1)=b(n)+2\cdot c(n) and c(n+1)=2\cdot b(n)+3\cdot c(n). Now we loop over the b(n), until they’re past the upper bound (4000000 in the problem as written). Coded up, I’ve got it as follows (I trimmed some import and timing and print lines out):

b,c = 0,1
sum = 0
bound = int(sys.argv[1])

while(b <= bound):
    sum += b
    b,c = b+2*c, 2*b+3*c

In my mind, I expect this to be marginally faster than the while loop as used in other solutions, because we're essentially doing all the same arithmetic, but only 1/3 as many tests (and fewer multiple-assignments). I'll return to performance comparisons shortly.

The other solution I'd like to give code for is the generalization of Tim's <a href="">mathematical solution</a> to any bound. Again trimming out some import (you need to import math for this to work in pure Python - no Sage), timing, and print statements, I've got:

def geom(b,p):
    """ sum (b^3)^n from n=1 to n=p """
    return (1-(b**3)**(p+1)) / (1-b**3)

bound = int(sys.argv[1])

alpha = (1 + sqrt(5)) / 2
beta = (1 - sqrt(5)) / 2

# what index is the largest even fibonum less than our bound?
# the factor of sqrt(5) on the bound comes from the idea that
# fib(n) ~ alpha^n / sqrt(5)
# and the factor of 3 on alpha is because we want even fibonums
maxn = floor(log(bound * sqrt(5)) / log(alpha**3))

# formula for sum of the even fibonums up to the bound
sum = int((1/sqrt(5)) * (geom(alpha,maxn) - geom(beta,maxn)))

No loops at all. Nice. Of course, there’s a lot of floating point operations going on there. How does that affect performance?

The following chart compares my solution (called “evens” in the chart), with the solution calculating all Fibonacci numbers (called “all”), and Tim’s (“noloop”). The tickmark n along the x-axis indicates that the upper bound (maximum even Fibonacci number to sum to) is 4\cdot 10^n (why the 4? the original problem had one, is all). Times are measure in milliseconds on the y-axis.

We see that, indeed, the “noloop” solution looks approximately constant-time, and that the “even” loop slightly beats the “all” loop. What strikes me as more interesting, at first anyway, is that both looping solutions beat the noloop solution at the bound given in the problem (the tickmark 6 on the x-axis). After some thought, it almost makes sense, because there are only 11 even Fibonacci numbers less than 4000000, so it’s not like the loops are taking long.

I’m also intrigued by the sudden jump in both looping solutions between x=8 and x=9. These correspond to a bound of 4\cdot 10^8 and 4\cdot 10^9, respectively. Taking log base 2 of these, we get \approx 28.57 and \approx 31.89, right around 32, a noteable bound for my computer. In fact, the next even Fibonacci number after 4\cdot 10^9 is F(48)=4,807,526,976, which happens to be bigger than 2^{32}=4,294,967,296. And this is the number we’d make before ending the while loop.

I’m not quite done. Tim commented that for large numbers, \alpha was the dominating term in the expression for F(n). I wondered if the solution I coded up above, based on his math, would give different answers from the looping solutions for some particular bounds. Running a few loops, I noticed that the solutions disagree, unsurprisingly enough, at (some of the) bounds that are even Fibonacci numbers (which, generalizing the original problem, we should include). Here’s a chart using even Fibonacci numbers as bounds:

       Bound        Loop         NoLoop
           2           2           2
           8          10           2
          34          44          44
         144         188          44
         610         798         798
        2584        3382         798
       10946       14328       14328
       46368       60696       14328
      196418      257114      257114
      832040     1089154      257114
     3524578     4613732     4613732
    14930352    19544084    19544084
    63245986    82790070    82790070
   267914296   350704366   350704366
  1134903170  1485607536  1485607536
  4807526976  6293134512  6293134512

For the first several, the non-looping solution (as coded above) alternates between correct, and incorrect. It seems this comes from the \beta term of F(n) being negative, so that even or odd powers have some affect. It looks like the last incorrect value is at the bound 832040.

At this point, I think trying to make the non-looping solution work correctly for all even bounds is most easily patched up by just storing a table of the first several, and then computing it for any higher bounds. Or switching between a loop and non-loop solution based on the bound being big enough.

Or, Sage does symbolic manipulations, let’s take the above non-looping solution, and drop it into Sage with an extra if-switch to see if the upper bound is an even Fibonacci number. I coded that as

if(bound == ((1/sqrt(5))*(alpha**(3*(maxn + 1)) - beta**(3*(maxn + 1))))):
     maxn += 1

and inserted that just after the “maxn = ” line above (so, at line 15). Running this, with the bound an even Fibonacci number, now always seems to agree with the looping solutions.

Ok, I think that’s all I have to say about problem 2.