## Problem 1 – nah

by

Ok, I know problem 1 has been solved several times. But I’ve continued messing with it, and various versions of my programs, and comparing run-times, and so I thought I’d share.

To run my python scripts from the command line, and do performance testing, it’s helpful to have the scripts take arguments. The natural arguments for problem 1 are the upper bound and the list of divisors. So I’d call my program as “python script.py 1000 3 5”. This lets me quickly change the upper bound, or the list of divisors, outside of the code. So, how do I get at those variables in python? I found out from this page at diveintopython.org (which I should probably poke around more).

The first solution I’d like to test uses the idea of my first solution: make a set representing all the multiples, and then simply sum the set. My code is:

```import sys, time
from math import ceil

t = time.time()

# the first argument is the upper bound (excluded from the sum)
# python treats all the arguments as strings, so we have to convert to ints
bound = int(sys.argv[1]) - 1

# the remaining arguments are the divisors, convert them all to ints using map
# i like the "slice notation" python uses for sub-lists
divs = map(int, sys.argv[2:])

multiples = set()

for n in divs:
s = set(n*i for i in xrange(1, bound//n + 1))
multiples |= s # faster than = multiples | s

t = time.time() - t

print sum(multiples), "in", t, "sec"
```

I should mention that Chris’ recent code pointed out to me that you can just subtract one from the upper bound, and then do no any extra checking about including it or not.

Next up, we use the idea Chris has explained about directly computing the sum, instead of messing about making lists of multiples. We should expect this to be an essentially constant-time algorithm (in the size of the upper bound, anyway, not the number of divisors) since there is no looping involving the bound. Since I want to allow any number of divisors, I’ll have to iterate over subsets of the divisors, which are conveniently in 1-1 correspondence with with binary strings with length number-of-divisors. Anyway, here’s what I came up with:

import time,sys
t = time.time()

# reduce(gcd,list) to get gcd of more than 2 numbers
def gcd(a,b):
if(b == 0):
return a
return gcd(b,a % b)

# reduce(lcm,list) to get lcm of more than 2 numbers
def lcm(a,b):
return a*b / gcd(a,b)

# 1 + 2 + … + n
def sumton(n):
return n * (n + 1) // 2

# sum all multiples of n up to and including b
def summultiples(b,n):
return n * sumton(b // n)

# convert int to binary string, from
# http://code.activestate.com/recipes/219300/
# sweet lambda function
bstr_ = lambda n: n>0 and bstr_(n>>1)+str(n&1) or ”
# convert n to binary string, with width at least w
# padded on the left with 0s
def tobin(n, w):
base = bstr_(n)
spaces = w – len(base)
if(spaces < 0):          spaces = 0      return spaces*"0" + base # nice syntax... spaces copies of "0" # first argument is the upper bound (excluded) bound = int(sys.argv[1]) - 1 # the remaining arguments are the list of divisors divs = map(int,sys.argv[2:]) sum = 0 # for every non-empty subset of divisors for n in xrange(1,2**len(divs)):      bstr = tobin(n, len(divs))      height = bstr.count('1')      sign = (-1)**(height - 1) # appropriate minus signs      thesedivs = [] # represents this subset of divisors      for p in xrange(1,len(divs) + 1): # let's play with negative string indexing # making the 2^p digit correspond to the p-th divisor in divs if(bstr[-p] == '1'): thesedivs.append(divs[p - 1]) sum += (sign * summultiples(bound,reduce(lcm,thesedivs))) t = time.time() - t print sum, "in", t, "sec" [/sourcecode] This entire solution, by the way, can be done in very few lines with sage, since it has more built-in commands. More on that another day though, I guess. Anyway, I then did some performance testing, for grins. For bounds between 1000 and 10000, in multiples of 500 (so, 1000, 1500, 2000, ... , 10000), I ran each of the above scripts 50 times, and calculated the average runtime. I'd post my shell script here, but it's nasty. If I continue trying performance testing, and clean it up a bit, perhaps I'll write a post. In the mean time... So I took all the numbers out, and played with the Google Charts API for a bit, and came up with a decent graph:

Performance Test Comparison

I’m tired of messing with it, so I’ll just tell you here that the scale on the $x$-axis is thousands, and the scale on the $y$-axis is milliseconds. Now that I’ve done this once, it might go quicker in the future, but don’t count on seeing too many more of these from me.

One final word – a wordpress tip I just stumbled upon accidentally. After doing the “sourcecode” tag, and copying your program in, if you click the “HTML” button toward the top of the editor box, and then click back on “Visual”, things work out more nicely, in terms of editing the sourcecode. At least, they did for me.

Update 20090518: Corrected the lack of links to Chris’ post.

Advertisements

Tags: ,

### 2 Responses to “Problem 1 – nah”

1. Performance Testing « Leonhard Euler’s Flying Circus Says:

[…] Each of the scripts (that I want to test anyway) is set up to take command line arguments (see my other post for how to do that). The script below assumes that the argument that is changing comes first, and […]

2. sumidiot Says:

In the second solution above, instead of getting height = bstr.count(1), and then calculating the sign (lines 46-47), it might more sense to determine sign as (-1)**(len(thesedivs)-1) just after the if statement in the for loop (so, between lines 53 and 54). This isn’t a big difference, but it seems like since you’re already looping through the string looking for 1s, you’re basically redoing the ‘bstr.count’ calculation.