Posts Tagged ‘prob1’

Problem 1 and Sage

May 17, 2009

In my previous post I mentioned that much of the work I did by hand in the second solution for problem 1, taking all subset of divisors and computing the appropriate sums without constructing sets of multiples, could be done behind the scenes in Sage. I’d like to share that solution now.

# 1 + 2 + ... + n
def sumton(n):
    return n * (n+1) / 2

# sum of multiples of n up to and including b
def sumults(b,n):
    return n*sumton(b//n)

bound = 1000 - 1 # subtract 1 to not include the bound
divs = Set([3,5]) # add as many divisors as you want here
sum = 0

for ss in powerset(divs):
    if(len(ss) > 0):
        sign = (-1)**(len(ss) - 1)
        sum += (sign * sumults(bound,lcm(ss)))

print sum

Notice that lcm is built-in in Sage, so you can save some writing with that. The main improvement comes from the powerset function, so you don’t need to mess with binary strings. According to the documentation, the subsets show up in no particular order, but when I run it the empty set is first. It seems there should be some cute way to not include the empty set at all, and thus remove the if-switch in the for-loop. I can’t quite see what that method would be though, and I guess we wouldn’t technically be allowed to assume the empty set shows up first, based on the documentation. I expect some playing around with iterators, and assuming the empty set comes first, could do it.

I thought I’d also share a helpful website: sagenb.org. There, you can sign up for a free account and have access to sage (and any saved work you want) online. Pretty handy. Also saves you installing it.

Advertisements

Problem 1 – nah

May 17, 2009

Ok, I know problem 1 has been solved several times. But I’ve continued messing with it, and various versions of my programs, and comparing run-times, and so I thought I’d share.

To run my python scripts from the command line, and do performance testing, it’s helpful to have the scripts take arguments. The natural arguments for problem 1 are the upper bound and the list of divisors. So I’d call my program as “python script.py 1000 3 5”. This lets me quickly change the upper bound, or the list of divisors, outside of the code. So, how do I get at those variables in python? I found out from this page at diveintopython.org (which I should probably poke around more).

The first solution I’d like to test uses the idea of my first solution: make a set representing all the multiples, and then simply sum the set. My code is:

import sys, time
from math import ceil

t = time.time()

# the first argument is the upper bound (excluded from the sum)
# python treats all the arguments as strings, so we have to convert to ints
bound = int(sys.argv[1]) - 1

# the remaining arguments are the divisors, convert them all to ints using map
# i like the "slice notation" python uses for sub-lists
divs = map(int, sys.argv[2:])

multiples = set()

for n in divs:
     s = set(n*i for i in xrange(1, bound//n + 1))
     multiples |= s # faster than = multiples | s

t = time.time() - t

print sum(multiples), "in", t, "sec"

I should mention that Chris’ recent code pointed out to me that you can just subtract one from the upper bound, and then do no any extra checking about including it or not.

Next up, we use the idea Chris has explained about directly computing the sum, instead of messing about making lists of multiples. We should expect this to be an essentially constant-time algorithm (in the size of the upper bound, anyway, not the number of divisors) since there is no looping involving the bound. Since I want to allow any number of divisors, I’ll have to iterate over subsets of the divisors, which are conveniently in 1-1 correspondence with with binary strings with length number-of-divisors. Anyway, here’s what I came up with:

import time,sys
t = time.time()

# reduce(gcd,list) to get gcd of more than 2 numbers
def gcd(a,b):
    if(b == 0):
        return a
    return gcd(b,a % b)

# reduce(lcm,list) to get lcm of more than 2 numbers
def lcm(a,b):
    return a*b / gcd(a,b)

# 1 + 2 + … + n
def sumton(n):
    return n * (n + 1) // 2

# sum all multiples of n up to and including b
def summultiples(b,n):
    return n * sumton(b // n)

# convert int to binary string, from
# http://code.activestate.com/recipes/219300/
# sweet lambda function
bstr_ = lambda n: n>0 and bstr_(n>>1)+str(n&1) or ”
# convert n to binary string, with width at least w
# padded on the left with 0s
def tobin(n, w):
     base = bstr_(n)
     spaces = w – len(base)
     if(spaces < 0):          spaces = 0      return spaces*"0" + base # nice syntax... spaces copies of "0" # first argument is the upper bound (excluded) bound = int(sys.argv[1]) - 1 # the remaining arguments are the list of divisors divs = map(int,sys.argv[2:]) sum = 0 # for every non-empty subset of divisors for n in xrange(1,2**len(divs)):      bstr = tobin(n, len(divs))      height = bstr.count('1')      sign = (-1)**(height - 1) # appropriate minus signs      thesedivs = [] # represents this subset of divisors      for p in xrange(1,len(divs) + 1): # let's play with negative string indexing # making the 2^p digit correspond to the p-th divisor in divs if(bstr[-p] == '1'): thesedivs.append(divs[p - 1]) sum += (sign * summultiples(bound,reduce(lcm,thesedivs))) t = time.time() - t print sum, "in", t, "sec" [/sourcecode] This entire solution, by the way, can be done in very few lines with sage, since it has more built-in commands. More on that another day though, I guess. Anyway, I then did some performance testing, for grins. For bounds between 1000 and 10000, in multiples of 500 (so, 1000, 1500, 2000, ... , 10000), I ran each of the above scripts 50 times, and calculated the average runtime. I'd post my shell script here, but it's nasty. If I continue trying performance testing, and clean it up a bit, perhaps I'll write a post. In the mean time... So I took all the numbers out, and played with the Google Charts API for a bit, and came up with a decent graph:

Performance Test Comparison

Performance Test Comparison

I’m tired of messing with it, so I’ll just tell you here that the scale on the x-axis is thousands, and the scale on the y-axis is milliseconds. Now that I’ve done this once, it might go quicker in the future, but don’t count on seeing too many more of these from me.

One final word – a wordpress tip I just stumbled upon accidentally. After doing the “sourcecode” tag, and copying your program in, if you click the “HTML” button toward the top of the editor box, and then click back on “Visual”, things work out more nicely, in terms of editing the sourcecode. At least, they did for me.

Update 20090518: Corrected the lack of links to Chris’ post.

n(n+1)/2

May 15, 2009

My new approach to Problem 1 is different from Eric’s. We all know the formula

1+2+3+\cdots+n = \dfrac{n(n+1)}{2}.

We can use this formula to very quickly add the multiples of 3 between 1 and 999, namely, add all the integers between 1 and 333, and then multiply by 3. Similarly, we can add the integers between 1 and 195 and then multiply by 5 to get the sum of all multiples of 5 between 1 and 999. Putting those together, we’ve double-counted the multiples of 15, so we subtract the multiples of 15 to get the sum of all multiples of 3 or 5 between 1 and 999.

In general, if you had more divisors, you’d have to use some sort of alternating sum/inclusion-exclusion principle argument to get rid of the stuff you double/triple/quadruple/etc.-counted.

a=3
b=5
c=999
d=a*b
e=c//a
f=c//b
g=c//d
a*e*(e+1)/2+b*f*(f+1)/2-d*g*(g+1)/2

I did not write code to handle more divisors. I suppose you could just stick my equation into Eric’s code, and that would work.

Problem 1 Rehash

May 15, 2009

Well Nick mentioned earlier today in the office that he had thought of a quicker way to accomplish the first problem.  I’m not sure if this is what he meant, but what I came up with is roughly this: you don’t need to run past all the multiples of each number (3 and 5 in our case) since you can run over the lcm instead.  There will be a fixed amount added each time you pass a multiple of the lcm.

So for example with the numbers 3 and 5 chosen as the divisors we’re interested in we could organize the multiples into a list as follows:

  • 3 + 5 + 6 + 9 + 10 + 12 +15 = 60
  • 18 + 20 + 21 + 24 + 25 + 27 + 30 = 165
  • 33 + 35 + 36 + 39 + 40 + 42 + 45 = 270
  • . . .

You’ll notice that the difference between each sum here is 105 = 7 * 15 which is the number of divisible numbers below the lcm times the lcm.  Well, you probably get the idea now . . . the only snag is the fudge factor: since 15*66 = 990 we need to pick up the last couple multiples above this guy before we get to 1000 (namely 993,996,999).

It seemed like a good idea to allow as many different dividing factors as you like at the same time, so I rewrote both mine and Nick’s solution to take an arbitrary list of numbers as an argument.  For low values like 1000, the lcm idea doesn’t add much.  But for higher maximum values and more factors, it’s a considerable speedup.

Here are some timing results for 3 and 5:

  • Max = 1000, Lcm = .00083, All = .00081
  • Max = 100000 Lcm = .12, All = .12
  • Max = 10000000 Lcm = 3.1, All = 6.3

Now for 3,5 and 7:

  • Max = 1000, Lcm = .00014, All = .001
  • Max = 100000, Lcm = .012, All = .14
  • Max = 10000000, Lcm = .51, All = 9.5

You can see that the Lcm method actually improves with more factors since we’re skipping ahead faster.  Here’s the (kind of messy) code:

def solution2(divisors = [3,5,7], m_val = max_val):
   nums_lcm = lcm(divisors)

   base_set = set([nums_lcm])

   for j in range(len(divisors)):
     cur_div = divisors[j]
     base_set |= set( cur_div * i for i in range(1, nums_lcm / cur_div) )

   # sum of all numbers below least common multiple
   base_sum = sum(base_set)

   # of summands in the previous example
   num_summands = len(base_set)

   reps = max_val//nums_lcm
   scale = num_summands * nums_lcm

   big_skip = nums_lcm * reps
   fudge_set = set( i + big_skip for i in base_set if i + big_skip < max_val )    fudge = sum(fudge_set)    t = time()    my_set = set( base_sum + (i*scale) for i in xrange(0,reps))    print_answer(sum(my_set) + fudge, time()-t) [/sourcecode] Note that I don't start the clock until after doing the setup calculations since technically this information would be given to us as part of specifying which factors to use. Finally, here is some number magic: the answers for higher maximum values in the case of the factors 3 and 5 are as follows:

  • 1000 => 233168
  • 10000 => 23331668
  • 100000 => 2333316668
  • 1000000 => 233333166668

Holy shit, no?

More solutions…

May 14, 2009

So mine are almost identical to Tim’s but I’ll post them anyway along with the time they take…

Problem 1:

sum = 0
for i in range(1000):
  if (i%3 == 0 or i%5 == 0):
    sum = sum + i
print sum

Time: 0.000477 sec

Problem 2:

fibsum = 0
a, b = 0, 1
while b < 4000000:    if b%2 == 0:      fibsum = fibsum + b    a, b = b, (a+b) print fibsum [/sourcecode] Time: 0.000148 sec

Mathematica

May 14, 2009

I tried the problem first in Mathematica, and then in Sage. Here is my Mathematica code:

j = 0;
For[i = 1, i < 1000, i++, j += If[Divisible[i, 3], i, If[Divisible[i, 5], i, 0]]]; j [/sourcecode] Mathematica claims a running time of 0.005759 seconds. Now here is the code I ran in Sage: [sourcecode language='python'] j=0; def is_divisible_by(number,divisor): return number%divisor == 0 for i in range(1,1000): if is_divisible_by(i,3): j+=i if is_divisible_by(i,3)==false: if is_divisible_by(i,5): j+=i print(j) [/sourcecode] Sage tells me that it took 0.02s to run that code. Then I saw some of your previouly posted solutions, and amended my code so that the for loop looked like [sourcecode language='python'] for i in range(1,1000): if is_divisible_by(i,3) or is_divisible_by(i,5): j+=i [/sourcecode] (i.e., I made my code almost exactly the same as some of the previously posted solutions). Now Sage says it takes 0.01s to compute the answer.

Rustic Solutions

May 14, 2009

Here are my solutions.  I don’t have any comments yet, but I thought I’d at least post once to see how this all works.  I have very little experience with programming, so my solutions aren’t particularly sophisticated.  Please let me know if I do something horribly wrong…

These gave me the right answers when I ran them with Python, but I might have distorted things when I posted them.  The spacing and indentation have changed.

a,b=0,0

while a< 1000:      if a%3==0 or a%5==0:      b=a+b a=a+1 print b [/sourcecode] [sourcecode language='python'] a,b,s=0,1,0 while a< 4000000:      if a%2==0:      s=s+a a,b=b,a+b print s [/sourcecode]

T.P.

Problem 1 – nah

May 13, 2009

Eric wins for the first solution. I think I can beat his execution time by approximately a factor of 2 though (on my computer, after running each a few times). The built-in set data structure also has a built-in union function, which I found in chapter 5 of the tutorial. So, my go at the problem is:

 from time import time

 t = time()

 threes = set(3*n for n in range(1,1000/3 + 1)) # multiples of 3
 fives = set(5*n for n in range(1,1000/5)) # multiples of 5

 either = threes | fives # set union

 print sum(either), "in", time()-t, "sec"

I also included the code to calculate execution time that I learned about from the blog “Numerical Recipes“.

Anybody wanna do up a more general solution? Maybe see how the different approaches scale to numbers higher than 1000, or with other choices of divisors (allowing more of them, perhaps)?

Have another approach? How does it compare, speed-wise?

Just a side-note: I also took my solution and put it all into one line, and avoided saving the sets as variables. This technique seemed to take, on average, just slightly longer to run.

Out of the gates in a hurry!

May 13, 2009

Well, I had already tried a bunch of these guys, and I want to try out this posting deal, so here’s my solutions to the first two.


#
#  problem1.py
#

nums = set(i for i in range(1,1000) if ((i%3 == 0) or (i%5 == 0)))

print sum(nums)

#
#  problem2.py
#

fib_last = 1
fib_cur = 1

ev_sum = 0

def isEven(num):
return (num % 2 == 0)

while fib_cur <= 4000000: if isEven(fib_cur): ev_sum += fib_cur temp = fib_cur fib_cur += fib_last fib_last = temp print ev_sum [/sourcecode]