Posts Tagged ‘prob23’

Problem 23 – Sums of Abundant Numbers

September 6, 2009

In problem 23, we are asked to find the sum of all of the numbers that are not the sum of two abundant numbers. We are given an upper bound on the numbers we’ll be summing, which is pretty convenient.

To solve this problem, I’ll first make a list of all the abundant numbers up to the bound, using a function I’ll call listOfAbundants. Then, I’ll tease out the numbers that aren’t sums of two abundant numbers as follows:

def solve():
    bunds = listOfAbundants(28123) # hardcoded upper bound
    bsums = set([i+j for i in bunds for j in bunds]) # numbers that are sums
    nonsums = set([k for k in xrange(1,28123) if not k in bsums]) # everybody else
    largest = reduce(lambda x,y:x>y and x or y, nonsums) # better upper bound?
    return sum(nonsums)

List comprehension is awesome. I think, possibly in a newer version of Python or something, one can avoid the “set([])” notation and use “{}”, known as “set builder notation”. But it didn’t work the first time I tried it on my computer, so I moved on (how’s that for persistence?). I have that line for the variable “largest” to see what the actual upper bound is, since the problem hints that 28123 isn’t a strict bound. It seems to work out that largest is 20161.

Now, that leaves the listOfAbundants function. I have a factor method I’ve used before, and I figure with a factorization it should be easy to sum divisors. As a reminder, the “factor” method I have returns an array of pairs (p,e), and the product of all of the p^e is the number. If I’ve got a method to sum divisors, call it sumDivs, then I can do:

def listOfAbundants(bound):
   """ Return a list of all abundant numbers less than bound """
   # we can start at 12, since that's the smallest abundant number
   return [k for k in xrange(12,bound) if sumDivs(k) > 2*k]

My first way to code how to find the sum of the divisors of a number is not very efficient, but I love how it uses list comprehensions, in concert with map/reduce. My idea was to first convert every pair (p,e) to the list [1,p,p^2,...,p^e]. Then, if a=[a_1,a_2,\ldots] and b=[b_1,b_2,\ldots] are lists of numbers, then sort of take the cartesian product a\times b, the map a\times b to integers by multiplication. So, e.g., [1,2,4] and [1,5] combine to [1,2,4,5,10,20]. Doing this iteratively (via reduce) and then suming the resultant list gives the sum of the divisors of the starting number:

def sumDivs(n):
    """ Returns the sum of the divisors of n """
    f = factor(n)
    # find all of the prime power divisors
    ppowdivs = map(lambda pe:[pe[0]**i for i in xrange(0,pe[1]+1))
    # combine them
    return sum(reduce(lambda x,y:[i*j for i in x for j in y], ppowdivs))

Of course, I’m using the property that says that the sum of the divisors of n, denoted \sigma(n), is a multiplicative function – meaning that if m and n are relatively prime then \sigma(m,n)=\sigma(m)\cdot \sigma(n). So if I know how to quickly compute \sigma(p^e), I can simplify the above code. I guess the idea has perhaps come up in a previous problem, but I also ran across it in my reading of Hardy and Wright’s “Introduction to the Theory of Numbers” book. Wherever you get it, \sigma(p^e)=\frac{p^{e+1}-1}{p-1}, the sum of the geometric series 1+p+p^2+\cdots +p^e. So, sumDivs simplifies a bit:

def sumDivs(n):
    """ Returns the sum of the divisors of n """
    f = factor(n)
    return reduce(lambda x,y:x*y,
                  map(lambda pe:(pe[0]**(pe[1]+1)-1)//(pe[0]-1), f))

So that’ll pretty much do it. But it feels wrong (not incorrect… just immoral or something). We’re factoring every single number up to the bound. Instead, we should be able to use some recursion, or some sieving, or something, to more quickly factor the whole interval of numbers we’re interested in.

And we don’t really need the whole factorization. If we know n=p^eq, where p is prime and p^e is the largest power of p dividing n, then \sigma(n)=\sigma(p^e)\cdot\sigma(q) (by multiplicativity). We have a formula (used above) for \sigma(p^e), and q<n so inductively we’ll have computed \sigma(q).

I’ve put together the following code to try to use this idea:

def sumDivsTo(bound):
    """ Returns an array, a, with a[i] = sumDivs(i), for i in [0,bound) """
    ret = [0 for i in xrange(0,bound)]
    ret[1] = 1
    for idx in xrange(2,bound):
        if(ret[idx] == 0): # idx is prime
            ret[idx] = idx+1 # 1 and idx divide idx
            m = idx # the smallest multiple we need to consider
            while(idx * m < bound):
                # tell our future selves (when idx=idx*m) that idx divides idx*m
                ret&#91;idx * m&#93; = idx
                m += 1
        else: # ret&#91;idx&#93; is the smallest prime dividing idx
            p = ret&#91;idx&#93;
            e = 1
            while(idx % p**e == 0):
                e += 1
            e -= 1 # our loop went one too far
            q = idx // p**e
            # now idx = p^e * q, (p,q)=1, p prime
            ret&#91;idx&#93; = (p**(e+1)-1)//(p-1) * ret&#91;q&#93;
    return ret

I think there is still room for improvement here. When we run across a prime, instead of just noting that it divides all of it's multiples, I feel like there should be some clever way to note what power of that prime divides those multiples. Then we wouldn't have to find that power in the loop of the else clause. I guess this is mostly just shifting around when the loop happens, so the improvement might not be vast.

I do want to point out that this same idea can be used to produce all of the factorizations for numbers in some interval [1,b] without factoring each individually. Which is probably helpful at some point.

Now that we've got a more efficient way to find the sum of divisors for all numbers up to our bound, we can re-factor the solve method. The listOfAbundants method never really did much, we can wrap it up into solve in a line. What I'm envisioning now is to get rid of some of that ending code of solve, where we build all of the sums of two abundant numbers, and then basically take the set complement. Instead of doing all that, perhaps there is some improvement to be gained in looping through the numbers in our range, subtracting abundant numbers and seeing if the difference is still abundant. My solve method becomes:

&#91;sourcecode language="python"&#93;
def solve():
    bound = 28123 # the given upper bound
    sds = sumDivsTo(bound) # record all of the sums of divisors
    bunds = &#91;k for k in xrange(12, bound) if sds&#91;k&#93; > 2*k] # the abundants
    ret = 0
    for n in xrange(1,28123):
        i = 0
        isSum = False
        while(bunds[i] < n and not isSum):
            d = n-bunds&#91;i&#93;
            if(sds&#91;d&#93; > 2*d): # is n-bunds[i] abundant?
                isSum = True
            i += 1
        if not isSum:
            ret += n
    return [ret,t]

We could, of course, start our loop at 25, with a starting ret value 1+2+\cdots+23, since 24 is the first number that is the sum of two abundants (since 12 is the first abundant).

This code runs much faster than my first version, taking roughly one-fifth the time. With this faster solve, avoiding the set nonsense, the first version seems to take about 133% as much time as this new version. That is, the big improvement in this problem came from the better solve method, not the better sumDivsTo method.