## Posts Tagged ‘python’

### Problem 30 – Sums of Digit Powers

October 1, 2009

In problem 30 we are asked to find all the numbers that are the sum of the $n$-th powers of their digits. Well, the sum of those numbers. And $n=5$. You can just about do it in one line…

Most of this problem is fairly easy: Given a number, compute the sum of the powers of its digits, and see if you get the number you started with. If you wanna find all such numbers, just loop for a while. The trouble is how long? If you’ve got some range in mind you want to look at, you can do the following:

def solve(pow, low, hi):
""" Find the sum of all numbers that are the sum of their pow-th powers of digits in the interval [low,hi) """
return sum(filter(lambda n:n==sum(map(lambda d:int(d)**pow,str(n))) xrange(low, hi)))


Ok, sure, it’s a longish line, but it’s all there. Python’s syntactic sugar for handling all the little looping is awesome.

The remaining trick is to figure out those bounds. The problem mentions not using 1 as one of the numbers, so we could start at 2. However, that means we could start at 2**pow, because that’ll be the lowest digit sum we’d make. On the high end, the biggest digit-power-sum will come from digits being 9. So I figure that to find an upper bound, you can think about digit sums for longer and longer strings of 9. If you think about $k$ 9s in a row, that’ll give a digit sum of $k\cdot 9^p$ (if $p$ is the specified power). Find the largest $k$ where this value is still a $k$-digit number (or bigger), and you’ve got yourself an upper bound to loop to. Here’s some code to find the bound:

def bound(pow):
""" Find bound on nums that can be sum of their pow-th powers of digits """
if pow <= 1:
return 9
nine = 9**pow
coeff = 1
while(len(str(coeff*nine)) >= coeff):
coeff += 1
return (coeff - 1)*nine


Now, these bounds I’ve set up are pretty poor. In the case where the power is 4, as in the example problem, we’d loop from 2**4=16 to 32805, when the actual numbers that have the property we’re hoping for are between 1000 and 10000. We’ve overshot by something like 3 times too many values.

So, we could try to be more clever and come up with a smaller set to search in. But we could still use that one-line solver to pick from that set the appropriate values. And that makes me happy.

### Problem 29 – One Liner

September 29, 2009

In problem 29 we are supposed to determine how many unique values there are for the value $a^b$ where $a_0\leq a\leq a_1$, $b_0\leq b\leq b_1$ (all integers). I know there is clever counting one can do to determine this, but I thought for comparison I’d also just do up the brute force approach. Find all the values, get rid of duplicates, and count how many are left.

Python has a built-in type “set”, which will eliminate duplicates as I go. And you can ask for the length of the set. And list comprehension is awesome. So you can solve this in one line:

def solve(lowa, hia, lowb, hib):
return len(set([a**b for a in xrange(lowa, hia+1) for b in xrange(lowb, hib+1)]))


Having wrapped up solution code into a function, I also like to use command line arguments to pass in… well, arguments. In this case, I’d like to allow two different options for command line options. The stated problem has the bounds 2 to 100 for both $a$ and $b$. So if I take one number in, I want to assume it is the upper bound, and that 2 is the lower bound, for both variables. Alternatively, I could take all four bounds in as arguments. I know the following is crude, and doesn’t fail gracefully, but it’s got a lot of nice things going for it:

if __name__ == "__main__":
if len(sys.argv) == 2:
print solve(*(2*[2,int(sys.argv)]))
else:
print solve(*map(int, sys.argv[1:]))


I love this. Using * to change a list into the form my solve function expects, with 4 arguments… that’s awesome. Using [1:] to grab the arguments besides the called command… nice. 2*list to do a list concatenated with itself… delightful. And, of course, map. Who could forget map?

Anyway, I may return to the more mathematical, counting based solution for this. At the same time, though, this code is quick (to write and run (for the stated bounds)) and easily checked. Maybe I’ll leave the more mathematical solution to the other authors on this blog. I seem to recall that there were some…

### Prime Oracle

September 27, 2009

I sat down to work on problem 27, and realized it might be handy to have a class that’ll tell me when numbers are prime. Something that’ll keep some memory of tested values, and sieve when it needs to know about bigger numbers than have been tested. Here’s what I threw together:

""" A class that'll help you decide if things are prime

Make an object of this class, I'll call it oracle, and
oracle[n] is True or False depending on if n is prime or not

Assumes negatives are not prime
"""

class PrimeOracle:
# Takes a starting maximum to sieve to
def __init__(self, startmax = 2000):
# self.isprime[n] = True if 2n+1 is prime
self.isprime = [True for n in xrange(0,startmax//2 + 1)]
self.isprime = False
# run the sieve
for val in xrange(3, startmax + 1, 2):
if(self.isprime[val // 2]):
for mul in xrange(val*val, startmax + 1, 2*val):
self.isprime[mul // 2] = False

# is n prime?
def __getitem__(self, n):
if n == 2:
return True
if n<2 or n % 2 == 0:
return False
nidx = n//2
while(nidx >= len(self.isprime)):
# extend the sieve
# currently, we know the primality of 2*(len-1)+1=2*len-1
# we can extend that to its square, 4*len^2 - 4len + 1
# which would be index (4*len^2-4*len+1) // 2 = 4len(len-1)+1 // 2
# so the new length should be one more than this
# we already have length len, so we need to add (prev line) - len
curlen = len(self.isprime)
newlen = (4*curlen*(curlen-1) + 1) // 2
self.isprime += [True for n in xrange(0, newlen+1-curlen)]
maxknown = 2*curlen - 1
maxtoknow = 2*newlen - 1
for idx in xrange(1, curlen):
if(self.isprime[idx]):
val = 2*idx + 1
# start crossing off multiples larger than maxknown
sfact = 1 + (maxknown // val)
if(sfact % 2 == 0):
sfact += 1 # otherwise our indexing is wrong
startfact = max(val, sfact)
for mul in xrange(startfact * val, maxtoknow + 1, 2*val):
self.isprime[mul // 2] = False
return self.isprime[nidx]

# the largest index we know about
def __len__(self):
return 2*len(self.isprime) - 1

def __str__(self):
return "\n".join(["%d: %s" % (2*n+1, self.isprime[n])
for n in xrange(0,len(self.isprime))])


It’s basically just the usual sieve, and then it extends itself whenever necessary.

There’s room for improvement. We could remember the highest value we’ve sieved to, and then take our array and just tease out the primes from it. Then we only store the list of primes. When we go to extend, we make a new array to hold whatever the new values would be, sieve appropriately, and then just append to our list of primes. Here’s my modified __getitem__ method:

def __getitem__(self, n):
if n == 2:
return True
if n<2 or n % 2 == 0:
return False
while(n > self.sievedto):
# we have sieved to sst = self.sievedto (assume odd)
# we could sieve to sst^2 (will also be odd)
# that'd be [sst+1 .. sst^2]
# but really we might as well do [sst+2 .. sst^2]
# sst=2s+1 => [2s+3 .. 4s^2+4s+1] //2 is [s+1 .. 2s(s+1)]
# which has length 2s^2+s
# and index i corresponds to 2(s+1+i)+1
# so that an odd integer m is at m//2 - (s+1)
s = self.sievedto // 2
newsst = self.sievedto * self.sievedto
isprime = [True for idx in xrange(0,2*s*s+s+1)]
for pr in self.primes.keys():
sfact = 1 + (self.sievedto // pr)
if(sfact % 2 == 0):
sfact += 1
for mul in xrange(max(pr, sfact) * pr, newsst + 1, 2 * pr):
isprime[mul//2 - (s+1)] = False
for m in [2*(s+1+i)+1 for i in xrange(0,2*s*s+s+1) if isprime[i]]:
self.primes[m] = True
self.sievedto = newsst
return self.primes.has_key(n)


My first guess was that the second version would be better, because it would be storing less. However, it seems to be a little slower. I guess filtering out the primes at each extension has its cost. Or my implementation could use some tweaking? I wish I knew a good way to compare the size requirements of these scripts, in addition to how long they take to run.

By the way, I would be a little shocked to find that I’m not off by one somewhere (many somewheres) in the above code. Little tests seem to work out. But use are your own peril.

### Problem 28 – Spirals

September 27, 2009

In problem 28 we construct a spiral starting with 1 in the center, and build our way outwards clockwise. Then we are supposed to find the sum of the diagonals.

I figured there should be some formula for the sum of the diagonal entries, depending on the size of the grid you want to stop at. If we only compute diagonal elements when we have a square, the square will always have sides of odd length. So we should take a number $n$, and compute the appropriate sum of the elements in the $2n+1$ by $2n+1$ grid. I’ll write down the sum for the case $n=3$ in the following manner:

  1^2 + (1^2+2) + (1^2+2*2) + (1^2+3*2)
+ 3^2 + (3^2+4) + (3^2+2*4) + (3^2+3*4)
+ 5^2 + (5^2+6) + (5^2+2*6) + (5^2+3*6)
+ 7^2


That seems to make a pattern pretty clear. We’re adding up odd squares (the rising diagonal coming off of the central 1) and then some offsets of those squares. The formula for the sum of squares is $\displaystyle \sum_{i=1}^{m} i^2 = \frac{m(m+1)(2m+1)}{6}$

and we can write the sum of odd squares as $\displaystyle \sum_{k=1}^{m} (2k-1)^2 = \sum_{k=1}^{2m}k^2 - \sum_{k=1}^{m} (2k)^2 = \sum_{k=1}^{2m} k^2 - 4\sum_{k=1}^{m}k^2.$

Pushing some symbols around, I got down to the following formula for the sum of the diagonal elements in a $2n+1$ by $2n+1$ square: $\frac{16}{3}n^3 + 10n^2 + \frac{26}{3}n + 1.$

Of course, those thirds make me nervous. The answer is clearly supposed to be an integer. But it’s ok. Gathering up those thirds we get $\frac{2}{3}n(8n^2+13)+(10n^2+1)$

for the sum. Now, if $n$ is divisible by 3, we’ll get an integer in that first term. And if $n$ is not divisible by 3, it is either 1 or 2 mod 3, and so $8n^2+13\equiv -1(1)+1=0\pmod{3}$. So that third isn’t going to cause problems.

It’s pretty easy to code up the solution when you’ve got such a formula. By using an if-switch, I took into account the mod-3 thing above so that I was never working with floats. Here’s my code:

def solve(n):
ret = 10*n*n + 1
factor = 8*n*n+13
if(n % 3 == 0):
ret += 2*(n//3)*factor
else:
ret += 2*n*(factor//3)
return ret


Then I decided it would be a fun little exercise to actually cook up some code to generate the spiral. I knew this would have slower run-time, and require more resources, but I wanted to write down the code anyway.

The way I thought about it, I’ll keep track of my current coordinates $(posx,posy)$, starting at $(0,0)$. To go to the next place, I look to my right, and see if that spot is filled. If it is, I just continue on in whichever direction I was going; otherwise, I move to that spot to my right. This means I need to keep track of which direction I’m facing. There are only 4 directions, and they cycle, so I’ll think of direction as an integer mod 4. If I make 0,1,2,3 correspond to Up, Right, Down, Left, then (my direction)+1 is the direction to my right. After writing down a little grid of input directions, and what they meant about where to move next, I came up some little formulas, and was good to go.

I had originally thought I’d wrap up an implementation into a class, so that you could index via $[x][y]$, but then I realized that wouldn’t quite work, and you’d need to just index as $[(x,y)]$. At that point, I figured I might as well just use a dictionary type. And away I went:

def dx(dir):
""" interpret dir as 0,1,2,3 = N,E,S,W, return what dx is in that dir """
return (dir%2) * (2-dir)

def dy(dir):
""" dir as 0,1,2,3 = N,E,S,W return what dy is in that dir """
return ((dir+1)%2) * (1-dir)

def solve(n):
""" Build up a 2n+1 by 2n+1 grid, then sum diagonals """
grid = {}
posx, posy = 0, 0
idx = 1
dir = 0 # dir in 0,1,2,3 = N,E,S,W
while(posx <= n):
grid[(posx,posy)] = idx
idx += 1
# find the position to my right
rsposx,rsposy = posx + dx((dir+1)%4), posy + dy((dir+1)%4)
if(grid.has_key((rsposx,rsposy))):
# keep going the direction I was going
posx,posy = posx + dx(dir), posy + dy(dir)
else:
# turn right
posx,posy = rsposx, rsposy
dir = (dir+1)%4
# sum the diagonals
ret = sum([grid[(x,x)] for x in xrange(-n,n+1)])
ret += sum([grid[(x,-x)] for x in xrange(-n,n+1)])
ret -= 1 # the origin got counted twice
return ret


Only two days behind my artificial deadline too.

### Problem 25 – More Fibonacci Numbers

September 19, 2009

In problem 25 we are asked to find the first Fibonacci number with a given number of digits. Actually, we are only asked to find which Fibonacci number it is (its index in the sequence of Fibonacci numbers), instead of the number itself. We’ve already visited Fibonacci numbers in problem 2, and had useful work there. For this problem, it will be helpful to note that the number of digits of $n$ can be found as $\lfloor \log_{10} n\rfloor + 1$, where $\lfloor x\rfloor$ denotes the “floor” function, equal to the greatest integer less than $x$ (on a not-particularly-related note, this isn’t the formula I would have used a week ago, glad I got that cleared up).

I decided to do some performance comparisons with this problem, with a few different versions of code. [GRR. WordPress seems to not be doing the sourcecode “tag” like I thought it should below. Or I’m messing something up…]

In version 1, I’ll compute each Fibonacci number successively, using the recurrence relation, then calculate the number of digits until I’ve found a big enough number. My function looks like:

def solve(n):
ret = 1
prev, cur = 0, 1
while(cur &lt; 10**(n-1)):
prev, cur = cur, prev + cur
ret += 1
return ret

In the second version, I’ll compute the $n$-th Fibonacci number using the formula $F_n = \frac{1}{\sqrt{5}}\left(\left(\frac{1+\sqrt{5}}{2}\right)^2-\left(\frac{1-\sqrt{5}}{2}\right)^2\right)$
that was used last time we saw Fibonacci numbers. In fact, I’ll wrap that up into its own function, “digs_fib” that takes in an “n” and gives the number of digits of the appropriate Fibonacci number. Now my code is basically:
def fib(n):
""" The nth fibonacci number """
alpha = (1 + 5**.5) / 2
beta = (1 - 5**.5) / 2
return (1/5**.5) * (alpha**n - beta**n)
def digs_fib(n):
""" Number of digits of fib(n) """
return int(math.log(fib(n), 10)) + 1
def solve(n):
ret = 1
while(digs_fib(ret) &lt; n):
ret += 1
return ret

Of course, that second term in the formula is smallish, and raised to big powers it gets smaller, so we could maybe ignore it. And then we’re taking the log of products and powers, so we can use log rules to re-write the expression for the number of digits, without directly calculating the Fibonacci number. My “digs_fib” function changes to the following:
def digs_fib(n):
""" The approximate number of digits of the nth fibonum """
alpha = (1+5**.5)/2
return int(n*math.log(alpha,10)-.5*math.log(5, 10))+1

Finally, given a target number of digits $d$, we could just about solve for $n$ in $d \approx \lfloor n\log_{10}((1+\sqrt{5})/2) - \frac{1}{2}\log_{10}(5)\rfloor + 1$
and obtain $n\approx \dfrac{d-1+\frac{1}{2}\log_{10}(5)}{\log_{10}((1+\sqrt{5})/2)}$.
This means that given $d$ we can get an approximation to which Fibonacci number gives us that many digits in basically constant time. Now my “solve” is just
def solve(n):
num = n - 1 + .5*math.log(5, 10)
num /= math.log((1+5**.5)/2, 10)
return int(num+1) # round up, approximately

I ran each of these through my little performance evaluation script and came up with a reasonable graph. I could only go up to asking for 300 digits, because after that the second version fails, since the numbers are too big. The other versions have no problem at all going up to 1000 digits (and well beyond). But anyway, here’s my graph: You can barely see that constant time algorithm there along the $x$-axis. The scale of the $y$-axis is some linear change in actual time taken, but the shape of these graphs is what’s important anyway. Right?


### Problem 26 – Recurring Decimals

September 19, 2009

In problem 26, we are asked to find the integer less than a given bound whose reciprocal has the longest decimal period among such integers. That is, among all fractions $1/d$ where $d$ is less than some bound, which fraction has the longest periodic part in its decimal representation?

According to Hardy and Wright, An Introduction to the Theory of Numbers (6th ed, Theorem 135),

The decimal for a rational number $p/q$ between 0 and 1 is terminating or recurring, and any terminating or recurring decimal is equal to a rational number. If $(p,q)=1$, $q=2^{\alpha}5^{\beta}$, and $\max(\alpha,\beta)=\mu$, then the decimal terminates after $\mu$ digits. If $(p,q)=1$, $q=2^{\alpha}5^{\beta}Q$, where $Q>1$, $(Q,10)=1$, and $\nu$ is the order of 10 (mod $Q$), then the decimal contains $\mu$ non-recurring and $\nu$ recurring digits.

Let’s trim that down a bit. Our $p=1$, so automatically $(p,q)=1$ (this notation is for the greatest common divisor, as usual). If the denominator, $q$, is divisible by either 2 or 5, then either (1) the decimal terminates, or (2) the decimal repeats, with the same length recurring part as the reciprocal of a smaller integer. In either case, we will suppose that this $q$ is not the answer. Either (1) its recurring part has length 1, being a string of 0s, or (2) we could return the smaller integer with the same length recurring part.

So, suppose our $q$ is not divisible by 2 or 5. Then according to the theorem above, we may compute the length of the repeating part of the decimal for $1/q$ by finding the “order of 10 (mod $q$)”. That is, find the smallest $p$ so that $10^p\equiv 1\pmod{q}$. This is easy enough to find, just keep raising 10 to bigger powers, and taking the answer mod $q$, until you find 1.

Here’s my code

def solve(n):
""" Return the d with largest decimal period, d&lt;n """
max_d, max_p = 2, 1
for d in xrange(3,n,2):
if(d % 5 != 0):
# now find the order of 10 mod Q
# that is, the smallest p for which 10^p == 1 mod Q
p = 1
tmd = 10 % d # tmd = 10 mod d
while(tmd != 1):
p += 1
tmd = (tmd * 10) % d
if (p &gt; max_p):
max_d, max_p = d, p
return max_d

Pretty straightforward, I think. [Except, as of right now, WordPress doesn’t seem to be handling < and > correctly in that code. Grr]
There’s almost certainly more math we could do here. But this seems to work well enough for now.


### Problem 24 – Permutations

September 10, 2009

In problem 24 we are asked to consider the permutations of the digits 0 through 9 in lexicographic order, and then pick out a certain element of that list.

I wrote down the permutations, in order, of 0 through 2, and also 0 through 3. Looking at the list, it’s sort of easy to pick out some patterns. My first solution was to compute the desired permutation recursively, by first picking out the leading digit. For the digits 0 through 3, the first digit is 0 for (4-1)! terms, then 1 for (4-1)! terms, and so on. So we can compute the leading digit of the n-th element of the list of permutations by finding n/(4-1)! (well, the integer part). We then have one fewer elements to consider for permuting, and want the n%(4-1)!-th element of that list of permutations. Actually, this discussion assumes that the list is indexed starting at 0, so we have to be careful about that somewhere.

In code:

def factorial(n):
return reduce(lambda x,y:x*y, xrange(1,n+1), 1)

def solve(n, ar):
“”” Find the (n+1)th permutation of the elements of ar
Assumes that n >= 0, and ar[i] < ar[i+1] """ if(n == 0): return reduce(lambda s,x:s+str(x), ar, "") fact = factorial(len(ar) - 1) fidx = n // fact oidx = n % fact return str(ar[fidx]) + solve(oidx, ar[0:fidx] + ar[fidx + 1:]) [/sourcecode] This seems to be a pretty efficient solution, as we sort of expect. It has to recur no more than the number of digits you are permuting. Looking at the list of permutations of 0 through 3, it is entertaining to try to think about the algorithm to determine the next element in the list, given some chosen index. If we had such an algorithm, we could increment a counter, doing this as many times as necessary until we get up to the desired permutation. So how do we get from one element to the next? Staring at the list for a while, I came up with the following: Look at whatever string you are at, starting at the right end. Move left in this string, until the digit you are looking at is smaller than the one you just looked at. So, for example, looking at 2130 we would pick out the 1. Now consider all the digits from the 1 on, in this case 130. We need to make the next bigger number than this, from these same digits. There is a bigger number, because we know the second digit, 3, is bigger than the first, 1, so switching them would make a bigger number. But we want the smallest number bigger than the one we are looking at, 130. Find the smallest digit bigger than 1, move it to the front, and put all of the other digits afterwards, in ascending order. So we'd get 301 in the running example. With the 2 we had left out for the moment, this makes 2301 the successor of 2130. Code: [sourcecode language="python"] def solve(n, ar): tostr = lambda ar:reduce(lambda s,x:s+str(x), ar, ""); if(len(ar) == 1): return str(ar) cur = 0 while(cur < n): i = len(ar) - 2 while(i>=0 and ar[i] > ar[i+1]):
i -= 1
if(i < 0): # ar is the biggest string, assume n=len(ar)!-1 return tostr(ar) # right now, ar[i] < ar[i+1], for our variable i head,tail = ar[0:i],ar[i:] nsidx, tidx = 1, 2 while(tidx < len(tail)): if(tail[tidx] > tail and tail[tidx] < tail[nsidx]): nsidx = tidx tidx += 1 # tail[nsidx] is the smallest number in the tail bigger than tail remtail = tail[0:nsidx] + tail[nsidx + 1:] remtail.sort() ar = head + [tail[nsidx]] + remtail cur += 1 return tostr(ar) [/sourcecode] This method is, as you should expect, slower for the stated problem. However, I started wondering about generating all of the permutations. I rearranged the two above blocks a little bit to aim for this goal. In the first version, I just did [solve(n,ar) for n in xrange(0,factorial(len(ar)))], and in the second, instead of keeping track of indices, I just stored the ar after each iteration of the loop. With these changes, the iterative script actually seems to run faster.

### Problem 23 – Sums of Abundant Numbers

September 6, 2009

In problem 23, we are asked to find the sum of all of the numbers that are not the sum of two abundant numbers. We are given an upper bound on the numbers we’ll be summing, which is pretty convenient.

To solve this problem, I’ll first make a list of all the abundant numbers up to the bound, using a function I’ll call listOfAbundants. Then, I’ll tease out the numbers that aren’t sums of two abundant numbers as follows:

def solve():
bunds = listOfAbundants(28123) # hardcoded upper bound
bsums = set([i+j for i in bunds for j in bunds]) # numbers that are sums
nonsums = set([k for k in xrange(1,28123) if not k in bsums]) # everybody else
largest = reduce(lambda x,y:x>y and x or y, nonsums) # better upper bound?
return sum(nonsums)


List comprehension is awesome. I think, possibly in a newer version of Python or something, one can avoid the “set([])” notation and use “{}”, known as “set builder notation”. But it didn’t work the first time I tried it on my computer, so I moved on (how’s that for persistence?). I have that line for the variable “largest” to see what the actual upper bound is, since the problem hints that 28123 isn’t a strict bound. It seems to work out that largest is 20161.

Now, that leaves the listOfAbundants function. I have a factor method I’ve used before, and I figure with a factorization it should be easy to sum divisors. As a reminder, the “factor” method I have returns an array of pairs $(p,e)$, and the product of all of the $p^e$ is the number. If I’ve got a method to sum divisors, call it sumDivs, then I can do:

def listOfAbundants(bound):
""" Return a list of all abundant numbers less than bound """
# we can start at 12, since that's the smallest abundant number
return [k for k in xrange(12,bound) if sumDivs(k) > 2*k]


My first way to code how to find the sum of the divisors of a number is not very efficient, but I love how it uses list comprehensions, in concert with map/reduce. My idea was to first convert every pair $(p,e)$ to the list $[1,p,p^2,...,p^e]$. Then, if $a=[a_1,a_2,\ldots]$ and $b=[b_1,b_2,\ldots]$ are lists of numbers, then sort of take the cartesian product $a\times b$, the map $a\times b$ to integers by multiplication. So, e.g., $[1,2,4]$ and $[1,5]$ combine to $[1,2,4,5,10,20]$. Doing this iteratively (via reduce) and then suming the resultant list gives the sum of the divisors of the starting number:

def sumDivs(n):
""" Returns the sum of the divisors of n """
f = factor(n)
# find all of the prime power divisors
ppowdivs = map(lambda pe:[pe**i for i in xrange(0,pe+1))
# combine them
return sum(reduce(lambda x,y:[i*j for i in x for j in y], ppowdivs))


Of course, I’m using the property that says that the sum of the divisors of $n$, denoted $\sigma(n)$, is a multiplicative function – meaning that if $m$ and $n$ are relatively prime then $\sigma(m,n)=\sigma(m)\cdot \sigma(n)$. So if I know how to quickly compute $\sigma(p^e)$, I can simplify the above code. I guess the idea has perhaps come up in a previous problem, but I also ran across it in my reading of Hardy and Wright’s “Introduction to the Theory of Numbers” book. Wherever you get it, $\sigma(p^e)=\frac{p^{e+1}-1}{p-1}$, the sum of the geometric series $1+p+p^2+\cdots +p^e$. So, sumDivs simplifies a bit:

def sumDivs(n):
""" Returns the sum of the divisors of n """
f = factor(n)
return reduce(lambda x,y:x*y,
map(lambda pe:(pe**(pe+1)-1)//(pe-1), f))


So that’ll pretty much do it. But it feels wrong (not incorrect… just immoral or something). We’re factoring every single number up to the bound. Instead, we should be able to use some recursion, or some sieving, or something, to more quickly factor the whole interval of numbers we’re interested in.

And we don’t really need the whole factorization. If we know $n=p^eq$, where $p$ is prime and $p^e$ is the largest power of $p$ dividing $n$, then $\sigma(n)=\sigma(p^e)\cdot\sigma(q)$ (by multiplicativity). We have a formula (used above) for $\sigma(p^e)$, and $q so inductively we’ll have computed $\sigma(q)$.

I’ve put together the following code to try to use this idea:

def sumDivsTo(bound):
""" Returns an array, a, with a[i] = sumDivs(i), for i in [0,bound) """
ret = [0 for i in xrange(0,bound)]
ret = 1
for idx in xrange(2,bound):
if(ret[idx] == 0): # idx is prime
ret[idx] = idx+1 # 1 and idx divide idx
m = idx # the smallest multiple we need to consider
while(idx * m < bound):
# tell our future selves (when idx=idx*m) that idx divides idx*m
ret&#91;idx * m&#93; = idx
m += 1
else: # ret&#91;idx&#93; is the smallest prime dividing idx
p = ret&#91;idx&#93;
e = 1
while(idx % p**e == 0):
e += 1
e -= 1 # our loop went one too far
q = idx // p**e
# now idx = p^e * q, (p,q)=1, p prime
ret&#91;idx&#93; = (p**(e+1)-1)//(p-1) * ret&#91;q&#93;
return ret
&#91;/sourcecode&#93;

I think there is still room for improvement here. When we run across a prime, instead of just noting that it divides all of it's multiples, I feel like there should be some clever way to note what power of that prime divides those multiples. Then we wouldn't have to find that power in the loop of the else clause. I guess this is mostly just shifting around when the loop happens, so the improvement might not be vast.

I do want to point out that this same idea can be used to produce all of the factorizations for numbers in some interval $[1,b]$ without factoring each individually. Which is probably helpful at some point.

Now that we've got a more efficient way to find the sum of divisors for all numbers up to our bound, we can re-factor the solve method. The listOfAbundants method never really did much, we can wrap it up into solve in a line. What I'm envisioning now is to get rid of some of that ending code of solve, where we build all of the sums of two abundant numbers, and then basically take the set complement. Instead of doing all that, perhaps there is some improvement to be gained in looping through the numbers in our range, subtracting abundant numbers and seeing if the difference is still abundant. My solve method becomes:

&#91;sourcecode language="python"&#93;
def solve():
bound = 28123 # the given upper bound
sds = sumDivsTo(bound) # record all of the sums of divisors
bunds = &#91;k for k in xrange(12, bound) if sds&#91;k&#93; > 2*k] # the abundants
ret = 0
for n in xrange(1,28123):
i = 0
isSum = False
while(bunds[i] < n and not isSum):
d = n-bunds&#91;i&#93;
if(sds&#91;d&#93; > 2*d): # is n-bunds[i] abundant?
isSum = True
i += 1
if not isSum:
ret += n
return [ret,t]


We could, of course, start our loop at 25, with a starting ret value $1+2+\cdots+23$, since 24 is the first number that is the sum of two abundants (since 12 is the first abundant).

This code runs much faster than my first version, taking roughly one-fifth the time. With this faster solve, avoiding the set nonsense, the first version seems to take about 133% as much time as this new version. That is, the big improvement in this problem came from the better solve method, not the better sumDivsTo method.

### Whoops! 11 Redux

September 4, 2009

For some reason, this week I decided to actually sign up for an account on projecteuler.net, our source of programming exercises. Then I went back and entered the answers I’d gotten for my past solutions, and did pretty well. However, I messed up number 11 (in 4 versions!). I thought perhaps I should perhaps some corrected code.

I had two things incorrect. First, my inner loop wasn’t going far enough (shame!). Second, I was only accounting for one type of diagonal (shame!). Here’s my corrected (hopefully!) code:

def solve(grid):
t = time.time()
print grid
ret = 0
diffs = ( (1,0), (0,1), (1,1), (1,-1) )
for i in xrange(0,len(grid)):
for j in xrange(0,len(grid)):
for dt in diffs:
dx,dy = dt
try:
p = 1
for k in xrange(0,4):
p *= grid[i+dx*k][j+dy*k]
ret = max(ret, p)
except:
pass
t = time.time() - t
return (ret, t)


I feel like I’m cheating a little when I use try-catch blocks like that. I feel like I should just index into the array to places I know exist, instead of relying on try to sort it out otherwise. But I also like not worrying about indexing (it was the source of one of my errors, after all). I also like looping over the “diffs”, to compute the product across, or down, or diagonally (either diagonal).

Now, on to new problems! I hope to post a solution to at least one this evening, before the (well-extended) deadline.

### Problem 22

July 19, 2009

Just got back from dinner, depressed because a friend wanted to have a conversation about politics and things. Finally decided maybe some programming would pull me out of it. I’m not convinced it did, but whatever. It had the best shot of doing so.

In Problem 22 we’re supposed to read in a file, pull out the words between quotes, sort them, do some computation on each string to convert it to an int, and then take the sum of all of those words times their index in the list. I thought I’d see how little actual work I could do, which seems to translate to using lots of maps and lambda functions and such.

After a “from __future__ import with_statement”, it’s pretty easy to read in files:

def readfile(filename):
ret = []
with open(filename) as file:
for line in file:
ret += line.split(",")
return ret


The given file has all of the names on one line, so the loop doesn’t last long. Next up, convert that list to a bunch of numbers, and sum them up, with an appropriate product:

def process(names):
names.sort()
scores = map(lambda s:sum(map(lambda x:string.ascii_uppercase.index(x)+1,s[1:-1].upper())), names)
return sum([(i+1)*n for i,n in enumerate(scores)])


Perhaps some explanation is in order here. The first line, sort(), is pretty clear, I hope. In the next line, s[1:-1] strips the quote marks off of the name, and then the call to upper() converts all the characters to uppercase. This is probably superfluous, because the names all seem to be in uppercase to begin with, but better safe than sorry. Then I use the ascii_uppercase string provided by the string module, as documented here. This lets me easily convert letters to their position in the alphabet (remembering to add one because indices in strings start at 0). So my “scores” list above contains what the problem instructions call the “alphanumeric value” of each of the names. In the last line, I use the enumerate function (some documentation) to give me pairs: (index, value) in the list of scores (remembering, again, to add one to the index).

So that was fun.