Problem 198 – Ambiguous Numbers

December 10, 2009 by sumidiot

In problem 198, a number, x, is defined to be ambiguous for the denominator d if x has two best approximations by rationals with denominators no bigger than d. And then an x is ambiguous if it is ambiguous for some denominator. The problem is to determine how many ambiguous rationals there are in a certain interval, with a certain bound on the denominator.

Yes, I’m getting to this problem a little out of order. Jaime told me to. I was talking about Neighbors in Farey Sequences, on another blog, and he said this problem would fit well with what I was doing. In fact, if you don’t know about Farey sequences, you should probably go to that post (or Wikipedia) and read up, because I’m going to assume you know about them.

Once you have read a little about Farey sequences, you realize a number is ambiguous for a denominator if the number is the midpoint of neighbors in the appropriate Farey sequence. Moreover, the midpoint of Farey neighbors is always reduced, so you don’t have to worry about finding gcds. Also, Farey sequences can be easily generated inductively, by inserting the “mediant” between neighbors, when appropriate.

So a first go at this problem would be to use that idea. My first solutions are always ones that are completely stupid, but I’m pretty convinced are correct. So they’re slow, but good for testing optimizations or other solutions (for small inputs, anyway). Below is my third revision, approximately two orders of magnitude quicker than my first solution, an improvement gained by being slightly careful how far you iterate and what order you do things in. Oh, and I’m representing rationals as pairs of integers, to remain in the world of integer arithmetic. I worry about floating point operations when my denominators are allowed to be as big as 10^8.

def midpoint(lhs, rhs):
    """ Midpoint of lhs and rhs """
    return (lhs[0]*rhs[1] + lhs[1]*rhs[0], 2*lhs[1]*rhs[1])

def solve(r, D):
    """ Find ambiguous rationals

    Find ambiguous p/q with:
    1. p/q < 1/r (i.e. r*p < q)
    2. q <= D
    """
    seq = [(0,1), (1,1)] # F_1, the first Farey sequence
    den = 1
    ret = r < 2 and 1 or 0 # if 1/2 is included or not
    while den < D:
        # loop invariant: seq is the Farey sequence F_{den}
        # at least up to 1/r

        idx = 0 # our location in F_den
        # iterate to the end of F_den or until seq[idx] > 1/r
        while idx < len(seq)-1 and r*seq[idx][0] < seq[idx][1]:
            nextidx = idx + 1
            mediantden = seq[idx][1] + seq[idx+1][1]
            if(mediantden == den + 1): # new element of F_{den+1}
                mediant = (seq[idx][0]+seq[idx+1][0],mediantden)
                seq.insert(idx + 1, mediant)
                nextidx = idx + 2
                # possible new ambiguous numbers
                midleft = midpoint(seq[idx], mediant)
                midright = midpoint(mediant,seq[idx+2])
                if r*midleft[0] < midleft[1] and midleft[1] <= D:
                    ret += 1
                if r*midright[0] < midright[1] and midright[1] <= D:
                    ret += 1
            idx = nextidx
        den += 1
    return ret

This is still quite slow. I’m pretty sure it’d basically never solve the stated problem.

I spent some time thinking about ways to change the algorithm around and get the necessary speed improvements. I had spent some time thinking about neighbors in Farey sequences, and the extended Euclidean algorithm. I thought I could translate it into code that would iterate over reduced fractions, find neighbors for each, and count ambiguous numbers that way. It was a little messy, and then I realized I was over counting. And there seemed to be some boundary cases to worry about. I would still like to get a nice solution going along these lines.

But yesterday I came up with a recursive solution that runs quickly enough to solve the stated problem in about 10 minutes on my computer. That’s still 10 (or 100 :) ) times longer than I usually shoot for on these problems, but I’ll take it (for now). It’s a pretty clean solution, I think. The idea is to loop through all the intervals [h/k, H/K] between neighbors in Farey sequences where (1) the interval contains points smaller than the specified bound, and (2) the product kK isn’t too big, for the specified denominator bound. Once you’ve got one such interval you can split the interval into two, at the mediant, and then run the same things on those two smaller intervals.

While the solution is essentially recursive, for whatever reason I ended coding it up as a loop. Here’s what I came up with:

def solve(r, D):
    """ assumes D >= 2 """

    # pairs will be my list of integrals to check
    # an element is (h,k), (H,K), neighbors in a Farey sequence
    # along with the denominator, 2*k*K, of the midpoint
    pairs = [ [(0,1),(1,1), 2] ]
    ret = 0
    while len(pairs):
        pair = pairs.pop()
        midden = pair[2] # we already know this is <= D
        midnum = pair[0][0]*pair[1][1] + pair[0][1]*pair[1][0]
        if r * midnum < midden: # midpoint is small enough, ambiguous
            ret += 1
        # construct the mediant
        mednum = pair[0][0] + pair[1][0]
        medden = pair[0][1] + pair[1][1]
        mediant = (mednum, medden)
        # and the denominators of the next midpoints
        leftden = 2 * pair[0][1] * medden
        rightden = 2 * pair[1][1] * medden
        # include each subinterval in the list of pairs
        # as long as the midpoint denominator isn't too big
        if leftden <= D:
            pairs.append( [ pair[0], mediant, leftden ] )
        if rightden <= D:
            pairs.append( [ mediant, pair[1], rightden ] )
    return ret

You can optimize this a bit. If you re-organize a little, don’t bother with tuples, and store some computed values, you can cut the running time nearly in half. That’s as much of an improvement as I’ve been able to make so far though. I look forward to doing better… but I do like the cleanliness of this solution. It’s 18 lines of code, and with some syntactic sugar (multiple assignments) could be reduced to or below 15, without harming readability much (perhaps depending on how readable you think it is now).

Problem 35 – Circular Primes

November 1, 2009 by sumidiot

In problem 35 we are asked to find the number of “circular primes” below a bound. A prime is circular if all of the numbers obtained by cyclic permutation of the original number are also prime.

In my first solution, I’ll use my PrimeOracle from a while ago:

from PrimeOracle import PrimeOracle

def iscircularprime(num, isprime):
    if not isprime[num]:
        return False
    numstr = str(num)
    cyclics = [ int(numstr[i:] + numstr[:i]) for i in range(0,len(numstr)) ]
    return reduce(lambda a,b:a and b, map(lambda n: isprime[n], cyclics))

def solve(bound):
    """ Find how many circular primes there are below bound """
    isprime = PrimeOracle(bound)
    return len(set([n for n in range(2,bound) if iscircularprime(n, isprime)]))

This is pretty straightforward. I like how we can make all of the cyclic permutations using that one line (line 7), using the slice notation. And testing if all elements in a list are “True”, using reduce, is nice. I had hoped that maybe instead of the lambda expression I wrote, I could just pass in “and” as the first argument to that reduce. No dice though. Anybody know a nice way? I’ve also wondered about passing in the identity function in that first spot (in a filter, not a reduce), but don’t know how (without a lambda function or so).

Anyway, I think it’s good to have a solution like this that is easy to write and easy to read (hopefully) and, therefore, easy to check for correctness. Then I like to try to make it faster. The code above checks each cyclic permutation of each number many times, which is pointless.

Here’s a slightly faster solution, which doesn’t invoke the PrimeOracle:

def solve(bound):
    """ Find how many circular primes there are below bound """
    # isprime[n] if 2n+3 is prime, eventually
    # so to test if m is prime, check isprime[(m-3)/2]
    isprime = [True for n in range(0, bound//2)]
    for val in xrange(3, bound, 2):
        if isprime[(val-3)//2]:
            for mul in xrange(val*val, bound, 2*val):
                isprime[(mul-3)//2] = False

    ret = 1 # 2 isn't included in isprime, but it is circular
            # we've assumed bound > 2

    # let isprime[n] mean 2n+3 is a prime we haven't checked for circular primality

    And = lambda a,b: a and b # since we'll use it twice

    for val in xrange(3, bound, 2):
        if isprime[(val-3)//2]:
            valstr = str(val)
            circs = [ int(valstr[i:] + valstr[:i])
                      for i in range(0,len(valstr))]
            if reduce(And, map(lambda d:d%2, map(int, valstr))): # no even digits
                if reduce(And, map(lambda n:isprime[(n-3)//2], circs)): # all circs are prime
                    ret += len(set(circs))
            for n in circs:
                isprime[(n-3)//2] = False

    return ret

In line 25, where we increment ret, my original thought was to add len(valstr), since that’s how many cyclic permutations there are. However, these cyclic permutations need not be distinct, so I switched to making a set and getting its length.

What do you think of lines 23 and 24, an if within an if, and not much else? Would it be better (more readable?) to have simply “if () and ()”, even though that’d look best on two lines anyway? I guess python wouldn’t have to make one more nested context for that inner if, if we used 1 instead of 2…

It has, after writing all of this, occured to me that there’s some issues if “bound” isn’t a power of 10. Should we consider only primes such that all cyclic permutations are also primes less than the bound? We’d also have to think about how big to make our isprime array. So it would probably be better to re-write the above code accepting the bound as a number of digits.

Problem 34 – Factorials of Digits

October 31, 2009 by sumidiot

In problem 34 we are supposed to find the sum of all numbers that are the sum of the factorials of their digits.

My first thought was to just loop through all the possible numbers, compute sums of factorials of digits, and see what numbers worked. To do this, we have to know a stopping point. Well, 9! is 362880, so the largest digit factorial sum (DFS) that could be made with n digits is 362880n. An n digit number is always at least as big as 10^{n-1}, so if 362880n<10^{n-1}, then no number with n digits (or more) will be equal to its DFS. It turns out that 8 digits is too many. So we could stop looking once we got to 7\cdot 9!=2540160. We could clearly lower this bound some more, but let’s go with it for now.

Suppose we’ve got a function “fact” that returns the factorial of whatever number you give it. Then we can compute DFS using the following function:

def dfs(n):
    """ Compute the sum of the factorials of the digits of n """
    return sum(map(fact, map(int, str(n))))

I think that’s fun. Convert n to a string, then change the string to a list of digits (as integers), apply “fact” to each, and then sum. I’m sure I’ve used similar lines in other problems here.

Now, given this, we could, in theory, solve this problem with one more line:

print sum([n for n in xrange(10,7*fact(9)) if dfs(n) == n])

This certainly has the advantage of being readable. But I think it could be improved a bit, as it’s fairly slow.

The problem with this code is that we can quickly ignore lots of numbers, without computing the digit sum for each. If a number starts with a bunch of 1s, the DFS will only be so big. On the flip side, there’s no point in considering any 5 digit numbers with a 9 in them. Another issue with the code above, I expect, is that the dfs calculation used is not particularly fast.

Here’s some more efficient code:

fact = [1,1,2,6,24,120,720,5040,40320,362880]
digs = range(0,10)

def solvedigs(k, head = [], headval = 0, sumhead = 0):
    """ All solutions with k digits, starting with head

    assumes:
      len(head) <= k
    """
    if len(head) == k: # we have a k digit number
        if headval == sumhead:
            return headval
        else:
            return 0
    rem = k-len(head) # number of digits remaining to append
    smallval = headval*(10**rem) # smallest value that could be made
    smallsum = sumhead + rem*1 # smallest digit sum possible
    largeval = smallval + 10**rem - 1 # largest value that could be made
    largesum = sumhead + rem*fact[9] # largest digit sum possible
    if(largeval < smallsum or smallval > largesum):
        return 0
    return sum([solvedigs(k, head + [d], 10*headval + d, sumhead + fact[d])
                for d in digs
                if len(head) or d>0]) # no 0 leading digits!

if __name__ == "__main__":
    print sum([solvedigs(k) for k in xrange(2,8)])

I know I cheated with the global variable fact, but it is quicker than computing factorials every time we want to know them. And with problems of this size, having global variables really shouldn’t be an issue.

In the code above I pass around the digits of the number I’m considering, the value that digit string represents, and the sum of the factorials of those digits. Never involving strings (more precisely, string to/from integer conversions) seems like a good idea for efficiency in this problem.

Problem 33 – Cancelling Fractions

October 30, 2009 by sumidiot

49/98 = 4/8 if you cancel the 9s. This happens for few others fractions n/d where 0<n<d<100, and we’ll also ignore (n'\cdot 10)/(d'\cdot 10) for digits n',d'. Now take all these fractions that cancel like this, consider their product as a reduced fraction, and find the denominator. This is the challenge of problem 33.

Here’s what I came up with:

def gcd(a,b):
    if b == 0:
        return a
    return gcd(b,a%b)

def solve():
    fracs = [ ]
    conv = lambda t:int("%s%s" % t)
    for bothdig in xrange(1,10): # the digit that cancels
        for numdig in xrange(1,10): # digit left in numerator
            for dendig in xrange(1,10): # digit left in denom
            nums = map(conv, [(numdig,bothdig),(bothdig,numdig)]) # consider bother orderings
            dens = map(conv, [(dendig,bothdig),(bothdig,dendig)])
            for v in [(n,d) for n in nums for d in dens if n<d]:
                if v[0]*dendig == v[1]*numdig: # v[0]/v[1] = numdig/dendig
                    fracs += [ v ]
    p = reduce(lambda l,r:(l[0]*r[0],l[1]*r[1]),fracs) # multiply together solutions
    return p[1] / gcd(*p) # reduce the fraction, return denom

Certainly room for improvement, but it works.

Problem 32 – Pandigital Products

October 30, 2009 by sumidiot

We say that n is a pandigital product if n=a\cdot b and the string “abn” obtained by concatenating the base-10 representations of a, b, and n contains all of the digits 1-9 exactly once (with no 0s). In problem 32 we are asked to find the sum of all pandigital products.

I ended up being not particularly fancy about this problem. Loop through enough values for n, and, for each n, loop through enough values for a, and see if you get a pandigital product with n, a, and n/a. We know that it is enough for a to venture up to \sqrt{n}, but what about bounds on n? If n has 5 digits, then to be a pandigital product, with factors a and b, the factors must be either 1 and 3 digits, or 2 and 2 digits. But 9*999 and 99*99 aren’t 5 digit numbers, so we know we can stop looking for n once we get to 10000. Similar reasoning shows that n must have more than 3 digits, and so, in fact, to be a pandigital product, n must have 4 digits (and since they must not be 0, and must be distinct, 1234 is the smallest possibility).

The following code seems to get the job done:

def ispandigprod(a,b,n):
    """ assumes a*b==n """
    # consider the digits in the string "abn"
    # remove 0s and duplicates (with filter, set)
    # if you've still got 9 elements, you're pandigital
    return len(set(filter(lambda t:t, map(int, "%s%s%s" % (a,b,n))))) == 9

def solve():
    """ return the sum of the pandigital products """
    ret = 0
    for n in xrange(1234,9876+1):
        a = 1
        included = False
        while(a < n**.5 + 1 and not included):
            if(n % a == 0):
                b = n//a
                if(ispandigprod(a,b,n)):
                    ret += n
                    included = True
            a += 1
    return ret

I’d honestly like a nicer solution for this problem. The loop I have is pretty stupid, and loops through far more choices than is worthwhile. In particular, I’d like to not think about any n or a that have a digit that is 0. To ignore n with a 0 digit, one could replace the “included = False” line (apparently line 10) with

  included = str(n).count("0") > 0

This seems to cut the run-time down by about 20% (or more), so is maybe worthwhile. I still expect this code could be improved by a bit, but I think I'll stop here for now.

Current Problems

October 25, 2009 by sumidiot

To date, there has been a widget in the right-hand column with the “current problems”. The original intention here was to have two per week. Of course, the original intention also involved some other authors.

I’m going to remove the “current problems” widget. I’ll still hope to do at least one problem per week.

Problem 31 – Coins

October 19, 2009 by sumidiot

In problem 31, we are given a list of coin values and a target value, and asked how many ways we can make that target value with coins of the given denominations. I think this is probably a pretty standard programming exercise, and I’m embarassed it took me as long as it did. Anyway, here’s what I came up with, nothing too special:

import sys, time

def worker(target, coins):
    """ assume coins are sorted decreasing """
    if target == 0:
        return 1 # there's only one way to not have money?

    # get rid of coins bigger than the target
    while(len(coins) and target < coins[0]):
        coins = coins[1:]

    # some easy cases
    if not len(coins): # all coins are > target
        return 0
    if len(coins) == 1:
        return (target % coins[0] == 0) and 1 or 0

    # there are >=2 types of coins, all <= target
    return sum([worker(target - v, coins[1:])
                for v in xrange(0, target + 1, coins[0])])

def solve(target, coins):
    t = time.time()
    coins.sort()
    coins.reverse()
    ret = worker(target, coins)
    t = time.time() - t
    return [ret, t]

if __name__ == "__main__":
    print solve(int(sys.argv[1]), map(int,sys.argv[2:]))

I also had fun trying this another way. Instead of breaking down the target amount and doing things recursively, I thought about having a huge amount of coins around. The example problem had a target of 200, with denominations 1, 2, 5, 10, 20, 50, 100, 200. Suppose I had 200/1=200 of the 1s, 200/2 = 100 of the 2s, 200/5 of the 5s, etc. I could then take any number of my 1s, along with any number of my 2s, etc, see how much I have, and see if the total is 200. So I’ve got a product space [0,200/1]\times [0,200/2]\times [0,200/5]\times\cdots\times [0,200/200] of possible combinations, and I want to pick out all those that have a value of 200. This presents a fun application of the map/reduce functions, which I can’t seem to get enough of.

Suppose I start with the list of coins available, [1,2,5,10,20,50,100,200], and my target value in mind. I want to replace each item in this list, say the value c, with the list of values I could make using only $c$-valued coins (and only enough to get to 200). I’m going to do something a little silly, and store each value as it’s own list (this’ll make the next step easier, I think). So I first do

target = 200
coins = [1, 2, 5, 10, 20, 50, 100, 200]
var = map(lambda c: [ [v] for v in xrange(0,target+1,c)], coins)

Next, I want to make the cartesian product of all of these list of lists. I’ll build up my product, working my way through var. That is, I think about L_1\times L_2\times \cdots \times L_k as (((L_1\times L_2)\times L_3)\times \cdots )\times L_k, where each L_i here is a list. This is the sort of thing reduce is for. Keeping in mind that all of my lists are lists of one-element lists (so “+” below is really “concatenate”), I can build the cartesian product via

var = reduce(lambda l,r: [ x+y for x in l for y in r ], var)

Now each element of var is a list with as many values as there are coins, e.g. [7, 6, 0, 40] corresponds to 7 in 1s, 6 in 2s, 0 in 5s, 40 in 10s (of course, in the running example, I need more values in the list, but I hope the point comes across). To find the combined value of the coins in each list, I just need the sum, so I do

var = map(sum, var)

Finally, I only want the values where the sum is my target value, so I filter. We’re asked for the number of ways to make the target value, so I take the length:

var = filter(lambda t: t == target, var)
print len(var)

This is not particularly efficient. In the stated problem, the product of my lists has 1.28 billion elements (significantly larger than the actual answer, which is less than 100,000). That’s rather a lot of space to be taking up. I tried running this code on a target value of 100, and gave up on it after about 20 minutes. For contrast, the first solution above, the recursive solution, takes about half a second on my computer. Still, I like this exercise of getting the cartesian product of a bunch of lists.

Problem 30 – Sums of Digit Powers

October 1, 2009 by sumidiot

In problem 30 we are asked to find all the numbers that are the sum of the n-th powers of their digits. Well, the sum of those numbers. And n=5. You can just about do it in one line…

Most of this problem is fairly easy: Given a number, compute the sum of the powers of its digits, and see if you get the number you started with. If you wanna find all such numbers, just loop for a while. The trouble is how long? If you’ve got some range in mind you want to look at, you can do the following:

def solve(pow, low, hi):
    """ Find the sum of all numbers that are the sum of their pow-th powers of digits in the interval [low,hi) """
    return sum(filter(lambda n:n==sum(map(lambda d:int(d)**pow,str(n))) xrange(low, hi)))

Ok, sure, it’s a longish line, but it’s all there. Python’s syntactic sugar for handling all the little looping is awesome.

The remaining trick is to figure out those bounds. The problem mentions not using 1 as one of the numbers, so we could start at 2. However, that means we could start at 2**pow, because that’ll be the lowest digit sum we’d make. On the high end, the biggest digit-power-sum will come from digits being 9. So I figure that to find an upper bound, you can think about digit sums for longer and longer strings of 9. If you think about k 9s in a row, that’ll give a digit sum of k\cdot 9^p (if p is the specified power). Find the largest k where this value is still a k-digit number (or bigger), and you’ve got yourself an upper bound to loop to. Here’s some code to find the bound:

def bound(pow):
    """ Find bound on nums that can be sum of their pow-th powers of digits """
    if pow <= 1:
        return 9
    nine = 9**pow
    coeff = 1
    while(len(str(coeff*nine)) >= coeff):
        coeff += 1
    return (coeff - 1)*nine

Now, these bounds I’ve set up are pretty poor. In the case where the power is 4, as in the example problem, we’d loop from 2**4=16 to 32805, when the actual numbers that have the property we’re hoping for are between 1000 and 10000. We’ve overshot by something like 3 times too many values.

So, we could try to be more clever and come up with a smaller set to search in. But we could still use that one-line solver to pick from that set the appropriate values. And that makes me happy.

Problem 29 – One Liner

September 29, 2009 by sumidiot

In problem 29 we are supposed to determine how many unique values there are for the value a^b where a_0\leq a\leq a_1, b_0\leq b\leq b_1 (all integers). I know there is clever counting one can do to determine this, but I thought for comparison I’d also just do up the brute force approach. Find all the values, get rid of duplicates, and count how many are left.

Python has a built-in type “set”, which will eliminate duplicates as I go. And you can ask for the length of the set. And list comprehension is awesome. So you can solve this in one line:

def solve(lowa, hia, lowb, hib):
    return len(set([a**b for a in xrange(lowa, hia+1) for b in xrange(lowb, hib+1)]))

Having wrapped up solution code into a function, I also like to use command line arguments to pass in… well, arguments. In this case, I’d like to allow two different options for command line options. The stated problem has the bounds 2 to 100 for both a and b. So if I take one number in, I want to assume it is the upper bound, and that 2 is the lower bound, for both variables. Alternatively, I could take all four bounds in as arguments. I know the following is crude, and doesn’t fail gracefully, but it’s got a lot of nice things going for it:

if __name__ == "__main__":
    if len(sys.argv) == 2:
        print solve(*(2*[2,int(sys.argv[1])]))
    else:
        print solve(*map(int, sys.argv[1:]))

I love this. Using * to change a list into the form my solve function expects, with 4 arguments… that’s awesome. Using [1:] to grab the arguments besides the called command… nice. 2*list to do a list concatenated with itself… delightful. And, of course, map. Who could forget map?

Anyway, I may return to the more mathematical, counting based solution for this. At the same time, though, this code is quick (to write and run (for the stated bounds)) and easily checked. Maybe I’ll leave the more mathematical solution to the other authors on this blog. I seem to recall that there were some…

Problem 27 – Prime Generating Quadratics

September 27, 2009 by sumidiot

With my freshly minted PrimeOracle, I’m ready to attack problem 27, in which we are asked to find polynomials that generate long strings of consecutive primes starting when you plug in 0.

Basically my idea for this problem was to iterate through all of the choices for the coefficients a and b in n^2+an+b, and then for each polynomial plug in values until we hit a composite. Record maximums as appropriate.

There are a few easy optimizations to make. First, b itself has to be prime, because we test n=0. Next we test n=1, so we need 1+a+b\geq 2, that is a\geq 1-b. Those are the only optimizations I made, though I’m sure there are some more advanced ones out there.

So here’s my code:

# assume PrimeOracle.py is in the same directory as this file
from PrimeOracle import PrimeOracle

def solve(n):
    # seed our oracle to 2*n+2, since n=1 gives 1+a+b
    isprime = PrimeOracle(2*n + 2)
    maxp, maxn = 41, 40
    for b in [m for m in xrange(2, n) if isprime[m] ]:
        for a in xrange(1-b, n):
            idx = 1 # we already tesetd 0 with isprime[b]
            while(isprime[idx*idx+idx*a+b]):
                idx += 1
            if(idx - 1 > maxn):
                maxp, maxn = a*b, idx - 1
    return maxp

I had my oracle also tell me what the biggest number it was asked about was, and the answer was up around 12000. So we could change the starting seed in the oracle to be big enough to never have to extend, and probably save some time.