Archive for July, 2009

Problem 22

July 19, 2009

Just got back from dinner, depressed because a friend wanted to have a conversation about politics and things. Finally decided maybe some programming would pull me out of it. I’m not convinced it did, but whatever. It had the best shot of doing so.

In Problem 22 we’re supposed to read in a file, pull out the words between quotes, sort them, do some computation on each string to convert it to an int, and then take the sum of all of those words times their index in the list. I thought I’d see how little actual work I could do, which seems to translate to using lots of maps and lambda functions and such.

After a “from __future__ import with_statement”, it’s pretty easy to read in files:

def readfile(filename):
    ret = []
    with open(filename) as file:
        for line in file:
            ret += line.split(",")
    return ret

The given file has all of the names on one line, so the loop doesn’t last long. Next up, convert that list to a bunch of numbers, and sum them up, with an appropriate product:

def process(names):
    names.sort()
    scores = map(lambda s:sum(map(lambda x:string.ascii_uppercase.index(x)+1,s[1:-1].upper())), names)
    return sum([(i+1)*n for i,n in enumerate(scores)])

Perhaps some explanation is in order here. The first line, sort(), is pretty clear, I hope. In the next line, s[1:-1] strips the quote marks off of the name, and then the call to upper() converts all the characters to uppercase. This is probably superfluous, because the names all seem to be in uppercase to begin with, but better safe than sorry. Then I use the ascii_uppercase string provided by the string module, as documented here. This lets me easily convert letters to their position in the alphabet (remembering to add one because indices in strings start at 0). So my “scores” list above contains what the problem instructions call the “alphanumeric value” of each of the names. In the last line, I use the enumerate function (some documentation) to give me pairs: (index, value) in the list of scores (remembering, again, to add one to the index).

So that was fun.

Advertisements

Problem 20 – One More Line

July 17, 2009

I love one-liners. The sum of the digits of n factorial:

print sum(map(int, str(reduce(lambda x,y:x*y,xrange(1,n+1),1))))

This problem is pretty similar to Problem 16, which I also wrote about. There are probably improvements to be made. Like, you might as well start the xrange at 2, instead of 1. I can’t think of a solution that would be faster to code though.

Problem 18 (and 67) – Bottoms Up

July 10, 2009

My first idea in simplifying problem 18 (finding a path with maximum sum through a given triangle) was to note that going left then right (LR), you’d end at the same place as going right then left (RL). So if you want maximum sums from some point, you should process the sums LL, LR=RL, and RR separately. I wrote up my first solution pretty quickly, and it gave the right answer for the small sample input. Then I wanted to test its answer on the larger input, so I wrote the brute-force solution that checks all paths. It turned out I made an error in my initial solution (it would never try LLR), and thinking about it a little more turned up what I believe is the correct (and quick) solution.

Instead of working from the top down, work from the bottom up, inductively. At any point of the bottom row, you know the best path to the bottom (since you’re already there). Moving up a row, at any index of the second row from the bottom, you should pick the max of going left or going right, and that’ll be the best path to the bottom. Inductively, if you know the best path starting at any point of some row, then you can easily compute the best path for each index in the next row up. Here’s my code:

from __future__ import with_statement

import sys, time

def readfile(filename):
    """ Read in a file, return a list of (lists of) integers """
    ret = []
    with open(filename) as file:
        for line in file:
            ret.append(map(int,line.split()))
    return ret

def solve(grid):
    """ Find the path with the maximum sum
    Returns: the sum, the path string of L and R, time taken
    """
    t = time.time()

    lowerrow = map(lambda n:(n,""), grid[-1])
    rowidx = len(grid) - 2

    while(rowidx >= 0):
        currow = []
        for col in xrange(0, len(grid[rowidx])):
            if(lowerrow[col][0] >= lowerrow[col + 1][0]):
                lowsum, lowstr = lowerrow[col]
                currow.append((lowsum + grid[rowidx][col], "L" + lowstr))
            else:
                lowsum, lowstr = lowerrow[col + 1]
                currow.append((lowsum + grid[rowidx][col], "R" + lowstr))
        lowerrow = currow
        rowidx -= 1

    rsum, rpath = lowerrow[0]

    t = time.time() - t
    return (rsum, rpath, t)

if __name__ == "__main__":
    print solve(readfile(sys.argv[1]))

Problem 15 – Counting

July 3, 2009

When I first read problem 15, about counting the number of paths from one corner of a grid to the opposite corner, I knew there was some clever counting way to do it. I couldn’t quite remember what the clever way was, but I thought about it for a while, and remembered it.

Any path from the upper left to lower right corner consists of a sequence of moves right or down. We can denote any path as a sequence of Rs and Ds. In an n\times m grid, to start in one corner and go to the other, there must be exactly n Rs, and m Ds. So we can think of a path as a string of Rs and Ds, whose length is n+m, and so that exactly m of the letters are D (equivalently, exactly n are Rs). So we have n+m “spots” in which to put m Ds, meaning there are \binom{n+m}{m}=\frac{(n+m)!}{n!m!}=\frac{(m+1)\cdots(m+n)}{n!} paths. This gives us a constant-time (essentially) algorithm for computing the number of paths.

I didn’t give this problem too much more thought after that. Moving and finishing teaching a summer class occupied my time. But then Friday rolled around, and nobody else has written about this problem, so I thought I should. As I started coding, I realized there were some interesting things to say. Well, interesting when you’ve been grading Calculus exams for a few hours.

Here’s the code I wrote that computes the number of paths using the second formula above (which has fewer multiplications than the first):

import sys,time

def solve(n, m):
    """ Number of paths in through a grid 

    Return (number of paths, time to compute)
    """
    t = time.time()
    prod = lambda x,y: x*y
    prange = lambda x,y: reduce(prod,xrange(x,y+1),1)
    fact = lambda y: prange(1,y)
    ret = prange(m+1,m+n) // fact(n)
    t = time.time() - t
    return (ret, t)

if __name__ == "__main__":
    args = map(int, sys.argv[1:3])
    print solve(*args)

There’s a few things to note here. The first is apparent in the code, the “*args” in the last line. What this does is “unpack” a tuple. The line above it makes a list (of length two), and I want to pass this to my solver function. But my solver function takes two arguments, not a list of length two. So I use the “*” to unpack my list. Pretty nice syntactic sugar. For (slightly) more on this topic, see the tutorial.

The other thing to notice only becomes apparent when you run the code. The value that gets printed as the answer (not the time part, but just the number of paths part), ends with an extra “L”. This indicates that the answer is a long integer. This happens, with this code, even if the final answer isn’t really that big. The in-between steps are big enough to cause the long integer to show up.

Next, I thought I’d see how easy it was to code up a solution that didn’t use this counting argument, but just counted the number of paths recursively. If you have an n\times m block, and are in the upper left, you can move either right (making an (n-1)\times m block), or down (making an n\times (m-1) block). So the total number of paths is the sum of the number of paths of these smaller blocks. There’s a base case when one of the parameters is 1. Then you have a 1\times m block, say, and there are m+1 routes through it (m+1 choices for when to move down). Here’s the essential bits of the code I wrote:

def worker(n,m):
    if n == 1 or m == 1:
        return max(n,m) + 1
    return worker(n-1,m) + worker(n,m-1)

But this take a loong time to run. The solution using the counting argument takes less than 0.0001 seconds to run on my computer (on the 20×20 input of the stated problem). This second solution takes… minutes (enough time to write, or at least edit, a post). Thinking about why, while the program continues to churn, we realize some improvements can be made. For starters, \text{worker(n,m)}=\text{worker(m,n)}. To use this, I figure I’ll store results as I compute them, so that I can just use them (instead of re-calculating them) if I need them again later (rather like my solution for problem 14). Here’s what I rigged up:

class Worker:
    def __init__(self):
        self.mem = {}
    def __getitem__(self,t):
        sm,lg = min(t), max(t)
        if (sm,lg) in self.mem:
            return self.mem[(sm,lg)]
        if sm == 1:
            self.mem[(1,lg)] = lg + 1
            return lg + 1
        ret = self[(sm-1,lg)] + self[(sm,lg-1)]
        self.mem[(sm,lg)] = ret
        return ret

def solve(n, m):
    w = Worker()
    return w[(n,m)]

This is still slower than the counting-trick solution, by more than a factor of 10 for the 20×20 input. Still, that’s fractions of a second, a big improvement on the code without any memory. Plus it lets me play with defining classes, and operator overloading. The function “__getitem__” in my class allows me to use the [] notation on an object of my class. I think I kinda like the way c++ does this better, where you define a function “operator[]”. That way you don’t have to remember the name of the function. But whatever, this works. Now that I’ve done it once, I imagine I’ll be able to remember the name.

Problem 16 – One Liner

July 1, 2009

I spent a little while trying to come up with some inductive (or other clever) way to compute the sum of the digits of 2^n, without just implementing multiplication by 2. Mostly I did this because I figure 2^{1000} is pretty big, and maybe having a script compute that number would cause some sort of issue. Not coming up with anything particularly clever, I thought I’d try just a naive approach: compute 2^{n} directly and then sum the digits. Here’s the one-liner (not counting “import sys”) I came up with:

print sum(map(int,str(int(sys.argv[1]))))

Delightful. Even better – my initial concern that something would go wrong (take a while, use up too much memory) was apparently unfounded. This script ran without any noticeable difficulty in just a fraction of a second. It also did fine with a power of 10,000 and 100,000 (taking a little more than a second here). For 1,000,000, it took close to 2 minutes. This still makes me wonder if there’s a clever way to solve this problem quickly for big powers.

I thought I’d see what the function f(n)= the sum of the digits of 2^n looked like. Here’s a sampler:

Kinda fun, I think. Then I thought: how about just g(n)= sum of the digits of n. I encourage you to spend up to 10 seconds thinking what this graph should look like.