I sat down to work on problem 27, and realized it might be handy to have a class that’ll tell me when numbers are prime. Something that’ll keep some memory of tested values, and sieve when it needs to know about bigger numbers than have been tested. Here’s what I threw together:

""" A class that'll help you decide if things are prime Make an object of this class, I'll call it oracle, and oracle[n] is True or False depending on if n is prime or not Assumes negatives are not prime """ class PrimeOracle: # Takes a starting maximum to sieve to def __init__(self, startmax = 2000): # self.isprime[n] = True if 2n+1 is prime self.isprime = [True for n in xrange(0,startmax//2 + 1)] self.isprime[0] = False # run the sieve for val in xrange(3, startmax + 1, 2): if(self.isprime[val // 2]): for mul in xrange(val*val, startmax + 1, 2*val): self.isprime[mul // 2] = False # is n prime? def __getitem__(self, n): if n == 2: return True if n<2 or n % 2 == 0: return False nidx = n//2 while(nidx >= len(self.isprime)): # extend the sieve # currently, we know the primality of 2*(len-1)+1=2*len-1 # we can extend that to its square, 4*len^2 - 4len + 1 # which would be index (4*len^2-4*len+1) // 2 = 4len(len-1)+1 // 2 # so the new length should be one more than this # we already have length len, so we need to add (prev line) - len curlen = len(self.isprime) newlen = (4*curlen*(curlen-1) + 1) // 2 self.isprime += [True for n in xrange(0, newlen+1-curlen)] maxknown = 2*curlen - 1 maxtoknow = 2*newlen - 1 for idx in xrange(1, curlen): if(self.isprime[idx]): val = 2*idx + 1 # start crossing off multiples larger than maxknown sfact = 1 + (maxknown // val) if(sfact % 2 == 0): sfact += 1 # otherwise our indexing is wrong startfact = max(val, sfact) for mul in xrange(startfact * val, maxtoknow + 1, 2*val): self.isprime[mul // 2] = False return self.isprime[nidx] # the largest index we know about def __len__(self): return 2*len(self.isprime) - 1 def __str__(self): return "\n".join(["%d: %s" % (2*n+1, self.isprime[n]) for n in xrange(0,len(self.isprime))])

It’s basically just the usual sieve, and then it extends itself whenever necessary.

There’s room for improvement. We could remember the highest value we’ve sieved to, and then take our array and just tease out the primes from it. Then we only store the list of primes. When we go to extend, we make a new array to hold whatever the new values would be, sieve appropriately, and then just append to our list of primes. Here’s my modified __getitem__ method:

def __getitem__(self, n): if n == 2: return True if n<2 or n % 2 == 0: return False while(n > self.sievedto): # we have sieved to sst = self.sievedto (assume odd) # we could sieve to sst^2 (will also be odd) # that'd be [sst+1 .. sst^2] # but really we might as well do [sst+2 .. sst^2] # sst=2s+1 => [2s+3 .. 4s^2+4s+1] //2 is [s+1 .. 2s(s+1)] # which has length 2s^2+s # and index i corresponds to 2(s+1+i)+1 # so that an odd integer m is at m//2 - (s+1) s = self.sievedto // 2 newsst = self.sievedto * self.sievedto isprime = [True for idx in xrange(0,2*s*s+s+1)] for pr in self.primes.keys(): sfact = 1 + (self.sievedto // pr) if(sfact % 2 == 0): sfact += 1 for mul in xrange(max(pr, sfact) * pr, newsst + 1, 2 * pr): isprime[mul//2 - (s+1)] = False for m in [2*(s+1+i)+1 for i in xrange(0,2*s*s+s+1) if isprime[i]]: self.primes[m] = True self.sievedto = newsst return self.primes.has_key(n)

My first guess was that the second version would be better, because it would be storing less. However, it seems to be a little slower. I guess filtering out the primes at each extension has its cost. Or my implementation could use some tweaking? I wish I knew a good way to compare the size requirements of these scripts, in addition to how long they take to run.

By the way, I would be a little shocked to find that I’m not off by one somewhere (many somewheres) in the above code. Little tests seem to work out. But use are your own peril.