Posts Tagged ‘sieve’

Prime Oracle

September 27, 2009

I sat down to work on problem 27, and realized it might be handy to have a class that’ll tell me when numbers are prime. Something that’ll keep some memory of tested values, and sieve when it needs to know about bigger numbers than have been tested. Here’s what I threw together:

""" A class that'll help you decide if things are prime

Make an object of this class, I'll call it oracle, and
oracle[n] is True or False depending on if n is prime or not

Assumes negatives are not prime

class PrimeOracle:
    # Takes a starting maximum to sieve to
    def __init__(self, startmax = 2000):
        # self.isprime[n] = True if 2n+1 is prime
        self.isprime = [True for n in xrange(0,startmax//2 + 1)]
        self.isprime[0] = False
        # run the sieve
        for val in xrange(3, startmax + 1, 2):
            if(self.isprime[val // 2]):
                for mul in xrange(val*val, startmax + 1, 2*val):
                    self.isprime[mul // 2] = False

    # is n prime?
    def __getitem__(self, n):
        if n == 2:
            return True
        if n<2 or n % 2 == 0:
            return False
        nidx = n//2
        while(nidx >= len(self.isprime)):
            # extend the sieve
            # currently, we know the primality of 2*(len-1)+1=2*len-1
            # we can extend that to its square, 4*len^2 - 4len + 1
            # which would be index (4*len^2-4*len+1) // 2 = 4len(len-1)+1 // 2
            # so the new length should be one more than this
            # we already have length len, so we need to add (prev line) - len
            curlen = len(self.isprime)
            newlen = (4*curlen*(curlen-1) + 1) // 2
            self.isprime += [True for n in xrange(0, newlen+1-curlen)]
            maxknown = 2*curlen - 1
            maxtoknow = 2*newlen - 1
            for idx in xrange(1, curlen):
                    val = 2*idx + 1
                    # start crossing off multiples larger than maxknown
                    sfact = 1 + (maxknown // val)
                    if(sfact % 2 == 0):
                        sfact += 1 # otherwise our indexing is wrong
                    startfact = max(val, sfact)
                    for mul in xrange(startfact * val, maxtoknow + 1, 2*val):
                        self.isprime[mul // 2] = False
        return self.isprime[nidx]

    # the largest index we know about
    def __len__(self):
        return 2*len(self.isprime) - 1

    def __str__(self):
        return "\n".join(["%d: %s" % (2*n+1, self.isprime[n])
                          for n in xrange(0,len(self.isprime))])

It’s basically just the usual sieve, and then it extends itself whenever necessary.

There’s room for improvement. We could remember the highest value we’ve sieved to, and then take our array and just tease out the primes from it. Then we only store the list of primes. When we go to extend, we make a new array to hold whatever the new values would be, sieve appropriately, and then just append to our list of primes. Here’s my modified __getitem__ method:

def __getitem__(self, n):
    if n == 2:
        return True
    if n<2 or n % 2 == 0:
        return False
    while(n > self.sievedto):
        # we have sieved to sst = self.sievedto (assume odd)
        # we could sieve to sst^2 (will also be odd)
        # that'd be [sst+1 .. sst^2]
        # but really we might as well do [sst+2 .. sst^2]
        # sst=2s+1 => [2s+3 .. 4s^2+4s+1] //2 is [s+1 .. 2s(s+1)]
        # which has length 2s^2+s
        # and index i corresponds to 2(s+1+i)+1
        # so that an odd integer m is at m//2 - (s+1)
        s = self.sievedto // 2
        newsst = self.sievedto * self.sievedto
        isprime = [True for idx in xrange(0,2*s*s+s+1)]
        for pr in self.primes.keys():
            sfact = 1 + (self.sievedto // pr)
            if(sfact % 2 == 0):
                sfact += 1
            for mul in xrange(max(pr, sfact) * pr, newsst + 1, 2 * pr):
                isprime[mul//2 - (s+1)] = False
        for m in [2*(s+1+i)+1 for i in xrange(0,2*s*s+s+1) if isprime[i]]:
            self.primes[m] = True
        self.sievedto = newsst
    return self.primes.has_key(n)

My first guess was that the second version would be better, because it would be storing less. However, it seems to be a little slower. I guess filtering out the primes at each extension has its cost. Or my implementation could use some tweaking? I wish I knew a good way to compare the size requirements of these scripts, in addition to how long they take to run.

By the way, I would be a little shocked to find that I’m not off by one somewhere (many somewheres) in the above code. Little tests seem to work out. But use are your own peril.