这不是作业,我只是好奇。
INFINITE 是这里的关键词。
我希望将其用作for p in primes()
. 我相信这是 Haskell 中的一个内置函数。
所以,答案不能像“做个筛子”那样天真。
首先,你不知道会消耗多少个连续素数。好吧,假设你一次可以炮制 100 个。您会使用相同的筛法以及素数频率公式吗?
我更喜欢非并发方法。
感谢您阅读(和写作;))!
The erat2
function from the cookbook can be further sped up (by about 20-25%):
import itertools as it
def erat2a( ):
D = { }
yield 2
for q in it.islice(it.count(3), 0, None, 2):
p = D.pop(q, None)
if p is None:
D[q*q] = q
yield q
else:
# old code here:
# x = p + q
# while x in D or not (x&1):
# x += p
# changed into:
x = q + 2*p
while x in D:
x += 2*p
D[x] = p
The not (x&1)
check verifies that x
is odd. However, since both q
and p
are odd, by adding 2*p
half of the steps are avoided along with the test for oddity.
If one doesn't mind a little extra fanciness, erat2
can be sped up by 35-40% with the following changes (NB: needs Python 2.7+ or Python 3+ because of the itertools.compress
function):
import itertools as it
def erat3( ):
D = { 9: 3, 25: 5 }
yield 2
yield 3
yield 5
MASK= 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0,
MODULOS= frozenset( (1, 7, 11, 13, 17, 19, 23, 29) )
for q in it.compress(
it.islice(it.count(7), 0, None, 2),
it.cycle(MASK)):
p = D.pop(q, None)
if p is None:
D[q*q] = q
yield q
else:
x = q + 2*p
while x in D or (x%30) not in MODULOS:
x += 2*p
D[x] = p
The erat3
function takes advantage of the fact that all primes (except for 2, 3, 5) modulo 30 result to only eight numbers: the ones included in the MODULOS
frozenset. Thus, after yielding the initial three primes, we start from 7 and work only with the candidates.
The candidate filtering uses the itertools.compress
function; the “magic” is in the MASK
sequence; MASK
has 15 elements (there are 15 odd numbers in every 30 numbers, as chosen by the itertools.islice
function) with a 1
for every possible candidate, starting from 7. The cycle repeats as specified by the itertools.cycle
function.
The introduction of the candidate filtering needs another modification: the or (x%30) not in MODULOS
check. The erat2
algorithm processed all odd numbers; now that the erat3
algorithm processes only r30 candidates, we need to make sure that all D.keys()
can only be such —false— candidates.
On an Atom 330 Ubuntu 9.10 server, versions 2.6.4 and 3.1.1+:
$ testit
up to 8192
==== python2 erat2 ====
100 loops, best of 3: 18.6 msec per loop
==== python2 erat2a ====
100 loops, best of 3: 14.5 msec per loop
==== python2 erat3 ====
Traceback (most recent call last):
…
AttributeError: 'module' object has no attribute 'compress'
==== python3 erat2 ====
100 loops, best of 3: 19.2 msec per loop
==== python3 erat2a ====
100 loops, best of 3: 14.1 msec per loop
==== python3 erat3 ====
100 loops, best of 3: 11.7 msec per loop
On an AMD Geode LX Gentoo home server, Python 2.6.5 and 3.1.2:
$ testit
up to 8192
==== python2 erat2 ====
10 loops, best of 3: 104 msec per loop
==== python2 erat2a ====
10 loops, best of 3: 81 msec per loop
==== python2 erat3 ====
Traceback (most recent call last):
…
AttributeError: 'module' object has no attribute 'compress'
==== python3 erat2 ====
10 loops, best of 3: 116 msec per loop
==== python3 erat2a ====
10 loops, best of 3: 82 msec per loop
==== python3 erat3 ====
10 loops, best of 3: 66 msec per loop
A primegen.py
module contains the erat2
, erat2a
and erat3
functions. Here follows the testing script:
#!/bin/sh
max_num=${1:-8192}
echo up to $max_num
for python_version in python2 python3
do
for function in erat2 erat2a erat3
do
echo "==== $python_version $function ===="
$python_version -O -m timeit -c \
-s "import itertools as it, functools as ft, operator as op, primegen; cmp= ft.partial(op.ge, $max_num)" \
"next(it.dropwhile(cmp, primegen.$function()))"
done
done
由于 OP 要求有效的实现,这是 David Eppstein/Alex Martelli 对活动状态 2002 代码的重大改进(在他的回答中看到):不要在字典中记录素数的信息,直到它的正方形出现在候选人。对于产生的 n 个素数 ( π(sqrt(n log n)) ~ 2 sqrt(n log n) / log(n log n) ~将空间复杂度降低到O(sqrt(n))而不是O(n)以下2 sqrt(n / log n))。因此,时间复杂度也得到了提高,即它运行得更快。
创建一个“滑动筛”作为每个基本素数的当前倍数(即低于当前生产点的 sqrt)的字典,以及它们的步长值:
from itertools import count
# ideone.com/aVndFM
def postponed_sieve(): # postponed sieve, by Will Ness
yield 2; yield 3; yield 5; yield 7; # original code David Eppstein,
sieve = {} # Alex Martelli, ActiveState Recipe 2002
ps = postponed_sieve() # a separate base Primes Supply:
p = next(ps) and next(ps) # (3) a Prime to add to dict
q = p*p # (9) its sQuare
for c in count(9,2): # the Candidate
if c in sieve: # c's a multiple of some base prime
s = sieve.pop(c) # i.e. a composite ; or
elif c < q:
yield c # a prime
continue
else: # (c==q): # or the next base prime's square:
s=count(q+2*p,2*p) # (9+6, by 6 : 15,21,27,33,...)
p=next(ps) # (5)
q=p*p # (25)
for m in s: # the next multiple
if m not in sieve: # no duplicates
break
sieve[m] = s # original test entry: ideone.com/WFv4f
(此处较旧的原始代码已被编辑以合并更改,如下面的Tim Peters的答案所示)。另请参阅此以获取相关讨论。
类似的基于2-3-5-7 轮子的代码运行速度快了 ~ 2.15 倍(这非常接近 的理论改进3/2 * 5/4 * 7/6 = 2.1875
)。
对于后人来说,这是Will Ness为 Python 3 编写的漂亮算法的重写。需要进行一些更改(迭代器不再具有.next()
方法,但有一个新的next()
内置函数)。其他更改是为了好玩(使用 newyield from <iterable>
替换原始语句中的四个yield
语句。更多是为了可读性(我不喜欢过度使用 ;-) 1 字母变量名称)。
它比原来的要快得多,但不是出于算法原因。加速主要是由于删除了原始add()
函数,而是内联。
def psieve():
import itertools
yield from (2, 3, 5, 7)
D = {}
ps = psieve()
next(ps)
p = next(ps)
assert p == 3
psq = p*p
for i in itertools.count(9, 2):
if i in D: # composite
step = D.pop(i)
elif i < psq: # prime
yield i
continue
else: # composite, = p*p
assert i == psq
step = 2*p
p = next(ps)
psq = p*p
i += step
while i in D:
i += step
D[i] = step
这最初不是我的代码,但是,值得发布。原文可以在这里找到:http ://code.activestate.com/recipes/117119/
def gen_primes():
D = {}
q = 2 # first integer to test for primality.
while True:
if q not in D:
# not marked composite, must be prime
yield q
#first multiple of q not already marked
D[q * q] = [q]
else:
for p in D[q]:
D.setdefault(p + q, []).append(p)
# no longer need D[q], free memory
del D[q]
q += 1
它是一个生成器,所以像其他任何东西一样使用它。
primes = gen_primes()
for p in primes:
print p
在我的桌面上生成并放入一组 100 万个素数需要 1.62 秒。
Do a segmented sieve, where the size of a segment is determined by available memory or the maximal size of a bitset.
For each segment represent the numbers in some interval [n; n + segment_size) as a bit set and sieve with all prime numbers below the square root of the upper bound.
Using a bit set uses less memory than a hash table or tree data structure, because you are working with dense sets of numbers.
另一个答案,比我erat3
在这里的答案更节省内存:
import heapq
def heapprimegen():
hp= []
yield 2
yield 3
cn= 3
nn, inc= 3, 6
while 1:
while cn < nn:
yield cn
heapq.heappush(hp, (3*cn, 2*cn))
cn+= 2
cn= nn+2
nn, inc= heapq.heappushpop(hp, (nn+inc, inc))
它维护一个质数倍数的堆(列表)而不是字典。显然,它失去了一些速度。
另一种方法:
import itertools
def primeseq():
prime = [2]
num = 0
yield 2
for i in itertools.count(3, 2):
is_prime = True
for num in prime:
if i % num == 0:
is_prime = False
break
elif num ** 2 > i:
break
if is_prime:
prime.append(i)
yield i
这是一个非常快的无限生成器,用 Python2 编写,但很容易适应 Python3。要使用它来将素数相加到 10**9,请使用以下命令:
from itertools import takewhile
from functools import partial
from operator import gt
print (sum(takewhile(partial(gt, 10**9), prime_gen_inf())))
它是一种分段筛,速度更快,但显然不如 Will Ness 的算法那么优雅。
from operator import mul
from functools import reduce
def prod(x): return reduce(mul, x, 1)
def build_sieve(wheel):
w = prod(wheel)
w_phi = prod([p-1 for p in wheel])
rems = [a for a in range(w) if all(a % p for p in wheel)]
assert len(rems) == w_phi
inv = {a:pow(a, w_phi - 1, w) for a in rems}
try:
known_p = wheel + rems[1 : rems.index(rems[1]*rems[1])]
except ValueError:
known_p = wheel + rems[1:]
return wheel, w, w_phi, rems, inv, known_p
#Adjust the chunk variable based on your computer's architecture.
#
#Adjust the line with #! if you don't need "true" infinite. If you don't need
#primes larger than 1<<32, use array('H', []), if 1<<64 use 'L', if 1<<128 (in
#Python3) use 'Q', otherwise use empty list [].
#To save memory, comment out the lines with #*, and uncomment the commented-out
#lines
import itertools
from itertools import islice, count, compress, izip
chain_f = itertools.chain.from_iterable
from array import array
def prime_gen_inf(chunk=250000, sieve_info = build_sieve([2,3,5,7])):
""" Indefinitely yields primes """
wheel, w, w_phi, rems, inv, known_p = sieve_info
for p in known_p: yield p
new_n = 0;
while True:
size = min(chunk, (p * p - new_n) / w)
sieve = bytearray([1]) * size * w_phi
n, new_n = new_n, new_n + size * w
if not n:
zero = bytearray([0])
seen = len(known_p) - len(wheel) + 1
sieve[:seen:1] = zero * seen
p_gen = islice(prime_gen_inf(), len(wheel), None)
new_p = next(p_gen)
ps = [] #! array('H', [])
p_invs = bytearray([]) #*
while new_p * new_p < new_n:
ps.append(new_p)
p_invs.append(inv[new_p % w]) #*
new_p = next(p_gen)
for p, p_inv, modp in izip(ps, p_invs, [-n % p for p in ps]): #*
s = [(modp + p * (p_inv * (r - modp) % w)) / w for r in rems] #*
#for p in ps:
# s = [(-n%p + p * (inv[p%w] * (r - -n%p) % w)) / w for r in rems]
for i, start in enumerate(s):
slice_size = ((size - start - 1) / p + 1)
sieve[i + start * w_phi :: p * w_phi] = zero * slice_size
for p in compress(chain_f(izip(*[count(n+r, w) for r in rems])), sieve):
yield p
这是一个复杂的基于堆的实现,它并不比其他基于堆的实现快多少(参见我的另一个答案中的速度比较),但它使用的内存要少得多。
此实现使用两个堆(tu 和 wv),它们包含相同的数字元素。每个元素都是一个 int 对。为了找到直到q**2
(其中q
是素数)的所有素数,每个堆将最多包含2*pi(q-1)
元素,其中pi(x)
正素数的数量不大于x
。所以整数的总数最多为4*pi(floor(sqrt(n)))
。(我们可以通过将一半的东西推送到堆中来获得 2 的内存因子,但这会使算法变慢。)
上面的其他基于 dict 和堆的方法(例如,erat2b 和 heap_prime_gen_squares 和 heapprimegen)存储大约 '2*pi(n)' 整数,因为它们每次找到素数时都会扩展它们的堆或 dict。作为比较:要找到 1_000_000 个素数,此实现存储少于 4141 个整数,其他实现存储多于 1_000_000 个整数。
import heapq
def heap_prime_gen_smallmem():
yield 2
yield 3
f = 5
fmar3 = 2
q = 7
q6 = 7 * 6
qmar3 = 4
tu = [(25, 30), (35, 30)]
vw = [(25, 30), (35, 30)]
while True:
qmar3 += 2
if qmar3 == 6:
qb = q + 4
q6b = q6 + 24
qmar3 = 2
else:
qb = q + 2
q6b = q6 + 12
if q < tu[0][0]:
d = q * q
while f < d:
a, b = vw[0]
if f < a:
yield f
else:
a, b = vw[0]
heapq.heapreplace(vw, (a + b, b))
a, b = vw[0]
while f >= a:
heapq.heapreplace(vw, (a + b, b))
a, b = vw[0]
fmar3 += 2
if fmar3 == 6:
f += 4
fmar3 = 2
else:
f += 2
c = q * qb
heapq.heappush(tu, (d, q6))
heapq.heappush(tu, (c, q6))
heapq.heappush(vw, (d, q6))
heapq.heappush(vw, (c, q6))
else:
a, b = tu[0]
heapq.heapreplace(tu, (a + b, b))
a, b = tu[0]
while q >= a:
heapq.heapreplace(tu, (a + b, b))
a, b = tu[0]
q = qb
q6 = q6b
这是一个使用堆而不是字典的简单但不是非常慢的方法:
import heapq
def heap_prime_gen_squares():
yield 2
yield 3
h = [(9, 6)]
n = 5
while True:
a, b = h[0]
while n < a:
yield n
heapq.heappush(h, (n * n, n << 1))
n += 2
heapq.heapreplace(h, (a + b, b)) # Replace h[0], which is still (a, b).
我对前 100 万个素数的用户时间的速度测量(数字越小越好):
所以基于字典的方法似乎是最快的。
这是一个更真实的生成器,它在 Haskell 中是如何完成的:过滤已知素数的组合,然后将剩余的素数添加到列表中。
def gen_primes():
primes = []
i = 2
while True:
prime = True
for p in primes:
if not (i % p):
prime = False
break
if prime:
yield i
primes.append(i)
i += 1
我知道帖子很旧,但我遇到了这个问题......以下代码基于一个非常简单的想法:Eratosthenes 的增长筛。虽然这个解决方案比这里最好的解决方案慢,但它很容易掌握并且设计成可读...
我使用整数来存储筛子的结果。在二进制格式中,整数是0
s 和1
s的列表0
,i
如果它i
不是素数,1
如果它可能是素数。必要的无穷大是 Python 3 整数无界的结果。
def primes():
container, size = 1 << 2, 3 # we start with 0b100 (from right to left: 0 and 1 are not primes, 2 is
last_prime = 1
while True:
prime = next((j for j in range(last_prime+1, size) if container & 1 << j), None) # find the next prime
while not prime:
container, size = expand(container, size, 2**16) # add 65536 cells and sieve the container
prime = next((j for j in range(last_prime+1, size) if container & 1 << j), None)
yield prime
last_prime = prime
如何扩展容器?1
只需在容器左侧(二进制格式)添加一堆s 并筛选它们。这与标准筛相同,略有不同。在标准筛子中,如果我们找到一个素数i
,我们开始在 处穿过细胞i*i
,步长为i
。
在这里,这可能是针对容器的第一部分完成的。如果容器的新部分比i*i
.
def expand(container, size, n):
new_size = size + n
container += (1 << (new_size + 1) - 1) - (1 << size) # add n 1's
for i in range(2, new_size):
if container & (1 << i): # i is a prime
t = sum(1 << j for j in range(max(i, size // i)*i, new_size, i)) # set 1 for all mutiple
container &= ~t # cross the cells
return container, new_size
测试一百万个素数:
import itertools
assert 78498 == len(list(itertools.takewhile(lambda p: p<1000000, primes())))