python - 找到给出超过 65 个素数的最低 collatz 序列

Question

我有一个任务，我需要在 Python 中找到包含超过 65 个素数的最低 Collatz 序列。

例如，19 的 Collatz 序列是：

19, 58, 29, 88, 44, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1

该序列包含 7 个素数。

我还需要使用记忆，所以它不必运行一个“年”来找到它。我找到了用于记忆 Collatz 序列的代码，但是当我只需要素数时，我无法弄清楚如何让它工作。

这是我找到的 Collatz 记忆代码：

lookup = {}
def countTerms(n):
    if n not in lookup:
        if n == 1:
            lookup[n] = 1
        elif not n % 2:
            lookup[n] = countTerms(n / 2)[0] + 1
        else:
            lookup[n] = countTerms(n*3 + 1)[0] + 1

    return lookup[n], n

这是我的素数测试仪：

def is_prime(a):
    for i in xrange(2,a):
        if a%i==0:
            #print a, " is not a prime number"
            return False
    if a==1:
        return False
    else:
        return True

score 3 · Accepted Answer

您现有的代码缩进不正确。我假设这个任务是一个家庭作业，所以我不会发布一个完整的工作解决方案，但我会给你一些有用的片段。

首先，这是一个稍微高效的素性测试仪。它不是测试是否所有小于的数a都是的因数a，而是测试到的平方根a。

def is_prime(a):
    for i in xrange(2, int(1 + a ** 0.5)):
        if a % i == 0:
            return False
    return True

请注意，此函数返回True. a = 1没关系，因为您不需要测试 1：您可以将其预加载到lookup字典中：

lookup = {1:0}

您的countTerms函数需要稍作修改，以便仅lookup在当前n为素数时将计数加一。在 Python 中，False数值为 0，True数值为 1。这在这里非常方便：

def count_prime_terms(n):
    ''' Count the number of primes terms in a Collatz sequence '''
    if n not in lookup:
        if n % 2:
            next_n = n * 3 + 1
        else:
            next_n = n // 2

        lookup[n] = count_prime_terms(next_n) + is_prime(n)
    return lookup[n]

我已将函数名称更改为更 Pythonic。

FWIW，第一个包含 65 个或更多素数的 Collatz 序列实际上包含 67 个素数。它的种子数超过 180 万，在检查直到该种子的所有序列时需要素数测试的最高数是 151629574372。完成时，lookupdict 包含 3920492 个条目。

作为对 James Mills 关于递归的评论的回应，我编写了一个非递归版本，并且为了便于查看迭代和递归版本都产生相同的结果，我发布了一个完整的工作程序。我在上面说过我不打算这样做，但我认为现在应该可以这样做，因为 spørreren 已经使用我在原始答案中提供的信息编写了他们的程序。

我完全同意避免递归是一件好事，除非在适合问题域的情况下（例如，树遍历）。Python 不鼓励递归——它不能优化尾调用递归并且它施加了递归深度限制（尽管如果需要，可以修改该限制）。

这种 Collatz 序列素数计数算法自然是递归地陈述的，但迭代地做并不难——我们只需要一个列表来临时保存序列，同时确定其所有成员的素数。诚然，这个列表占用了 RAM，但它（可能）在空间方面比递归版本所需的堆栈帧要求更有效。

在解决 OP 中的问题时，递归版本的递归深度达到了 343。这在默认限制范围内，但仍然不是很好，如果您想搜索包含更多素数的序列，您将达到该限制。

迭代和递归版本的运行速度大致相同（至少，它们在我的机器上运行）。为了解决 OP 中所述的问题，它们都需要不到 2 分钟的时间。这比我原来的解决方案要快得多，主要是由于素性测试的优化。

基本的 Collatz 序列生成步骤已经需要确定一个数字是奇数还是偶数。显然，如果我们已经知道一个数字是偶数，那么就没有必要测试它是否是素数。:) 我们还可以消除is_prime函数中偶数因素的测试。lookup我们可以通过简单地将 2 的结果加载到缓存中来处理 2 是素数的事实。

在相关的说明中，当搜索包含给定数量的素数的第一个序列时，我们不需要测试任何以偶数开头的序列。偶数（除了 2）不会增加序列的素数，并且由于此类序列中的第一个奇数将低于我们当前的数字lookup，因此假设我们从3. 如果我们不从 3 开始搜索，我们只需要确保我们的起始种子足够低，这样我们就不会意外错过包含所需素数数量的第一个序列。采用这种策略不仅减少了所需的时间，还减少了查找缓存中的条目数。

#!/usr/bin/env python

''' Find the 1st Collatz sequence containing a given number of prime terms

    From http://stackoverflow.com/q/29920691/4014959

    Written by PM 2Ring 2015.04.29

    [Seed == 1805311, prime count == 67]
'''

import sys

def is_prime(a):
    ''' Test if odd `a` >= 3 is prime '''
    for i in xrange(3, int(1 + a ** 0.5), 2):
        if not a % i:
            return 0
    return 1


#Track the highest number generated so far; use a list
# so we don't have to declare it as global...
hi = [2]

#Cache for sequence prime counts. The key is the sequence seed,
# the value is the number of primes in that sequence.
lookup = {1:0, 2:1}

def count_prime_terms_iterative(n):
    ''' Count the number of primes terms in a Collatz sequence 
        Iterative version '''
    seq = []
    while n not in lookup:
        if n > hi[0]:
            hi[0] = n

        if n % 2:
            seq.append((n, is_prime(n)))
            n = n * 3 + 1
        else:
            seq.append((n, 0))
            n = n // 2

    count = lookup[n]
    for n, isprime in reversed(seq):
        count += isprime
        lookup[n] = count

    return count

def count_prime_terms_recursive(n):
    ''' Count the number of primes terms in a Collatz sequence
        Recursive version '''
    if n not in lookup:
        if n > hi[0]:
            hi[0] = n

        if n % 2:
            next_n = n * 3 + 1
            isprime = is_prime(n)
        else:
            next_n = n // 2
            isprime = 0
        lookup[n] = count_prime_terms(next_n) + isprime

    return lookup[n]


def find_seed(numprimes, start):
    ''' Find the seed of the 1st Collatz sequence containing
        `numprimes` primes, starting from odd seed `start` '''
    i = start
    mcount = 0

    print 'seed, prime count, highest term, dict size'
    while mcount < numprimes:
        count = count_prime_terms(i)
        if count > mcount:
            mcount = count
            print i, count, hi[0], len(lookup)
        i += 2


#count_prime_terms = count_prime_terms_recursive
count_prime_terms = count_prime_terms_iterative

def main():
    if len(sys.argv) > 1:
        numprimes = int(sys.argv[1])
    else:
        print 'Usage: %s numprimes [start]' % sys.argv[0]
        exit()

    start = int(sys.argv[2]) if len(sys.argv) > 2 else 3 

    #Round `start` up to the next odd number
    if start % 2 == 0:
        start += 1

    find_seed(numprimes, start)


if __name__ == '__main__':
    main()

当运行时

$ ./CollatzPrimes.py 65

输出是

seed, prime count, highest term, dict size
3 3 16 8
7 6 52 18
19 7 160 35
27 25 9232 136
97 26 9232 230
171 28 9232 354
231 29 9232 459
487 30 39364 933
763 32 250504 1626
1071 36 250504 2197
4011 37 1276936 8009
6171 43 8153620 12297
10971 44 27114424 21969
17647 48 27114424 35232
47059 50 121012864 94058
99151 51 1570824736 198927
117511 52 2482111348 235686
202471 53 17202377752 405273
260847 55 17202377752 522704
481959 59 24648077896 966011
963919 61 56991483520 1929199
1564063 62 151629574372 3136009
1805311 67 151629574372 3619607

score 1 · Accepted Answer

让我们从一个判断一个数是否为素数的函数开始；我们将使用 Miller-Rabin 算法，对于我们将要处理的数字大小，它比试除法更快：

from random import randint

def isPrime(n, k=5): # miller-rabin
    if n < 2: return False
    for p in [2,3,5,7,11,13,17,19,23,29]:
        if n % p == 0: return n == p
    s, d = 0, n-1
    while d % 2 == 0:
        s, d = s+1, d/2
    for i in range(k):
        x = pow(randint(2, n-1), d, n)
        if x == 1 or x == n-1: continue
        for r in range(1, s):
            x = (x * x) % n
            if x == 1: return False
            if x == n-1: break
        else: return False
    return True

由于我们想找到满足标准的第一个数字，我们的策略是从 2 开始，然后逐步向上，边走边存储结果。我们使用 Collatz 序列中从 0 到 2 的素数计数来初始化缓存（这是一个糟糕的双关语，抱歉）：

primeCount = [0,0,1]

函数pCount(n)计算 Collatz 序列中n的素数。一旦序列k的当前值低于n，我们就会在缓存中查找结果。在那之前，我们会测试 Collatz 序列中每个奇数的素数，并在适当的情况下增加素数p 。当我们有n的质数时，我们将它添加到缓存中并返回它。

def pCount(n):
    k, p = n, 0
    while k > 0:
        if k < n:
            t = p + primeCount[k]
            primeCount.append(t)
            return t
        elif k % 2 == 0:
            k = k / 2
        elif isPrime(k):
            p = p + 1
            k = 3*k + 1
        else:
            k = 3*k + 1

现在只需计算每个n的素数，从 3 开始，当素数超过 65 时停止：

n = 3
t = pCount(n)
while t < 65:
    n = n + 1
    t = pCount(n)

这不会花很长时间，在我的电脑上不到一分钟。结果如下：

print n

结果中有 67 个素数。如果您想查看它们，这里有一个简单的函数，可以打印给定n的 Collatz 序列：

def collatz(n):
    cs = []
    while n != 1:
        cs.append(n)
        n = 3*n+1 if n&1 else n//2
    cs.append(1)
    return cs

这是素数列表：

filter(isPrime,collatz(n))

多么有趣的休闲数学题啊！

编辑：由于人们问起米勒-拉宾素性测试仪，让我展示这个基于 2、3、5 轮的简单素数测试仪；它会尝试除以 2、3 和 5 以及不是 2、3 或 5 的倍数的数字，其中包括一些复合数，因此用素数试除的效率有点低，但没有必要预先计算和存储素数，所以它更容易使用。

def isPrime(n): # 2,3,5-wheel
    if n < 2: return False
    wheel = [1,2,2,4,2,4,2,4,6,2,6]
    w, next = 2, 0
    while w * w <= n:
        if n % w == 0: return False
        w = w + wheel[next]
        next = next + 1
        if next > 10: next = 3
    return True

Sayingfilter(isPrime,range(1000000))在几秒钟内识别出 78498 个少于一百万的素数。您可能希望将此素性测试仪的时序与 Miller-Rabin 测试仪进行比较，并根据运行时效率确定交叉点的位置；我没有做任何正式的计时，但它似乎与 Miller-Rabin 测试仪几乎同时解决了 65-collatz-prime 问题。或者，您可能希望将试验划分与 2、3、5 轮组合到某个限制，例如 1000 或 10000 或 25000，然后对幸存者运行 Miller-Rabin；这可以快速消除大多数复合材料，因此它可以非常快速地运行。

python - 找到给出超过 65 个素数的最低 collat​​z 序列

2 回答 2

Related

Reference

python - 找到给出超过 65 个素数的最低 collatz 序列