c - 没有从 Pollard 的 rho 算法实现中获得正确的输出

Question

我不知道在尝试使用 Pollard 的 rho 算法计算素数分解时我做错了什么。

#include<stdio.h>
#define f(x)  x*x-1

int pollard( int );
int gcd( int, int);

int main( void ) {
    int n;
    scanf( "%d",&n );
    pollard( n );
    return 0;  
}

int pollard( int n ) {
    int i=1,x,y,k=2,d;
    x = rand()%n;
    y = x;

    while(1) {
        i++;
        x = f( x ) % n;
        d = gcd( y-x, n);

        if(d!=1 && d!=n)
            printf( "%d\n", d);

        if(i == k) {
            y = x;
            k = 2 * k;
        }
    }
}   
int gcd( int a, int b ) {

    if( b == 0) 
        return a;
    else 
        return gcd( b, a % b);
}

score 6 · Accepted Answer

一个直接的问题是，正如 Peter de Rivaz 怀疑的那样

#define f(x)  x*x-1

因此这条线

x = f(x)%n;

变成

x = x*x-1%n;

并且的优先级%高于的-，因此表达式被隐式括起来为

x = (x*x) - (1%n);

这相当于x = x*x - 1;（我假设n > 1，无论如何它是x = x*x - constant;），如果你从一个 value 开始x >= 2，在你有机会找到一个因素之前你已经溢出了：

2 -> 2*2-1 = 3 -> 3*3 - 1 = 8 -> 8*8 - 1 = 63 -> 3968 -> 15745023 -> 如果 int 为 32 位则溢出

不过，这并不会立即使这gcd(y-x,n)是一个因素成为不可能。它只是使得在理论上，您可能已经找到一个因子的阶段，溢出破坏了数学上存在的公因子 - 比溢出引入的公因子更有可能。

有符号整数的溢出是未定义的行为，因此无法保证程序的行为方式，但通常它的行为是一致的，因此迭代f仍然会产生一个定义明确的序列，该算法原则上适用于该序列。

另一个问题是它y-x经常是负数，然后计算gcd也可能是负数 - 经常-1。在这种情况下，您打印-1.

然后，f从起始值迭代未检测到公共因子的情况并不少见，因为以两个素因子为模的循环（例如n两个不同素数的乘积）具有相等的长度并且在同时。您没有尝试发现此类情况；无论何时gcd(|y-x|, n) == n，该序列中的任何进一步工作都是毫无意义的，因此您应该break在何时退出循环d == n。

此外，您永远不会检查是否n是素数，在这种情况下，试图找到一个因素从一开始就是徒劳的。

此外，在修复后f(x)，% n适用于的完整结果f(x)，您x*x仍然会溢出相对较小的问题x（使用标准带符号的 32 位ints，对于x >= 46341），因此较大的因式分解n可能由于溢出而失败。至少，您应该unsigned long long用于计算，以免n < 2^32. 然而，对如此小的数字进行因式分解通常通过试除法更有效地完成。Pollard 的 Rho 方法和其他先进的因式分解算法适用于较大的数字，其中试除法不再有效甚至不可行。

score 3 · Accepted Answer

我只是 C++ 的新手，而且我是 Stack Overflow 的新手，所以我写的一些内容看起来很草率，但这应该会让你朝着正确的方向前进。此处发布的程序通常应该找到并返回您在提示时输入的数字的一个重要因素，否则如果找不到这样的因素，它会道歉。

我用几个半素数测试了它，它对我有用。对于 371156167103，它在我按下回车键后发现 607619 没有任何可检测到的延迟。我没有用比这更大的数字检查它。我使用了 unsigned long long 变量，但如果可能，您应该获取并使用提供更大整数类型的库。

编辑添加，对 X 的方法 f 的单一调用和对 Y 的 2 次这样的调用是有意的，并且符合算法的工作方式。我想将对 Y 的调用嵌套在另一个这样的调用中以使其保持在一条线上，但我决定这样做，这样更容易理解。

#include "stdafx.h"
#include <stdio.h>
#include <iostream>
typedef unsigned long long ULL;

ULL pollard(ULL numberToFactor);
ULL gcd(ULL differenceBetweenCongruentFunctions, ULL numberToFactor);
ULL f(ULL x, ULL numberToFactor);

int main(void)
{
    ULL factor;
    ULL n;
    std::cout<<"Enter the number for which you want a prime factor: ";
    std::cin>>n;
    factor = pollard(n);
    if (factor == 0) std::cout<<"No factor found.  Your number may be prime, but it is     not certain.\n\n";
    else std::cout<<"One factor is: "<<factor<<"\n\n";
}

ULL pollard(ULL n)
{
    ULL x = 2ULL;
    ULL y = 2ULL;
    ULL d = 1ULL;

    while(d==1||d==n)
    {
        x = f(x,n);
        y = f(y,n);
        y = f(y,n);
        if (y>x)
        {
            d = gcd(y-x, n);
        }
        else
        {
            d = gcd(x-y, n);
        }
    }

    return d;

}


ULL gcd(ULL a, ULL b)
{
    if (a==b||a==0)
        return 0;   // If x==y or if the absolute value of (x-y) == the number     to be factored, then we have failed to find
                    // a factor.  I think this is not proof of     primality, so the process could be repeated with a new function.
                    // For example, by replacing x*x+1 with x*x+2, and     so on.  If many such functions fail, primality is likely.

    ULL currentGCD = 1;
    while (currentGCD!=0) // This while loop is based on Euclid's algorithm
    {
        currentGCD = b % a;
        b=a;
        a=currentGCD;
    }

    return b;
}

ULL f(ULL x, ULL n)
{
    return (x * x + 1) % n;
}

score 1 · Accepted Answer

很抱歉长时间延迟回到这个问题。正如我在第一个答案中提到的那样，我是 C++ 的新手，这在我过度使用全局变量、过度使用 BigIntegers 和 BigUnsigned（其他类型可能更好）、缺乏错误检查以及其他编程习惯上很明显更熟练的人可能不会展示的展示。话虽如此，让我解释一下我做了什么，然后将发布代码。

我在第二个答案中这样做是因为第一个答案作为一个非常简单的演示很有用，一旦您了解了 Pollard 的 Rho 算法的作用，它是如何实现的。它的作用是首先取 2 个变量，称它们为 x 和 y，并为它们分配起始值 2。然后它通过一个函数运行 x，通常是 (x^2+1)%n，其中 n 是你的数字想考虑因素。它每个周期两次通过相同的函数运行 y。然后计算 x 和 y 的差值，最后找到这个差值和 n 的最大公约数。如果该数字为 1，则您再次通过该函数运行 x 和 y。

继续这个过程，直到 GCD 不为 1 或直到 x 和 y 再次相等。如果找到不为 1 的 GCD，则该 GCD 是 n 的非平凡因数。如果 x 和 y 相等，则 (x^2+1)%n 函数失败。在这种情况下，您应该使用另一个函数重试，可能是 (x^2+2)%n，等等。

这是一个例子。以 35 为例，我们知道其质因数是 5 和 7。我将介绍 Pollard Rho 并向您展示它是如何找到一个非平凡因数的。

循环#1：X 从 2 开始。然后使用函数 (x^2+1)%n, (2^2+1)%35，我们得到 x 的 5。Y 也是从 2 开始，经过一次函数运行后，它的值也是 5。但是 y 总是经过两次函数，所以第二次运行是 (5^2+1)%35，即 26。 x 和 y 的差是 21。21（差）和 35（n）的 GCD 是 7。我们已经找到了 35 的质因数！请注意，任何 2 个数字的 GCD，即使是非常大的指数，都可以通过使用欧几里得算法的公式非常快速地找到，这就是我将在此处发布的程序所做的。

关于 GCD 函数，我正在使用我为这个程序下载的一个库，一个允许我使用 BigIntegers 和 BigUnsigned 的库。该库还内置了 GCD 函数，我本可以使用它。但出于教学目的，我决定继续使用手写 GCD 功能。如果您想提高程序的执行时间，使用库的 GCD 函数可能是个好主意，因为有比 Euclid 更快的方法，并且可以编写库以使用其中一种更快的方法。

另一个旁注。.Net 4.5 库也支持使用 BigIntegers 和 BigUnsigned。我决定不在这个程序中使用它，因为我想用 C++ 编写整个东西，而不是 C++/CLI。您可以从 .Net 库中获得更好的性能，或者您可能不会。我不知道，但我想分享这也是一种选择。

我在这里有点跳跃，所以现在让我先大致解释一下程序的作用，最后我将解释如果你使用 Visual Studio 11（也称为 Visual Studio 2012），如何在你的计算机上设置它。

该程序分配 3 个数组来存储您给它处理的任何数字的因子。这些数组有 1000 个元素宽，这可能是多余的，但它确保任何具有 1000 个或更少质因数的数字都适合。

当您在提示符处输入数字时，它假定该数字是复合数字，并将其放入复合因子数组的第一个元素中。然后它会经历一些公认的效率低下的 while 循环，这些循环使用 Miller-Rabin 来检查数字是否是合数。请注意，此测试可以说一个数字是具有 100% 置信度的复合数，也可以说该数字是具有极高（但不是 100%）置信度的素数。置信度可通过程序中的变量 confidenceFactor 进行调整。该程序将对 2 和 confidenceFactor 之间的每个值进行一次检查，包括在内，因此总检查次数比 confidenceFactor 本身的值少。

我对 confidenceFactor 的设置是 101，它会进行 100 次检查。如果它说一个数字是素数，那么它真正复合的几率是 4^100 中的 1，或者与连续 200 次正确掷硬币的几率相同。简而言之，如果它说这个数字是素数，它可能是，但是可以增加confidenceFactor 数字，以牺牲速度来获得更大的信心。

这里可能是一个值得提及的好地方，虽然 Pollard 的 Rho 算法可以非常有效地分解较小数量的 long long 类型，但如果没有 BigInteger，Miller-Rabin 测试来查看一个数字是否是复合的或多或少是无用的和 BigUnsigned 类型。BigInteger 库几乎需要能够可靠地将大数分解为像这样的素数。

当 Miller Rabin 说因子是复合因子时，它是因子，因子存储在临时数组中，复合数组中的原始因子除以相同的因子。当数字被识别为可能的素数时，它们被移动到素数数组并输出到屏幕。这个过程一直持续到没有剩下的复合因子。这些因素往往是按升序排列的，但这是巧合。该程序不会按升序列出它们，而只会在找到它们时列出它们。

请注意，无论我给 c 的值是多少，我都找不到任何可以将数字 4 分解的函数 (x^2+c)%n。Pollard Rho 似乎很难使用所有完美的正方形，但 4 是我发现的唯一一个使用所描述格式的函数完全不受它影响的合数。因此，我在 pollard 方法中添加了一个 n 是否为 4 的检查，如果是，则立即返回 2。

所以要设置这个程序，这是你应该做的。转到https://mattmccutchen.net/bigint/并下载 bigint-2010.04.30.zip。解压并将所有 .hh 文件和所有 C++ 源文件放在 ~\Program Files\Microsoft Visual Studio 11.0\VC\include 目录中，不包括 Sample 和 C++ Testsuite 源文件。然后在 Visual Studio 中，创建一个空项目。在解决方案资源管理器中，右键单击资源文件文件夹并选择添加...现有项目。在我刚才提到的目录中添加所有 C++ 源文件。然后也在解决方案资源管理器中，右键单击 Source Files 文件夹并添加一个新项目，选择 C++ 文件，为其命名，然后将以下源代码粘贴到其中，它应该适合您。

不要过分奉承，但是 Stack Overflow 上有些人比我更了解 C++，如果他们修改下面的代码以使其更好，那就太棒了。但即使没有，代码也是按原样运行的，它应该有助于说明以编程方式查找中型数字的素因子所涉及的原则。它不会威胁到一般数字域筛，但它可以在相当短的时间内分解具有 12 - 14 位素数的数字，即使在像我正在使用的旧 Core2 Duo 计算机上也是如此。

代码如下。祝你好运。

#include <string>
#include <stdio.h>
#include <iostream>
#include "BigIntegerLibrary.hh"

typedef BigInteger BI;
typedef BigUnsigned BU;

using std::string;
using std::cin;
using std::cout;

BU pollard(BU numberToFactor);
BU gcda(BU differenceBetweenCongruentFunctions, BU numberToFactor);
BU f(BU x, BU numberToFactor, int increment);
void initializeArrays();
BU getNumberToFactor ();
void factorComposites();
bool testForComposite (BU num);

BU primeFactors[1000];
BU compositeFactors[1000];
BU tempFactors [1000];
int primeIndex;
int compositeIndex;
int tempIndex;
int numberOfCompositeFactors;
bool allJTestsShowComposite;

int main ()
{
    while(1)
    {
        primeIndex=0;
        compositeIndex=0;
        tempIndex=0;
        initializeArrays();
        compositeFactors[0] = getNumberToFactor();
        cout<<"\n\n";
        if (compositeFactors[0] == 0) return 0;
        numberOfCompositeFactors = 1;
        factorComposites();
    }
}

void initializeArrays()
{
    for (int i = 0; i<1000;i++)
    {
        primeFactors[i] = 0;
        compositeFactors[i]=0;
        tempFactors[i]=0;
    }
}

BU getNumberToFactor ()
{
    std::string s;
    std::cout<<"Enter the number for which you want a prime factor, or 0 to quit: ";
    std::cin>>s;
    return stringToBigUnsigned(s);
}

void factorComposites()
{
    while (numberOfCompositeFactors!=0)
    {
        compositeIndex = 0;
        tempIndex = 0;

        // This while loop finds non-zero values in compositeFactors.
        // If they are composite, it factors them and puts one factor in tempFactors,
        // then divides the element in compositeFactors by the same amount.
        // If the element is prime, it moves it into tempFactors (zeros the element in compositeFactors)
        while (compositeIndex < 1000)
        {
            if(compositeFactors[compositeIndex] == 0)
            {
                compositeIndex++;
                continue;
            }
            if(testForComposite(compositeFactors[compositeIndex]) == false)
            {
                tempFactors[tempIndex] = compositeFactors[compositeIndex];
                compositeFactors[compositeIndex] = 0;
                tempIndex++;
                compositeIndex++;
            }
            else
            {
                tempFactors[tempIndex] = pollard (compositeFactors[compositeIndex]);
                compositeFactors[compositeIndex] /= tempFactors[tempIndex];
                tempIndex++;
                compositeIndex++;
            }
        }
        compositeIndex = 0;

        // This while loop moves all remaining non-zero values from compositeFactors into tempFactors
        // When it is done, compositeFactors should be all 0 value elements
        while (compositeIndex < 1000)
        {
            if (compositeFactors[compositeIndex] != 0)
            {
                tempFactors[tempIndex] = compositeFactors[compositeIndex];
                compositeFactors[compositeIndex] = 0;
                tempIndex++;
                compositeIndex++;
            }
            else compositeIndex++;
        }
        compositeIndex = 0;
        tempIndex = 0;

        // This while loop checks all non-zero elements in tempIndex.
        // Those that are prime are shown on screen and moved to primeFactors
        // Those that are composite are moved to compositeFactors
        // When this is done, all elements in tempFactors should be 0
        while (tempIndex<1000)
        {
            if(tempFactors[tempIndex] == 0)
            {
                tempIndex++;
                continue;
            }
            if(testForComposite(tempFactors[tempIndex]) == false)
            {
                primeFactors[primeIndex] = tempFactors[tempIndex];
                cout<<primeFactors[primeIndex]<<"\n";
                tempFactors[tempIndex]=0;
                primeIndex++;
                tempIndex++;
            }
            else
            {
                compositeFactors[compositeIndex] = tempFactors[tempIndex];
                tempFactors[tempIndex]=0;
                compositeIndex++;
                tempIndex++;
            }
        }
        compositeIndex=0;
        numberOfCompositeFactors=0;

        // This while loop just checks to be sure there are still one or more composite factors.
        // As long as there are, the outer while loop will repeat
        while(compositeIndex<1000)
        {
            if(compositeFactors[compositeIndex]!=0) numberOfCompositeFactors++;
            compositeIndex ++;
        }
    }
    return;
}

// The following method uses the Miller-Rabin primality test to prove with 100% confidence a given number is     composite,
// or to establish with a high level of confidence -- but not 100% -- that it is prime

bool testForComposite (BU num)
{
    BU confidenceFactor = 101;
    if (confidenceFactor >= num) confidenceFactor = num-1;
    BU a,d,s, nMinusOne;
    nMinusOne=num-1;
    d=nMinusOne;
    s=0;
    while(modexp(d,1,2)==0)
    {
        d /= 2;
        s++;
    }
    allJTestsShowComposite = true; // assume composite here until we can prove otherwise
    for (BI i = 2 ; i<=confidenceFactor;i++)
    {
        if (modexp(i,d,num) == 1) 
            continue;  // if this modulus is 1, then we cannot prove that num is composite with this     value of i, so continue
        if (modexp(i,d,num) == nMinusOne)
        {
            allJTestsShowComposite = false;
            continue;
        }
        BU exponent(1);     
        for (BU j(0); j.toInt()<=s.toInt()-1;j++)
        {
            exponent *= 2;
            if (modexp(i,exponent*d,num) == nMinusOne)
            {
                // if the modulus is not right for even a single j, then break and increment i.
                allJTestsShowComposite = false;
                continue;
            }
        }
        if (allJTestsShowComposite == true) return true; // proven composite with 100% certainty, no need     to continue testing
    }
    return false;
    /* not proven composite in any test, so assume prime with a possibility of error = 
    (1/4)^(number of different values of i tested).  This will be equal to the value of the
    confidenceFactor variable, and the "witnesses" to the primality of the number being tested will be all     integers from
    2 through the value of confidenceFactor.

    Note that this makes this primality test cryptographically less secure than it could be.  It is     theoretically possible,
    if difficult, for a malicious party to pass a known composite number for which all of the lowest n integers     fail to
    detect that it is composite.  A safer way is to generate random integers in the outer "for" loop and use     those in place of
    the variable i.  Better still if those random numbers are checked to ensure no duplicates are generated.
    */
}

BU pollard(BU n)
{
    if (n == 4) return 2;
    BU x = 2;
    BU y = 2;
    BU d = 1;
    int increment = 1;

    while(d==1||d==n||d==0)
    {
        x = f(x,n, increment);
        y = f(y,n, increment);
        y = f(y,n, increment);
        if (y>x)
        {
            d = gcda(y-x, n);
        }
        else
        {
            d = gcda(x-y, n);
        }
        if (d==0) 
        {
            x = 2;
            y = 2;
            d = 1;
            increment++; // This changes the pseudorandom function we use to increment x and y
        }
    }
    return d;
}


BU gcda(BU a, BU b)
{
    if (a==b||a==0)
        return 0;   // If x==y or if the absolute value of (x-y) == the number to be factored, then we     have failed to find
                    // a factor.  I think this is not proof of primality, so the process could     be repeated with a new function.
                    // For example, by replacing x*x+1 with x*x+2, and so on.  If many such     functions fail, primality is likely.

    BU currentGCD = 1;
    while (currentGCD!=0) // This while loop is based on Euclid's algorithm
    {
        currentGCD = b % a;
        b=a;
        a=currentGCD;
    }
    return b;
}

BU f(BU x, BU n, int increment)
{
    return (x * x + increment) % n;
}

score 0 · Accepted Answer

据我所知，Pollard Rho 通常使用f(x)as (x*x+1)（例如在这些讲义中）。

您的选择x*x-1似乎不如它经常陷入循环中那么好：

 x=0
 f(x)=-1
 f(f(x))=0

c - 没有从 Pollard 的 rho 算法实现中获得正确的输出

4 回答 4

Related

Reference