3

到目前为止的答案

所以这里是代码分解。

//Time: ~7s (linear loop algorithm)
//100,000! (456,574 decimal digits)
BigInteger bigIntVar = computeFactorial(100000);

//The first three here are just for comparison and are not actually Base 10.
bigIntVar.ToBase64String() //Time: 00.001s | Base 64 | Tetrasexagesimal
bigIntVar.ToString("x")    //Time: 00.016s | Base 16 | Hexadecimal
bigIntVar.ToBinaryString() //Time: 00.026s | Base 02 | Binary
bigIntVar.ToQuickString()  //Time: 11.200s | Base 10 | String Version
bigIntVar.ToQuickString()  //Time: 12.500s | Base 10 | StringBuilder Version
bigIntVar.ToString()       //Time: 13.300s | Base 10 | Original

原始问题

我在这方面花了很多时间,所以我需要你的帮助。

这是用于计算巨大阶乘的个人项目(例如 100,000!)

这是我的代码:

using (var stream = new StreamWriter(fileName + ".txt", false))
{
    stream.WriteLine(header);

    var timer = new Stopwatch();    
    timer.Restart();
    //This is the huge BigInteger holding the answer to 100,000!
    stream.WriteLine(saveFactorial.Output.ToString());         
    //Let me be clear: ToString() is directly causing the the 13sec time delay.
    //Not the stream.
    timer.Stop();                   
}

time = (timer.ElapsedMilliseconds / 1000.0).ToString() + "s"; 

MessageBox.Show(time);

100,000!在我的机器上计算大约需要 7 秒(线性循环算法)。

然而,使用此标准 IO 代码需要 13 秒才能保存。

所以换句话说,保存工作比适度计算它需要更长的时间。

所以我想也许我可以使用:

BigInteger.ToByteArray();

尽管它运行得非常快,但我不知道如何将其保存为可读文本。

您可以使用上述方法将二进制字符串写入具有此自制扩展名的文本文件:

ToBinaryString

//Usage: string bigIntBinary = bigIntVar.ToBinaryString();
public static string ToBinaryString(this BigInteger source)
{
    //If you lookup the ToByteArray() method...
    //It actually stores the bytes in reverse order.
    var bigIntBytes = source.ToByteArray().Reverse();

    StringBuilder bigIntBinary = new StringBuilder();

    foreach (var bigIntByte in bigIntBytes)
    {
       bigIntBinary.Append(Convert.ToString(bigIntByte, 2).PadLeft(8, '0'));
    }

    return bigIntBinary.ToString();
}

ToBase64String

    ////Usage: string bigIntBase64 = bigIntVar.ToBase64String();
    public static string ToBase64String(this BigInteger source)
    {
        var bigIntBytes = source.ToByteArray().Reverse().ToArray();

        return Convert.ToBase64String(bigIntBytes);
    }

我还尝试了数学方法(mod 10 等...)来获取每个数字,但这需要比 ToString() 多 TON 的时间。

我在这里做错了什么?


这段代码是我根据下面的答案提出的。这比 ToString() 快,但只快了几秒钟。

ToQuickString

//Usage: string bigIntString = bigIntVar.ToQuickString()
public static String ToQuickString(this BigInteger source)
{
    powersOfTen = new List<BigInteger>();

    powersOfTen.Add(1);

    for (BigInteger i = 10; i < source; i *= i)
    {
        powersOfTen.Add(i);
    }

    return BuildString(source, powersOfTen.Count - 1).ToString().TrimStart('0');
}

private static List<BigInteger> powersOfTen;

private static string BuildString(BigInteger n, int m)
{
    if (m == 0)
        return n.ToString();

    BigInteger remainder;
    BigInteger quotient = BigInteger.DivRem(n, powersOfTen[m], out remainder);

    return BuildString(quotient, m - 1) + BuildString(remainder, m - 1);
}
4

2 回答 2

2

首先,我会计算所有10^(2^m)小于n. 然后我会使用DivRem其中最大的一个来将问题分成两个子问题。以递归方式重复该操作,直到您减少到单个数字。

var powersOfTen=new List<BigInteger>();
powersOfTen.Add(1);
for(BigInteger i=10;i<n;i=i*i)
  powersOfTen.Add(i);

string ToString(BigInteger n, int m)
{
  if(m==0)
    return n.ToString();
  quotient = DivRem(n,powersOfTen[m], remainder)
  return ToString(quotient, m-1)+ToString(remainder, m-1)
}

您还可以通过直接写入字符数组来完全优化字符串连接。


或者,您可以考虑在所有计算过程中使用基数 1000'000'000。这样,您最终根本不需要基本转换。对于阶乘计算,这可能要快得多。

List<int> multiply(List<int> f1, int f2)
{
  int carry=0;
  for(int i=0;i<f1.Count;i++)
  {
    var product=(Int64)f1[i]*(Int64)f2;
    carry=product/1000000000;
    result.Add(product%1000000000);
  }
  if(carry!=0)
    result.Add(carry);
}

现在转换为以 10 为底的字符串既简单又便宜。

于 2012-10-31T09:26:38.143 回答
1

以二进制或十六进制格式保存 BigInteger 数据。它对计算机和足够敬业的人来说是可读的。;>

花费额外的精力使输出“人类可读”是浪费时间。没有人能够理解 450,000 个数字,无论它们是以 10 为底、以 16 为底、以 2 为底还是其他任何数字。

更新

更仔细地研究 Base 10 转换,可以在多核系统上使用多线程将 ToString 的基线性能降低几乎一半。主要障碍是整个十进制化过程中最大的时间消耗是对原始 450k 数字的第一次除法运算。

Stats on my quad core P7: 
Generating a 500k digit random number using power and multiply: 5 seconds
Dividing that big number by anything just once: 11 seconds
ToString(): 22 seconds
ToQuickString: 18 seconds
ToStringMT: 12.9 seconds

.

public static class BigIntExtensions
{
    private static List<BigInteger> powersOfTen;

    // Must be called before ToStringMt()
    public static void InitPowersOfTen(BigInteger n)
    {
        powersOfTen = new List<BigInteger>();

        powersOfTen.Add(1);

        for (BigInteger i = 10; i < n; i *= i)
            powersOfTen.Add(i);
    }

    public static string ToStringMT(this BigInteger n)
    {
        // compute the index into the powersOfTen table for the given parameter. This is very fast.
        var m = (int)Math.Ceiling(Math.Log(BigInteger.Log10(n), 2));

        BigInteger r1;
        // the largest amount of execution time happens right here:
        BigInteger q1 = BigInteger.DivRem(n, BigIntExtensions.powersOfTen[m], out r1);

        // split the remaining work across 4 threads - 3 new threads plus the current thread
        var t1 = Task.Factory.StartNew<string>(() =>
        {
            BigInteger r1r2;
            BigInteger r1q2 = BigInteger.DivRem(r1, BigIntExtensions.powersOfTen[m - 1], out r1r2);
            var t2 = Task.Factory.StartNew<string>(() => BuildString(r1r2, m - 2));
            return BuildString(r1q2, m - 2) + t2.Result;
        });
        BigInteger q1r2;
        BigInteger q1q2 = BigInteger.DivRem(q1, BigIntExtensions.powersOfTen[m - 1], out q1r2);
        var t3 = Task.Factory.StartNew<string>(() => BuildString(q1r2, m - 2));
        var sb = new StringBuilder();
        sb.Append(BuildString(q1q2, m - 2));
        sb.Append(t3.Result);
        sb.Append(t1.Result);
        return sb.ToString();
    }

    // same as ToQuickString, but bails out before m == 0 to reduce call overhead.
    // BigInteger.ToString() is faster than DivRem for smallish numbers.
    private static string BuildString(BigInteger n, int m)
    {
        if (m <= 8)
            return n.ToString();

        BigInteger remainder;
        BigInteger quotient = BigInteger.DivRem(n, powersOfTen[m], out remainder);
        return BuildString(quotient, m - 1) + BuildString(remainder, m - 1);
    }
}

对于 ToQuickString() 和 ToStringMT(),需要在使用这些函数之前初始化 10 次方数组。初始化此数组不应包含在函数执行时间测量中,因为该数组可以在后续调用中重复使用,因此它的初始化成本在程序的生命周期内摊销,而不是单个函数调用。

对于生产系统,我会设置更自动的初始化,例如在类静态构造函数中初始化合理数量的条目,然后检查 ToQuickString() 或 ToStringMT() 以查看表中是否有足够的条目来处理给定大整数。如果没有,请向表中添加足够的条目来处理当前的 BigInteger,然后继续操作。

This ToStringMT function constructs the worker tasks manually to spread the remaining work out across 4 threads on the available execution cores in a multi core CPU. You could instead just make the original ToQuickString() function spin off half of its work into another thread on each recursion, but this quickly creates too many tasks and gets bogged down in task scheduling overhead. The recursion drills all the way down to individual decimal digits. I modified the BuildString() function to bail out earlier (m <= 8 instead of m == 0) because BigInteger.ToString() is faster than DivRem for smallish numbers.

90% of ToStringMt()'s execution time is taken up by the first DivRem call. It converges very quickly after that, but the first one is really painful.

于 2012-11-02T17:26:32.440 回答