2

有人可以帮我计算(A*B)%C1<=A,B,C<=10^18在 C++ 中,没有大数字,只是一种数学方法。

4

2 回答 2

6

在我的脑海中(未经过广泛测试)

typedef unsigned long long BIG;
BIG mod_multiply( BIG A, BIG B, BIG C )
{
    BIG mod_product = 0;
    A %= C;

    while (A) {
        B %= C;
        if (A & 1) mod_product = (mod_product + B) % C;
        A >>= 1;
        B <<= 1;
    }

    return mod_product;
}

这具有复杂性O(log A)迭代。您可能可以%用条件减法替换大部分,以获得更高的性能。

typedef unsigned long long BIG;
BIG mod_multiply( BIG A, BIG B, BIG C )
{
    BIG mod_product = 0;
    // A %= C; may or may not help performance
    B %= C;

    while (A) {
        if (A & 1) {
            mod_product += B;
            if (mod_product > C) mod_product -= C;
        }
        A >>= 1;
        B <<= 1;
        if (B > C) B -= C;
    }

    return mod_product;
}

这个版本只有一个长整数模——它甚至可能比大块方法更快,这取决于你的处理器如何实现整数模。

  • 现场演示:https ://ideone.com/1pTldb——与 Yakk 的结果相同。
于 2013-06-24T19:47:50.703 回答
0

堆栈溢出答案的实现之前:

#include <stdint.h>
#include <tuple>
#include <iostream>

typedef std::tuple< uint32_t, uint32_t > split_t;
split_t split( uint64_t a )
{
  static const uint32_t mask = -1;
  auto retval = std::make_tuple( mask&a, ( a >> 32 ) );
  // std::cout << "(" << std::get<0>(retval) << "," << std::get<1>(retval) << ")\n";
  return retval;
}

typedef std::tuple< uint64_t, uint64_t, uint64_t, uint64_t > cross_t;
template<typename Lambda>
cross_t cross( split_t lhs, split_t rhs, Lambda&& op )
{
  return std::make_tuple( 
    op(std::get<0>(lhs), std::get<0>(rhs)),
    op(std::get<1>(lhs), std::get<0>(rhs)),
    op(std::get<0>(lhs), std::get<1>(rhs)),
    op(std::get<1>(lhs), std::get<1>(rhs))
  );
}

// c must have high bit unset:
uint64_t a_times_2_k_mod_c( uint64_t a, unsigned k, uint64_t c )
{
  a %= c;
  for (unsigned i = 0; i < k; ++i)
  {
    a <<= 1;
    a %= c;
  }
  return a;
}

// c must have about 2 high bits unset:
uint64_t a_times_b_mod_c( uint64_t a, uint64_t b, uint64_t c )
{
  // ensure a and b are < c:
  a %= c;
  b %= c;
  
  auto Z = cross( split(a), split(b), [](uint32_t lhs, uint32_t rhs)->uint64_t {
    return (uint64_t)lhs * (uint64_t)rhs;
  } );
  
  uint64_t to_the_0;
  uint64_t to_the_32_a;
  uint64_t to_the_32_b;
  uint64_t to_the_64;
  std::tie( to_the_0, to_the_32_a, to_the_32_b, to_the_64 ) = Z;
  
  // std::cout << to_the_0 << "+ 2^32 *(" << to_the_32_a << "+" << to_the_32_b << ") + 2^64 * " << to_the_64 << "\n";
  
  // this line is the one that requires 2 high bits in c to be clear
  // if you just add 2 of them then do a %c, then add the third and do
  // a %c, you can relax the requirement to "one high bit must be unset":
  return
    (to_the_0
    + a_times_2_k_mod_c(to_the_32_a+to_the_32_b, 32, c) // + will not overflow!
    + a_times_2_k_mod_c(to_the_64, 64, c) )
  %c;
}

int main()
{
  uint64_t retval = a_times_b_mod_c( 19010000000000000000, 1011000000000000, 1231231231231211 );
  std::cout << retval << "\n";
}

这里的想法是将您的 64 位整数拆分为一对 32 位整数,它们可以安全地在 64 位域中相乘。

我们表示a*b(a_high * 2^32 + a_low) * (b_high * 2^32 + b_low),进行 4 倍乘法(跟踪 2 32 个因子而不将它们存储在我们的位中),然后注意a * 2^k % c可以通过k此模式的一系列重复来完成((a*2 %c) *2%c)...所以我们可以在 2 32中取这个 64 位整数的 3 到 4 元素多项式并减少它,而不必担心事情。

昂贵的部分是a_times_2_k_mod_c函数(唯一的循环)。

c如果你知道它有不止一个高位清晰,你可以让它快很多倍。

您可以改为a %= c用减法替换a -= (a>=c)*c;

两者都做并不是那么实用。

活生生的例子

于 2013-06-24T20:49:46.373 回答