algorithm - 找到不小于 N 的最小正则数

Question

正则数是 60 次幂的整数。例如，60 ² = 3600 = 48 × 75，因此 48 和 75 都是 60 次幂的除数。因此，它们也是正则数。

这是向上取整到 2 的下一个幂的扩展。

我有一个整数值N，它可能包含大的素因数，我想将它四舍五入为仅由小素因数（2、3 和 5）组成的数字

例子：

f(18) == 18 == 2¹ * 3²
f(19) == 20 == 2² * 5¹
f(257) == 270 == 2¹ * 3³ * 5¹

找到满足此要求的最小数字的有效方法是什么？

涉及的值可能很大，所以我想避免从 1 开始枚举所有常规数字或维护所有可能值的数组。

score 6 · Accepted Answer

通过直接枚举三元组，我们可以及时地在第 n 个成员周围产生任意细的汉明序列切片，例如.~ n^(2/3)(i,j,k)N = 2^i * 3^j * 5^k

该算法从log2(N) = i+j*log2(3)+k*log2(5); 枚举所有可能k的 s 并且对于每个可能j的 s，找到顶部i，因此找到三元组(k,j,i)，如果在给定的“宽度”内低于给定的高对数最大值（当width< 1 时，最多可以有一个这样i的 ) 然后按它们的对数对它们进行排序。

WP 这么说，n ~ (log N)^3即运行时间~ (log N)^2。这里我们不关心找到的三元组在序列中的确切位置，因此可以丢弃原始代码中的所有计数计算：

slice hi w = sortBy (compare `on` fst) b where       -- hi>log2(N) is a top value
  lb5=logBase 2 5 ; lb3=logBase 2 3                  -- w<1 (NB!) is log2(width)
  b  = concat                                        -- the slice
      [ [ (r,(i,j,k)) | frac < w ]                   -- store it, if inside width
        | k <- [ 0 .. floor ( hi   /lb5) ],  let p = fromIntegral k*lb5,
          j <- [ 0 .. floor ((hi-p)/lb3) ],  let q = fromIntegral j*lb3 + p,
          let (i,frac)=properFraction(hi-q) ;    r = hi - frac ]   -- r = i + q
                    -- properFraction 12.7 == (12, 0.7)

-- update: in pseudocode:
def slice(hi, w):
    lb5, lb3 = logBase(2, 5), logBase(2, 3)  -- logs base 2 of 5 and 3
    for k from 0 step 1 to floor(hi/lb5) inclusive:
        p = k*lb5
        for j from 0 step 1 to floor((hi-p)/lb3) inclusive:
           q = j*lb3 + p
           i = floor(hi-q)
           frac = hi-q-i                     -- frac < 1 , always
           r = hi - frac                     -- r == i + q
           if frac < w:
              place (r,(i,j,k)) into the output array
   sort the output array's entries by their "r" component
        in ascending order, and return thus sorted array

枚举切片中的三元组后，排序和搜索就很简单了，实际上需要O(1)时间（对于任意薄切片）来找到上面的第一个三元组N。(i,j,k)好吧，实际上，对于恒定宽度（对数），切片中的数字数量（平面下方 - 空间中的“上地壳”成员log(N)）再次出现m ~ n^2/3 ~ (log N)^2，并且排序需要m log m时间（因此搜索，即使是线性的，也需要~ m运行那么时间）。N但是，根据一些经验观察，可以将 s 越大的宽度越小；无论如何，三元组枚举的常数因子远高于随后的排序。

即使宽度恒定（对数），它也运行得非常快，可以立即计算汉明序列中的第 1,000,000 个值，并在 0.05 秒内计算出第 10亿个值。

正如我在 2008 年关于 DDJ 博客讨论的帖子中所引用的那样，“三重奏顶级乐队”的最初想法是由 Louis Klauder 提出的。

更新：正如GordonBGood在评论中指出的那样，不需要整个波段，而只需在目标上方和下方大约一两个值。该算法很容易修改为该效果。在继续算法之前，还应测试输入本身是否为汉明数，以避免双精度舍入问题。比较预先知道不同的汉明数的对数没有四舍五入问题（尽管在序列中达到万亿分之一的对数值使用大约 14 位有效数字，只剩下 1-2 位备用，所以那里的情况实际上可能变得不确定；但对于十亿分之一，我们只需要 11 个有效数字）。

update2：结果是对数的双精度将其限制为大约 20,000 到 40,000 个十进制数字以下的数字（即第 10 万亿到第 100 万亿汉明数）。如果确实需要对如此大的数字进行此操作，则可以将算法切换回使用整数值本身而不是它们的对数，这会更慢。

score 5 · Accepted Answer

好的，希望第三次在这里有魅力。用于 p 的初始输入的递归分支算法，其中 N 是在每个线程中“构建”的数字。NB 3a-c 这里作为单独的线程启动或以其他方式（准）异步完成。

计算 p 之后的 2 的次大幂，称为 R。N = p。
N > R？退出这个话题。p 是否仅由小的素因数组成？你完成了。否则，转到步骤 3。
在 3a-c 中的任何一个之后，转到步骤 4。

a) 将 p 向上舍入到 2 的最接近的倍数。这个数可以表示为 m * 2。
b) 将 p 向上舍入到 3 的最接近的倍数。这个数可以表示为 m * 3。
c) 将 p 向上舍入到最接近 5 的倍数。这个数可以表示为 m * 5。
转到步骤 2，p = m。

我省略了关于跟踪 N 的簿记，但我认为这相当简单。

编辑：忘记 6，谢谢 ypercube。

编辑 2：如果达到 30，（5、6、10、15、30）意识到这是不必要的，就把它拿出来。

编辑 3：（我保证的最后一个！）添加了 30 次幂检查，这有助于防止该算法耗尽所有 RAM。

编辑 4：根据 finnw 的观察，将 30 的幂更改为 2 的幂。

score 5 · Accepted Answer

这是 Python 中的一个解决方案，基于Will Ness 的回答，但采用了一些捷径并使用纯整数数学来避免遇到对数空间数值精度错误：

import math

def next_regular(target):
    """
    Find the next regular number greater than or equal to target.
    """
    # Check if it's already a power of 2 (or a non-integer)
    try:
        if not (target & (target-1)):
            return target
    except TypeError:
        # Convert floats/decimals for further processing
        target = int(math.ceil(target))

    if target <= 6:
        return target

    match = float('inf') # Anything found will be smaller
    p5 = 1
    while p5 < target:
        p35 = p5
        while p35 < target:
            # Ceiling integer division, avoiding conversion to float
            # (quotient = ceil(target / p35))
            # From https://stackoverflow.com/a/17511341/125507
            quotient = -(-target // p35)

            # Quickly find next power of 2 >= quotient
            # See https://stackoverflow.com/a/19164783/125507
            try:
                p2 = 2**((quotient - 1).bit_length())
            except AttributeError:
                # Fallback for Python <2.7
                p2 = 2**(len(bin(quotient - 1)) - 2)

            N = p2 * p35
            if N == target:
                return N
            elif N < match:
                match = N
            p35 *= 3
            if p35 == target:
                return p35
        if p35 < match:
            match = p35
        p5 *= 5
        if p5 == target:
            return p5
    if p5 < match:
        match = p5
    return match

英文：遍历 5s 和 3s 的每一个组合，快速找到每对 2 >= target 的下一个幂并保持最小的结果。（如果只有一个是正确的，那么遍历每个可能的 2 的倍数是浪费时间）。如果它发现目标已经是一个常规数字，它也会提前返回，尽管这不是绝对必要的。

我已经对它进行了非常彻底的测试，测试了从 0 到 51200000 的每个整数，并与 OEIS http://oeis.org/A051037上的列表以及许多与常规数字相差±1的大数等进行了比较。现在是在 SciPy 中可用fftpack.helper.next_fast_len，以找到 FFT 的最佳大小（源代码）。

我不确定 log 方法是否更快，因为我无法让它足够可靠地工作以对其进行测试。我认为它有类似数量的操作，虽然？我不确定，但这相当快。^{计算 2 142} × 3 ⁸⁰ × 5 ^{444是 2}² × 3 ⁴⁵⁴ × 5 ²⁴⁹ +1 （第 100,000,000 个常规数字，有 392 位）之上的下一个常规数字需要 <3 秒（或 gmpy 为 0.7 秒）

score 3 · Accepted Answer

您想找到最小的数字，m即all 。m >= Nm = 2^i * 3^j * 5^ki,j,k >= 0

取对数，方程可以改写为：

 log m >= log N
 log m = i*log2 + j*log3 + k*log5

您可以计算log2、log3和log5至logN（足够高，取决于的大小N）精度。那么这个问题看起来像一个整数线性规划问题，你可以尝试使用一种已知的算法来解决这个 NP-hard 问题。

score 1 · Accepted Answer

编辑/更正： 更正了代码以通过scipy 测试：

Here's an answer based on endolith's answer，但几乎消除了通过使用 float64 对数表示进行基本比较以找到通过标准的三重值的长多精度整数计算，只有在对数有机会时才诉诸全精度比较值可能不够准确，仅当目标非常接近前一个或下一个常规数字时才会发生：

import math

def next_regulary(target):
    """
    Find the next regular number greater than or equal to target.
    """
    if target < 2: return ( 0, 0, 0 )
    log2hi = 0
    mant = 0
    # Check if it's already a power of 2 (or a non-integer)
    try:
        mant = target & (target - 1)
        target = int(target) # take care of case where not int/float/decimal
    except TypeError:
        # Convert floats/decimals for further processing
        target = int(math.ceil(target))
        mant = target & (target - 1)

    # Quickly find next power of 2 >= target
    # See https://stackoverflow.com/a/19164783/125507
    try:
        log2hi = target.bit_length()
    except AttributeError:
        # Fallback for Python <2.7
        log2hi = len(bin(target)) - 2

    # exit if this is a power of two already...
    if not mant: return ( log2hi - 1, 0, 0 )

    # take care of trivial cases...
    if target < 9:
        if target < 4: return ( 0, 1, 0 )
        elif target < 6: return ( 0, 0, 1 )
        elif target < 7: return ( 1, 1, 0 )
        else: return ( 3, 0, 0 )

    # find log of target, which may exceed the float64 limit...
    if log2hi < 53: mant = target << (53 - log2hi)
    else: mant = target >> (log2hi - 53)
    log2target = log2hi + math.log2(float(mant) / (1 << 53))

    # log2 constants
    log2of2 = 1.0; log2of3 = math.log2(3); log2of5 = math.log2(5)

    # calculate range of log2 values close to target;
    # desired number has a logarithm of log2target <= x <= top...
    fctr = 6 * log2of3 * log2of5
    top = (log2target**3 + 2 * fctr)**(1/3) # for up to 2 numbers higher
    btm = 2 * log2target - top # or up to 2 numbers lower

    match = log2hi # Anything found will be smaller
    result = ( log2hi, 0, 0 ) # placeholder for eventual matches
    count = 0 # only used for debugging counting band
    fives = 0; fiveslmt = int(math.ceil(top / log2of5))
    while fives < fiveslmt:
        log2p = top - fives * log2of5
        threes = 0; threeslmt = int(math.ceil(log2p / log2of3))
        while threes < threeslmt:
            log2q = log2p - threes * log2of3
            twos = int(math.floor(log2q)); log2this = top - log2q + twos

            if log2this >= btm: count += 1 # only used for counting band
            if log2this >= btm and log2this < match:
                # logarithm precision may not be enough to differential between
                # the next lower regular number and the target, so do
                # a full resolution comparison to eliminate this case...
                if (2**twos * 3**threes * 5**fives) >= target:
                    match = log2this; result = ( twos, threes, fives )
            threes += 1
        fives += 1

    return result

print(next_regular(2**2 * 3**454 * 5**249 + 1)) # prints (142, 80, 444)

由于消除了大多数长的多精度计算，因此不需要 gmpy，并且在 IDEOne 上，对于 endolith 的解决方案，上述代码需要 0.11 秒而不是 0.48 秒来找到下一个大于百万分之一的常规数字，如图所示；(761,572,489)找到十亿之后的下一个常规数字（下一个是过去）需要 0.49 秒而不是 5.48 秒(1334,335,404) + 1，并且随着范围的增加，与 endolith 版本相比，多精度计算变得越来越长，差异会变得更大这里几乎没有。因此，这个版本可以在 IDEOne 上大约 50 秒内从序列中的万亿分之一计算出下一个常规数字，而 endolith 版本可能需要一个多小时。

该算法的英文描述与 endolith 版本几乎相同，不同之处如下： 1）计算参数目标值的浮点对数估计（我们不能使用内置log直接函数，因为范围可能太大而无法表示为 64 位浮点数），2) 比较日志表示值以确定在估计范围内的限定值，高于和低于大约两个或三个数字的目标值（取决于在四舍五入时），3）仅在上述定义的窄带内比较多精度值，4）输出三重索引而不是完整的长多精度整数（对于十亿分之一，将是大约 840 个十进制数字，是万亿分之一的十倍），然后可以根据需要轻松转换为长多精度值。

除了用于可能非常大的多精度整数目标值、大约相同大小的中间评估比较值以及三元组的输出扩展（如果需要）之外，该算法几乎不使用任何内存。该算法是对 endolith 版本的改进，它成功地将对数值用于大多数比较，尽管它们缺乏精度，并且它将比较数字的范围缩小到只有几个。

该算法适用于略高于 10 万亿的参数范围（在 IDEOne 速率下只需几分钟的计算时间），因为根据@WillNess 的讨论，由于日志表示值缺乏精度，它不再正确；为了解决这个问题，我们可以将日志表示更改为由固定长度整数组成的“roll-your-own”对数表示（124 位大约是指数范围的两倍，适用于超过十万位的目标，如果一个愿意等待）；由于较小的多精度整数运算比 float64 运算慢，所以这会慢一点，但由于大小有限（可能慢三倍左右），所以速度不会慢很多。

现在这些 Python 实现（不使用 C 或 Cython 或 PyPy 或其他东西）都不是特别快，因为它们比用编译语言实现慢大约一百倍。作为参考，这里是一个 Haskell 版本：

{-# OPTIONS_GHC -O3 #-}

import Data.Word
import Data.Bits

nextRegular :: Integer -> ( Word32, Word32, Word32 )
nextRegular target
  | target < 2                   = ( 0, 0, 0 )
  | target .&. (target - 1) == 0 = ( fromIntegral lg2hi - 1, 0, 0 )
  | target < 9                   = case target of
                                     3 -> ( 0, 1, 0 )
                                     5 -> ( 0, 0, 1 )
                                     6 -> ( 1, 1, 0 )
                                     _ -> ( 3, 0, 0 )
  | otherwise                    = match
 where
  lg3 = logBase 2 3 :: Double; lg5 = logBase 2 5 :: Double
  lg2hi = let cntplcs v cnt =
                let nv = v `shiftR` 31 in
                if nv <= 0 then
                  let cntbts x c =
                        if x <= 0 then c else
                        case c + 1 of
                          nc -> nc `seq` cntbts (x `shiftR` 1) nc in
                  cntbts (fromIntegral v :: Word32) cnt
                else case cnt + 31 of ncnt -> ncnt `seq` cntplcs nv ncnt
          in cntplcs target 0
  lg2tgt = let mant = if lg2hi <= 53 then target `shiftL` (53 - lg2hi)
                      else target `shiftR` (lg2hi - 53)
           in fromIntegral lg2hi +
                logBase 2 (fromIntegral mant / 2^53 :: Double)
  lg2top = (lg2tgt^3 + 2 * 6 * lg3 * lg5)**(1/3) -- for 2 numbers or so higher
  lg2btm = 2* lg2tgt - lg2top -- or two numbers or so lower
  match =
    let klmt = floor (lg2top / lg5)
        loopk k mtchlgk mtchtplk =
          if k > klmt then mtchtplk else
          let p = lg2top - fromIntegral k * lg5
              jlmt = fromIntegral $ floor (p / lg3)
              loopj j mtchlgj mtchtplj =
                if j > jlmt then loopk (k + 1) mtchlgj mtchtplj else
                let q = p - fromIntegral j * lg3
                    ( i, frac ) = properFraction q; r = lg2top - frac
                    ( nmtchlg, nmtchtpl ) =
                      if r < lg2btm || r >= mtchlgj then
                        ( mtchlgj, mtchtplj ) else
                      if 2^i * 3^j * 5^k >= target then
                        ( r, ( i, j, k ) ) else ( mtchlgj, mtchtplj )
                in nmtchlg `seq` nmtchtpl `seq` loopj (j + 1) nmtchlg nmtchtpl
          in loopj 0 mtchlgk mtchtplk
    in loopk 0 (fromIntegral lg2hi) ( fromIntegral lg2hi, 0, 0 )


trival :: ( Word32, Word32, Word32 ) -> Integer
trival (i,j,k) = 2^i * 3^j * 5^k

main = putStrLn $ show $ nextRegular $ (trival (1334,335,404)) + 1 -- (1126,16930,40)

此代码在 IDEOne 上计算十亿分之一后的下一个常规数字，时间太短而无法测量，并在 IDEOne 上在 0.69 秒内计算万亿分之一（并且可能运行得更快，除非 IDEOne 不支持 LLVM）。在 JIT 编译的“热身”之后，甚至 Julia 也会以类似 Haskell 的速度运行。

EDIT_ADD： Julia 代码如下：

function nextregular(target :: BigInt) :: Tuple{ UInt32, UInt32, UInt32 }
    # trivial case of first value or anything less...
    target < 2 && return ( 0, 0, 0 )

    # Check if it's already a power of 2 (or a non-integer)
    mant = target & (target - 1)

    # Quickly find next power of 2 >= target
    log2hi :: UInt32 = 0
    test = target
    while true
        next = test & 0x7FFFFFFF
        test >>>= 31; log2hi += 31
        test <= 0 && (log2hi -= leading_zeros(UInt32(next)) - 1; break)
    end

    # exit if this is a power of two already...
    mant == 0 && return ( log2hi - 1, 0, 0 )

    # take care of trivial cases...
    if target < 9
        target < 4 && return ( 0, 1, 0 )
        target < 6 && return ( 0, 0, 1 )
        target < 7 && return ( 1, 1, 0 )
        return ( 3, 0, 0 )
    end

    # find log of target, which may exceed the Float64 limit...
    if log2hi < 53 mant = target << (53 - log2hi)
    else mant = target >>> (log2hi - 53) end
    log2target = log2hi + log(2, Float64(mant) / (1 << 53))

    # log2 constants
    log2of2 = 1.0; log2of3 = log(2, 3); log2of5 = log(2, 5)

    # calculate range of log2 values close to target;
    # desired number has a logarithm of log2target <= x <= top...
    fctr = 6 * log2of3 * log2of5
    top = (log2target^3 + 2 * fctr)^(1/3) # for 2 numbers or so higher
    btm = 2 * log2target - top # or 2 numbers or so lower

    # scan for values in the given narrow range that satisfy the criteria...
    match = log2hi # Anything found will be smaller
    result :: Tuple{UInt32,UInt32,UInt32} = ( log2hi, 0, 0 ) # placeholder for eventual matches
    fives :: UInt32 = 0; fiveslmt = UInt32(ceil(top / log2of5))
    while fives < fiveslmt
        log2p = top - fives * log2of5
        threes :: UInt32 = 0; threeslmt = UInt32(ceil(log2p / log2of3))
        while threes < threeslmt
            log2q = log2p - threes * log2of3
            twos = UInt32(floor(log2q)); log2this = top - log2q + twos

            if log2this >= btm && log2this < match
                # logarithm precision may not be enough to differential between
                # the next lower regular number and the target, so do
                # a full resolution comparison to eliminate this case...
                if (big(2)^twos * big(3)^threes * big(5)^fives) >= target
                    match = log2this; result = ( twos, threes, fives )
                end
            end
            threes += 1
        end
        fives += 1
    end
    result
end

score 0 · Accepted Answer

这是我刚刚想到的另一种可能性：

如果N是X位长，那么最小的规则数R ≥ N将在范围内
[2^X-1, 2^X]

例如，如果N = 257（二进制100000001），那么我们知道R是，除非R完全等于 2 的下一个幂（512）1xxxxxxxx

要生成此范围内的所有正则数，我们可以先生成奇数正则数（即 3 和 5 的幂的倍数），然后取每个值并根据需要乘以 2（通过位移），得到它进入这个范围。

在 Python 中：

from itertools import ifilter, takewhile
from Queue import PriorityQueue

def nextPowerOf2(n):
    p = max(1, n)
    while p != (p & -p):
        p += p & -p
    return p

# Generate multiples of powers of 3, 5
def oddRegulars():
    q = PriorityQueue()
    q.put(1)
    prev = None
    while not q.empty():
        n = q.get()
        if n != prev:
            prev = n
            yield n
            if n % 3 == 0:
                q.put(n // 3 * 5)
            q.put(n * 3)

# Generate regular numbers with the same number of bits as n
def regularsCloseTo(n):
    p = nextPowerOf2(n)
    numBits = len(bin(n))
    for i in takewhile(lambda x: x <= p, oddRegulars()):
        yield i << max(0, numBits - len(bin(i)))

def nextRegular(n):
    bigEnough = ifilter(lambda x: x >= n, regularsCloseTo(n))
    return min(bigEnough)

score -1 · Accepted Answer

我写了一个小c#程序来解决这个问题。它不是很优化，但它是一个开始。对于大至 11 位的数字，此解决方案非常快。

private long GetRegularNumber(long n)
{
    long result = n - 1;
    long quotient = result;

    while (quotient > 1)
    {
        result++;
        quotient = result;

        quotient = RemoveFactor(quotient, 2);
        quotient = RemoveFactor(quotient, 3);
        quotient = RemoveFactor(quotient, 5);
    }

    return result;
}

private static long RemoveFactor(long dividend, long divisor)
{
    long remainder = 0;
    long quotient = dividend;
    while (remainder == 0)
    {
        dividend = quotient;
        quotient = Math.DivRem(dividend, divisor, out remainder);
    }
    return dividend;
}

score -1 · Accepted Answer

你知道吗？我会把钱放在“哑”算法实际上是最快的命题上。这是基于观察到下一个常规数字通常不会比给定输入大得多。因此，只需开始计数，每次递增后，重构并查看是否找到了一个常规数字。但是为您拥有的每个可用核心创建一个处理线程，并且对于 N 个核心，让每个线程检查每个第 N 个数字。当每个线程找到一个数字或超过 2 次幂阈值时，比较结果（保持运行最佳数字）就可以了。

algorithm - 找到不小于 N 的最小正则数

8 回答 8

Related

Reference