ruby - 解码base45字符串

Question

我们正在尝试对新的欧盟冠状病毒测试/疫苗接种证书进行验证，但无法使 base45 解码正常工作。

规范在这里：https ://datatracker.ietf.org/doc/draft-faltstrom-base45/

我们几乎完成了我们的课程，但我们有时会得到错误的值..

目标是这样的：

Encoding example 1: The string "AB" is the byte sequence [65 66].
The 16 bit value is 65 * 256 + 66 = 16706. 16706 equals 11 + 45 * 11
+ 45 * 45 * 8 so the sequence in base 45 is [11 11 8].  By looking up
these values in the Table 1 we get the encoded string "BB8".

Encoding example 2: The string "Hello!!" as ASCII is the byte
sequence [72 101 108 108 111 33 33].  If we look at each 16 bit
value, it is [18533 27756 28449 33].  Note the 33 for the last byte.
When looking at the values modulo 45, we get [[38 6 9] [36 31 13] [9
2 14] [33 0]] where the last byte is represented by two.  By looking
up these values in the Table 1 we get the encoded string "%69
VD92EX0".

Encoding example 3: The string "base-45" as ASCII is the byte
sequence [98 97 115 101 45 52 53].  If we look at each 16 bit value,
it is [25185 29541 11572 53].  Note the 53 for the last byte.  When
looking at the values modulo 45, we get [[30 19 12] [21 26 14] [7 32
5] [8 1]] where the last byte is represented by two.  By looking up
these values in the Table 1 we get the encoded string "UJCLQE7W581".

这是我当前的代码，它产生错误的值：

class Base45

  ALPHABET = {
    "00" => "0",
    "01" => "1",
    "02" => "2",
    "03" => "3",
    "04" => "4",
    "05" => "5",
    "06" => "6",
    "07" => "7",
    "08" => "8",
    "09" => "9",
    "10" => "A",
    "11" => "B",
    "12" => "C",
    "13" => "D",
    "14" => "E",
    "15" => "F",
    "16" => "G",
    "17" => "H",
    "18" => "I",
    "19" => "J",
    "20" => "K",
    "21" => "L",
    "22" => "M",
    "23" => "N",
    "24" => "O",
    "25" => "P",
    "26" => "Q",
    "27" => "R",
    "28" => "S",
    "29" => "T",
    "30" => "U",
    "31" => "V",
    "32" => "W",
    "33" => "X",
    "34" => "Y",
    "35" => "Z",
    "36" => " ",
    "37" => "$",
    "38" => "%",
    "39" => "*",
    "40" => "+",
    "41" => "-",
    "42" => ".",
    "43" => "/",
    "44" => ":"
  }.freeze

  def self.encode_base45(text)
    restsumme = text.unpack('S>*')

    # not sure what this is doing, but without it, it works worse :D
    restsumme << text.bytes[-1] if text.bytes.size > 2 && text.bytes[-1] < 256

    bytearr = restsumme.map do |bytes|
      arr = []
      multiplier, rest = bytes.divmod(45**2)
      arr << multiplier if multiplier > 0

      multiplier, rest = rest.divmod(45)
      arr << multiplier if multiplier > 0
      arr << rest if rest > 0
      arr.reverse
    end
    return bytearr.flatten.map{|a| ALPHABET[a.to_s.rjust(2, "0")]}.join
  end

  def self.decode_base45(text)
    arr = text.split("").map do |char|
      ALPHABET.invert[char]
    end
    textarr = arr.each_slice(3).to_a.map do |group|
      subarr = group.map.with_index do |val, index|
        val.to_i * (45**index)
      end
      ap subarr
      subarr.sum
    end

    return textarr.pack("S>*") # returns wrong values
  end
end

结果：

Base45.encode_base45("AB")
=> "BB8" # works
Base45.decode_base45("BB8")
=> "AB" # works

Base45.encode_base45("Hello!!")
=> "%69 VD92EX" # works
Base45.decode_base45("BB8")
=> "Hello!\x00!" # wrong \x00


Base45.encode_base45("base-45")
=> "UJCLQE7W581" # works
Base45.decode_base45("UJCLQE7W581")
=> "base-4\x005" # wrong \x00

任何提示表示赞赏:(

score 0 · Accepted Answer

如果你想要一种有条理的方式来做到这一点：

return textarr.map{|x| x<256 ? [x].pack("C*") : [x].pack("n*") }.join

看着这个方案，感觉像是一种奇怪的编码方式，因为我们正在处理数字......如果是我，我会从字符串的尾部开始，然后向头部工作，但那是因为我们'正在使用数字。

无论如何，我的 bodge 工作的原因是它将小元素/数字视为 8 位无符号而不是 16 位无符号。

...

稍微顺眼一点，但可能也好不到哪里去：

def self.decode_base45(text)
  arr = text.split("").map do |char|
    ALPHABET.invert[char]
  end
  textarr = arr.each_slice(3).to_a.map do |group|
    subarr = group.map.with_index do |val, index|
      val.to_i * (45**index)
    end
    ap subarr
    subarr.sum.divmod(256)
  end.flatten.reject(&:zero?)

  return textarr.pack("C*") # returns wrong values
end

score 0 · Accepted Answer

在努力获得其他答案后，我根据您的问题和这个片段制作了自己的方法。上述答案在大多数情况下都有效，但并非在所有情况下都有效，尤其是当字符串长度 mod 3 = 2 时。

class Base45
  ALPHABET = {
    0 => "0",
    1 => "1",
    2 => "2",
    3 => "3",
    4 => "4",
    5 => "5",
    6 => "6",
    7 => "7",
    8 => "8",
    9 => "9",
    10 => "A",
    11 => "B",
    12 => "C",
    13 => "D",
    14 => "E",
    15 => "F",
    16 => "G",
    17 => "H",
    18 => "I",
    19 => "J",
    20 => "K",
    21 => "L",
    22 => "M",
    23 => "N",
    24 => "O",
    25 => "P",
    26 => "Q",
    27 => "R",
    28 => "S",
    29 => "T",
    30 => "U",
    31 => "V",
    32 => "W",
    33 => "X",
    34 => "Y",
    35 => "Z",
    36 => " ",
    37 => "$",
    38 => "%",
    39 => "*",
    40 => "+",
    41 => "-",
    42 => ".",
    43 => "/",
    44 => ":"
  }.freeze

  def self.decode_base45(text)
    raise ArgumentError, "invalid base45 string" if text.size % 3 == 1

    arr = text.split("").map do |char|
      ALPHABET.invert[char]
    end

    arr.each_slice(3).to_a.map do |group|
      if group.size == 3
        x = group[0] + group[1] * 45 + group[2] * 45 * 45
        raise ArgumentError, "invalid base45 string" if x > 0xFFFF
        x.divmod(256)
      else
        x = group[0] + group[1] * 45
        raise ArgumentError, "invalid base45 string" if x > 0xFF
        x
      end
    end.flatten.pack("C*")
  end
end

score 0 · Accepted Answer

这可能不是这里问题的正确解决方案。

但是添加textarr.pack("S>*").gsub(/\x00/, "")解决了给定解码示例的问题。encode此外，您的版本对我来说效果不佳（在第一个示例中的结果错误）也很奇怪。

无论如何，这个帖子让我通过把它变成一个gem做出了一些贡献。

ruby - 解码base45字符串

3 回答 3

Related

Reference