r - Rcpp 和 int64 NA 值

Question

如何在 64 位向量中将 NA 值从 Rcpp 传递到 R？

我的第一种方法是：

// [[Rcpp::export]]                                     
Rcpp::NumericVector foo() {
  Rcpp::NumericVector res(2);

  int64_t val = 1234567890123456789;
  std::memcpy(&(res[0]), &(val), sizeof(double));
  res[1] = NA_REAL;

  res.attr("class") = "integer64";
  return res;
}

但它产生

#> foo()
integer64
[1] 1234567890123456789 9218868437227407266

我需要得到

#> foo()
integer64
[1] 1234567890123456789 <NA>

score 6 · Accepted Answer

好吧，我想我找到了答案......（不漂亮，但工作）。

简短的回答：

// [[Rcpp::export]]                                     
Rcpp::NumericVector foo() {
  Rcpp::NumericVector res(2);

  int64_t val = 1234567890123456789;
  std::memcpy(&(res[0]), &(val), sizeof(double));

  # This is the magic:
  int64_t v = 1ULL << 63;
  std::memcpy(&(res[1]), &(v), sizeof(double));

  res.attr("class") = "integer64";
  return res;
}

这导致

#> foo()
integer64
[1] 1234567890123456789 <NA>

更长的答案

检查如何bit64存储NA

# the last value is the max value of a 64 bit number
a <- bit64::as.integer64(c(1, 2, NA, 9223372036854775807))
a
#> integer64
#> [1] 1    2    <NA> <NA>
bit64::as.bitstring(a[3])
#> [1] "1000000000000000000000000000000000000000000000000000000000000000"
bit64::as.bitstring(a[4])
#> [1] "1000000000000000000000000000000000000000000000000000000000000000"

^{由reprex 包（v0.3.0）于 2020 年 4 月 23 日创建}

我们看到它是一个10000.... 这可以用重新Rcpp创建int64_t val = 1ULL << 63;。使用memcpy()而不是简单的分配=确保不会更改任何位！

score 6 · Accepted Answer

这真的要简单得多。我们有int64（几个）附加包提供的 R 中的行为，其中最好的是bit64为我们提供了integer64S3 类和相关行为。

它在内部定义 NA如下：

#define NA_INTEGER64 LLONG_MIN

这就是全部。R 及其包是最重要的 C 代码，并且LLONG_MIN存在于那里并且（几乎）一直追溯到创始人。

这里有两个教训。第一个是 IEEE 为浮点值定义 NaN 和 Inf 的扩展。R 实际上远远超出并NA为它的每个类型添加。几乎以上述方式：通过保留一个特定的位模式。（在一种情况下，这是两位原始 R 创作者之一的生日。）

另一个是欣赏 Jens 对bit64软件包所做的大量工作以及所有必需的转换和运算符功能。无缝转换所有可能的值，包括 NA、NaN、Inf、... 并非易事。

这是一个没有太多人知道的简洁话题。我很高兴你问了这个问题，因为我们现在在这里有记录。

r - Rcpp 和 int64 NA 值

2 回答 2

简短的回答：

更长的答案

Related

Reference