c++ - 使用 sscanf 定位 bug 的来源

Question

我已经为此苦苦挣扎太久了。

假设我有这个最小的代码：

测试.cxx

#include <iostream>
#include <cstdio>

int main (int argc, char *argv[])
{
  const char *text = "1.01 foo";  
  float value = 0;  
  char other[8];

  int code = sscanf(text, "%f %7s", &value, other);
  std::cout << code << " | " << text << " | => | " << value << " | " << other << " | " << std::endl;

  return 0;
}

$ g++ test.cxx; ./a.out正如预期的那样产生这个输出：

$ 2 | 1.01 foo | => | 1.01 | foo |

现在我将这 5 行嵌入到一个有数千行的项目中，并且包含很多...

编译，运行，现在的输出是：

$ 2 | 1.01 foo | => | 1 | .01 |

我可以使用什么策略来定位这种不一致的根源？

编辑： export LC_ALL=C (or LC_NUMERIC=C); ./a.out似乎解决了我的问题

score 2 · Accepted Answer

这可能是由您的测试和目标应用程序中的不同语言环境引起的。我能够在 coliru 上重现它：

通过使用：

setlocale(LC_ALL, "cs_CZ.utf8");

http://coliru.stacked-crooked.com/a/5a8f2ea7ac330d66

您可以在此 SO 中找到一些解决方案：

sscanf() 和语言环境。一个人如何真正解析像“3.14”这样的东西？

[编辑]

解决方案uselocale，但既然你用 C++ 标记了这个问题，那么为什么不使用 std::stringstream 并用适当的语言环境对其进行灌输（参见上面的 SO 链接）。

http://coliru.stacked-crooked.com/a/dc0fac7d2533d95c

  const char *text = "1.01 foo";  
  float value = 0;  
  char other[8];

  // set for testing, sscanf will assume floating point numbers use comma instead of dots
  setlocale(LC_ALL, "cs_CZ.utf8");

  // Temporarily use C locale (uses dot in floats) on current thread
  locale_t locale = newlocale(LC_NUMERIC_MASK, "C", NULL);
  locale_t old_locale = uselocale(locale);

  int code = sscanf(text, "%f %7s", &value, other);
  std::cout << code << " | " << text << " | => | " << value << " | " << other << " | " << std::endl;

  // Go back to original locale
  uselocale(old_locale);
  freelocale(locale);

c++ - 使用 sscanf 定位 bug 的来源

1 回答 1

Related

Reference