c++ - 对 attribute((format)) 的自定义支持

Question

GCC 和 Clang 都支持对变量参数函数进行编译时检查，例如printf. 这些编译器接受如下语法：

extern void dprintf(int dlevel, const char *format, ...)
  __attribute__((format(printf, 2, 3)));  /* 2=format 3=params */

在 OSX 上，Cocoa 框架也使用这个的扩展NSString：

#define NS_FORMAT_FUNCTION(F,A) __attribute__((format(__NSString__, F, A)))

在我们公司，我们有一个自定义的 C++ 框架，其中包含一堆类，所有类BaseString都派生自BaseObject. 其中BaseString有一些类似于的可变参数方法sprintf，但有一些扩展。例如，"%S"需要一个类型的参数BaseString*，并且"%@"需要一个BaseObject*参数。

我想对我们项目中的参数执行编译时检查，但由于扩展，__attribute__((format(printf)))会给出很多误报警告。

有没有办法自定义对__attribute__((format))两个编译器之一的支持？如果这需要对编译器源代码进行修补，是否可以在合理的时间内完成？或者，是否有其他类似lint的工具可以执行检查？

score 5 · Accepted Answer

使用最新版本的GCCPLUGIN_ATTRIBUTES （我推荐 4.7 或更高版本，但您可以尝试使用 GCC 4.6），您可以通过 GCC 插件（带有钩子）或MELT扩展添加自己的变量和函数属性。MELT 是一种扩展 GCC 的领域特定语言（实现为 [meta-] 插件）。

如果使用插件（例如 MELT），则不需要重新编译 GCC 的源代码。但是您需要一个支持插件的 GCC（检查gcc -v）。

2020年MELT不再更新（因为资金不足）；但是，您可以使用 C++为GCC 10编写自己的GCC 插件，进行此类检查。

^{一些 Linux 发行版不启用插件gcc- 请向您的发行版供应商投诉；其他人为 GCC 插件开发提供了一个包，例如gcc-4.7-plugin-devDebian 或 Ubuntu。}

score 2 · Accepted Answer

这是可行的，但肯定不容易；部分问题在于BaseStringandBaseObject是用户定义的类型，因此您需要动态定义格式说明符。幸运的是 gcc 至少支持这一点，但仍然需要修补编译器。

神奇之处在于 inhandle_format_attribute函数gcc/c-family/c-format.c，它为引用用户定义类型的格式说明符调用初始化函数。支持您的一个很好的例子是gcc_gfc格式类型，因为它定义了一个格式说明%L符locus *：

/* This will require a "locus" at runtime.  */
{ "L",   0, STD_C89, { T89_V,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "", "R", NULL },

显然，尽管您希望将format_char_info数组基于print_char_table，因为它定义了标准printf说明符；gcc_gfc相比而言，大幅削减。

添加的补丁gcc_gfc是http://gcc.gnu.org/ml/fortran/2005-07/msg00018.html；从那个补丁中应该很明显你需要如何以及在哪里进行添加。

score 2 · Accepted Answer

在问了这个问题一年半之后，我提出了一种完全不同的方法来解决真正的问题：有没有办法静态检查自定义可变参数格式化语句的类型？

为了完整性并且因为它可以帮助其他人，这是我最终实施的解决方案。与原始问题相比，它有两个优点：

比较简单：不到一天就实现了；
独立于编译器：可以在任何平台（Windows、Android、OSX、...）上检查 C++ 代码。

Perl 脚本解析源代码，查找格式化字符串并解码其中的百分比修饰符。然后，它通过调用模板标识函数来包装所有参数CheckFormat<>。例子：

str->appendFormat("%hhu items (%.2f %%) from %S processed", 
    nbItems, 
    nbItems * 100. / totalItems, 
    subject);

变成：

str->appendFormat("%hhu items (%.2f %%) from %S processed", 
    CheckFormat<CFL::u, CFM::hh>(nbItems  ), 
    CheckFormat<CFL::f, CFM::_>(nbItems * 100. / totalItems  ), 
    CheckFormat<CFL::S, CFM::_, const BaseString*>(subject  ));

枚举CFL和CFM模板函数CheckFormat必须像这样在一个公共头文件中定义（这是一个摘录，大约有 24 个重载）。

enum class CFL
{
    c, d, i=d, star=i, u, o=u, x=u, X=u, f, F=f, e=f, E=f, g=f, G=f, p, s, S, P=S, at
};
enum class CFM
{
    hh, h, l, z, ll, L=ll, _
};
template<CFL letter, CFM modifier, typename T> inline T CheckFormat(T value) { CFL test= value; (void)test; return value; }
template<> inline const BaseString* CheckFormat<CFL::S, CFM::_, const BaseString*>(const BaseString* value) { return value; }
template<> inline const BaseObject* CheckFormat<CFL::at, CFM::_, const BaseObject*>(const BaseObject* value) { return value; }
template<> inline const char* CheckFormat<CFL::s, CFM::_, const char*>(const char* value) { return value; }
template<> inline const void* CheckFormat<CFL::p, CFM::_, const void*>(const void* value) { return value; }
template<> inline char CheckFormat<CFL::c, CFM::_, char>(char value) { return value; }
template<> inline double CheckFormat<CFL::f, CFM::_, double>(double value) { return value; }
template<> inline float CheckFormat<CFL::f, CFM::_, float>(float value) { return value; }
template<> inline int CheckFormat<CFL::d, CFM::_, int>(int value) { return value; }

...

出现编译错误后，很容易用正则表达式CheckFormat<[^<]*>\((.*?) \)替换它的捕获来恢复原始表单。

score 2 · Accepted Answer

__attribute__ ((format))在 c++11 中，可以通过巧妙地组合constexpr、decltype和 variadic 参数包来解决这个问题。将格式字符串传递给一个constexpr函数，该函数在编译时提取所有说明%符，并验证第 n 个说明符是否与decltype(n+1)'st 参数匹配。

这是解决方案的草图...

如果你有：

int x = 3;
Foo foo;
my_printf("%d %Q\n", x, foo);

您将需要一个宏包装器my_printf，使用此处描述的技巧，以获得如下内容：

#define my_printf(fmt, ...) \
{ \
    static_assert(FmtValidator<decltype(makeTypeHolder(__VA_ARGS__))>::check(fmt), \
        "one or more format specifiers do not match their arguments"); \
    my_printf_impl(fmt, ## __VA_ARGS__); \
}

您需要编写FmtValidator和makeTypeHolder()。

makeTypeHolder看起来像这样：

    template<typename... Ts> struct TypeHolder {};

    template<typename... Ts>
    TypeHolder<Ts...> makeTypeHolder(const Ts&... args)
    {
        return TypeHolder<Ts...>();
    }

它的目的是创建一个由传入的参数类型唯一确定的类型my_printf()。然后FmtValidator需要验证这些类型是否%与fmt.

接下来，FmtValidator<T>::check()需要编写%在编译时提取说明符（即作为constexpr函数）。这需要一些编译时递归，如下所示：

    template<typename... Ts>
    struct FmtValidator;

    // recursion base case
    template<>
    struct FmtValidator<>
    {
        static constexpr bool check(const char* fmt)
        {
            return *fmt == '\0' ? true :
                    *fmt != '%' ? check(fmt + 1) :
                    fmt[1] == '%' ? check(fmt + 2) : false;
        }
    };

    // recursion
    template<typename T, typename... Ts>
    struct FmtValidator<TypeHolder<T, Ts...>>
    {
        static constexpr bool check(const char* fmt)
        {
            // find the first % specifier in fmt, validate it against T,
            // and then recursively dispatch with Ts... and the remainder of fmt
            ...
        }
    };

针对单个说明符验证单个类型%，您可以使用以下方法：

    template<>
    struct specmatch<int>
    {
        static constexpr bool match(const char* c, const char* cend)
        {
            return strmatches(c, cend, "d") ||
                    strmatches(c, cend, "i");
        }
    };

    // add other specmatch specializations for float, const char*, etc.

然后，您可以使用自己的自定义类型自由编写自己的验证器。

c++ - 对 __attribute__((format)) 的自定义支持

4 回答 4

Related

Reference

c++ - 对 attribute((format)) 的自定义支持