56

考虑这两个函数:

void foo() {}
void bar() {}

可以保证&foo != &bar吗?

相似地,

template<class T> void foo() { }

可以保证&foo<int> != &foo<double>吗?


我知道有两个链接器将函数定义放在一起。

MSVC 积极地 COMDAT 折叠函数,因此具有相同实现的两个函数可以变成一个函数。作为副作用,这两个函数共享相同的地址。我的印象是这是非法的,但我无法在标准中找到它是非法的。

Gold 链接器还折叠函数,包括 asafeall设置。 safe意思是如果一个函数地址被取了,它不会被折叠,而all即使地址被取了也会折叠。所以黄金的折叠safe表现得好像函数有不同的地址。

虽然折叠可能是出乎意料的,并且有些代码依赖于具有不同地址的不同(相同实现)函数(因此折叠可能很危险),但在当前的 C++ 标准下它实际上是非法的吗?(此时为 C++14)(自然好像safe折叠是合法的)

4

4 回答 4

30

It looks like defect report 1400: Function pointer equality deals with this issue and seems to me to say that it is okay to do this optimization but as comments indicate, there is disagreement. It says (emphasis mine):

According to 5.10 [expr.eq] paragraph 2, two function pointers only compare equal if they point to the same function. However, as an optimization, implementations are currently aliasing functions that have identical definitions. It is not clear whether the Standard needs to deal explicitly with this optimization or not.

and the response was:

The Standard is clear on the requirements, and implementations are free to optimize within the constraints of the “as-if” rule.

The question is asking about two issues:

  • Is it okay for these pointers to be considered equal
  • Is it okay to coalesce the functions

Based on comments I see two interpretations of the response:

  1. This optimization is ok, the standard gives the implementation this freedom under the as-if rule. The as-if rule is covered in section 1.9 and means the implementation only has to emulate the observable behavior with respect to the requirements of the standard. This is still my interpretation of the response.

  2. The issue is at hand is completely ignored and the statement merely says no adjustment to the standard is required because clearly the as-if rules covers this but the interpretation is left as an exercise to the reader. Although I acknowledge due to the terseness of the response I can not dismiss this view, it ends up being a totally unhelpful response. It also seems inconsistent with the responses in the other NAD issues which as far as I can tell point out issue if they exist.

What the draft standard says

Since we know we are dealing with the as-if rule, we can start there and note that section 1.8 says:

Unless an object is a bit-field or a base class subobject of zero size, the address of that object is the address of the first byte it occupies. Two objects that are not bit-fields may have the same address if one is a subobject of the other, or if at least one is a base class subobject of zero size and they are of different types; otherwise, they shall have distinct addresses.4

and note 4 says:

Under the “as-if” rule an implementation is allowed to store two objects at the same machine address or not store an object at all if the program cannot observe the difference

but a note from that section says:

A function is not an object, regardless of whether or not it occupies storage in the way that objects do

although it is not normative, the requirements for an object laid out in paragraph 1 do not make sense in the context of a function and so it is consistent with this note. So we are explicitly restricted from aliasing objects with some exceptions but not such restriction applies to functions.

Next we have section 5.10 Equality operators which says (emphasis mine):

[...]Two pointers compare equal if they are both null, both point to the same function, or both represent the same address (3.9.2), otherwise they compare unequal.

which tells us two pointers are equal if they are:

  • Null pointers
  • Point to the same function
  • Represent the same address

The or both represent the same address seems to give enough latitude to allow a compiler to alias two different functions and does not require pointers to different functions to compare unequal.

Observations

Keith Thompson has made some great observations that I feel are worth adding to the answer since they get to core issues involved, he says:

If a program prints the result of &foo == &bar, that's observable behavior; the optimization in question changes the observable behavior.

which I agree with and if we could shows that there is a requirement for the pointers to be unequal that would indeed violate the as-if rule but so far we can not show that.

and:

[...]consider a program that defines empty function and uses their addresses as unique values (think about SIG_DFL, SIG_ERR, and SIG_IGN in <signal.h> / <csignal>). Assigning them the same address would break such a program

As I noted in my comment the C standard requires these macros to generate distinct values, from 7.14 in C11:

[...]which expand to constant expressions with distinct values that have type compatible with the second argument to, and the return value of, the signal function, and whose values compare unequal to the address of any declarable function[...]

So although this case is covered perhaps there are other cases that would make this optimization dangerous.

Update

Jan Hubička a gcc developer wrote a blog post Link time and inter-procedural optimization improvements in GCC 5, code folding was one of many topics he covered.

I asked him to comment on whether folding identical functions to the same address was conforming behavior or not and he says it is not conforming behavior and indeed such an optimization would break gcc itself:

It is not conforming to turn two functions to have same address, so MSVC is quite aggressive here. Doing so, for example, breaks GCC itself because to my surprise address compare is done in the precompiled headers code. It works for many other projects, including Firefox.

In hindsight, after months more of reading defect reports and thinking about optimization issues, I am biased towards a more conservative reading of the committee's response. Taking the address of a function is observable behavior and therefore folding identical functions would violate the as-if rule.

Update 2

Also see this llvm-dev discussion: Zero length function pointer equality:

This is a well-known conformance-violating bug in link.exe; LLVM should not be making things worse by introducing a similar bug itself. Smarter linkers (for example, I think both lld and gold) will do identical function combining only if all but one of the function symbols is only used as the target of calls (and not to actually observe the address). And yes, this non-conforming behavior (rarely) breaks things in practice. See this research paper.

于 2014-10-23T18:44:39.863 回答
11

是的。来自标准(§5.10/1):“相同类型的两个指针比较相等当且仅当它们都为空,都指向相同的函数,或者都表示相同的地址”

一旦它们被实例化,foo<int>并且foo<double>是两个不同的函数,所以上面的内容也适用于它们。

于 2014-10-23T17:34:22.143 回答
9

So the problematic part is clearly the phrase or both represent the same address (3.9.2).

IMO this part is clearly there to define the semantics for object pointer types. And only for object pointer types.

The phrase references section 3.9.2, which means we should look there. 3.9.2 talks (among others) about the addresses that object pointers represent. It does not talk about the addresses that function pointers represent. Which, IMO, leaves just two possible interpretations:

1) The phrase simply does not apply to function pointers. Which leaves just the two null pointers and two pointers to the same function comparing equal, which is what probably most of us expected.

2) The phrase does apply. Since it's referring to 3.9.2, which says nothing about the addresses that function pointers represent, we may make any two function pointers compare equal. Which is very unexpected and of course renders comparing function pointers utterly useless.

So, while technically an argument could be made that (2) is a valid interpretation, IMO it's not a meaningful interpretation and thus should be disregarded. And since not everyone seems to agree on this, I also think that a clarification in the standard is needed.

于 2014-10-24T18:14:43.807 回答
3

5.10 等式运算符[expr.eq]

1 ==(等于)和!=(不等于)运算符从左到右分组。操作数应具有算术、枚举、指针或指向成员类型或类型的指针std::nullptr_t。运算符==and!=都产生trueor false,即类型 的结果bool。在以下每种情况下,在应用了指定的转换后,操作数应具有相同的类型
2如果至少一个操作数是指针,则对两个操作数执行指针转换 (4.10) 和限定转换 (4.4) 以将它们变为复合指针类型(第 5 条)。比较指针定义如下:如果两个指针都为空、都指向同一个函数或都表示相同的地址(3.9.2),则它们比较相等,否则它们比较不相等。

让我们逐位分析:

  1. 两个空指针比较相等。
    对你的理智有好处。
  2. 指向同一个函数的两个指针比较相等。
    其他任何事情都会非常令人惊讶。
    这也意味着任何inline-function 的一个外联版本都可能被占用它的地址,除非你想让函数指针比较变得过于复杂和昂贵。
  3. 两者代表相同的地址。
    现在,这就是一切。将其删除并简化if and only if为一个简单if的解释,但这是使任何两个函数相同的明确要求,只要它不会改变符合程序的可观察行为。
于 2014-10-23T18:18:25.430 回答