PxxxxR0
Define offsetof semantics in the C++ standard

New Proposal,

This version:
https://eisenwave.github.io/cpp-proposals/clmul.html
Author:
Audience:
EWG
Project:
ISO/IEC 14882 Programming Languages — C++, ISO/IEC JTC1/SC22/WG21
Source:
eisenwave/cpp-proposals

Abstract

The delegation of the offsetof macro semantics to C is problematic. I propose to re-define them in C++ without breaking changes to existing code.

1. Introduction

Currently, C++ delegates the semantics of offsetof to the C standard, as explained in [support.types.layout\] paragraph 1:

The macro offsetof(type, member-designator) has the same semantics as the corresponding macro in the C standard library header <stddef.h>, but accepts a restricted set of type arguments in this document. [...]

Further restrictions and clarifications follow. However, even with those additional clarifications, multiple problems arise with the C standard wording when pulled into C++ like this:

1.1. CWG2784. Unclear definition of member-designator for offsetof

[CWG2784] raises the question whether the following code is valid:

struct S {
  int a;
};
int x = offsetof(S, S::a);

C requires from the user (see 7.21 paragraph 3) that for the macro offsetof(type, member-designator), given the declaration static type t;:

the expression &(t. member-designator) evaluates to an address constant.

Since there is no qualified-id in C, it is unclear whether S::a can be used as a member-designator. MSVC and GCC support this, but clang rejects the code.

All in all, CWG2784 raises three questions:

EWG is soliciting a paper to thoroughly explore the design space.

1.2. Interaction with overloaded & operator

The C wording is also problematic because as stated above, C requires from the user that

&(t. member-designator) evaluates to an address constant.

It is unclear from normative wording whether the & refers to the non-overloaded & operator in C, or the potentially overloaded & operator in C++.

[support.types.layout] footnote 165 states that

offsetof is required to work as specified even if unary operator& is overloaded for any of the types involved.

However, this doesn’t clearly answer the question of what & in the C wording means when pulled into C++. It also isn’t normative, and seemingly unsupported by any normative wording.

1.3. Interaction with non-public members

As stated above, C requires that

&(t. member-designator) evaluates to an address constant.

If this expression has semantics defined in C (without operator overloading and access control), then a member-designator which designates a private member should be accepted as well.

However, MSVC, GCC, and Clang reject offsetof(type, m) where m is a private member.

Note: A class can be standard-layout as long as all non-static data members have the same access control.

1.4. Undefined behavior for non-default-constructible classes

As stated above, the C standard describes restrictions given the declaration static type t;. Obviously, this would not work for non-default-constructible types, which suggests that offsetof has undefined behavior for any class type that has no accessible default constructor.

2. Impact

3. Implementation experience

Note that the current implemented behavior historically originated from the C implementation:

#define offsetof(T,m) ((size_t)&((T*)0)->m) 

However, this doesn’t handle overloaded & operators properly (among other issues), so MSVC, GCC, and Clang now delegate this to __builtin_offsetof(T, m).

offsetof is a core language feature masquerading as a library feature, and when considering design questions (§ 4 Design Considerations) we are not restricted by what can be self-hosted in C++.

4. Design Considerations

The general approach is to re-define offsetof entirely in C++. The C wording is simply unfit to be pulled into C++ as it is now; there are too many open questions resulting from this.

For this proposal, the design is is essentially for offsetof(T, m) to give the user the offset of t.m (which must be well-formed), where t is an lvalue of type T, and m is an unqualified-id or qualified-id which designates a non-static data member.

This naturally answers whether a qualified-id is allowed, what role access control plays, and eliminates questions regarding the overloaded & operator.

5. Proposed wording

Modify [support.types.layout] paragraph 1 as follows:

The macro offsetof(type, member-designator) has the same semantics as the corresponding macro in the C standard library header , but accepts a restricted set of type arguments in this document. expands to a prvalue constant expression of type size_t, the value of which is the offset in bytes, to the subobject designated by member-designator, from the first byte of any object of type type.

member-designator:
    qualified-id
    unqualified-id
    member-designator . qualified-id
    member-designator . unqualified-id
    member-designator [ assignment-expression ]

The expression is well-formed only if type is a type-id which denotes a complete class type and given an lvalue t of type type,
  • t.member-designator is not a bit-field,
  • t.member-designator designates ([expr.ref]) a member subobject of t (directly or indirectly) or array element thereof,
  • for any use of the subscript operator within member-designator, the left operand shall be of of array type and shall designate a member subobject of t (directly or indirectly) or array element thereof, and the right operand shall be an integral constant expression ([expr.const]).
The expression offsetof(type, member-designator) is never type-dependent and it is value-dependent if and only if type is dependent. is type-dependent or value-dependent when the expression t.member-designator is type-dependent or value-dependent, respectively. The result of applying the offsetof macro to a static data member or a function member is undefined. No operation invoked by the offsetof macro shall throw an exception and noexcept(offsetof(type, member-designator)) shall be true.

In Annex C, modify subclause [diff.offsetof] paragraph 1 as follows:

The macro offsetof, defined in , accepts a restricted set of type arguments in C++ . , and supports member-designators which would not be valid in C. Subclause [support.types.layout] describes the change.

References

Informative References

[CWG2784]
Corentin Jabot. Unclear definition of member-designator for offsetof. 21 August 2023. open. URL: https://wg21.link/cwg2784