More bitset operations
- Document number:
- P3103R3
- Date:
2025-10-06 - Audience:
- LEWG
- Project:
- ISO/IEC 14882 Programming Languages — C++, ISO/IEC JTC1/SC22/WG21
- Reply-to:
- Jan Schultke <janschultke@gmail.com>
- GitHub Issue:
- wg21.link/P3103/github
- Source:
- github.com/eisenwave/cpp-proposals/blob/master/src/more-bitset-operations.cow
member functions should be added,
corresponding to the utility functions in
and to enable efficient iteration over the one-bits or zero-bits.
Contents
Revision history
Changes since R2
Changes since R1
Changes since R0
Introduction
Proposed changes
Design considerations
reverse
In-place reversal vs. reverse-copy
reverse ( bitset )
Common interface with < bit >
Counting overloads
Alternative designs facilitating iteration
Infallible countr_zero
find_next_one
No novel overloads
New overloads for < bit >
?
Impact on existing code
Implementation experience
Proposed wording
References
Appendix A — Compiler Explorer backup
1. Revision history
1.1. Changes since R2
1.2. Changes since R1
- mentioned [P3104R2], now that it has been published
- greatly expanded §4. Design considerations based on feedback from LEWGI
1.3. Changes since R0
-
simplified proposed wording by defining
as an equivalence in terms ofrotl rotr - fixed minor editorial mistakes
2. Introduction
[P0553R4] added the bit manipulation library to C++20, which introduced many useful utility functions.
Some of these already have a counterpart in
(such as
as
),
but not nearly all of them.
This leaves
overall lacking in functionality,
which is unfortunate because
is an undeniably useful container.
At the time of writing, it is used in 119K files on GitHub; see [GitHubCodeSearch].
does not (and should not)
expose the underlying integer sequence of the implementation.
Therefore, it is not possible for the user to implement these operations efficiently themselves.
3. Proposed changes
For each of the functions from the bit manipulation library that are not yet available
in
, add a member function.
Add a small amount of further useful member functions.
function |
Proposed member |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
([P3104R4]) |
|
([P3104R4]) |
|
([P3104R4]) |
N/A |
([P3104R4]) |
N/A |
The additional overloads for the counting functions allow counting from a starting position. This can be useful for iterating over all set bits:
Note:
and
counterparts are not proposed, only functions solely
dedicated to the manipulation of bit sequences.
4. Design considerations
The names and signatures of the proposed member functions is a mixture of
function templates and the existing conventions of
.
Sadly, it is impossible to be consistent with both parts of the standard library.
For example,
functions typically prefer
parameters for offsets,
whereas
uses
virtually everywhere.
also has a very brief convention of
or
,
where
has a
function.
This proposal generally prefers local consistency, i.e. it is more important to
be consistent with
than functions from a different part of the standard
library, even if those distant functions are similar in purpose.
4.1. reverse
is a novel invention, not taken from
.
However, a
counterpart is proposed in [P3104R2] "Bit permutations".
This function is added because the method for reversing integers may be tremendously faster than
doing so bit by bit.
ARM has a dedicated
instruction for reversing bits,
which could be leveraged.
4.1.1. In-place reversal vs. reverse-copy
I propose
to modify a
in-place rather than creating a reversed copy.
The interface of
can already accommodate large
,
and use cases with large
exist.
Compilers also should not be expected to find the optimization from a
with later assignment to an in-place
.
bitset
.
Instead, it allocates 512 bytes of stack space,
performs a reverse-copy, and memcpy
s the result back into x
.
I believe that all operations of a
with large
should remain usable,
and copying the whole set for an operation may be unacceptable due to danger of overflowing the stack
and/or performance considerations.
A perfect optimization would be very difficult due to substantial differences between the
and
algorithms.
4.1.2. reverse ( bitset )
Instead of a
member function, it would also be possible to add an
overload for the
algorithm which accepts a
.
There exists precedent (e.g.
),
although this is not the first design which comes to mind.
4.2. Common interface with < bit >
Another considered alternative is the creation of a common interface with the
function templates of
.
This would mean adding overloads for say,
and other functions
which would accept
in addition to unsigned integers.
This proposal does not pursue this design for multiple reasons:
- The use case of this is unclear. I have personally never needed to write generic code which does bit manipulation on either integers or bitsets, and don't know of any code base where this is done.
-
There is too much friction in the design of
andbitset
, such as the use of< bit >
vs.size_t
, the use of exceptions vs. preconditions, different naming schemes, etc.int -
The functions in
are all pure, but for< bit >
, copying it for the purpose of function calls and returning modified copies becomes impractical for largebitset < N >
. See also [[#in-place-reversal]].N -
A common interface does not eliminate the need for new member functions.
It would be very surprising to users if some functionality
was only available through
(e.g.< bit >
) whereas other functionality was available both as a member function and as acountr_zero
function (e.g.< bit >
/popcount
). Such a design would be inconsistent, and nonsensical when taken out of its historical context.. count ( ) -
A future proposal could still create a common interface for
and< bit >
. The addition of member functions adds no obstacles in this regard.bitset
Also note that most of these issues equally apply to sharing a common interface with
.
4.3. Counting overloads
,
,
, and
are overloaded member functions which take a
argument or nothing.
This is preferable to a single member function with a defaulted argument because the overloads
have different
specifications.
The overloads which take no arguments have a wide contract and can be marked
,
whereas the overloads taking
may throw
.
4.3.1. Alternative designs facilitating iteration
Besides the proposed design for
,
there are at least two alternatives, mainly motivated by simplifying iteration.
As mentioned above, we can iterate over all one-bits of a
as possible.
4.3.1.1. Infallible countr_zero
If
could not throw but returned zero on an index out of bounds,
this could be shortened to:
An infallible
would also cater to users who do not want to use
exceptions, whereas the current proposal perpetuates the existing exception-based design.
4.3.1.2. find_next_one
It would be possible to add a member function which increments and advances to the next bit in a one operation.
A possible issue is that this operation is extremely specialized towards iteration.
As with
, it also needs be considered whether
should handle a bad index by throwing an exception, if LEWG prefers this kind of function.
4.3.1.3. No novel overloads
It is also possible to drop the overloads taking
from this proposal.
Perhaps a future proposal focused on iteration for
could approach the
issue of facilitating iteration.
This was most recently attempted in [[P0161R0]].
However, such functions are well within the scope of this proposal, especially if they share a name with other proposed members, and they do not pose a substantial design/wording hurdle. In short, we may as well do it.
4.3.2. New overloads for < bit >
?
The overloads taking
don't need a counterpart in
because they are very easy to
emulate:
I consider these too trivial to justify adding such convenience functions to
,
even if it would create more symmetry between
and
.
5. Impact on existing code
The semantics of existing
member functions remain unchanged,
and no existing valid code is broken.
This proposal is purely an extension of the functionality of
.
6. Implementation experience
[bitset2] provides a
implementation which supports most proposed features.
There are no obvious obstacles to implementing the new features in common standard library
implementations.
7. Proposed wording
The proposed changes are relative to the working draft of the standard as of [N5014].
8. References
Appendix A — Compiler Explorer backup
In [CompilerExplorer1], the source code is: