What is the performance of STL bitset::count() met

2020-08-17 17:43发布

问题:

I searched around and could not find the performance time specifications for bitset::count(). Does anybody know what it is (O(n) or better) and where to find it?

EDIT By STL I refer only to the Standard Template Library.

回答1:

I read this file (C:\cygwin\lib\gcc\i686-pc-cygwin\3.4.4\include\c++\bitset) on my computer.
See these

/// Returns the number of bits which are set.
size_t
count() const { return this->_M_do_count(); }      

size_t
_M_do_count() const
{
  size_t __result = 0;
  for (size_t __i = 0; __i < _Nw; __i++)
  __result += __builtin_popcountl(_M_w[__i]);
  return __result;
}

BTW, this is where _Nw is specified:

  template<size_t _Nw>
    struct _Base_bitset

Thus it's O(n) in gcc implementation. We conclude the specification doesn't require it better than O(n). And nobody in their right mind will implement it in a way worse than that. We can then safely assume that it's at worst O(n). Possibly better but you can never count on that.



回答2:

I can't be sure what you really mean by "STL" here, due to a prevailing misuse of the term in the C++ community.

  • The C++ Standard (2003) makes no mandate for the performance of std::bitset::count() (or, in fact, any members of std::bitset as far as I can see).

  • I can't find any reference suggesting a mandate for the performance of STL's bitset::count() either.

I think any sane implementation will provide this in constant (or at worst linear) time, though. However, this is merely a feeling. Check yours to find out what you'll actually get.



回答3:

"SGI's reference implementation runs in linear time with respect to the number of bytes needed to store the bits. It does this by creating a static array of 256 integers. The value stored at ith index in the array is the number of bits set in the value i."

http://www.cplusplus.com/forum/general/12486/



回答4:

I'm not sure you're going to find a specification for that, since the STL doesn't typically require a certain level of performance. I've seen hints that it's "fast", around 1 cycle per bit in the set's size. You can of course read your particular implementation's code to find out what to expect.