I am working on a single producer single consumer ring buffer implementation.I have two requirements:
1) Align a single heap allocated instance of a ring buffer to a cache line.
2) Align a field within a ring buffer to a cache line (to prevent false sharing).
My class looks something like:
#define CACHE_LINE_SIZE 64 // To be used later.
template<typename T, uint64_t num_events>
class RingBuffer { // This needs to be aligned to a cache line.
public:
....
private:
std::atomic<int64_t> publisher_sequence_ ;
int64_t cached_consumer_sequence_;
T* events_;
std::atomic<int64_t> consumer_sequence_; // This needs to be aligned to a cache line.
};
Let me first tackle point 1 i.e. aligning a single heap allocated instance of the class. There are a few ways:
1) Use the c++ 11 alignas(..)
specifier:
template<typename T, uint64_t num_events>
class alignas(CACHE_LINE_SIZE) RingBuffer {
public:
....
private:
// All the private fields.
};
2) Use posix_memalign(..)
+ placement new(..)
without altering the class definition. This suffers from not being platform independent:
void* buffer;
if (posix_memalign(&buffer, 64, sizeof(processor::RingBuffer<int, kRingBufferSize>)) != 0) {
perror("posix_memalign did not work!");
abort();
}
// Use placement new on a cache aligned buffer.
auto ring_buffer = new(buffer) processor::RingBuffer<int, kRingBufferSize>();
3) Use the GCC/Clang extension __attribute__ ((aligned(#)))
template<typename T, uint64_t num_events>
class RingBuffer {
public:
....
private:
// All the private fields.
} __attribute__ ((aligned(CACHE_LINE_SIZE)));
4) I tried to use the C++ 11 standardized aligned_alloc(..)
function instead of posix_memalign(..)
but GCC 4.8.1 on Ubuntu 12.04 could not find the definition in stdlib.h
Are all of these guaranteed to do the same thing? My goal is cache-line alignment so any method that has some limits on alignment (say double word) will not do. Platform independence which would point to using the standardized alignas(..)
is a secondary goal.
I am not clear on whether alignas(..)
and __attribute__((aligned(#)))
have some limit which could be below the cache line on the machine. I can't reproduce this any more but while printing addresses I think I did not always get 64 byte aligned addresses with alignas(..)
. On the contrary posix_memalign(..)
seemed to always work. Again I cannot reproduce this any more so maybe I was making a mistake.
The second aim is to align a field within a class/struct to a cache line. I am doing this to prevent false sharing. I have tried the following ways:
1) Use the C++ 11 alignas(..)
specifier:
template<typename T, uint64_t num_events>
class RingBuffer { // This needs to be aligned to a cache line.
public:
...
private:
std::atomic<int64_t> publisher_sequence_ ;
int64_t cached_consumer_sequence_;
T* events_;
std::atomic<int64_t> consumer_sequence_ alignas(CACHE_LINE_SIZE);
};
2) Use the GCC/Clang extension __attribute__ ((aligned(#)))
template<typename T, uint64_t num_events>
class RingBuffer { // This needs to be aligned to a cache line.
public:
...
private:
std::atomic<int64_t> publisher_sequence_ ;
int64_t cached_consumer_sequence_;
T* events_;
std::atomic<int64_t> consumer_sequence_ __attribute__ ((aligned (CACHE_LINE_SIZE)));
};
Both these methods seem to align consumer_sequence
to an address 64 bytes after the beginning of the object so whether consumer_sequence
is cache aligned depends on whether the object itself is cache aligned. Here my question is - are there any better ways to do the same?
EDIT: The reason aligned_alloc did not work on my machine was that I was on eglibc 2.15 (Ubuntu 12.04). It worked on a later version of eglibc.
From the man page: The function aligned_alloc() was added to glibc in version 2.16
.
This makes it pretty useless for me since I cannot require such a recent version of eglibc/glibc.