I am looking for an alternative to Java Bitset implementation. I am implementing a high performance algorithm and seems like using a Bitset object is killing its performance. Any ideas?
相关问题
- Delete Messages from a Topic in Apache Kafka
- Jackson Deserialization not calling deserialize on
- How to maintain order of key-value in DataFrame sa
- StackExchange API - Deserialize Date in JSON Respo
- Difference between Types.INTEGER and Types.NULL in
If you really must squeeze the maximum performance out of this thing, and if memory does not matter, you can try storing each one of your flags in an integer whose bit size is equal to the width of the data bus of your CPU.
You are probably on a 64-bit data bus CPU, so try long integers.
Someone here has compared
boolean[]
toBitSet
and concluded with:If you Google, you can find some alternative implementations as well, like JavaEWAH, used by Apache Hive, Apache Spark and Eclipse JGit. It claims:
Look at Javolution FastBitSet : A high-performance bitset integrated with the collection framework as a set of indices and obeying the collection semantic for methods such as FastSet.size() (cardinality) or FastCollection.equals(java.lang.Object) (same set of indices).
See also http://code.google.com/p/guava-libraries/issues/detail?id=724#c3.
There are a number of compressed alternatives to the BitSet class. EWAH was already mentioned (https://github.com/lemire/javaewah). More recent additions include Roaring bitmaps (https://github.com/RoaringBitmap/RoaringBitmap) that are used by Apache Lucene, Apache Spark, Elastic Search, and so forth.
While searching an answer for my question single byte comparison vs multiple boolean comparison, I found OpenBitSet
They claim to be faster than Java BitSet and direct access to the array of words storing the bits.
I am definitely gonna try that. See if it solve your purpose too.