I want to use a collection that is sorted, but one in which I can access elements by index, i.e. I want something that has characteristics of both a Set and a List. Java.util.TreeSet comes real close to what I need, but doesn't permit access via an index.
I can think of several options:
- I could iterate through a TreeSet every time I needed a particular element.
- I could maintain a TreeSet and generate a List from it when I needed to access a particular element.
- Same as above, only cache the List until the Set changes.
- I could have a List and sort it myself whenever I needed to add an element.
- etc.
There are various trade-offs between the various options. I'm hoping somebody can give me some good advice. To answer the potential questions as to "why would you ever want to do that?", please read about the Apriori algorithm.
I would look into LinkedHashSet. It maintains insertion order of a HashSet.
Perhaps a combination of Treeset and the apache commons collections API CollectionUtils.get() would solve your problem
A couple of points:
Sort of a non-answer, but when I last needed to re-implement a frequent itemset mining algorithm, I went with FP-growth, which has performance on-par (or better) than a priori and, in my opinion, is easier to implement. This technique was developed by Jiawei Han and others, basically has a dedicated chapter in Data Mining: Concepts and Techniques.
There are several open-source tools that take a pretty standardized input (one list of integers per line; integers represent items, lines represent itemsets). Some of them give you a choice of algorithms. Many of them are available here with permissive licenses: http://fimi.ua.ac.be/src/
Keep in mind that using just any
List
implementation doesn't get youO(1)
element access unless you specifically use an array/vector. More likely, you'll get better mileage out of keeping a mostly- or fully sorted array (with binary search for finding elements over a specific limit, and usual indexing for random access).I had the same problem. So I took the source code of java.util.TreeMap and wrote IndexedTreeMap. It implements my own IndexedNavigableMap:
The implementation is based on updating node weights in the red-black tree when it is changed. Weight is the number of child nodes beneath a given node, plus one - self. For example when a tree is rotated to the left:
updateWeight simply updates weights up to the root:
And when we need to find the element by index here is the implementation that uses weights:
Also comes in very handy finding the index of a key:
You can find the result of this work at http://code.google.com/p/indexed-tree-map/