I have a set of points for which I want to construct KD Tree. After some time I want to add few more points to this KDTree periodically. Is there any way to do this in scipy implementation
问题:
回答1:
The problem with k-d-trees is that they are not designed for updates.
While you can somewhat easily insert objects (if you use a pointer based representation, which needs substantially more memory than an array-based tree), and do deletions with tricks such as tombstone messages, doing such changes will degrate the performance of the tree.
I am not aware of a good method for incrementally rebalancing a k-d-tree. For 1-dimensional trees you have red-black-trees, B-trees, B*-trees, B+-trees and such things. These don't obviously work with k-d-trees because of the rotating axes and thus different sorting. So in the end, with a k-d-tree, it may be best to just collect changes, and from time to time do a full tree rebuild. Then at least this part of the tree will be quite good.
However, there exists a similar structure (that in my experiments often outperforms the k-d-tree!): the R*-tree. Instead of performing binary splits, it uses rectangular bounding boxes to collect objects, and a lot of thought was put into making the tree a dynamic data structure. This is also where the R*-tree performs much better than the R-tree: it has a much more clever split for kNN search, and it performs incremental rebalancing to improve its structure.