Algorithm for converting hierarchical flat data (w

2020-06-18 05:51发布

问题:

I have the following structure:

MyClass {
  guid ID
  guid ParentID
  string Name
}

I'd like to create an array which contains the elements in the order they should be displayed in a hierarchy (e.g. according to their "left" values), as well as a hash which maps the guid to the indentation level.

For example:

ID     Name     ParentID
------------------------
1      Cats     2
2      Animal   NULL
3      Tiger    1
4      Book     NULL
5      Airplane NULL

This would essentially produce the following objects:

// Array is an array of all the elements sorted by the way you would see them in a fully expanded tree
Array[0] = "Airplane"
Array[1] = "Animal"
Array[2] = "Cats"
Array[3] = "Tiger"
Array[4] = "Book"

// IndentationLevel is a hash of GUIDs to IndentationLevels.
IndentationLevel["1"] = 1
IndentationLevel["2"] = 0
IndentationLevel["3"] = 2
IndentationLevel["4"] = 0
IndentationLevel["5"] = 0

For clarity, this is what the hierarchy looks like:

Airplane
Animal
  Cats
    Tiger
Book

I'd like to iterate through the items the least amount of times possible. I also don't want to create a hierarchical data structure. I'd prefer to use arrays, hashes, stacks, or queues.

The two objectives are:

  1. Store a hash of the ID to the indentation level.
  2. Sort the list that holds all the objects according to their left values.

When I get the list of elements, they are in no particular order. Siblings should be ordered by their Name property.

Update: This may seem like I haven't tried coming up with a solution myself and simply want others to do the work for me. However, I have tried coming up with three different solutions, and I've gotten stuck on each. One reason might be that I've tried to avoid recursion (maybe wrongly so). I'm not posting the partial solutions I have so far since they are incorrect and may badly influence the solutions of others.

回答1:

I needed a similar algorithm to sort tasks with dependencies (each task could have a parent task that needed to be done first). I found topological sort. Here is an iterative implementation in Python with very detailed comments.

The indent level may be calculated while doing the topological sort. Simply set a node's indent level to its parent node's indent level + 1 as it is added to the topological ordering.

Note there can exist many valid topological orderings. To ensure the resulting topological order groups parent nodes with child nodes, select a topological sort algorithm based on depth-first traversal of the graph produced by the partial ordering information.

Wikipedia gives two more algorithms for topological sort. Note these algorithms aren't as good because the first one is breadth-first traversal, and the second one is recursive.



回答2:

For hierarchical structures you almost certainly will need recursion (if you allow for arbitrary depth). I quickly hacked up some ruby code to illustrate how you might achieve this (though I haven't done the indentation):

# setup the data structure
class S < Struct.new(:id, :name, :parent_id);end

class HierarchySorter

    def initialize(il)
        @initial_list = il
        first_level = @initial_list.select{|a| a.parent_id == nil}.sort_by{|a| a.name }
        @final_array = subsort(first_level, 0)
    end

    #recursive function
    def subsort(list, indent_level)
        result = []
        list.each do |item|
            result << [item, indent_level]
            result += subsort(@initial_list.select{|a| a.parent_id == item.id}.sort_by{|a| a.name }, indent_level + 1)
        end
        result
    end

    def sorted_array
        @final_array.map &:first
    end

    def indent_hash
        # magick to transform array of structs into hash
        Hash[*@final_array.map{|a| [a.first.id, a.last]}.flatten]
    end

end

hs = HierarchySorter.new [S.new(1, "Cats", 2), S.new(2, "Animal", nil), S.new(3, "Tiger", 1), S.new(4, "Book", nil),
    S.new(5, "Airplane", nil)]

puts "Array:"
puts hs.sorted_array.inspect

puts "\nIndentation hash:"
puts hs.indent_hash.inspect

If you don't speak ruby I can re-craft it in something else.

Edit: I updated the code above to output both data-structures.

Outputs:

Array:
[#<struct S id=5, name="Airplane", parent_id=nil>, #<struct S id=2, name="Animal", parent_id=nil>, #<struct S id=1, name="Cats", parent_id=2>, #<struct S id=3, name="Tiger", parent_id=1>, #<struct S id=4, name="Book", parent_id=nil>]

Indentation hash:
{5=>0, 1=>1, 2=>0, 3=>2, 4=>0}


回答3:

Wonsungi's post helped a lot, however that is for a generic graph rather than a tree. So I modified it quite a bit to create an algorithm designed specifically for a tree:

// Data strcutures:
nodeChildren: Dictionary['nodeID'] = List<Children>;
indentLevel: Dictionary['nodeID'] = Integer;
roots: Array of nodes;
sorted: Array of nodes;
nodes: all nodes

// Step #1: Prepare the data structures for building the tree
for each node in nodes
  if node.parentID == NULL
    roots.Append(node);
    indentLevel[node] = 0;
  else
    nodeChildren[node.parentID].append(node);

// Step #2: Add elements to the sorted list
roots.SortByABC();
while roots.IsNotEmpty()
  root = roots.Remove(0);
  rootIndentLevel = indentLevel[root];
  sorted.Append(root);
  children = nodeChildren[root];
  children.SortByABC();
  for each child in children (loop backwards)
    indentLevel[child] = rootIndentLevel + 1
    roots.Prepend(child)