Tree traversal with duplicated subtrees

I have a simple recursive traversal after a tree order that prints out all possible paths. The problem is that it takes a long time.

I would like to use a property of my tree to keep the travel time. the property is that I have duplicated the subtree, in the example below, the subtree with head 1 appears 3 times in the tree.

          10
  /        |        \
 5         15        1
 /\        / \      / \
2  1      1   6    3   8
  / \    / \
 3   8  3   8

      

I'm looking for an improvement for my workaround that skips already processed subitems. The idea is to store every subtree I passed, but I couldn't put it into the after-order algorithm.

Thanks for the help.

+3


source to share


4 answers


Use a hash set to store visited nodes and for each visitor node, check if it has already been visited: if not, add it to the visited set and proceed as usual, otherwise return.



+1


source


I don't know if your program allows this change .... Since we cannot know if a subtree has appeared before you go through, one possible way is to reorganize your tree as the next one to split the duplicate subtree.



          10
  /        |        \
 5         15        1
 /\        / \      / \
2  1 ------   6    3   8
  / \     
 3   8     

      

0


source


which prints all possible paths

I think this is the main problem. The running time of your program is linear in terms of the amount of output data; approximately for each step in the tree, your program must output at least one character. So you have no room to speed up here, as long as you keep the required output the same (i.e. as long as you need all the paths to output), you simply cannot have an algorithm that is faster than O(R)

where R

is the total size of the output you need to produce.

In fact, it's likely that the main bottleneck is thrown out on its own (console or disk performance is usually much worse than just walking in memory), so I think if you profile your program, you'll find that 90% of the time is wasted in output. So not only can you not get a better asymptotic solution, you cannot get a faster solution. (Except if you're optimizing the output itself, that's a different question.)

However, if you don't need to print all the paths, but handle them in some way - for example, count how many of them exist, or find the longest path, etc. - then your solution can greatly benefit from repeating subtrees. What this really means is that you don't have a tree, but a directed acyclic graph, and this usually allows for fairly simple approaches to dynamic programming.

0


source


Assuming each node is a unique subtree based on that node value, and the set of node values ​​is small, you can use it to optimize traversal:

string s[MAX_NODE_VALUE + 1];

string post_traverse(Node root)
{
    if(root == NULL)
        return "";
    if(s[root.value].length() == 0)
    {
        s[root.value] += post_traverse(root.left);
        s[root.value] += post_traverse(root.right);
        s[root.value] += itoa(root.value);
        s[root.value] += ' ';
    }
    cout << s[root.value];
    return s[root.value];
}

      

This will help when you have too many duplicate subtrees and the node's set of values ​​is very small, otherwise it will consume too much space and take longer.

0


source







All Articles