# Performance Algorithm - Order - Tree (data structure) only solution?

I have a problem, at first glance it looks simple, and it is true, however I am looking for some other solution (maybe easier):

Expressions:

`V0 V1 V2 V3 V4 SumA = V1 + V2 SumB = SumA + V3 SumC = SumB + SumA SumD = SumC + V0`

As we can see here, the "base" variables `V0, V1, V2, V3 and V4`

(the value of each of them is returned from the database queries)

**The user will ask the software to return the result V1
and SumC
.**

Solution I know:

Find all required variables: V1, SumC, SumB, SumA, V3, V2

For performance, I just want to process the math of each JUST ONE TIME variable.

This means that I need to order expressions from "base expressions" to "top variables".

At this point I only see a solution like "Tree (data structure)"> "Get V1, V2 and V3" Then get SumA, after getting SumB and only finally get SumC.

** Is there any other way to solve this problem?**

The ultimate goal of this algorithm is to use more complex variables and a few "mean variables". So performance is critical, I cannot perform the same math operation more than once.

source to share

I'm not sure I fully understand, but I think you mean general subexpression elimination , [or something similar], which is a very common compiler optimization .

One common way of doing this optimization is to use a graph [which is actually a DAG ] expressions in the program and iteratively add new expressions. The "sources" in your DAG are all initial variables [V0, V1, V2, V3, V4 in your example]. You can "know" which expression is redundant if you have already calculated it, and avoid recalculating it.

These lecture notes seem to be a more detailed explanation (although I admit I haven't read all of this)

source to share

First of all, you need to build a tree with all the expressions. Trees are the simplest data structure for this case.

Now, suppose you have the following formulas:

```
SumA = v1 + v2
SumB = v1 + v2 + v3
SumC = ...
```

and the user asks `SumB`

(so you know how to calculate `SumC`

, but you don't have to to make the user happy).

In memory, it looks like this:

```
SumA = Add( v1, v2 )
SumB = Add( Add( v1, v2 ), v3 ) )
```

The next step is to define comparison operators that indicate whether two subtrees are the same. By running them, you will notice that it `Add( v1, v2 )`

appears twice, so you can optimize:

```
SumA = Add( v1, v2 )
SumB = Add( SumA, v3 )
```

This means you can achieve results with a minimum of computation. The next step is to add caching to your statements: when someone asks for their value, they have to cache it so the next call `getValue()`

can return the last result.

This means that the score `SumA`

or `SumB`

will fill the cache for `SumA`

. Since you never ask for a value `SumC`

, it never gets evaluated and therefore costs nothing.

source to share

The only way to speed it up is to use serialization at a level that you cannot get programmatically unless you use your own hardware. Example:

Please ignore the note in the top right corner, this is stolen from my script :)

Case A: 100 * 4 cycles

Case B: The first result takes 3 cycles, each next one takes only 1 (serialization, Ford factory). - 102 cycles

102 versus 400 - about 4 * speed.

Modern processors can do this to some extent automatically, but it's quite difficult to measure. I heard that ICC (Intel C Compiler) optimizes the build to make the most of this, perhaps in part because they beat everything else on Intel CPU :)

source to share