Performance Algorithm - Order - Tree (data structure) only solution?

Question

Performance Algorithm - Order - Tree (data structure) only solution?

I have a problem, at first glance it looks simple, and it is true, however I am looking for some other solution (maybe easier):

Expressions:

V0
V1
V2
V3
V4
SumA = V1 + V2
SumB = SumA + V3
SumC = SumB + SumA
SumD = SumC + V0

As we can see here, the "base" variables V0, V1, V2, V3 and V4

(the value of each of them is returned from the database queries)

The user will ask the software to return the result V1

and SumC

.

Solution I know:

Find all required variables: V1, SumC, SumB, SumA, V3, V2

For performance, I just want to process the math of each JUST ONE TIME variable.

This means that I need to order expressions from "base expressions" to "top variables".

At this point I only see a solution like "Tree (data structure)"> "Get V1, V2 and V3" Then get SumA, after getting SumB and only finally get SumC.

Is there any other way to solve this problem?

The ultimate goal of this algorithm is to use more complex variables and a few "mean variables". So performance is critical, I cannot perform the same math operation more than once.

+3

algorithm

Dryadwoods 16 Mar 12 at 12:10

source to share

4 answers

First of all, you need to build a tree with all the expressions. Trees are the simplest data structure for this case.

Now, suppose you have the following formulas:

SumA = v1 + v2
SumB = v1 + v2 + v3
SumC = ...

and the user asks SumB

(so you know how to calculate SumC

, but you don't have to to make the user happy).

In memory, it looks like this:

SumA = Add( v1, v2 )
SumB = Add( Add( v1, v2 ), v3 ) )

The next step is to define comparison operators that indicate whether two subtrees are the same. By running them, you will notice that it Add( v1, v2 )

appears twice, so you can optimize:

SumA = Add( v1, v2 )
SumB = Add( SumA, v3 )

This means you can achieve results with a minimum of computation. The next step is to add caching to your statements: when someone asks for their value, they have to cache it so the next call getValue()

can return the last result.

This means that the score SumA

or SumB

will fill the cache for SumA

. Since you never ask for a value SumC

, it never gets evaluated and therefore costs nothing.

+1

Aaron digulla 16 Mar At 13:01

source to share

The only way to speed it up is to use serialization at a level that you cannot get programmatically unless you use your own hardware. Example: enter image description here

Please ignore the note in the top right corner, this is stolen from my script :)

Case A: 100 * 4 cycles

Case B: The first result takes 3 cycles, each next one takes only 1 (serialization, Ford factory). - 102 cycles

102 versus 400 - about 4 * speed.

Modern processors can do this to some extent automatically, but it's quite difficult to measure. I heard that ICC (Intel C Compiler) optimizes the build to make the most of this, perhaps in part because they beat everything else on Intel CPU :)

0

AoeAoe 16 Mar 12 at 12:18

source to share

Perhaps you could simplify it and eliminate the middle step:

SumA = (V1 + V2)*2
SumC = V3 + SumA

0

Nikhil 16 Mar '12 at 12:30

source to share

amit · Accepted Answer · 2012-03-16T12:29:25+0000

I'm not sure I fully understand, but I think you mean general subexpression elimination , [or something similar], which is a very common compiler optimization .

One common way of doing this optimization is to use a graph [which is actually a DAG ] expressions in the program and iteratively add new expressions. The "sources" in your DAG are all initial variables [V0, V1, V2, V3, V4 in your example]. You can "know" which expression is redundant if you have already calculated it, and avoid recalculating it.

These lecture notes seem to be a more detailed explanation (although I admit I haven't read all of this)

Performance Algorithm - Order - Tree (data structure) only solution?

More articles: