What should be my AST for simple transformations?

I have a minimal javascript-like game language. I am creating an AST to try out some optimization techniques such as escape analysis, type inference. I've tried several approaches like generalizing operator tokens instead of class / function for each, keeping the type information on each node ... But I still don't feel like I'm going somewhere. It quickly becomes cumbersome to work on ...

I've studied lua5, neko, v8, but ... well ... I'm sure not one of the brightest.

Does anyone have experience designing AST and implementing transformations in AST that are easy to work with? I would appreciate tips and tricks that make your life easier?

(Please don't tell me I'm reading the book of dragons. I already have it.)

+1


source to share


4 answers


As Alan mentioned, Appel's books are great. I had a modern compiler implementation in ML for an undergraduate compilers course.

I would personally avoid doing a lot of conversions in AST simply because of the number of different constructs you can have, and the number of ways the same thing can be expressed. You often have to write code that handles a lot of cases and subnets, and as you said, it gets very fast.



It is better to convert the AST to a more minimal representation, such as basic blocks in a control flow graph. Then each operation can be written as a simple statement in the basic block. The set of possible operations should be small. Just make sure you have enough information that you can still do any necessary conversions (in particular, don't throw away the types). You can also use the Single Static Assignment form, where each program variable is assigned only once. This provides an invariant that simplifies a lot of transformations and analyzes.

+3


source


I am using ANLTR.org which I found very easily.



0


source


Ok, I won't recommend the Dragon book since you already have it. May I Recommend Appel Books? There is even some source code that demonstrates concepts, among which the Visitor pattern is used to translate abstract syntax trees into concrete syntax trees. Good luck and enjoy!

0


source


AST represent the structure of the program. For difficult languages, your ACT is bound to be difficult. So don't assume this should be "easy".

Many people assume that life will be easier with AST. But analysis is only the foothills of the Himalayas. ACTs do not provide general conclusions such as the value of the identifier, which operator is performing next, where this data is consumed. If you don't have all of these capabilities, you are not going to be able to do much with real language, let alone do it easily.

It is best to make these output pins cached or explicit: symbol tables, control flows, data streams, ...

You can add langauges matching templates to enable syntax recognition or even rule entry conversion:

optimize_increment(x:exp):statement
= " \x=\x+1; " -->  " \x++ " if no_side_effects(x);

      

Such rules should be based on cached outputs (for example, "side_effects").

Trying to build it all is very difficult. One way to make this practical is to amortize the cost of infrastructure across many transformation and transformation applications. Our DMS Software Reengineering Toolkit has all of this hardware to varying degrees for a wide variety of languages ​​including C, C ++, Java, C # and COBOL.

0


source







All Articles