How do I build a compiler for a dynamically typed language?

I have a university project where I have to build a compiler for the language selected by the teachers using Bison and Flex in C ++.

Language is an object oriented data collection with a dynamic language.

The thing is, my friend and I are just confused about how to write the mips code for a.x

when we only know the type of a at runtime. see this pseudocode:

class A{private x;public A(){x=10}}
class B{public x;public B(){x=2}}
class C
{
   public static main(args)
   {
      n=input('integer');
      if(n>5)
         a=new A();
      else
         a=new B();
      write(a.x);
   }
}

      

We asked the teacher, she said that we store the types of variables in the symbol table, but we only provide these types at runtime, which means that we have to build an interpreter and that is what she said. but she seemed to forget that we only have the value n

in the mips code in some register or in the stack pointer $ sp (stack pointer), we don't have a value n

in the C ++ code so we can't know the type a

, if there is no mips code for tell the c++ program that the value of n is 1

.

We can make the possibilities of the type possible a

, in the above code a

there is either a type a

or B

, and the mips code a.x

could be something like this:

beq type(a) A label1
li $a0,0(a)
li $v0 //code for print integer
syscall 
label1:raise exception

      

but this statement a.b.c.d.etc

makes things more complicated, so this approach is terrible.

My friend asked the teacher to force the programmer to write a type, so for a.b.c

he must write A<a>.B<b>.C<c>

, for example, and the result will be either an exception (wrong listing, or personal access) or a.b.c

, but the teacher refused, and I still don't like it.

The methods that I know

1 - store the value in the symbol table: this will make it useless to generate mips, and the program is pure C ++ (it is no longer a compiler and interpreter).

2 - define a value attribute for a character in a symbol table, but let the mips code change that value, well, if we say in C ++ int n

and then when generating the code, we say (in C ++) printf("sw $v0,0(%d)",&x)

, then the mips code will actually change x. because it stores the value of v0 in an address that is the same as address x.

But this approach requires the assembler and compiler to work in parallel and in parallel, it's just hard for us.

So what's the best way to handle this?

+3


source to share


1 answer


If you were translating your dynamic language to C ++ you could lay down our language objects like:

class DynamicLanguageObject
{
    int m_type;
    void* m_pValue; // stores int, double, char*, etc. depending on m_type
    map<string, DynamicLanguageObject*> m_fields;
};

      



and then the expression a.x

in your dynamic language matches a.m_fields["x"]

in C ++. This dictionary -of-fields approach is how many dynamic languages, including JavaScript and Python, implement objects.

You "just" need to figure out how to implement a hash table in MIPS assembly language.

+2


source







All Articles